CN110543291A

CN110543291A - Finite field large integer multiplier and implementation method of large integer multiplication based on SSA algorithm

Info

Publication number: CN110543291A
Application number: CN201910502766.2A
Authority: CN
Inventors: 谢星; 孙玲; 孙海燕; 杨玲玲
Original assignee: Nantong University
Current assignee: Nantong University
Priority date: 2019-06-11
Filing date: 2019-06-11
Publication date: 2019-12-06

Abstract

the invention discloses a finite field large integer multiplier, which is a 768K-bit large integer multiplier based on SSA algorithm and comprises the following components: a first input for receiving first input data; a second input for receiving second input data; an output end; the first finite field processing module and the second finite field processing module are used for carrying out NTT conversion processing on the received input data; and the control module is used for sequencing the first input data and the second input data and outputting the sequenced first input data to the first finite field processing module. And the carry processing module is used for carrying out carry processing on the data subjected to the NTT inverse transformation processing to generate a final calculation result and outputting the final calculation result through the output end. The invention also provides a realization method for carrying out large number multiplication based on the SSA algorithm, and the multiplier and the method disclosed by the invention can be widely applied to a public key encryption scheme.

Description

Finite field large integer multiplier and implementation method of large integer multiplication based on SSA algorithm

Technical Field

The invention relates to the technical field of encryption, in particular to a finite field large integer multiplier and a method for performing full homomorphic encryption operation large integer multiplication.

Background

with the advent of the cloud computing era and the increasing demand of cloud services, the security and privacy of user data become hot spots of concern. Although the cloud platform stores encrypted data of a user, the key is known by a cloud service provider, and the security and the privacy of the data of the user cannot be ensured. For the problem, the fully homomorphic encryption in the prior art can allow the cloud server to directly carry out any operation on the ciphertext, the data is always in an encrypted state in the operation process, and any original text information cannot be exposed, so that the fully homomorphic encryption method is considered to be one of effective technologies for guaranteeing the data security of the cloud-age user.

However, in the Full Homomorphic Encryption (FHE) algorithm represented by Gentry, in order to ensure the security of Homomorphic calculation, the required key length is very long, which results in the very high complexity of the Homomorphic Encryption scheme, and the very low running efficiency on the existing microprocessor. For example, for the lowest security setting of dimension 2048, the size of a 1-bit raw code after encryption is about 785000 bits, 1bit per encryption takes 1s, and the subsequent operations on these ciphertexts are all large number operations, and the computation delay hinders the practical application of the FHE scheme.

disclosure of Invention

in order to solve the problems of low encryption operation efficiency and low calculation speed, the inventor constructs a finite field large integer multiplier, and for fully homomorphic encryption, the modular multiplication operation efficiency of large numbers is the key of the encryption effect. The cloud server can be allowed to directly carry out any operation on the ciphertext through the large number multiplier, the operation speed is high in the operation process due to the fact that the cloud server is based on the SSA algorithm, the data are always in an encrypted state, any original text information cannot be exposed, and data safety of a user is improved. Based on the high-efficiency binary multiplication algorithm SSA (-Strassen algorithm), the encrypted data is subjected to homomorphic calculation, the operation speed and efficiency are greatly improved, and a key with overlong length does not need to be set, so that the data security is protected.

According to one aspect of the present invention, there is provided a large integer multiplier, the large integer multiplier being a 768K-bit large integer multiplier based on the SSA algorithm, comprising: a first input for receiving first input data; a second input for receiving second input data; an output end; the first finite field processing module and the second finite field processing module are used for carrying out NTT conversion processing on the received input data; the control module is used for sequencing the first input data and the second input data, outputting the sequenced first input data to the first finite field processing module, outputting the sequenced second input data to the second finite field processing module, and acquiring the second input data subjected to NTT conversion from the second finite field processing module and outputting the second input data to the first finite field processing module; the carry processing module is used for carrying out carry processing on the data subjected to the NTT inverse transformation processing to generate a final calculation result and outputting the final calculation result through an output end; the first finite field processing module is further configured to perform dot product modular multiplication calculation on the first input data after the NTT conversion processing and the second input data after the NTT conversion processing, perform NTT inverse conversion processing on a calculation result of the dot product modular multiplication calculation, and output the result to the following carry processing module. The large number algorithm due to SSA requires NTT to compute the operand sum and inverse NTT to compute the dot product result of the two operands. Therefore, the first and second finite field processing modules are important. The control module carries out sequencing operation on the input first data and the input second data and then respectively inputs the data into the first finite field processing module and the second finite field processing module to meet the requirements of the SSA algorithm. For the carry processing module, the problem of complex NNT operation caused by directly performing multiplication operation in the prior art can be solved through the operation of carry processing. Therefore, the efficiency of large number operation can be improved, and the information security is effectively protected.

In some embodiments, the first and second finite field processing modules are implemented to include a first stage processing component for performing a first stage of operation processing on input data and a second stage processing component for performing a second stage of operation processing on a result of the first stage of operation processing, where the first stage processing component includes sixty-four consecutive 1024-point NTT processing units, and the 1024-point NTT processing units are formed by serially connecting two stages of base-32-point NTT processing units; the second-level processing component comprises a modular multiplier and a base-64-point NTT processing unit, wherein the modular multiplier is used for performing modular multiplication cloud. Because the modular addition, the modular subtraction and the modular multiplication involved in the NTT operation are all based on the power of 2, if 64-bit or other wide operations are to be realized, each operation result needs to be subjected to the modular operation, which occupies a large amount of hardware resources, therefore, the first-stage operation processing assembly and the second-stage operation processing assembly are respectively arranged in the two finite field processing modules to realize the hierarchical operation of the input data, the complexity of direct operation is avoided, and the two modules both use the basic-32-point NTT processing and the basic-64-point NTT processing as basic units, and the operation process only needs to carry out the shifting and the modular addition operation, thereby reducing the complexity of the NTT operation.

in some embodiments, the base-32 point NTT processing unit is implemented to include thirty-two shift units and a tree big sum processing unit, and the base-64 point NTT processing unit is implemented to include sixty-four shift units and a tree big sum processing unit; the shifting unit is used for performing shifting operation on input data and outputting the shifted data to the tree-shaped large number summation processing unit; and the tree-shaped large sum processing unit is used for performing accumulation sum operation on the output data after the shift operation to obtain a sum result and outputting the sum result. In the actual NTT operation, if 64-point NTT and 32-point NTT are calculated directly, the complexity of NTT operation can be reduced because the unit root of the NTT operation such as 64-bit, 32-bit and 16-bit is the power of 2, so that the shift operation can be performed without direct multiplication. Therefore, the calculation result can be obtained by arranging the shift units in the radix-32 point NTT and the radix-64 point NTT and accumulating the results processed by the shift units by the large number summation processing unit.

in some embodiments, the tree-shaped big sum processing unit comprises a 32-input four-stage series structure carry-save adder for performing addition operation on the data after the shift processing; the modular reduction device is used for carrying out digit conversion on the addition operation result of the carry-retaining adder; and the modulo adder is used for performing modulo addition operation on the data after the digit conversion to obtain a summation result and outputting the summation result. In order to improve the calculation efficiency, the calculation is carried out by the principle of a carry-save adder, and the carry-save device has 3 operation number input ports, wherein two of the operation number input ports are addition data sums, and the other operation number input port is a carry from a lower bit; the output port is a carry output. A 4-input carry-save adder can be realized by using two carry-save adders connected in series, so that a 32-input 4-stage series structure is adopted as a tree-shaped large-number summation processing unit. And then, the calculation result is subjected to digit conversion and modular addition operation through a modular reduction unit and a modular addition unit, so that the operation speed is greatly improved.

in some embodiments, the device further comprises a storage module for storing operational data, wherein the storage module comprises a first storage unit for the operational data of the 1024-point NTT processing unit and the base-64-point NTT processing unit; and a second storage unit for storing twiddle factors of the radix-32 point NTT processing unit and the radix-64 point NTT processing unit. Since a large number of memory modules are required in the NTT operation, a collision-free algorithm can be realized by setting the first memory unit and the second memory unit. The twiddle factors involved in the operation are fixed values and are power powers of 2. The current mainstream hardware circuit is very well suitable for calculating the power of 2 operation, because only a twiddle factor needs to be configured in the storage unit, and the calculation result can be directly obtained when the shifting is carried out.

in some embodiments, the first storage unit is implemented as sixty-four RAM storage banks, each RAM storage bank including thirty-two RAM memories. Therefore, different RAM memories can be repeatedly called to obtain calculation data during actual NTT operation, and the resource overhead of design is reduced.

in some embodiments, the second storage unit is implemented as a ROM memory. The first finite field processing module and the second finite field processing module are arranged to share the second storage unit. In order to improve the calculation efficiency and reduce the use of a ROM (read only memory), the base-32-point NTT processing unit and the base-64-point NTT processing unit both use the twiddle factor of the same ROM, thereby achieving the effect of the same-address calculation.

in some embodiments, the control module includes a preprocessing unit, configured to perform grouping and sorting processing on the acquired first input data and second input data, respectively, and generate 32K sample data groups, which are stored in thirty-two corresponding RAM memories, respectively; and the read-write unit is used for acquiring the read-write address of the input data by utilizing the same-address operation and the conflict-free algorithm, performing read-write operation on the input data and outputting the read-write operation to the first finite field transformation module and the first finite field transformation module. The first input data and the second input data are grouped and sorted and then stored in the RAM without occupying excessive data resources, the read-write address of the input data is read immediately after being used by the read-write unit, the data storage is based on the same-address operation and the conflict-free algorithm, the multiplexing is realized, and the processing efficiency of the operation is greatly improved.

According to another aspect of the present invention, there is provided an implementation method for performing large number multiplication based on SSA algorithm, including the following steps: inputting first input data and second input data to a fully homomorphic large integer number multiplier through a first input end and a second input end respectively, wherein the finite field large integer number multiplier is the finite field large integer number multiplier; grouping and sequencing the first input data and the second input data respectively and storing the first input data and the second input data into corresponding RAM memories; reading the RAM memory to obtain first input data and second input data, and respectively outputting the first input data and the second input data to a first finite field processing module and a second finite field processing module for finite field transformation calculation; obtaining a transformation processing result from the second finite field processing module and outputting the transformation processing result to the first finite field processing module, and performing dot product modular multiplication operation and inverse transformation processing on the transformed first input data and second input data through the first finite field processing module; and carrying out carry processing on the inverse transformation processing result to obtain a final processing result and outputting the final processing result. The first input data and the second input data are preprocessed and stored in the RAM, and the data are subjected to finite field calculation based on the SSA algorithm according to the first finite field processing module and the second finite field processing module, so that the operation speed is high in the operation process, the data are always in an encrypted state, and the data security of a user is improved.

In some embodiments, the computation of the finite field transformation by the first finite field processing module and the second finite field processing module is implemented to include configuring a twiddle factor in a ROM memory; sequentially reading thirty-two bit input data from sixty-four RAM storage groups, outputting the thirty-two bit input data to a 1024-point NTT processing unit for carrying out first-stage NTT operation, and writing an operation result of each RAM storage group into a corresponding RAM storage group until the sixty-four RAM storage groups are all subjected to 1024-point NTT processing; respectively reading the calculation results of the first-stage NTT operation from sixty-four RAM storage groups, and performing modular multiplication operation on the read calculation results and the twiddle factors; and outputting the modular multiplication operation result to a base-64-point NTT processing unit for second-stage NTT operation to obtain a calculation result output of finite field transformation. The twiddle factors stored in the ROM and the processing data stored in the RAM storage group can effectively save data resources and greatly improve the operation speed.

drawings

FIG. 1 is a schematic block diagram of a finite field large integer multiplier according to an embodiment of the present invention;

FIG. 2 is a schematic block diagram of a first finite field processing module of the finite field large integer multiplier according to an embodiment of the present invention;

FIG. 3 is a schematic block diagram of a second finite field processing module of the finite field large integer multiplier according to an embodiment of the present invention;

FIG. 4 is a schematic block diagram of a basic 32-point NTT processing unit of a finite field large integer multiplier according to an embodiment of the present invention;

FIG. 5 is a schematic block diagram of a basic 64-point NTT processing unit of a finite field large integer multiplier according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of a tree summation unit of a finite field large integer multiplier according to an embodiment of the present invention;

FIG. 7 is a flowchart of a method for performing a large number multiplication based on the SSA algorithm according to an embodiment of the present invention;

Fig. 8 is a schematic block diagram of thirty-two RAM memory read/write address operations of the base-32-point NTT processing unit 32 according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings.

FIG. 1 schematically shows a functional block diagram of a finite field large integer multiplier according to one embodiment of the present invention. As shown in figure 1 of the drawings, in which,

The large integer multiplier 1 is a 768K-bit large integer multiplier based on the SSA algorithm, and includes: the device comprises a first input end 11 for receiving first input data, a second input end 12 for receiving second input data, an output end 13, a first finite field processing module 22, a second finite field processing module 23, a control module 21 and a carry processing module 24. In the 768k-bit large multiplier of this embodiment, the first input terminal 11 and the second input terminal 12 respectively receive 64k sets of the first input data and the second input data, each set of data is 64 bits, so that the output data of the large multiplier is 768kbit in total. The first finite field processing module 22 and the second finite field processing module 23 are configured to perform NTT (finite field transform) processing on input data received by the first input terminal 11 and the second input terminal 12, and are respectively implemented as two finite field processing modules with 64k points, and since the majority computation of SSA needs to calculate NTT of input data sum, and an INTT operation (inverse NTT transform) is used to calculate a dot product result of an operand after the NTT processing, the first finite field processing module 22 and the second finite field processing module 21 are key modules. The control module 21 is configured to perform sorting processing on the first input data and the second input data, and may set sorting according to types of data in the control module, output the sorted first input data to the first finite field processing module 22, and output the sorted second input data to the second finite field processing module 23. The control module 21 also obtains the second input data after the NTT transform processing from the second finite field processing module 23 and outputs the second input data to the first finite field processing module 22 for further NTT operation. The carry processing module 24 is configured to carry process the data subjected to the NTT inverse transformation, generate a final calculation result, and output the final calculation result through the output end 13. In order to reduce the resource overhead, the first finite field processing module 22 is further configured to perform dot product and modulo multiplication on the first input data after the NTT transform processing and the second input data after the NTT transform processing, perform NTT inverse transform processing on a calculation result of the dot product and modulo multiplication calculation, and output the result to the carry processing module 24.

Specifically, as a preferred embodiment, as shown in fig. 2 and fig. 3, the first finite field processing module 22 and the second finite field processing module 23 are each implemented to include a first-stage processing component 221 for performing first-stage operation processing on input data and a second-stage processing component 222 for performing second-stage operation processing on a result of the first-stage operation processing, where the first-stage processing component 221 includes sixty-four consecutive 1024-point NTT processing units 31, and the 1024-point NTT processing unit 31 is formed by serially connecting two-stage basis-32-point NTT processing units 32. The second stage processing component 222 includes a modular multiplier 41 for performing a modular multiplication operation and a base-64 point NTT processing unit 42. Since the modulo addition, the modulo subtraction, and the modulo multiplication involved in the NTT transform are all based on the power of 2 in the finite field transform, for the 64-bit wide operation to be implemented in this embodiment, each operation result needs to be subjected to the modulo operation, which occupies a large amount of hardware resources. Thus, the inventor contemplates that the 64-bit wide operands are extended to 192 bits and the data bits are extended in a "0-fill" manner, thus avoiding modulo operations for each operation, and thus selecting the base-32-point NTT processing unit 32 and the base-64-point NTT processing unit as the basic units, thus requiring only shift and modulo add operations during the operation. Here, as shown in fig. 4, the base-32 point NTT processing unit 32 is implemented to include thirty-two shift units 321 and a tree-shaped large sum processing unit 322. As shown in fig. 5, the base-64 point NTT processing unit 42 is implemented to include sixty-four shift units 321 and a tree large sum processing unit 322. The shifting unit 321 is configured to perform shifting operation on input data and output the shifted data to the tree-shaped big-number summing processing unit 322, and the tree-shaped big-number summing processing unit 322 is configured to perform accumulation and summation operation on the shifted output data to obtain a summation result and output the summation result. The tree-shaped large sum processing unit 322 includes a carry-save adder 61, a modulo-save adder 62, and a modulo-add 63, which are 32-input four-stage series structures, where the carry-save adder 61(CSA) is configured to perform addition operation on the data after shift processing, the modulo-save adder 62 is configured to perform bit number conversion on the addition operation result of the carry-save adder 61, and the modulo-add 63 is configured to perform modulo-add operation on the data after bit number conversion to obtain a sum result and output the sum result. In the base-32 point NTT processing unit 32, the calculation formulas of the finite field NTT and INTT are respectively:

wherein k represents ordinal number, the value is 0# k 31, n represents sampling point, the value is 0# n 31, p is Solinas prime number, and the value is 264- & lt 232+1 & gt.

Thirty-two data processed by the shift unit 321 through the above operation are accumulated by the tree-shaped big sum processing unit 322, as shown in fig. 6, the CSA has 3-bit operand input ports, two of which are the sum of the first input data and the second input data, and also includes a carry data from lower bits, the output port includes a portion sum output, and the output port is a carry output. Given 3 input data a, b, c, the partial sum of which is row a b c, the carry is ab + bc + ac. Two carry-save adders are connected in series to realize a 4-input CSA, and then a summation result is obtained through a modular addition operation

in the base-64 point NTT processing unit 32, the calculation formulas of the finite field NTT and INTT are respectively:

Wherein k represents ordinal number, the value is 0# k 63, n represents sampling point, the value is 0# n 63, p is Solinas prime number, and the value is 264-.

the data processed by the sixty-four shift units 321 through the above operation is accumulated by the tree-shaped large sum processing unit 322.

the multiplier also comprises a storage module 7 for storing operational data, the storage module 7 comprises a first storage unit 71 and a second storage unit 72, the first storage unit 71 is used for the operational data of the 1024-point NTT processing unit 31 and the base-64-point NTT processing unit 42, and is implemented as sixty-four RAM storage groups, and each RAM storage group comprises thirty-two RAM memories. In order to improve the efficiency of data in the calculation process, the calculation data in the processing process is stored in each RAM (random access memory) through the same-address operation, namely, the current layer is always used for replacing the previous layer in the calculation process, so that the output data uses the internal memory occupied by the original input data node. The butterfly computation of the output data and the input data by using the same memory unit is called as the addressing computation, and the algorithm has high efficiency in computing all the analysis point data. The second storage unit 72 is for storing the twiddle factors (fixed values to the power of 2) of the base-32-point NTT processing unit 32 and the base-64-point NTT processing unit 42, and is implemented as one ROM memory. In order to reduce the use of ROM memory and save resources, the first finite field processing module 22 and the second finite field processing module 23 are arranged to share a twiddle factor in the second memory unit, i.e. to share the same memory unit.

The control module 21 includes a preprocessing unit (not shown) and a read-write unit (not shown), where the preprocessing unit is configured to perform grouping and sorting processing on the acquired first input data and second input data, and generate 32K sample data groups, and store the 32K sample data groups in thirty-two corresponding RAM memories, respectively. The read-write unit is used for acquiring a read-write address of input data, performing read-write operation on the input data, and outputting the read-write operation to the first finite field transformation module 22 and the second finite field transformation module 23. Specifically, as shown in fig. 8, taking the thirty-two RAM memories of the base-32-point NTT processing unit 32 to obtain the data read-write address for data read-write as an example, the implementation is as follows: the memory addresses of the respective RAM memories in fig. 8 are converted into two-dimensional addresses, i.e., the block addresses are represented by the memory numbers, and the point addresses are represented by the data addresses. The block addresses are represented by 0-31 and the point addresses are represented by 0-1023, such as data 32 having an address of (1, 1). Setting the address of input Data as [ dn-1, dn-2, L, d2, d1, d0] r, wherein N is the number of blocks, d is the original sequence number, r is the number of NTT points, and N is the length of NTT; the address of the stored data is:

Address＝[d,d,L,d,d]，

Bank_index＝(d+d+L+d+d+d)modr；

where r is taken to be 32 and N is 1024. In the specific processing process, the control module sequences input data through the preprocessing unit, after the sequenced data obtain read-write addresses through the read-write unit according to the same-address operation, the data are stored in the double-port RAM, the read-write address control unit performs read-write operation on the data, the read data are operated through the basic-32-point NTT processing unit, and the operation result and the twiddle factors stored in the ROM are subjected to modular multiplication operation and then stored in the RAM.

According to the 768kbit large number multiplier based on the SSA algorithm disclosed by the embodiment, the first finite field processing module and the second finite field processing module of the key module are realized by using the base-32-point NTT processing unit and the base-64-point NTT processing unit, so that only the operations of shifting and modular addition are required in the operation process, and the operation efficiency is improved. In addition, in the data operation process, the data access of the storage module is realized by adopting the same-address operation and the conflict-free algorithm, the resource consumption is reduced, the operation speed is improved, and the data security of the user is greatly protected.

Fig. 7 schematically shows a flowchart of an implementation method for performing fully homomorphic large number multiplication based on the SSA algorithm according to an embodiment of the present invention, as shown in fig. 7, the embodiment includes the following steps:

Step S701: and inputting the first input data and the second input data to the fully homomorphic large integer number multiplier through a first input end and a second input end respectively, wherein the fully homomorphic large integer number multiplier is the fully homomorphic large integer number multiplier.

Step S702: and respectively grouping and sequencing the first input data and the second input data and storing the first input data and the second input data into corresponding RAM memories.

Step S703: and reading the RAM to obtain first input data and second input data, and respectively outputting the first input data and the second input data to a first finite field processing module and a second finite field processing module for finite field transformation calculation. And then, the read-write unit in the control module 21 performs read-write operation on the data, the read data is operated by the base-32 NTT module, and the operation result and the twiddle factor stored in the ROM unit are subjected to modular multiplication and then stored in the RAM. The method is specifically realized as follows: firstly, configuring twiddle factors in a ROM, sequentially reading thirty-two bit input data from sixty-four RAM storage groups, outputting the data to a 1024-point NTT processing unit for carrying out first-stage NTT operation, and writing an operation result of each RAM storage group into the corresponding RAM storage group until the sixty-four RAM storage groups finish 1024-point NTT processing; and respectively reading the calculation results of the first-stage NTT operation from sixty-four RAM storage groups, performing modular multiplication operation on the read calculation results and the rotation factors, and outputting the modular multiplication operation results to a base-64-point NTT processing unit for performing second-stage NTT operation to obtain the calculation results of finite field transformation and outputting the calculation results.

Step S704: and the first finite field processing module performs dot product modular multiplication operation and inverse transformation on the transformed first input data and second input data.

Step S705: and carrying out carry processing on the inverse transformation processing result to obtain a final processing result and outputting the final processing result.

according to the method provided by the embodiment, the maximization of parallel processing in the process of large number multiplication can be ensured by adopting addition and shift operation, the processing speed is effectively improved, and the safety of user data is ensured.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A finite field large integer multiplier, wherein the large integer multiplier is a 768K-bit large integer multiplier based on SSA algorithm, comprising:

a first input for receiving first input data;

A second input for receiving second input data;

An output end;

The first finite field processing module and the second finite field processing module are used for carrying out NTT conversion processing on the received input data;

The control module is used for sequencing the first input data and the second input data, outputting the sequenced first input data to the first finite field processing module, outputting the sequenced second input data to the second finite field processing module, acquiring the second input data subjected to NTT conversion from the second finite field processing module, and outputting the second input data to the first finite field processing module; and

The carry processing module is used for carrying out carry processing on the data subjected to the NTT inverse transformation processing to generate a final calculation result and outputting the final calculation result through the output end;

The first finite field processing module is further configured to perform dot product modular multiplication calculation on the first input data after the NTT transform processing and the second input data after the NTT transform processing, perform NTT inverse transform processing on a calculation result of the dot product modular multiplication calculation, and output the result to the carry processing module.

2. the finite field large integer multiplier of claim 1, wherein the first finite field processing module and the second finite field processing module are each implemented to include a first stage processing component for performing a first stage of arithmetic processing on input data and a second stage processing component for performing a second stage of arithmetic processing on a result of the first stage of arithmetic processing, wherein

The first-stage processing assembly comprises sixty-four continuous 1024-point NTT processing units, and the 1024-point NTT processing units are formed by connecting two-stage basic-32-point NTT processing units in series;

The second stage processing component comprises a modular multiplier for performing modular multiplication operations and a base-64 point NTT processing unit.

3. The finite field large integer multiplier of claim 2, wherein the base-32 point NTT processing unit is implemented to include thirty-two shift units and a tree-shaped large sum processing unit, and the base-64 point NTT processing unit is implemented to include sixty-four shift units and a tree-shaped large sum processing unit;

the shifting unit is used for performing shifting operation on input data and outputting the shifted data to the tree-shaped big number summation processing unit;

And the tree-shaped large sum processing unit is used for performing accumulation sum operation on the output data after the shift operation to obtain a sum result and outputting the sum result.

4. The finite field large integer multiplier of claim 3, wherein the tree-shaped large sum processing unit comprises

The carry reservation adder 32 is input into the four-stage series structure and is used for carrying out addition operation on the data after the shift processing;

The modular reduction device is used for carrying out digit conversion on the addition operation result of the carry-retaining adder;

and the modulo adder is used for performing modulo addition operation on the data after the digit conversion to obtain a summation result and outputting the summation result.

5. the finite field large integer multiplier of any of claims 2 to 4, further comprising a storage module for storing operational data, the storage module comprising

the first storage unit is used for operating data of the 1024-point NTT processing unit and the base-64-point NTT processing unit; and

and a second storage unit for storing twiddle factors of the base-32 point NTT processing unit and the base-64 point NTT processing unit.

6. The finite field large integer multiplier of claim 5, wherein the first memory unit is implemented as sixty-four RAM memory banks, each RAM memory bank comprising thirty-two RAM memories.

7. The finite field large integer multiplier of claim 6, wherein the second storage unit is implemented as a ROM memory.

8. The finite field large integer multiplier of claim 5, wherein the first finite field processing module and the second finite field processing module are arranged to share the second memory location.

9. The finite field large integer multiplier of claim 6, wherein the control module comprises:

the preprocessing unit is used for grouping and sorting the acquired first input data and second input data respectively to generate 32K sample data groups which are stored in thirty-two corresponding RAM memories respectively; and

And the read-write unit is used for acquiring the read-write address of the input data by utilizing the same-address operation and the conflict-free algorithm, performing read-write operation on the input data and outputting the read-write operation to the first finite field transformation module and the second finite field transformation module.

10. the method for realizing finite field large number multiplication based on the SSA algorithm is characterized by comprising the following steps:

inputting first input data and second input data to a large integer multiplier through a first input terminal and a second input terminal, respectively, wherein the large integer multiplier is the finite field large integer multiplier of any one of claims 1 to 9;

Grouping and sequencing the first input data and the second input data respectively and storing the first input data and the second input data into corresponding RAM memories;

Reading the RAM to obtain first input data and second input data, and respectively outputting the first input data and the second input data to a first finite field processing module and a second finite field processing module for finite field transformation calculation;

The second finite field processing module obtains a transformation processing result and outputs the transformation processing result to the first finite field processing module, and the first finite field processing module performs dot product modular multiplication operation and inverse transformation processing on the transformed first input data and second input data;

And carrying out carry processing on the inverse transformation processing result to obtain a final processing result and outputting the final processing result.

11. the method of claim 10, wherein the performing finite field transform computations by the first finite field processing module and the second finite field processing module is performed by including

configuring a twiddle factor in a ROM memory;

Sequentially reading thirty-two bit input data from sixty-four RAM storage groups, outputting the thirty-two bit input data to a 1024-point NTT processing unit for carrying out first-stage NTT operation, and writing an operation result of each RAM storage group into a corresponding RAM storage group until the sixty-four RAM storage groups are all subjected to 1024-point NTT processing;

respectively reading the calculation results of the first-stage NTT operation from sixty-four RAM storage groups, and performing modular multiplication operation on the read calculation results and the twiddle factors;

And outputting the modular multiplication operation result to a base-64-point NTT processing unit for second-stage NTT operation to obtain a calculation result output of finite field transformation.