CN115459898A - Paillier homomorphic encryption and decryption calculation method and system based on GPU - Google Patents

Paillier homomorphic encryption and decryption calculation method and system based on GPU Download PDF

Info

Publication number
CN115459898A
CN115459898A CN202211017789.2A CN202211017789A CN115459898A CN 115459898 A CN115459898 A CN 115459898A CN 202211017789 A CN202211017789 A CN 202211017789A CN 115459898 A CN115459898 A CN 115459898A
Authority
CN
China
Prior art keywords
gpu
calculation
message
decryption
homomorphic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211017789.2A
Other languages
Chinese (zh)
Inventor
朱辉
李临风
郑艳冬
王枫为
李晖
薛行策
黄煜坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202211017789.2A priority Critical patent/CN115459898A/en
Publication of CN115459898A publication Critical patent/CN115459898A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/008Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols involving homomorphic encryption
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0861Generation of secret information including derivation or calculation of cryptographic keys or passwords
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/12Details relating to cryptographic hardware or logic circuitry
    • H04L2209/125Parallelization or pipelining, e.g. for accelerating processing of cryptographic operations

Abstract

The invention discloses a Paillier homomorphic encryption and decryption calculation method based on a GPU, which is applied to a GPU device side and comprises the following steps: generating a second partial precomputation table in parallel through a plurality of GPU threads according to the password parameters, the sliding window parameters and the first partial precomputation table sent by the CPU equipment end, and obtaining a global precomputation table according to the second partial precomputation table and the first partial precomputation table; according to the global precomputation table, the message to be encrypted and the homomorphic encryption parameter, generating a first ciphertext message in parallel through a plurality of GPU threads, and sending the first ciphertext message to a CPU (central processing unit) device end; according to the message to be decrypted and the decryption parameters sent by the CPU equipment end, plaintext messages are generated in parallel through a plurality of GPU threads, and the plaintext messages are sent to the CPU equipment end; and according to the second ciphertext message and the third ciphertext message sent by the CPU equipment end, carrying out homomorphic addition operation in parallel through a plurality of GPU threads to generate a fourth ciphertext message, and sending the fourth ciphertext message to the CPU equipment end.

Description

Paillier homomorphic encryption and decryption calculation method and system based on GPU
Technical Field
The invention belongs to the technical field of information security, and particularly relates to a Paillier homomorphic encryption and decryption calculation method and system based on a GPU.
Background
Currently, with the popularization of outsourced computing models, outsourcing of data storage and computing services has become a necessary trend. However, outsourcing computing provides economic and reliable shared computing services, and simultaneously presents new challenges to security in a data outsourcing process, so that confidentiality and integrity of data in the outsourcing computing process are ensured to become a problem to be solved urgently in cloud computing.
As a special encryption system, the homomorphic encryption password system has the greatest advantage that the processing task of the ciphertext can be completed on the premise of not revealing sensitive information. However, the existing homomorphic encryption scheme is complex in operation and large in ciphertext expansion, the existing computing efficiency is still low, and a large amount of homomorphic operations in a cloud environment cause great computing pressure on a client side and a server side. The Paillier homomorphic encryption algorithm is a public key cryptosystem supporting addition homomorphic attributes, and has been widely used in the design of privacy protection schemes due to the characteristics of relative high efficiency and complete security certification. However, at present, with the continuous improvement of the computing power of the device, the Paillier encryption algorithm needs to continuously improve the key length to meet the requirement of security, which causes the obvious reduction of the encryption and decryption efficiency.
In recent years, with the gradual maturity of the semiconductor industry and the perfection of various hardware operation platforms, the computing power of various hardware devices is rapidly developed, and the hardware system architecture is based on processing computation-intensive tasks, so that the method is very suitable for efficiently realizing the cryptographic algorithm. However, the Paillier homomorphic encryption algorithm has the characteristics of complex algorithm structure logic, large memory occupation of operation results and intermediate results and the like, and the relied classical optimization algorithm cannot be directly adapted to the characteristics of various hardware platforms, so that the problems of poor encryption and decryption operation and homomorphic operation efficiency and the like are caused.
At present, some solutions exist for the above problems, but one of the problems of the existing solutions is: mainly focuses on the low-delay implementation of a Central Processing Unit (CPU) platform, and the efficient implementation of a Graphics Processing Unit (GPU) platform is not sufficiently studied, and the computing capability of the GPU platform is difficult to be exerted.
Disclosure of Invention
In order to solve the problems in the related art, the invention provides a Paillier homomorphic encryption and decryption calculation method and system based on a GPU. The technical problem to be solved by the invention is realized by the following technical scheme:
the invention provides a Paillier homomorphic encryption and decryption calculation method based on a GPU, which is applied to a GPU device side and comprises the following steps:
generating a second partial pre-calculation table in parallel through a plurality of first GPU threads of a GPU (graphics processing unit) device end according to a password parameter, a sliding window parameter and a first partial pre-calculation table sent by a CPU (Central processing Unit) device end, and obtaining a global pre-calculation table according to the second partial pre-calculation table and the first partial pre-calculation table;
according to the global precomputation table, the message to be encrypted and the homomorphic encryption parameters which are sent by the CPU equipment end, generating a first ciphertext message in parallel through a plurality of second GPU threads of the GPU equipment end, and sending the first ciphertext message to the CPU equipment end;
according to the message to be decrypted and the decryption parameters sent by the CPU equipment end, generating a plaintext message in parallel through a plurality of third GPU threads of the GPU equipment end, and sending the plaintext message to the CPU equipment end;
according to the second ciphertext message and the third ciphertext message sent by the CPU device side, performing homomorphic addition operation in parallel through a plurality of fourth GPU threads of the GPU device side to generate a fourth ciphertext message, and sending the fourth ciphertext message to the CPU device side.
The invention also provides a Paillier homomorphic encryption and decryption calculation method based on the GPU, which is applied to a CPU device end and comprises the following steps:
acquiring a security parameter, a random value and a sliding window parameter;
generating a cryptographic parameter based on the security parameter and the random value;
calculating a first partial pre-calculation table based on the sliding window parameter;
sending the password parameters, the sliding window parameters and the first partial precomputation table to a GPU (graphics processing unit) device end;
acquiring a message to be encrypted and generating homomorphic encryption parameters corresponding to the message to be encrypted;
sending the message to be encrypted and the homomorphic encryption parameter to the GPU equipment terminal;
acquiring a message to be decrypted and a decryption parameter;
sending the message to be decrypted and the decryption parameter to the GPU equipment end;
acquiring a second ciphertext message and a third ciphertext message;
and sending the second ciphertext message and the third ciphertext message to the GPU equipment terminal.
The invention also provides a Paillier homomorphic encryption and decryption computing system based on the GPU, which comprises:
homomorphic operation module and arithmetic operation module; the arithmetic operation module is used for being called by the homomorphic operation module;
the homomorphic operation module comprises:
the system initialization and pre-calculation module is used for acquiring a security parameter, a random value and a sliding window parameter by a CPU (central processing unit) device end, generating a password parameter based on the security parameter and the random value, calculating a first part of pre-calculation table based on the sliding window parameter, and sending the password parameter, the sliding window parameter and the first part of pre-calculation table to the GPU device end; generating a second partial pre-calculation table in parallel by a plurality of first GPU threads of the GPU equipment end according to the password parameters, the sliding window parameters and the first partial pre-calculation table by the GPU equipment end, and obtaining a global pre-calculation table according to the second partial pre-calculation table and the first partial pre-calculation table;
the encryption calculation module is used for acquiring a message to be encrypted by the CPU equipment end, generating homomorphic encryption parameters corresponding to the message to be encrypted, and sending the message to be encrypted and the homomorphic encryption parameters to the GPU equipment end; the GPU equipment end generates first ciphertext messages in parallel through a plurality of second GPU threads of the GPU equipment end according to the global precomputation table, the messages to be encrypted and the homomorphic encryption parameters, and sends the first ciphertext messages to the CPU equipment end;
the decryption calculation module is used for acquiring a message to be decrypted and a decryption parameter by the CPU equipment end and sending the message to be decrypted and the decryption parameter to the GPU equipment end; generating, by the GPU device side, a plaintext message in parallel by a plurality of third GPU threads of the GPU device side according to the message to be decrypted and the decryption parameter, and sending the plaintext message to the CPU device side;
the homomorphic addition calculation module is used for acquiring a second ciphertext message and a third ciphertext message from the CPU equipment end and sending the second ciphertext message and the third ciphertext message to the GPU equipment end; performing homomorphic addition operation in parallel by the GPU device end through a plurality of fourth GPU threads of the GPU device end according to the second ciphertext message and the third ciphertext message to generate a fourth ciphertext message, and sending the fourth ciphertext message to the CPU device end;
the arithmetic operation module includes:
the non-fixed base modular exponentiation operation module is used for generating a local pre-calculation table by performing pre-calculation window calculation when plaintext messages are generated in parallel through the plurality of third GPU threads, executing a modular multiplication calculation method based on a Montgomery algorithm and a Karatsuba algorithm by calling a modular multiplication calculation module based on the Montgomery algorithm and the Karatsuba algorithm, and performing non-fixed base modular exponentiation operation by inquiring the local pre-calculation table;
the Barrett algorithm-based modular operation module is used for performing modular processing when plaintext messages are generated in parallel through the plurality of third GPU threads;
the fixed base modular exponentiation operation module is used for executing a modular multiplication calculation method based on Montgomery algorithm and Karatsuba algorithm by calling a modular multiplication calculation module based on Montgomery algorithm and Karatsuba algorithm when the second part of pre-calculation table is generated by the plurality of first GPU threads; when first ciphertext messages are generated in parallel through the plurality of second GPU threads, fixed base modular exponentiation is carried out by inquiring the global precomputation table;
the modular multiplication calculation module based on the Montgomery algorithm and the Karatsuba algorithm is used for performing modular multiplication operation and modular square operation based on the Montgomery algorithm and performing multi-precision multiplication operation in the modular multiplication operation and the modular square operation based on the Karatsuba algorithm;
and the basic operation module of the assembly instruction based on the GPU hardware is used for executing addition operation, subtraction operation, multiplication operation and shift operation.
The invention has the following beneficial technical effects:
the GPU device side generates a second partial pre-calculation table in parallel through multiple GPU threads of the GPU device side according to the password parameters, the sliding window parameters and the first partial pre-calculation table sent by the CPU device side, and obtains a global pre-calculation table according to the second partial pre-calculation table and the first partial pre-calculation table; according to the global pre-calculation table, the message to be encrypted and the homomorphic encryption parameters sent by the CPU equipment end, generating a first ciphertext message in parallel through a plurality of GPU threads of the CPU equipment end, and sending the first ciphertext message to the CPU equipment end; generating plaintext messages in parallel through a plurality of GPU threads of the CPU equipment according to the messages to be decrypted and the decryption parameters sent by the CPU equipment, and sending the plaintext messages to the CPU equipment; and according to the second ciphertext message and the third ciphertext message sent by the CPU equipment end, performing homomorphic addition operation in parallel through multiple GPU threads of the CPU equipment end to generate a fourth ciphertext message, and sending the fourth ciphertext message to the CPU equipment end. Therefore, homomorphic operations such as calculation, encryption and decryption of the global precomputation table are transferred to the GPU for parallel calculation, on one hand, the global precomputation table is generated at the GPU end, and the throughput of subsequent encryption calculation can be improved; on the other hand, the GPU has strong computing power, so that a longer password can be allowed to be adopted for encryption calculation, and the safety of encrypted data is improved; on the other hand, the calculation efficiency of homomorphic operation can be improved through parallel homomorphic operation. In other words, because the GPU has more computing units than the CPU, and can efficiently process computation-intensive data in parallel, the present invention takes into account the characteristics of the GPU many-core hardware architecture, transfers the homomorphic operations of the global pre-computation table, such as computation, encryption, decryption, and the like, to the GPU, decomposes the homomorphic operations of the global pre-computation table, such as computation, encryption, decryption, and the like, and performs fine-grained concurrent homomorphic operation operations, which not only improves the throughput during computation, improves the security of encrypted data, but also greatly improves the efficiency of the homomorphic operation operations of Paillier encryption, decryption, and the like, and can provide efficient homomorphic encryption outsourcing computation services, such as homomorphic encryption, decryption, and the like, for a privacy protection scenario.
The present invention will be described in further detail with reference to the accompanying drawings and examples.
Drawings
Fig. 1 is an optional flowchart of a Paillier homomorphic encryption and decryption calculation method based on a GPU applied to a GPU device end according to an embodiment of the present invention;
fig. 2 is another optional flowchart of the Paillier homomorphic encryption and decryption calculation method based on GPU applied to the CPU device side according to the embodiment of the present invention;
fig. 3 is a schematic structural diagram of an exemplary GPU-based Paillier homomorphic encryption and decryption computing system according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to specific examples, but the embodiments of the present invention are not limited thereto.
In the description of the present invention, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples described in this specification can be combined and combined by those skilled in the art.
While the invention has been described in connection with various embodiments, other variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed invention, from a review of the drawings, the disclosure, and the appended claims. In the claims, the word "comprising" does not exclude other elements or steps, and the word "a" or "an" does not exclude a plurality. A single processor or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.
At present, for the characteristics that the Paillier homomorphic encryption algorithm has complex algorithm structure logic and occupies large memory of operation results and intermediate results, the relied classical optimization algorithm can not be directly adapted to the characteristics of various hardware platforms, which leads to the problems of poor encryption and decryption operation and homomorphic operation efficiency, related solutions exist, but the following problems and defects exist in the solutions:
1. mainly focuses on the low-delay implementation of the CPU platform, the efficient implementation research of the GPU platform is not sufficient, and part of optimization algorithms cannot adapt to the platform characteristics of the GPU and are difficult to exert the computing power of the GPU platform.
2. The efficient implementation research of arithmetic operation modules depending on cryptographic algorithms on a GPU platform is not sufficient, for example, modular operation, modular multiplication operation and modular exponentiation operation all have complex algorithm structures, and the direct implementation on the GPU platform causes serious thread bundle differentiation, so that the computational overhead is high.
Fig. 1 is an optional flowchart of a Paillier homomorphic encryption and decryption calculation method based on a GPU according to an embodiment of the present invention, and as shown in fig. 1, the method includes the following steps:
s101, according to the password parameters, the sliding window parameters and the first partial precomputation table sent by the CPU equipment end, a second partial precomputation table is generated in parallel through a plurality of first GPU threads of the GPU equipment end, and a global precomputation table is obtained according to the second partial precomputation table and the first partial precomputation table.
In the embodiment of the invention, the CPU equipment terminal can acquire the security parameter, the random value and the sliding window parameter; generating a cryptographic parameter based on the security parameter and the random value; calculating a first partial pre-calculation table based on the sliding window parameter; and transmitting the password parameters, the sliding window parameters and the first part of pre-calculation table to a GPU device end through a PCIe bus.
In some embodiments, each first GPU thread computes the second partial pre-computed table by a fixed base modular exponentiation method based on the pre-computed table.
Here, the security parameter is λ, and the random values are p and q; the sliding window parameter comprises a global sliding window value ω 1 And host-side sliding window value omega 2
In some embodiments, the global pre-calculation table comprises a plurality of rows of pre-calculated values; the first part of the pre-calculation table is a first part of pre-calculation values in a plurality of rows of pre-calculation values; the second part of the precomputation table is a second part of precomputation values in a plurality of rows of precomputation values; the GPU equipment end determines a second part of pre-calculated values as values to be calculated according to the first part of pre-calculated values; determining the total times of loop calculation according to the sliding window parameters, and calculating a plurality of corresponding first GPU threads in each loop; calculating a plurality of pre-calculated values through a plurality of first GPU threads during each cycle calculation; and taking the pre-calculated value corresponding to the total number of the cycle calculation as a second part of pre-calculated value.
Illustratively, a process that a CPU device end generates a password parameter, a sliding window parameter and a first partial pre-calculation table and sends the password parameter, the sliding window parameter and the first partial pre-calculation table to a GPU end, the GPU end generates a second partial pre-calculation table in parallel through a plurality of first GPU threads according to the password parameter, the sliding window parameter and the first partial pre-calculation table, and obtains a global pre-calculation table according to the second partial pre-calculation table and the first partial pre-calculation table is as follows:
a) The CPU end selects a system safety parameter lambda according to a user input instruction, randomly generates two prime numbers p and q of lambda/2 bits, and calculates n = p × q, mu = lcm (p-1, q-1), and lcm (.) represents the least common multiple; subsequently, the CPU side selects α, where α < λ and | α | = Ω (| n = non-calculation of light ζ ) Zeta > 0, and a group generator g ∈ B is generated at the same time α Wherein B is α Is composed of
Figure BDA0003811621270000081
All the orders in (c) are a set of n α elements, and finally, the CPU calculates h = L (g) α modn 2 ) -1 Wherein g is α modn 2 Is expressed by g α To n is 2 The mould is taken out of the mould,
Figure BDA0003811621270000082
b) The CPU end selects the global sliding window value omega according to the instruction input by the user 1 And host-side sliding window value omega 2 And calculating the pre-calculation table at the CPU end
Figure BDA0003811621270000083
Wherein the content of the first and second substances,
Figure BDA0003811621270000091
g represents a fixed base based on a pre-calculated tableA fixed substrate in the bottom mold power operation method.
c) The CPU end generates the cipher parameters n, g, alpha, h and the host end pre-calculation table P x And transmitting to a GPU terminal.
d) The GPU end receives n, g, alpha and h sent by the CPU end and the host end precomputation table P x And storing in the memory of the GPU device.
e) GPU-side boot
Figure BDA0003811621270000092
The GPU streams compute the remaining pre-computed tables in parallel, wherein each GPU thread computes pre-computed values of a row in the remaining pre-computed tables. In each GPU stream, perform ω 12 1 cycle, each cycle starting
Figure BDA0003811621270000093
Parallel computation of GPU threads
Figure BDA0003811621270000094
Each pre-calculated value, wherein i is sequentially decreased from l-1 to 0 during the cyclic calculation, and each cyclic calculation is calculated first
Figure BDA0003811621270000095
Then substituting the obtained c value into
Figure BDA0003811621270000096
Calculating to obtain a pre-calculated value; thus, the remaining pre-calculation table is calculated and obtained according to P x And the residual pre-calculation table to obtain a global pre-calculation table.
In some embodiments, in generating the second partial precomputation table by the plurality of first GPU threads, each first GPU thread calculates the second partial precomputation table by a fixed base modular exponentiation method based on the precomputation table.
Here, the fixed base modular exponentiation method based on the pre-calculation table is used for performing a modular multiplication method based on the montgomery algorithm and the kartsuba algorithm, and also for performing the fixed base modular exponentiation by referring to the pre-calculation table generated at the initialization stage.
The Karatsuba algorithm is a fast multiplication algorithm, is mainly applied to multiplication of two large numbers, and has the principle that the large numbers are divided into two sections and then changed into smaller digits, and then multiplication is carried out for 3 times, and a small amount of addition operation and shift operation are attached.
Illustratively, the CPU device calculates the first partial pre-calculation table, and the plurality of first GPU threads calculate the second partial pre-calculation table based on the fixed basis modular exponentiation method of the pre-calculation table as follows:
(1) CPU end calculation pre-calculation table
Figure BDA0003811621270000101
Wherein the content of the first and second substances,
Figure BDA0003811621270000102
Figure BDA0003811621270000103
(2) GPU device side execution omega 12 1 cycle, wherein each cycle starts
Figure BDA0003811621270000104
One thread parallel computing
Figure BDA0003811621270000105
Each thread calculates 2 precomputed values at a time; for each thread z in each loop, a calculation is made first
Figure BDA0003811621270000106
Then according to
Figure BDA0003811621270000107
Calculating out
Figure BDA0003811621270000108
The P table here pre-calculates the values in the table.
And S102, according to the global pre-calculation table, the message to be encrypted and the homomorphic encryption parameters sent by the CPU equipment end, generating a first ciphertext message in parallel through a plurality of second GPU threads of the GPU equipment end, and sending the first ciphertext message to the CPU equipment end.
In the embodiment of the invention, a CPU (Central processing Unit) equipment end can acquire a message to be encrypted and generate homomorphic encryption parameters corresponding to the message to be encrypted; and sending the message to be encrypted and the homomorphic encryption parameters to a GPU device side.
In some embodiments, each second GPU thread performs cryptographic calculations by a fixed base modular exponentiation method based on pre-computed tables, a modular multiplication method based on montgomery and kartsuba algorithms, and a base operation method based on assembler instructions for the GPU hardware.
Here, the message to be encrypted may be a plaintext message sequence to be encrypted, and the homomorphic encryption parameter may be a random value sequence. The CPU device end can acquire the message to be encrypted by receiving the plaintext message sequence input by the user.
In some embodiments, the GPU device determines a plurality of parallel first encryption calculation tasks according to the global pre-calculation table, the message to be encrypted, or the homomorphic encryption parameter; distributing the plurality of first cryptographic computation tasks to a plurality of second GPU threads; executing the corresponding first encryption calculation tasks through each second GPU thread to obtain corresponding first encryption calculation data; determining a plurality of second encryption calculation tasks according to a plurality of first encryption calculation data corresponding to a plurality of second GPU threads, and distributing each second encryption calculation task to one second GPU thread; and executing the corresponding second encryption calculation task through each second GPU thread to obtain a plurality of second encryption calculation data, and obtaining a first ciphertext message according to the plurality of second encryption calculation data.
For example, the GPU device side splits the plaintext sequence M to be encrypted and the random number sequence R to obtain multiple groups of split data, where each group of split data includes plaintext data and a random value, and distributes the multiple groups of split data to multiple GPU threads for execution, so that the multiple GPU threads execute corresponding computations to obtain multiple first encrypted computed data.
Exemplarily, M = { M = { [ M ] 1 ,m 2 ,...,m k },R={r 1 ,r 2 ,...,r k And after the GPU device receives the plaintext message sequence M and the random value sequence R, starting 2k GPU threads to respectively calculate first encryption calculation data, where the first encryption calculation data is:
Figure BDA0003811621270000111
Figure BDA0003811621270000112
then, starting k GPU threads to calculate second encryption calculation data: c. C 1 =c 11 ×c 12 mod n 2 ,c 2 =c 21 ×c 22 mod n 2 ,…,c k =c k1 ×c k2 mod n 2 (ii) a Then, a ciphertext message sequence C = { C) is obtained 1 ,c 2 ,...,c k }。
In some embodiments, each second GPU thread executes a corresponding first cryptographic calculation task by using a fixed base modular exponentiation method based on a pre-calculation table, a modular multiplication method based on a montgomery algorithm and a kartsuba algorithm, and a basic operation method based on an assembly instruction of GPU hardware, and calculates to obtain corresponding first cryptographic calculation data.
In some embodiments, each second GPU thread executes a corresponding second cryptographic calculation task by using a fixed base modular exponentiation method based on a pre-calculation table, a modular multiplication method based on a montgomery algorithm and a kartsuba algorithm, and a basic operation method based on an assembly instruction of GPU hardware, and calculates to obtain corresponding second cryptographic calculation data.
In the embodiment of the invention, the basic operation method of the assembly instruction based on the GPU hardware is used for optimizing the bit carry propagation process in the addition, subtraction, multiplication and shift operation process based on the PTX instruction of the GPU hardware, reducing unnecessary instruction overhead and quickly calculating the addition, subtraction, multiplication and shift operation of large integers.
In the embodiment of the invention, the Montgomery algorithm and Karatsuba algorithm-based modular multiplication method is used for carrying out modular multiplication operation and modular square operation based on the Montgomery algorithm and carrying out multi-precision multiplication operation in the modular multiplication operation and the modular square operation based on the Karatsuba algorithm.
In the embodiment of the invention, the basic operation method of the GPU hardware-based assembly instruction comprises the following steps: an addition operation method, a subtraction operation method, a multiplication operation method and a shift operation method of the assembly instruction based on GPU hardware.
Exemplary modular multiplication operation methods based on Montgomery algorithm and Karatsuba algorithm include the following steps:
(1) Montgomery multiplication algorithm:
inputting: a, b, n, r, n', wherein a<n,b<n,r=2 k ,n<r,n'=-n -1 modr
And (3) outputting: ω = a × b × r -1 (modn)
And (3) calculating: calculating ω = a × b, m = ω × n, and ω = (ω + m × n)/r;
if ω < m, then ω = ω -n is calculated;
and outputting omega.
(2) A modular multiplication operation method based on Montgomery algorithm comprises the following steps:
inputting: a, b < n, n < 2 k ,n'=-n -1 modr,r=2 k
And (3) outputting: c = a × bmode;
the calculation steps are as follows in sequence: a' = MontMult (a, r) 2 modn);
b'=MontMult(b,r 2 modn);
c'=MontMult(a',b');
c=MontMult(c',1);
Outputting c;
the algorithm (1) is a MontMult (.) function of the algorithm (2) in the operation process.
For example, the GPU device side may split the plaintext sequence M to be encrypted and the random number sequence R, and search each part obtained after the splitting in a pre-calculation table, so that the modular exponentiation operation of the original fixed base can be split into a series of multiplication operations, and then the multiplication operations are distributed to a plurality of GPU calculation units for execution, and each GPU calculation unit executes a GPU thread to perform the multiplication operation by calling a modular multiplication calculation method based on the montgomery algorithm and the kartsuba algorithm and a multiplication method in a basic operation method based on the assembly instruction of the GPU hardware.
It should be noted that, in the present invention, the same letter is used for different algorithms and different methods, and the letter in each algorithm or method is only used for indicating a numerical value in the method or algorithm in which the letter is located.
S103, according to the message to be decrypted and the decryption parameters sent by the CPU equipment end, plaintext messages are generated in parallel through a plurality of third GPU threads of the GPU equipment end, and the plaintext messages are sent to the CPU equipment end.
In the embodiment of the invention, the CPU equipment side can acquire the message to be decrypted and the decryption parameters and send the message to be decrypted and the decryption parameters to the GPU equipment side. The CPU equipment end can respectively acquire the message to be decrypted and the decryption parameters by receiving the message to be decrypted and the decryption parameters input by a user.
In some embodiments, each third GPU thread performs decryption calculations by a Barrett algorithm-based modular arithmetic method, a sliding window algorithm-based non-fixed-base modular exponentiation method, a montgomery algorithm-and kartsuba algorithm-based modular multiplication calculation method, and a GPU hardware-based basic arithmetic method of assembly instructions.
In some embodiments, the GPU device side may determine a plurality of parallel first decryption computation tasks according to the message to be decrypted and the decryption parameter; distributing the plurality of first decryption computing tasks to a plurality of third GPU threads; executing the corresponding first decryption calculation tasks through each third GPU thread to obtain corresponding first decryption calculation data; determining a plurality of second decryption calculation tasks according to a plurality of first decryption calculation data corresponding to a plurality of third GPU threads, and distributing each second decryption calculation task to one third GPU thread; and executing the corresponding second decryption calculation tasks through each third GPU thread to obtain a plurality of second decryption calculation data, and obtaining the plaintext message according to the plurality of second decryption calculation data.
Here, the decryption operation process may be split by using the chinese remainder theorem, so as to obtain a plurality of parallel first decryption computation tasks.
For example, the GPU device may split the sequence C to be decrypted and the decryption key a to obtain multiple groups of split data, where each group of split data includes ciphertext data and a key value, and the multiple groups of split data are allocated to multiple GPU threads to be executed, so that the multiple GPU threads execute corresponding computations to obtain multiple first decryption computation data.
For example, the message to be decrypted may be a ciphertext sequence C = { C = { C } 1 ,c 2 ,...,c k The decryption parameter is a decryption key alpha; the GPU device side may start 2k GPU threads to respectively calculate first decryption calculation data, where the first decryption calculation data is: m is p1 =L p (c α modp 2 )(modp),
Figure BDA0003811621270000141
m p2 =L p (c α modp 2 )(modp),
Figure BDA0003811621270000142
m pk =L p (c α modp 2 )(modp),m qk =L q (c α modq 2 ) (modq); then, k GPU threads are started to compute second decrypted compute data: m is 1 =CRT(m p1 ,m q1 )·h(modn),m 2 =CRT(m p2 ,m q2 )·h(modn),…,m k =CRT(m p2 ,m q2 ) H (modn); after that, a plaintext message sequence M = { M ] is obtained 1 ,m 2 ,...,m k And, CRT (.) denotes a decryption function determined according to the chinese remainder theorem.
In some embodiments, each third GPU thread executes a corresponding first decryption computation task by using a Barrett algorithm-based modular arithmetic method, a sliding window algorithm-based non-fixed-base modular exponentiation method, a montgomery algorithm-and Karatsuba algorithm-based modular multiplication computation method, and a GPU hardware-based basic computation method of an assembly instruction, to obtain corresponding first decryption computation data.
In some embodiments, each third GPU thread executes a corresponding second decryption computation task by using a Barrett algorithm-based modular arithmetic method, a sliding window algorithm-based non-fixed base modular exponentiation method, a montgomery algorithm-and Karatsuba algorithm-based modular multiplication computation method, and a GPU hardware-based basic computation method of an assembly instruction, so as to obtain corresponding second decryption computation data.
Here, when performing decryption operation by each thread, it is necessary to perform modular operation by using a modular operation method based on Barrett's algorithm, then perform modular operation by using a non-fixed base modular exponentiation method based on a sliding window algorithm, then perform modular multiplication operation by using a modular multiplication calculation method optimized by montgomery algorithm and kartsuba algorithm, and perform addition operation by using a basic operation method of an assembly instruction based on GPU hardware.
Here, the Barrett algorithm-based modulo method is used to simplify modulo arithmetic using a pre-calculation process, replacing large integer division operations with shift operations. The Barrett algorithm, also known as Barrett Reduction algorithm, uses shifts instead of divisions.
Illustratively, the Barrett algorithm-based modulo operation method includes:
(1) Pre-calculation step
For modulo n modulo arithmetic, the parameters are pre-calculated
Figure BDA0003811621270000151
Where k = log b n+1。
(2) Modulo operation step
Substituting the mu calculated in the step (1) into
Figure BDA0003811621270000152
In (1), q is calculated, and c = (amodb) is calculated using q k +1 )-(q·nmodb k+1 ) To obtain c;
if c is less than 0, the meter is startedC = c + b k+1
If c is larger than or equal to n, c = c-n is calculated.
In the invention, the non-fixed base modular exponentiation method based on the sliding window algorithm is used for generating a local pre-calculation table by performing pre-calculation window calculation, executing a modular multiplication calculation method based on a Montgomery algorithm and a Karatsuba algorithm, and performing the non-fixed base modular exponentiation by inquiring the generated local pre-calculation table.
Exemplary, the non-fixed base modular exponentiation method based on the sliding window algorithm includes:
(1) Pre-calculation step
Setting t 0 =1,t 1 =a,t 2 =a,c=1;
Expressing the exponent b of the modular exponentiation as ω thereof 1 In a binary representation, i.e.
Figure BDA0003811621270000161
Successively calculating t j =t j-1 ·t 1 Wherein
Figure BDA0003811621270000162
Here, the non-fixed base modular exponentiation can be expressed as c = a ^ b (mod n ^ 2), where b is the exponent of the modular exponentiation; t represents a temporary pre-calculation table required to be used in the process of the non-fixed base modular exponentiation method.
(2) Step of loop calculation
In the step (2), the value of i is l-1 in the first loop, and the loop calculation is ended until the value of i is 0 in the second loop, and the step of each thread in each loop calculation is as follows:
a)
Figure BDA0003811621270000163
b) Substituting c calculated in the step a) into
Figure BDA0003811621270000164
Continuously calculating c, and taking the calculated c as an input value of the step a) in the next cycle calculation;
(3) At the end of the loop calculation, c obtained in step b) is output, wherein c = a b modn。
And S104, according to the second ciphertext message and the third ciphertext message sent by the CPU equipment end, performing homomorphic addition operation in parallel through a plurality of fourth GPU threads of the GPU equipment end to generate a fourth ciphertext message, and sending the fourth ciphertext message to the CPU equipment end.
In the embodiment of the invention, the CPU equipment terminal can obtain the second ciphertext message and the third ciphertext message; and sending the second ciphertext message and the third ciphertext message to the GPU equipment terminal.
Here, the CPU device may obtain the second ciphertext message and the third ciphertext message by receiving the second ciphertext message and the third ciphertext message input by the user, respectively.
In some embodiments, each fourth GPU thread performs a homomorphic addition operation by a modular multiplication calculation method based on montgomery algorithm and kartsuba algorithm, and a basic operation method based on an assembly instruction of GPU hardware.
In some embodiments, the GPU device determines a plurality of homomorphic addition calculation tasks in parallel according to the second ciphertext message and the third ciphertext message; distributing the plurality of homomorphic addition computing tasks to a plurality of fourth GPU threads; executing the corresponding homomorphic addition calculation task through each fourth GPU thread to obtain corresponding homomorphic addition calculation data; and obtaining a fourth ciphertext message according to the homomorphic addition calculation data corresponding to the homomorphic addition calculation tasks.
And executing the corresponding homomorphic addition calculation task by each fourth GPU thread by adopting a modular multiplication calculation method based on a Montgomery algorithm and a Karatsuba algorithm and a basic operation method based on an assembly instruction of GPU hardware to obtain corresponding homomorphic addition calculation data.
Here, the GPU device side may encrypt the ciphertext sequence C 1 And ciphertext sequence C 2 Splitting to obtain multiple groups of splitsData, each split data set including a ciphertext sequence C 1 Ciphertext data and ciphertext sequence C in (1) 2 And distributing the multiple groups of split data to multiple GPU threads for execution, so that the multiple GPU threads execute corresponding calculation to obtain multiple homomorphic addition calculation data.
Illustratively, ciphertext sequence C 1 ={c 11 ,c 12 ,...,c 1k },C 2 ={c 21 ,c 22 ,...,c 2k And then the GPU equipment end can convert the ciphertext sequence C 1 And ciphertext sequence C 2 Splitting, and then starting k GPU threads to calculate a plurality of homomorphic addition calculation data: c. C 1 '=c 11 ×c 21 modn 2 ,c 2 '=c 12 ×c 22 modn 2 ,…,c k '=c 1k ×c 2k modn 2 And yields the plaintext sequence C '= { C' 1 ,c' 2 ,...,c' k }。
In the embodiment of the present invention, the first GPU thread, the second GPU thread, the third GPU thread, and the fourth GPU thread may be the same GPU thread or different GPU threads, which is not limited in the embodiment of the present invention.
In the embodiment of the invention, homomorphic operations such as calculation, encryption and decryption and the like of the global precomputation table are transferred to the GPU for parallel calculation, on one hand, the global precomputation table is generated at the GPU end, so that the throughput during subsequent encryption calculation can be improved; on the other hand, the GPU has strong computing power, so that more complex encryption calculation can be allowed, and the security of encrypted data is improved; on the other hand, the calculation efficiency of homomorphic operation can be improved through parallel homomorphic operation. That is to say, because the GPU has more computing units than the CPU, and can efficiently process computation-intensive data in parallel, the present invention takes into account the many-core hardware architecture characteristics of the GPU, transfers the homomorphic operations of computation, encryption, decryption, and the like of the global pre-computation table to the GPU, decomposes the homomorphic operations of computation, encryption, decryption, and the like of the global pre-computation table, and performs fine-grained concurrent homomorphic operation operations, which not only improves the throughput during computation, but also improves the security of encrypted data, and further greatly improves the efficiency of the homomorphic operation operations of Paillier encryption, decryption, and the present invention can provide efficient homomorphic encryption outsourcing computing services such as computation, decryption, and the like for the homomorphic encryption outsourcing computing service in a privacy protection scenario.
The embodiment of the invention also provides a Paillier homomorphic encryption and decryption calculation method based on the GPU, which is applied to a CPU (central processing unit) equipment end, and as shown in figure 2, the method comprises the following steps:
s201, obtaining a safety parameter, a random value and a sliding window parameter.
S202, generating a password parameter based on the security parameter and the random value.
S203, calculating a first part of pre-calculation table based on the sliding window parameter.
And S204, sending the password parameters, the sliding window parameters and the first partial pre-calculation table to a GPU device side.
S205, obtaining the message to be encrypted and generating homomorphic encryption parameters corresponding to the message to be encrypted.
And S206, sending the message to be encrypted and the homomorphic encryption parameters to a GPU device side.
And S207, acquiring the message to be decrypted and the decryption parameters.
And S208, sending the message to be decrypted and the decryption parameter to the GPU equipment terminal.
S209, acquiring the second ciphertext message and the third ciphertext message.
S210, sending the second ciphertext message and the third ciphertext message to the GPU equipment terminal.
The embodiment of the present invention further provides a Paillier homomorphic encryption and decryption computing system based on a GPU, as shown in fig. 3, the system includes: homomorphic operation module 1 and arithmetic operation module 2; the arithmetic operation module 2 is used for being called by the homomorphic operation module 1; wherein, homomorphic operation module 1 includes:
the system initialization and pre-calculation module 11 is used for acquiring the security parameters, the random values and the sliding window parameters by the CPU equipment terminal, generating the password parameters based on the security parameters and the random values, calculating a first part pre-calculation table based on the sliding window parameters, and sending the password parameters, the sliding window parameters and the first part pre-calculation table to the GPU equipment terminal; generating a second partial pre-calculation table in parallel by a plurality of first GPU threads of the GPU equipment end according to the password parameters, the sliding window parameters and the first partial pre-calculation table by the GPU equipment end, and obtaining a global pre-calculation table according to the second partial pre-calculation table and the first partial pre-calculation table;
the encryption calculation module 12 is configured to obtain, by the CPU device side, a message to be encrypted, generate a homomorphic encryption parameter corresponding to the message to be encrypted, and send the message to be encrypted and the homomorphic encryption parameter to the GPU device side; the GPU equipment side generates a first ciphertext message in parallel through a plurality of second GPU threads of the GPU equipment side according to the global precomputation table, the message to be encrypted and the homomorphic encryption parameter, and sends the first ciphertext message to the CPU equipment side;
the decryption calculation module 13 is configured to obtain, by the CPU device side, a message to be decrypted and a decryption parameter, and send the message to be decrypted and the decryption parameter to the GPU device side; the GPU equipment side generates plaintext messages in parallel through a plurality of third GPU threads of the GPU equipment side according to the messages to be decrypted and the decryption parameters, and sends the plaintext messages to the CPU equipment side;
the homomorphic addition calculation module 14 is configured to obtain the second ciphertext message and the third ciphertext message from the CPU device side, and send the second ciphertext message and the third ciphertext message to the GPU device side; the GPU equipment end performs homomorphic addition operation in parallel through a plurality of fourth GPU threads of the GPU equipment end according to the second ciphertext message and the third ciphertext message to generate a fourth ciphertext message, and the fourth ciphertext message is sent to the CPU equipment end;
the arithmetic operation block 2 includes:
the non-fixed base modular exponentiation operation module 21 based on the sliding window method is used for generating a local pre-calculation table by performing pre-calculation window calculation when plaintext messages are generated in parallel through a plurality of third GPU threads, executing a modular multiplication calculation method based on a Montgomery algorithm and a Karatsuba algorithm by calling a modular multiplication calculation module based on the Montgomery algorithm and the Karatsuba algorithm, and performing non-fixed base modular exponentiation operation by inquiring the local pre-calculation table;
the Barrett algorithm-based modular operation module 22 is used for performing modular processing when plaintext messages are generated in parallel through a plurality of third GPU threads;
the fixed base modular exponentiation operation module 23 based on the pre-calculation table is used for executing a modular multiplication calculation method based on a Montgomery algorithm and a Karatsuba algorithm by calling a modular multiplication calculation module based on a Montgomery algorithm and a Karatsuba algorithm when the second part of the pre-calculation table is generated by a plurality of first GPU threads; when the first ciphertext messages are generated in parallel through a plurality of second GPU threads, fixed base modular exponentiation is carried out by inquiring the global precomputation table;
the modular multiplication calculation module 24 based on the Montgomery algorithm and the Karatsuba algorithm is used for performing modular multiplication operation and modular square operation based on the Montgomery algorithm and performing multi-precision multiplication operation in the modular multiplication operation and the modular square operation based on the Karatsuba algorithm;
and a basic operation module 25 of the assembly instruction based on the GPU hardware, which is used for executing addition operation, subtraction operation, multiplication operation and shift operation.
As shown in fig. 3, the basic operation block 25 of the assembly instruction based on the GPU hardware may be called by the non-fixed base modular exponentiation operation block 21 based on the sliding window method, the modular exponentiation operation block 22 based on the Barrett algorithm, the fixed base modular exponentiation operation block 23 based on the pre-calculation table, and the modular multiplication operation block 24 based on the montgomery algorithm and the kartsuba algorithm.
The invention provides a Paillier homomorphic encryption and decryption calculation method based on a GPU (graphics processing unit), aiming at the defect of high computational overhead of homomorphic encryption, decryption and homomorphic operation in the privacy protection outsourcing calculation process based on the homomorphic encryption technology. The invention divides the Paillier homomorphic cryptographic algorithm into a homomorphic operation layer and an arithmetic operation layer, and respectively designs an optimization scheme to optimize each calculation module: aiming at a homomorphic operation layer, the characteristic of a GPU many-core hardware architecture is considered, a proper encryption algorithm variant is selected, homomorphic operations such as encryption and decryption are decomposed, fine-grained concurrent homomorphic operation is designed, and the homomorphic operation calculation efficiency is greatly improved; aiming at the arithmetic operation layer, the characteristics of strong calculation capability and weak logic processing capability of the GPU are fully considered, the basic arithmetic operation used by homomorphic operation is optimized and realized, and efficient operation building blocks are provided for the realization of the upper homomorphic operation layer. The invention fully considers the hardware characteristics of the GPU, carries out layering and modular optimization on the Paillier homomorphic encryption algorithm, and mainly comprises fine-grained concurrent computation design on a homomorphic operation layer and multiple bottom arithmetic operation optimization designs. The optimization methods utilize the GPU many-core characteristics to carry out concurrent and parallel optimization on the algorithm, greatly improve the operation efficiency of Paillier encryption, decryption and homomorphic addition, and can provide efficient homomorphic encryption, decryption and computation services for homomorphic encryption outsourcing computation services under the privacy protection scene.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims (10)

1. A Paillier homomorphic encryption and decryption calculation method based on a GPU is characterized by being applied to a GPU device side and comprising the following steps:
generating a second partial pre-calculation table in parallel through a plurality of first GPU threads of a GPU (graphics processing unit) device end according to a password parameter, a sliding window parameter and a first partial pre-calculation table sent by a CPU (central processing unit) device end, and obtaining a global pre-calculation table according to the second partial pre-calculation table and the first partial pre-calculation table;
according to the global precomputation table, the message to be encrypted and the homomorphic encryption parameters which are sent by the CPU equipment end, generating a first ciphertext message in parallel through a plurality of second GPU threads of the GPU equipment end, and sending the first ciphertext message to the CPU equipment end;
according to the message to be decrypted and the decryption parameters sent by the CPU equipment end, generating a plaintext message in parallel through a plurality of third GPU threads of the GPU equipment end, and sending the plaintext message to the CPU equipment end;
according to the second ciphertext message and the third ciphertext message sent by the CPU equipment end, carrying out homomorphic addition operation in parallel through a plurality of fourth GPU threads of the GPU equipment end to generate a fourth ciphertext message, and sending the fourth ciphertext message to the CPU equipment end.
2. The method for Paillier homomorphic encryption and decryption computation based on GPU of claim 1, wherein the generating of the first ciphertext message in parallel by a plurality of second GPU threads of the GPU device end according to the global pre-computation table, the message to be encrypted and the homomorphic encryption parameter sent by the CPU device end comprises:
determining a plurality of parallel first encryption calculation tasks according to the global pre-calculation table, the message to be encrypted or the homomorphic encryption parameters;
assigning the plurality of first cryptographic computation tasks to the plurality of second GPU threads;
executing the corresponding first encryption calculation tasks through each second GPU thread to obtain corresponding first encryption calculation data;
determining a plurality of second encryption calculation tasks according to a plurality of first encryption calculation data corresponding to the plurality of second GPU threads, and distributing each second encryption calculation task to one second GPU thread;
and executing a corresponding second encryption calculation task through each second GPU thread to obtain a plurality of second encryption calculation data, and obtaining the first ciphertext message according to the plurality of second encryption calculation data.
3. The GPU-based Paillier homomorphic encryption and decryption computing method of claim 2, wherein the obtaining of the corresponding first encrypted computing data by each second GPU thread executing the corresponding first encrypted computing task comprises:
and executing the corresponding first encryption calculation task by each second GPU thread by adopting a fixed base modular exponentiation method based on a pre-calculation table, a modular multiplication method based on a Montgomery algorithm and a Karatsuba algorithm and a basic operation method based on an assembly instruction of GPU hardware, and calculating to obtain corresponding first encryption calculation data.
4. The method for Paillier homomorphic encryption and decryption computation based on GPU of claim 1, wherein the parallel generation of plaintext message by a plurality of third GPU threads of the GPU device end according to the message to be decrypted and the decryption parameter sent by the CPU device end comprises:
determining a plurality of parallel first decryption calculation tasks according to the message to be decrypted and the decryption parameters;
assigning the plurality of first decryption computing tasks to the plurality of third GPU threads;
executing the corresponding first decryption calculation tasks through each third GPU thread to obtain corresponding first decryption calculation data;
determining a plurality of second decryption calculation tasks according to a plurality of first decryption calculation data corresponding to the plurality of third GPU threads, and distributing each second decryption calculation task to one third GPU thread;
and executing the corresponding second decryption calculation tasks through each third GPU thread to obtain a plurality of second decryption calculation data, and obtaining the plaintext message according to the plurality of second decryption calculation data.
5. The GPU-based Paillier homomorphic encryption and decryption computing method of claim 4, wherein the step of executing the corresponding first decryption computing task through each third GPU thread to obtain the corresponding first decryption computing data comprises the steps of:
and executing a corresponding first decryption calculation task by each second GPU thread by adopting a Barrett algorithm-based modular operation method, a sliding window algorithm-based non-fixed base modular exponentiation operation method, a Montgomery algorithm-based modular multiplication calculation method and a Karatsuba algorithm-based modular multiplication calculation method and a GPU hardware-based basic operation method of an assembly instruction, so as to obtain corresponding first decryption calculation data.
6. The method for Paillier homomorphic encryption and decryption computation based on GPU of claim 1, wherein the generating of the fourth ciphertext message by performing homomorphic addition operation in parallel by a plurality of fourth GPU threads of the GPU device end according to the second ciphertext message and the third ciphertext message sent by the CPU device end comprises:
determining a plurality of parallel homomorphic addition calculation tasks according to the second ciphertext message and the third ciphertext message;
assigning the plurality of homomorphic addition computation tasks to the plurality of fourth GPU threads;
executing the corresponding homomorphic addition calculation task through each fourth GPU thread to obtain corresponding homomorphic addition calculation data;
and obtaining the fourth ciphertext message according to the homomorphic addition calculation data corresponding to the homomorphic addition calculation tasks.
7. The GPU-based Paillier homomorphic encryption and decryption computing method of claim 6, wherein the obtaining of corresponding homomorphic addition computing data by executing corresponding homomorphic addition computing tasks by each fourth GPU thread comprises:
and executing the corresponding homomorphic addition calculation task by each fourth GPU thread by adopting a modular multiplication calculation method based on a Montgomery algorithm and a Karatsuba algorithm and a basic operation method based on an assembly instruction of GPU hardware to obtain corresponding homomorphic addition calculation data.
8. The GPU-based Paillier homomorphic encryption and decryption computing method of claim 1, wherein the global pre-computation table comprises a plurality of rows of pre-computed values; the first partial precomputation table is a first partial precomputation value in the plurality of rows of precomputation values; the second partial precomputation table is a second partial precomputation value in the plurality of rows of precomputation values;
the method for generating a second partial pre-calculation table in parallel through a plurality of first GPU threads of the GPU equipment end according to the password parameters, the sliding window parameters and the first partial pre-calculation table sent by the CPU equipment end comprises the following steps:
determining the second part of the pre-calculated values as the values to be calculated according to the first part of the pre-calculated values;
determining the total times of loop calculation according to the sliding window parameters, and calculating the corresponding first GPU threads in each loop;
calculating a plurality of pre-calculated values through the plurality of first GPU threads during each cycle calculation;
and taking the pre-calculated value corresponding to the total number of the cyclic calculation as the second part pre-calculated value.
9. A Paillier homomorphic encryption and decryption calculation method based on a GPU is characterized in that the Paillier homomorphic encryption and decryption calculation method is applied to a CPU device side and comprises the following steps:
acquiring a security parameter, a random value and a sliding window parameter;
generating a cryptographic parameter based on the security parameter and the random value;
calculating a first partial pre-calculation table based on the sliding window parameter;
sending the password parameters, the sliding window parameters and the first partial precomputation table to a GPU (graphics processing unit) device end;
acquiring a message to be encrypted, and generating a homomorphic encryption parameter corresponding to the message to be encrypted;
sending the message to be encrypted and the homomorphic encryption parameters to the GPU equipment end;
acquiring a message to be decrypted and a decryption parameter;
sending the message to be decrypted and the decryption parameter to the GPU equipment end;
acquiring a second ciphertext message and a third ciphertext message;
and sending the second ciphertext message and the third ciphertext message to the GPU equipment terminal.
10. A Paillier homomorphic encryption and decryption computing system based on a GPU is characterized by comprising:
homomorphic operation module and arithmetic operation module; the arithmetic operation module is used for being called by the homomorphic operation module;
the homomorphic operation module comprises:
the system initialization and pre-calculation module is used for acquiring a security parameter, a random value and a sliding window parameter by a CPU (central processing unit) device end, generating a password parameter based on the security parameter and the random value, calculating a first part of pre-calculation table based on the sliding window parameter, and sending the password parameter, the sliding window parameter and the first part of pre-calculation table to the GPU device end; generating a second partial pre-calculation table in parallel by a plurality of first GPU threads of the GPU equipment end according to the password parameters, the sliding window parameters and the first partial pre-calculation table by the GPU equipment end, and obtaining a global pre-calculation table according to the second partial pre-calculation table and the first partial pre-calculation table;
the encryption calculation module is used for acquiring a message to be encrypted by the CPU equipment terminal, generating homomorphic encryption parameters corresponding to the message to be encrypted, and sending the message to be encrypted and the homomorphic encryption parameters to the GPU equipment terminal; the GPU equipment end generates first ciphertext messages in parallel through a plurality of second GPU threads of the GPU equipment end according to the global precomputation table, the messages to be encrypted and the homomorphic encryption parameters, and sends the first ciphertext messages to the CPU equipment end;
the decryption calculation module is used for acquiring a message to be decrypted and a decryption parameter by the CPU equipment end and sending the message to be decrypted and the decryption parameter to the GPU equipment end; generating, by the GPU device side, a plaintext message in parallel by a plurality of third GPU threads of the GPU device side according to the message to be decrypted and the decryption parameter, and sending the plaintext message to the CPU device side;
the homomorphic addition calculation module is used for acquiring a second ciphertext message and a third ciphertext message from the CPU equipment end and sending the second ciphertext message and the third ciphertext message to the GPU equipment end; performing homomorphic addition operation in parallel by the GPU device end through a plurality of fourth GPU threads of the GPU device end according to the second ciphertext message and the third ciphertext message to generate a fourth ciphertext message, and sending the fourth ciphertext message to the CPU device end;
the arithmetic operation module includes:
the non-fixed base modular exponentiation operation module is used for generating a local pre-calculation table by performing pre-calculation window calculation when plaintext messages are generated in parallel through the plurality of third GPU threads, executing a modular multiplication calculation method based on a Montgomery algorithm and a Karatsuba algorithm by calling a modular multiplication calculation module based on the Montgomery algorithm and the Karatsuba algorithm, and performing non-fixed base modular exponentiation operation by inquiring the local pre-calculation table;
the Barrett algorithm-based modular operation module is used for performing modular processing when plaintext messages are generated in parallel through the plurality of third GPU threads;
the fixed base modular exponentiation operation module is used for executing the modular multiplication calculation method based on the Montgomery algorithm and the Karatsuba algorithm by calling the modular multiplication calculation module based on the Montgomery algorithm and the Karatsuba algorithm when the second part of pre-calculation table is generated by the plurality of first GPU threads; when the plurality of second GPU threads generate first ciphertext messages in parallel, fixed base modular exponentiation is performed by inquiring the global precomputation table;
the modular multiplication calculation module based on the Montgomery algorithm and the Karatsuba algorithm is used for carrying out modular multiplication operation and modular square operation based on the Montgomery algorithm and carrying out multi-precision multiplication operation in the modular multiplication operation and the modular square operation based on the Karatsuba algorithm;
and the basic operation module of the assembly instruction based on the GPU hardware is used for executing addition operation, subtraction operation, multiplication operation and shift operation.
CN202211017789.2A 2022-08-23 2022-08-23 Paillier homomorphic encryption and decryption calculation method and system based on GPU Pending CN115459898A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211017789.2A CN115459898A (en) 2022-08-23 2022-08-23 Paillier homomorphic encryption and decryption calculation method and system based on GPU

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211017789.2A CN115459898A (en) 2022-08-23 2022-08-23 Paillier homomorphic encryption and decryption calculation method and system based on GPU

Publications (1)

Publication Number Publication Date
CN115459898A true CN115459898A (en) 2022-12-09

Family

ID=84297943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211017789.2A Pending CN115459898A (en) 2022-08-23 2022-08-23 Paillier homomorphic encryption and decryption calculation method and system based on GPU

Country Status (1)

Country Link
CN (1) CN115459898A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116090017A (en) * 2023-04-12 2023-05-09 东南大学 Paillier-based federal learning data privacy protection method
CN117527192A (en) * 2024-01-08 2024-02-06 蓝象智联(杭州)科技有限公司 Paillier decryption method based on GPU

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110161675A1 (en) * 2009-12-30 2011-06-30 Nvidia Corporation System and method for gpu based encrypted storage access
CN103532710A (en) * 2013-09-26 2014-01-22 中国科学院数据与通信保护研究教育中心 Implementation method and device for GPU (Graphics Processing Unit)-based SM2 (Streaming Multiprocessor 2) algorithm
CN108242994A (en) * 2016-12-26 2018-07-03 阿里巴巴集团控股有限公司 The treating method and apparatus of key
US20190036678A1 (en) * 2015-01-12 2019-01-31 Morphology, LLC Systems and methods for implementing an efficient, scalable homomorphic transformation of encrypted data with minimal data expansion and improved processing efficiency
CN111832050A (en) * 2020-07-10 2020-10-27 深圳致星科技有限公司 Paillier encryption scheme based on FPGA chip implementation for federal learning
CN112199707A (en) * 2020-10-28 2021-01-08 支付宝(杭州)信息技术有限公司 Data processing method, device and equipment in homomorphic encryption
CN112199214A (en) * 2020-10-13 2021-01-08 中国科学院信息工程研究所 Candidate password generation and application cracking method on GPU
CN112988237A (en) * 2021-04-21 2021-06-18 深圳致星科技有限公司 Paillier decryption system, chip and method
CN113541921A (en) * 2021-06-24 2021-10-22 电子科技大学 Fully homomorphic encryption GPU high-performance implementation method
CN113628094A (en) * 2021-07-29 2021-11-09 西安电子科技大学 High-throughput SM2 digital signature computing system and method based on GPU
CN114124364A (en) * 2020-08-27 2022-03-01 国民技术股份有限公司 Key security processing method, device, equipment and computer readable storage medium
CN114124349A (en) * 2021-11-19 2022-03-01 北京数牍科技有限公司 Rapid decryption method for homomorphic encryption scheme

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110161675A1 (en) * 2009-12-30 2011-06-30 Nvidia Corporation System and method for gpu based encrypted storage access
CN103532710A (en) * 2013-09-26 2014-01-22 中国科学院数据与通信保护研究教育中心 Implementation method and device for GPU (Graphics Processing Unit)-based SM2 (Streaming Multiprocessor 2) algorithm
US20190036678A1 (en) * 2015-01-12 2019-01-31 Morphology, LLC Systems and methods for implementing an efficient, scalable homomorphic transformation of encrypted data with minimal data expansion and improved processing efficiency
CN108242994A (en) * 2016-12-26 2018-07-03 阿里巴巴集团控股有限公司 The treating method and apparatus of key
CN111832050A (en) * 2020-07-10 2020-10-27 深圳致星科技有限公司 Paillier encryption scheme based on FPGA chip implementation for federal learning
CN114124364A (en) * 2020-08-27 2022-03-01 国民技术股份有限公司 Key security processing method, device, equipment and computer readable storage medium
CN112199214A (en) * 2020-10-13 2021-01-08 中国科学院信息工程研究所 Candidate password generation and application cracking method on GPU
CN112199707A (en) * 2020-10-28 2021-01-08 支付宝(杭州)信息技术有限公司 Data processing method, device and equipment in homomorphic encryption
CN112988237A (en) * 2021-04-21 2021-06-18 深圳致星科技有限公司 Paillier decryption system, chip and method
CN113541921A (en) * 2021-06-24 2021-10-22 电子科技大学 Fully homomorphic encryption GPU high-performance implementation method
CN113628094A (en) * 2021-07-29 2021-11-09 西安电子科技大学 High-throughput SM2 digital signature computing system and method based on GPU
CN114124349A (en) * 2021-11-19 2022-03-01 北京数牍科技有限公司 Rapid decryption method for homomorphic encryption scheme

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
HAI-BIN YANG ET AL.: "Efficiency Analysis of TFHE Fully Homomorphic Encryption Software Library Based on GPU", SPRINGERLINK, 15 March 2019 (2019-03-15) *
WEI WANG ET AL.: "Accelerating leveled fully homomorphic encryption using GPU", 2014 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 16 July 2014 (2014-07-16) *
吴伟民等: "基于GPU的密文分组随机链接加密模式的研究", 计算机工程与科学, no. 01, 15 January 2015 (2015-01-15) *
唐天泽等: "大数乘法的GPU加速实现", 计算机应用研究, no. 10, 10 October 2017 (2017-10-10) *
夏飞等: "基于CPU-GPU混合计算平台的RNA二级结构预测算法并行化研究", 国防科技大学学报, no. 06, 28 December 2013 (2013-12-28) *
小人物的挣扎: "paillier加密算法原理详解", Retrieved from the Internet <URL:https://www.cnblogs.com/sssssaylf/p/12398133.html> *
郑志蓉: "一种基于CPU-GPU混合系统的并行同态加密算法", 船舶电子工程, 20 August 2019 (2019-08-20) *
颜容: "隐私保护的智能电网多维数据聚合方案研究与实现", 中国优秀硕士学位论文全文数据库(工程科技II辑), no. 3, 15 March 2017 (2017-03-15) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116090017A (en) * 2023-04-12 2023-05-09 东南大学 Paillier-based federal learning data privacy protection method
CN117527192A (en) * 2024-01-08 2024-02-06 蓝象智联(杭州)科技有限公司 Paillier decryption method based on GPU
CN117527192B (en) * 2024-01-08 2024-04-05 蓝象智联(杭州)科技有限公司 Paillier decryption method based on GPU

Similar Documents

Publication Publication Date Title
CN115459898A (en) Paillier homomorphic encryption and decryption calculation method and system based on GPU
Bernstein et al. High-speed high-security signatures
CN108809623B (en) Secure multiparty computing method, device and system
US5581616A (en) Method and apparatus for digital signature authentication
CN109039640B (en) Encryption and decryption hardware system and method based on RSA cryptographic algorithm
CN105099672A (en) Hybrid encryption method and device for realizing the same
JP2010277085A (en) Protection of prime number generation in rsa algorithm
CN113628094B (en) High-throughput SM2 digital signature computing system and method based on GPU
JP2011059690A (en) Protection of prime number generation against side-channel attacks
CN103631660A (en) Method and device for distributing storage resources in GPU in big integer calculating process
Saxena et al. State of the art parallel approaches for RSA public key based cryptosystem
CN107888385B (en) RSA modulus generation method, RSA key generation method, computer device, and medium
Boudot et al. The state of the art in integer factoring and breaking public-key cryptography
CN109962783B (en) SM9 digital signature collaborative generation method and system based on progressive calculation
CN113193962B (en) SM2 digital signature generation and verifier based on lightweight modular multiplication
JPH11109859A (en) Method for generating pseudo-random number
CN105119929A (en) Safe mode index outsourcing method and system under single malicious cloud server
CN109818944B (en) Cloud data outsourcing and integrity verification method and device supporting preprocessing
Reddy RM-RSA algorithm
Li et al. Privacy-preserving large-scale systems of linear equations in outsourcing storage and computation
WO2022172041A1 (en) Asymmetric cryptographic schemes
CN115801221A (en) Acceleration apparatus, computing system, and acceleration method
JP2001066987A (en) Secure parameter generating device and method for algeblaic curve cryptograph, and recording medium
Rasslan et al. New Generic Design to Expedite Asymmetric Cryptosystems Using Three-levels of Parallelism.
JP3123820B2 (en) Operators in finite commutative groups

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination