CN114244510A - Hardware acceleration apparatus, method, device, and storage medium - Google Patents

Hardware acceleration apparatus, method, device, and storage medium Download PDF

Info

Publication number
CN114244510A
CN114244510A CN202111563372.1A CN202111563372A CN114244510A CN 114244510 A CN114244510 A CN 114244510A CN 202111563372 A CN202111563372 A CN 202111563372A CN 114244510 A CN114244510 A CN 114244510A
Authority
CN
China
Prior art keywords
encryption
aes
calculation unit
unit
calculation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111563372.1A
Other languages
Chinese (zh)
Inventor
莫雄
余桉
汤晓东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Union Memory Information System Co Ltd
Original Assignee
Shenzhen Union Memory Information System Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Union Memory Information System Co Ltd filed Critical Shenzhen Union Memory Information System Co Ltd
Priority to CN202111563372.1A priority Critical patent/CN114244510A/en
Publication of CN114244510A publication Critical patent/CN114244510A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0618Block ciphers, i.e. encrypting groups of characters of a plain text message using fixed encryption transformation
    • H04L9/0631Substitution permutation network [SPN], i.e. cipher composed of a number of stages or rounds each involving linear and nonlinear transformations, e.g. AES algorithms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0861Generation of secret information including derivation or calculation of cryptographic keys or passwords
    • H04L9/0863Generation of secret information including derivation or calculation of cryptographic keys or passwords involving passwords or one-time passwords
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/12Details relating to cryptographic hardware or logic circuitry
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/12Details relating to cryptographic hardware or logic circuitry
    • H04L2209/125Parallelization or pipelining, e.g. for accelerating processing of cryptographic operations

Abstract

The invention relates to a hardware acceleration device, a method, equipment and a storage medium, wherein the device comprises: SM4 encryption, AES128 encryption, AES192 encryption, and AES256 encryption; wherein, the SM4 encryption comprises an SM4 calculation unit 1 to an SM4 calculation unit 16, and 16 calculations are needed; AES128 encryption needs 11 times of calculation, including AES calculating unit 1 to AES calculating unit 10 and AES128 calculating unit 11; AES192 encryption needs 13 times of calculation, including AES calculating unit 1 to AES calculating unit 12 and AES192 calculating unit 13; the AES256 encryption requires 15 computations, including the AES computing units 1 through 14 and the AES256 computing unit 15. The invention supports 4 encryption modes through an integration and multiplexing mode, greatly improves the encryption and decryption speed, and ensures that the throughput rate of the whole system is optimal.

Description

Hardware acceleration apparatus, method, device, and storage medium
Technical Field
The present invention relates to the field of hardware acceleration technologies, and in particular, to a hardware acceleration apparatus, a hardware acceleration method, a hardware acceleration device, and a storage medium.
Background
Encryption techniques play an important role in the field of information security. The symmetric encryption algorithm is a main algorithm for information encryption at present due to the characteristics of small calculation overhead, high encryption speed, high confidentiality and the like. As shown in fig. 1, in the prior art, the number of times of SM4 encryption rounds is 32, which is 2 times of the number of times of AES rounds, the operation time difference is large, the unit operation speed is not balanced enough, the encryption and decryption speed of the system is not optimal, and the requirement cannot be met.
Disclosure of Invention
The present invention is directed to overcoming the deficiencies of the prior art and providing a hardware acceleration apparatus, method, device and storage medium.
In order to solve the technical problems, the invention adopts the following technical scheme:
hardware acceleration apparatus, comprising: 16 encryption calculation units, which are respectively an encryption calculation unit 1 to an encryption calculation unit 16; wherein the encryption calculation units 1 to 14 include an SM4 calculation unit 1 and AES calculation units 1 to SM4 calculation unit 14 and AES calculation unit 14, the encryption calculation unit 15 includes an SM4 calculation unit 15, and the encryption calculation unit 16 includes an SM4 calculation unit 16; the encryption calculation unit 11 further comprises an AES128 calculation unit 11, the encryption calculation unit 13 further comprises an AES192 calculation unit 13, and the encryption calculation unit 15 further comprises an AES256 calculation unit 15; the device has 4 encryption modes, and the SM4_ en, the AES128_ en and the AES192_ en are used for configuring corresponding encryption modes which are respectively SM4 encryption, AES128 encryption, AES192 encryption and AES256 encryption; the SM4 encryption comprises an SM4 calculation unit 1 to an SM4 calculation unit 16, and 16 calculations are needed in total; the AES128 encryption needs 11 times of calculation, and the calculation units required by the AES128 encryption comprise an AES calculation unit 1 to an AES calculation unit 10 and an AES128 calculation unit 11; the AES192 encryption needs 13 times of calculation, and the calculation units required by the AES192 encryption comprise an AES calculation unit 1 to an AES calculation unit 12 and an AES192 calculation unit 13; the AES256 encryption requires 15 calculations, and the calculation units required for the AES256 encryption include an AES calculation unit 1 through an AES calculation unit 14, and an AES256 calculation unit 15.
The further technical scheme is as follows: the AES calculation units 1 to 10 are used for AES128 encryption, AES192 encryption and AES256 encryption, and three different encryption modes are multiplexed; the AES calculation unit 11 and the AES calculation unit 12 are used for AES192 encryption and AES256 encryption, and two different encryption modes are multiplexed.
The further technical scheme is as follows: the processing calculation of the AES calculation unit 1 includes round key addition, and the processing calculation of the AES calculation units 2 to 14 includes byte substitution, row shift, column mixing, and round key addition.
The further technical scheme is as follows: the AES128 calculating unit 11, the AES192 calculating unit 13, and the AES256 calculating unit 15 have the same processing calculation, and the processing calculation in the AES calculating unit 11, the AES calculating unit 13, and the AES256 calculating unit 15 is multiplexed respectively, including: byte substitution, row shift, round key addition.
The hardware acceleration method is based on the hardware acceleration device and comprises the following steps:
acquiring 128-bit plaintext;
performing 16 iterative computations by using SM4 computing units 1 to SM4 computing units 16 according to 128-bit plaintext to obtain SM4 ciphertext; performing 11 times of iterative computation through an AES computing unit 1 to an AES computing unit 10 and an AES128 computing unit 11 to obtain an AES128 ciphertext; performing 13 times of iterative computation through an AES computing unit 1 to an AES computing unit 12 and an AES192 computing unit 13 to obtain an AES192 ciphertext; through the AES computing unit 1 to the AES computing unit 14 and the AES256 computing unit 15, 15 times of iterative computation to obtain an AES256 ciphertext.
The further technical scheme is as follows: the AES calculation units 1 to 10 are used for AES128 encryption, AES192 encryption and AES256 encryption, and three different encryption modes are multiplexed; the AES calculation unit 11 and the AES calculation unit 12 are used for AES192 encryption and AES256 encryption, and two different encryption modes are multiplexed.
The further technical scheme is as follows: the processing calculation of the AES calculation unit 1 includes round key addition, and the processing calculation of the AES calculation units 2 to 14 includes byte substitution, row shift, column mixing, and round key addition.
The further technical scheme is as follows: the AES128 calculating unit 11, the AES192 calculating unit 13, and the AES256 calculating unit 15 have the same processing calculation, and the processing calculation in the AES calculating unit 11, the AES calculating unit 13, and the AES256 calculating unit 15 is multiplexed respectively, including: byte substitution, row shift, round key addition.
The hardware acceleration device comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the hardware acceleration method when executing the computer program.
A storage medium storing a computer program comprising program instructions which, when executed by a processor, implement the hardware acceleration method as described above.
Compared with the prior art, the invention has the beneficial effects that: by means of integration and multiplexing, 4 encryption modes are supported under the condition that hardware resources are not excessively occupied; the encryption processing flow of the traditional AES/SM4 is optimized, the encryption and decryption speed is greatly improved, and the throughput rate of the whole system is optimal.
The invention is further described below with reference to the accompanying drawings and specific embodiments.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a schematic diagram of an SM4 encryption flow of the prior art;
FIG. 2 is a schematic diagram of an AES + SM4 encryption device;
FIG. 3 is a schematic block diagram of a hardware acceleration device provided by an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a hardware acceleration method according to an embodiment of the present invention;
fig. 5 is a schematic diagram of an AES encryption flow provided by an embodiment of the present invention;
fig. 6 is a schematic diagram of an SM4 encryption flow provided by an embodiment of the present invention;
fig. 7 is a schematic flow chart of a single-pass processing of SM4 according to an embodiment of the present invention;
FIG. 8 is a diagram illustrating an application of AES256 encrypted 4KB data stream according to an embodiment of the present invention;
fig. 9 is a schematic diagram of an application of SM4 to encrypt a 4KB data stream according to an embodiment of the present invention;
fig. 10 is a schematic block diagram of a hardware acceleration device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As shown in fig. 2, the AES (Advanced Encryption Standard) + SM4(SM4 block cipher algorithm) Encryption apparatus integrates SM4 and the AES Encryption algorithm, and supports two modes: ECB (Electronic Code Book mode) and XTS (XEX Tweakable Block Cipher with Cipher protect locking working mode). Since there is symmetry similarity between decryption and encryption, encryption is used for expansion.
The encryption circuit mainly comprises two parts: a. the round key generation unit is used for respectively carrying out key expansion unit operation and key encryption unit operation on the input parameters key (key) and hash key (hash key), and correspondingly and respectively generating random key (round key) and cipher key (hash key ciphertext); b. an AES/SM4 encryption unit encrypts plain text to generate cipher text. Mode selection is chosen by XTS en (XTS enable), when it is 0, ECB mode, corresponding to classical encryption process flow; it can be understood that the plaintext input before the encryption operation and the ciphertext generated after the encryption operation are not processed. When XTS en is selected to be 1, the XTS mode corresponds to an encryption and decryption processing flow with higher safety coefficient; it can be understood that the input plaintext before the encryption operation and the ciphertext generated after the encryption operation need to be subjected to exclusive or operation with the cipher hash key and the alpha (disturbance factor).
Referring to the embodiments shown in fig. 3 to 10, in which, referring to fig. 3, the present invention discloses a hardware acceleration device, comprising: 16 encryption calculation units, which are respectively an encryption calculation unit 1 to an encryption calculation unit 16; the method specifically comprises the following steps: encryption calculation unit 1, encryption calculation unit 2, encryption calculation unit 3, encryption calculation unit 4, encryption calculation unit 5, encryption calculation unit 6, encryption calculation unit 7, encryption calculation unit 8, encryption calculation unit 9, encryption calculation unit 10, encryption calculation unit 11, encryption calculation unit 12, encryption calculation unit 13, encryption calculation unit 14, encryption calculation unit 15, and encryption calculation unit 16. The encryption calculating units 1 to 14 include an SM4 calculating unit 1, an AES calculating unit 1 to SM4 calculating unit 14, and an AES calculating unit 14, and specifically are: the encryption calculation unit 1 includes an SM4 calculation unit 1 and an AES calculation unit 1; the encryption calculation unit 2 includes an SM4 calculation unit 2 and an AES calculation unit 2; the encryption calculation unit 3 includes an SM4 calculation unit 3 and an AES calculation unit 3; the encryption calculation unit 4 includes an SM4 calculation unit 4 and an AES calculation unit 4; the encryption calculation unit 5 includes an SM4 calculation unit 5 and an AES calculation unit 5; the encryption calculation unit 6 includes an SM4 calculation unit 6 and an AES calculation unit 6; the encryption calculation unit 7 includes an SM4 calculation unit 7 and an AES calculation unit 7; the encryption calculation unit 8 includes an SM4 calculation unit 8 and an AES calculation unit 8; the encryption calculation unit 9 includes an SM4 calculation unit 9 and an AES calculation unit 9; the encryption calculation unit 10 includes an SM4 calculation unit 10 and an AES calculation unit 10; the encryption calculation unit 11 includes an SM4 calculation unit 11 and an AES calculation unit 11; the encryption calculation unit 12 includes an SM4 calculation unit 12 and an AES calculation unit 12; the encryption calculation unit 13 includes an SM4 calculation unit 13 and an AES calculation unit 13; the encryption calculation unit 14 includes an SM4 calculation unit 14 and an AES calculation unit 14; the encryption calculation unit 15 includes an SM4 calculation unit 15, and the encryption calculation unit 16 includes an SM4 calculation unit 16; the encryption calculation unit 11 further comprises an AES128 calculation unit 11, the encryption calculation unit 13 further comprises an AES192 calculation unit 13, and the encryption calculation unit 15 further comprises an AES256 calculation unit 15; the device has 4 encryption modes, and the SM4_ en, the AES128_ en and the AES192_ en are used for configuring corresponding encryption modes which are respectively SM4 encryption, AES128 encryption, AES192 encryption and AES256 encryption; the SM4 encryption comprises an SM4 calculation unit 1 to an SM4 calculation unit 16, and 16 calculations are needed in total; the AES128 encryption needs 11 times of calculation, and the calculation units required by the AES128 encryption comprise an AES calculation unit 1 to an AES calculation unit 10 and an AES128 calculation unit 11; the AES192 encryption needs 13 times of calculation, and the calculation units required by the AES192 encryption comprise an AES calculation unit 1 to an AES calculation unit 12 and an AES192 calculation unit 13; the AES256 encryption requires 15 calculations, and the calculation units required for the AES256 encryption include an AES calculation unit 1 through an AES calculation unit 14, and an AES256 calculation unit 15.
Wherein, the AES calculation units 1 to 10 are used for AES128 encryption, AES192 encryption and AES256 encryption, and three different encryption modes are multiplexed; the AES calculation unit 11 and the AES calculation unit 12 are used for AES192 encryption and AES256 encryption, and two different encryption modes are multiplexed.
Wherein, the processing calculation of the AES calculation unit 1 comprises round key addition, and the processing calculation of the AES calculation units 2 to 14 comprises byte replacement, row shift, column mixing and round key addition.
The AES128 calculating unit 11, the AES192 calculating unit 13, and the AES256 calculating unit 15 have the same processing calculation, and the processing calculation in the AES calculating unit 11, the AES calculating unit 13, and the AES calculating unit 15 is multiplexed respectively, including: byte substitution, row shift, round key addition.
As shown in fig. 5, the AES encryption unit operation includes: 1. adding a round key; 2. byte replacement; 3. line shifting; 4. mixing the rows; according to the difference of bit width of key, the corresponding AES iteration times are different, and in each calculation, the corresponding round key W [ i ] needs to be transmitted.
When the bit width of the key is 128 bits, the total number of times of calculation corresponding to the AES128 is 11, and the required round key includes: w [0], W [1], … W [43], which corresponds to the range of i in the figure as [1,9], and j is 10.
When the bit width of the key is 192 bits, the total number of times of calculation corresponding to AES192 is 13, and the required round key includes: w [0], W [1], … W [51], which corresponds to the range of i in the figure as [1,11], and j is 12.
When the bit width of the key is 256 bits, the total number of times of calculation corresponding to the AES256 is 15, and the required round key includes: w [0], W [1], … W [59], corresponding to the ranges of i in the figure as [1,13], j being 14.
In the present embodiment, the processing calculation of the AES128 calculating unit 11, the AES192 calculating unit 13, and the AES256 calculating unit 15 includes byte replacement, line shift, and round key addition.
As shown in fig. 6, in the SM4 encryption process, the round-to-state information correspondence, F (round 1) — X0、X1、X2、X3]、F(round 2)—[X2、X3、X4、X5]、…、F(round i)—[X2i、X2i+1、X2i+2、X2i+3]、…、F(round 16)—[X32、X33、X34、X35]The number of iterations is 16. XiState information required before the SM4 encryption operation is expressed; f (round i) denotes the i-th round SM4 encryption calculation unit.
The encryption function of the single round encryption processing flow in the SM4 shown in fig. 7 can be expressed as follows: [ X ]i+4,Xi+5]=[F(Xi,Xi+1,Xi+2,Xi+3,rki),F(Xi+1,Xi+2,Xi+3,Xi+4,rki+1)]=[Xi⊕T(Xi+1⊕Xi+2⊕Xi+3⊕rki),Xi+1⊕T(Xi+2⊕Xi+3⊕Xi+4⊕rki+1)](ii) a Where T transform is a synthesis transform, which is composed of a nonlinear transform τ (sbox table lookup replacement) and a linear transform L (cyclic shift operation), i.e., T (x) ═ L (τ (x)). However, the following 3 points are noted: a. the single round encryption processing flow in the scheme can be understood as the two-time encryption processing flow of the traditional SM 4; b. compared with only 1 state information generated by the traditional SM4, the state information generated in a single round in the scheme is 1 more; c. in the scheme, 4 pieces of state information before single-round processing are replaced as follows: [ X ]0、X1、X2、X3]、[X2、X3、X4、X5]、…[X2i、X2i+1、X2i+2、X2i+3]、…[X32、X33、X34、X35]。
As shown in fig. 3, the hardware acceleration apparatus integrates four encryption processing flows of SM4, AES128, AES192, and AES256, and has 16 stages of operations. The hardware acceleration device is in 4 encryption modes and can be configured through SM4_ en, AES128_ en and AES192_ en, and AES256 is turned on as a default.
The configuration relationship between the 4 encryption modes and the SM4_ en, AES128_ en and AES192_ en is shown in the following table:
SM4_en AES128_en AES192_en description of the invention
On Off Off Start SM4
Off On Off Turning on AES128
Off Off On Turning on AES192
Off Off Off AES256 is turned on
The single-round encryption computing unit in the hardware acceleration device integrates a single-round SM4 computing unit and a single-round AES computing unit. Wherein, the processing flow of the single round of SM4 computing unit is shown in fig. 7; the processing flow of the single round of the AES computing unit is shown in fig. 5. The processing flow of the AES computing unit 1 included in the encryption computing unit 1 is round key addition, and is identical to the AES computing unit 1 in fig. 5. The processing flow of the AES calculation units 2 to 14 included in the encryption calculation units 2 to 14, respectively, includes: byte substitution, row shifting, column mixing, and round key addition, consistent with AES computation units 2-14 in fig. 5. The encryption calculation unit 11, the encryption calculation unit 13, and the encryption calculation unit 15 respectively include an AES128 calculation unit 11, an AES192 calculation unit 13, and an AES256 calculation unit 15, so as to identify that the processing flow of the calculation units before the AES128, AES192, and AES256 generate ciphertexts is the same as that of the AES calculation unit 15 in fig. 5, and the encryption calculation unit 11, the encryption calculation unit 13, and the encryption calculation unit 15 include: byte substitution, row shift, round key addition. In the hardware design, multiplexing technology can be adopted, and hardware resources are not consumed excessively. The number of stages required to be processed in the encryption processing of the SM4, the AES128, the AES192 and the AES256 is as follows: 16. 11, 13 and 15.
Therein, the application diagram of AES256 encrypted 4KB data stream as shown in FIG. 8; the application diagram of SM4 encrypting a 4KB data stream as shown in fig. 9; taking the encryption mode as AES256, the encryption mode as ECB, and the plaintext data 4KB (32768bit) as an example, the following description is made: processing the 256-bit Key by a Key extended in a Set AES256 Key to generate a 1920-bit random Key; 4KB plaintext data is processed by an AES256 encryption operation unit according to the granularity of 128bit and 1920bit random key to generate 4KB ciphertext. Fig. 8 shows a data flow path in the AES256 encryption process. Datai, j in the figure indicates that the block of data processed by the encryption calculation unit j is i. 15 rounds of encryption calculation units in AES256 correspond to the value range [1:15] of j; the number of data blocks converted from 4KB plaintext data according to the granularity of 128 bits is 256, and the data blocks correspond to the value range [0:255] of i. It can be seen that each data block in plain text generates a corresponding ciphertext after 15 rounds of encryption calculation. For example, the ciphertext blocks generated corresponding to the 1 st, 2 nd and … 256 th plaintext blocks are respectively data0,15, data1,15, … and data255, 15. The ciphertext blocks correspondingly generated after the continuously input plaintext blocks pass through the encryption operation unit are also continuous, so that the throughput rate of hardware processing data is improved.
Various combinations of encryption modes integrated in the device also belong to the variation of the scheme, and belong to the protection scope of the invention.
Referring to fig. 4, the present invention also discloses a hardware acceleration method, based on the hardware acceleration apparatus, including the following steps:
s1, obtaining 128 bits plaintext;
s1, according to 128-bit plaintext, performing 16 times of iterative computation through SM4 computation units 1 to SM4 computation units 16 to obtain an SM4 ciphertext; performing 11 times of iterative computation through an AES computing unit 1 to an AES computing unit 10 and an AES128 computing unit 11 to obtain an AES128 ciphertext; performing 13 times of iterative computation through an AES computing unit 1 to an AES computing unit 12 and an AES192 computing unit 13 to obtain an AES192 ciphertext; through the AES computing unit 1 to the AES computing unit 14 and the AES256 computing unit 15, 15 times of iterative computation to obtain an AES256 ciphertext.
Wherein, the AES calculation units 1 to 10 are used for AES128 encryption, AES192 encryption and AES256 encryption, and three different encryption modes are multiplexed; the AES calculation unit 11 and the AES calculation unit 12 are used for AES192 encryption and AES256 encryption, and two different encryption modes are multiplexed.
Wherein, the processing calculation of the AES calculation unit 1 comprises round key addition, and the processing calculation of the AES calculation units 2 to 14 comprises byte replacement, row shift, column mixing and round key addition.
The AES128 calculating unit 11, the AES192 calculating unit 13, and the AES256 calculating unit 15 have the same processing calculation, and the processing calculation in the AES calculating unit 11, the AES calculating unit 13, and the AES calculating unit 15 is multiplexed respectively, including: byte substitution, row shift, round key addition.
The processing flow of the AES computing unit 1 included in the encryption computing unit 1 is round key addition, and is the same as the AES computing unit 1 in fig. 5. The processing flow of the AES calculation units 2 to 14 included in the encryption calculation units 2 to 14, respectively, includes: byte substitution, row shifting, column mixing, and round key addition, consistent with AES computation units 2-14 in fig. 5. The encryption calculation unit 11, the encryption calculation unit 13, and the encryption calculation unit 15 respectively include an AES128 calculation unit 11, an AES192 calculation unit 13, and an AES256 calculation unit 15, so as to identify that the processing flow of the calculation units before the AES128, AES192, and AES256 generate ciphertexts is the same as that of the AES calculation unit 15 in fig. 5, and the encryption calculation unit 11, the encryption calculation unit 13, and the encryption calculation unit 15 include: byte substitution, row shift, round key addition.
The invention optimizes the round calculation in the traditional AES/SM4 encryption and decryption, reduces the round of SM4 from 32 to 16, and makes the round calculation times approximate to the round calculation times of AES 128/192/256; the data processing unit of SM4+ AES algorithm has a single-flow characteristic, each round of iteration can be regarded as a first-stage pipeline, 16 rounds of iteration are divided into 16 stages of pipelines, and the time of each stage of operation is the same, so that the encryption and decryption speed is greatly improved, and 4 encryption modes are supported in an integration and multiplexing mode under the condition that hardware resources are not excessively occupied; the encryption processing flow of the traditional AES/SM4 is optimized, the encryption and decryption speed is greatly improved, and the throughput rate of the whole system is optimal.
It should be noted that, as will be clear to those skilled in the art, for a specific implementation process of the foregoing method embodiment, reference may be made to the corresponding description of the foregoing hardware acceleration apparatus and each unit, and for convenience and conciseness of description, no further description is provided herein.
The hardware acceleration apparatus may be implemented in a form of a computer program, and the computer program may be run on a hardware acceleration device as shown in fig. 10.
Referring to fig. 10, fig. 10 is a schematic block diagram of a hardware acceleration device according to an embodiment of the present application; the hardware acceleration device 500 may be a terminal or a server, where the terminal may be an electronic device with a communication function, such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, and a wearable device. The server may be an independent server or a server cluster composed of a plurality of servers.
Referring to fig. 10, the hardware acceleration device 500 includes a processor 502, a memory, and a network interface 505 connected by a system bus 501, wherein the memory may include a non-volatile storage medium 503 and an internal memory 504.
The non-volatile storage medium 503 may store an operating system 5031 and a computer program 5032. The computer programs 5032 include program instructions that, when executed, cause the processor 502 to perform hardware acceleration methods.
The processor 502 is used to provide computing and control capabilities to support the operation of the overall hardware acceleration device 500.
The internal memory 504 provides an environment for the execution of the computer program 5032 in the non-volatile storage medium 503, and when the computer program 5032 is executed by the processor 502, the processor 502 can be enabled to execute the hardware acceleration method.
The network interface 505 is used for network communication with other devices. Those skilled in the art will appreciate that the structure shown in fig. 10 is a block diagram of only a portion of the structure related to the present application, and does not constitute a limitation on the hardware acceleration device 500 to which the present application is applied, and that a particular hardware acceleration device 500 may include more or less components than those shown in the figure, or combine some components, or have a different arrangement of components.
It should be understood that in the embodiment of the present Application, the Processor 502 may be a Central Processing Unit (CPU), and the Processor 502 may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
It will be understood by those skilled in the art that all or part of the flow of the method implementing the above embodiments may be implemented by a computer program instructing associated hardware. The computer program includes program instructions, and the computer program may be stored in a storage medium, which is a computer-readable storage medium. The program instructions are executed by at least one processor in the computer system to implement the flow steps of the embodiments of the method described above.
Accordingly, the present invention also provides a storage medium. The storage medium may be a computer-readable storage medium. The storage medium stores a computer program, wherein the computer program comprises program instructions which, when executed by a processor, implement the hardware acceleration method described above.
The storage medium may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a magnetic disk, or an optical disk, which can store various computer readable storage media.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative. For example, the division of each unit is only one logic function division, and there may be another division manner in actual implementation. For example, various elements or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented.
The steps in the method of the embodiment of the invention can be sequentially adjusted, combined and deleted according to actual needs. The units in the device of the embodiment of the invention can be merged, divided and deleted according to actual needs. In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium and including instructions for enabling a hardware acceleration device (which may be a personal computer, a terminal, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention.
The above embodiments are preferred implementations of the present invention, and the present invention can be implemented in other ways without departing from the spirit of the present invention.

Claims (10)

1. A hardware acceleration apparatus, comprising: 16 encryption calculation units, which are respectively an encryption calculation unit 1 to an encryption calculation unit 16; wherein the encryption calculation units 1 to 14 include an SM4 calculation unit 1 and AES calculation units 1 to SM4 calculation unit 14 and AES calculation unit 14, the encryption calculation unit 15 includes an SM4 calculation unit 15, and the encryption calculation unit 16 includes an SM4 calculation unit 16; the encryption calculation unit 11 further comprises an AES128 calculation unit 11, the encryption calculation unit 13 further comprises an AES192 calculation unit 13, and the encryption calculation unit 15 further comprises an AES256 calculation unit 15; the device has 4 encryption modes, and the SM4_ en, the AES128_ en and the AES192_ en are used for configuring corresponding encryption modes which are respectively SM4 encryption, AES128 encryption, AES192 encryption and AES256 encryption; the SM4 encryption comprises an SM4 calculation unit 1 to an SM4 calculation unit 16, and 16 calculations are needed in total; the AES128 encryption needs 11 times of calculation, and the calculation units required by the AES128 encryption comprise an AES calculation unit 1 to an AES calculation unit 10 and an AES128 calculation unit 11; the AES192 encryption needs 13 times of calculation, and the calculation units required by the AES192 encryption comprise an AES calculation unit 1 to an AES calculation unit 12 and an AES192 calculation unit 13; the AES256 encryption requires 15 calculations, and the calculation units required for the AES256 encryption include an AES calculation unit 1 through an AES calculation unit 14, and an AES256 calculation unit 15.
2. The hardware acceleration apparatus of claim 1, characterized in that, the AES calculation units 1 to 10 are used for AES128 encryption, AES192 encryption, and AES256 encryption, three different encryption modes being multiplexed; the AES calculation unit 11 and the AES calculation unit 12 are used for AES192 encryption and AES256 encryption, and two different encryption modes are multiplexed.
3. The hardware acceleration apparatus of claim 2, wherein the processing computations of the AES computation unit 1 include round key addition, and the processing computations of the AES computation units 2 through 14 include byte substitution, row shift, column mix, and round key addition.
4. The hardware acceleration device of claim 3, wherein the AES128 calculating unit 11, the AES192 calculating unit 13, and the AES256 calculating unit 15, whose processing calculations are the same, respectively multiplexes the processing calculations in the AES calculating unit 11, the AES calculating unit 13, and the AES calculating unit 15, and comprises: byte substitution, row shift, round key addition.
5. Hardware acceleration method, characterized in that it is based on the hardware acceleration apparatus of claim 1, comprising the following steps:
acquiring 128-bit plaintext;
performing 16 iterative computations by using SM4 computing units 1 to SM4 computing units 16 according to 128-bit plaintext to obtain SM4 ciphertext; performing 11 times of iterative computation through an AES computing unit 1 to an AES computing unit 10 and an AES128 computing unit 11 to obtain an AES128 ciphertext; performing 13 times of iterative computation through an AES computing unit 1 to an AES computing unit 12 and an AES192 computing unit 13 to obtain an AES192 ciphertext; through the AES computing unit 1 to the AES computing unit 14 and the AES256 computing unit 15, 15 times of iterative computation to obtain an AES256 ciphertext.
6. The hardware acceleration method of claim 5, characterized in that the AES calculation units 1 to 10 are used for AES128 encryption, AES192 encryption, and AES256 encryption, three different encryption modes being multiplexed; the AES calculation unit 11 and the AES calculation unit 12 are used for AES192 encryption and AES256 encryption, and two different encryption modes are multiplexed.
7. The hardware acceleration method of claim 6, wherein the processing computations of the AES computation unit 1 comprise round key addition, and the processing computations of the AES computation unit 2 through AES computation unit 14 comprise byte substitution, row shifting, column mixing, and round key addition.
8. The hardware acceleration method of claim 7, wherein the AES128 calculating unit 11, the AES192 calculating unit 13, and the AES256 calculating unit 15 have the same processing calculation, and the processing calculations in the AES calculating unit 11, the AES calculating unit 13, and the AES calculating unit 15 are multiplexed respectively, and the method comprises: byte substitution, row shift, round key addition.
9. Hardware acceleration device, characterized in that the hardware acceleration device comprises a memory, on which a computer program is stored, and a processor, which when executing the computer program implements a hardware acceleration method according to any of claims 5-8.
10. A storage medium storing a computer program comprising program instructions which, when executed by a processor, implement the hardware acceleration method of any one of claims 5-8.
CN202111563372.1A 2021-12-20 2021-12-20 Hardware acceleration apparatus, method, device, and storage medium Pending CN114244510A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111563372.1A CN114244510A (en) 2021-12-20 2021-12-20 Hardware acceleration apparatus, method, device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111563372.1A CN114244510A (en) 2021-12-20 2021-12-20 Hardware acceleration apparatus, method, device, and storage medium

Publications (1)

Publication Number Publication Date
CN114244510A true CN114244510A (en) 2022-03-25

Family

ID=80759389

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111563372.1A Pending CN114244510A (en) 2021-12-20 2021-12-20 Hardware acceleration apparatus, method, device, and storage medium

Country Status (1)

Country Link
CN (1) CN114244510A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107425976A (en) * 2017-04-26 2017-12-01 美的智慧家居科技有限公司 Key chip system and internet of things equipment
US20190044699A1 (en) * 2018-06-28 2019-02-07 Intel Corporation Reconfigurable galois field sbox unit for camellia, aes, and sm4 hardware accelerator
CN109379180A (en) * 2018-12-20 2019-02-22 湖南国科微电子股份有限公司 Aes algorithm implementation method, device and solid state hard disk
CN110138541A (en) * 2018-02-02 2019-08-16 英特尔公司 Uniform hardware accelerator for symmetric key cipher
US20190386815A1 (en) * 2018-06-15 2019-12-19 Intel Corporation Unified aes-sms4-camellia symmetric key block cipher acceleration
CN111767586A (en) * 2020-06-09 2020-10-13 北京智芯微电子科技有限公司 Microprocessor and safety chip with built-in hardware cryptographic algorithm coprocessor
CN111865560A (en) * 2020-06-23 2020-10-30 华中科技大学 AES password coprocessor and terminal equipment
CN113722702A (en) * 2021-09-01 2021-11-30 上海兆芯集成电路有限公司 Processor with block cipher algorithm and processing method thereof

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107425976A (en) * 2017-04-26 2017-12-01 美的智慧家居科技有限公司 Key chip system and internet of things equipment
CN110138541A (en) * 2018-02-02 2019-08-16 英特尔公司 Uniform hardware accelerator for symmetric key cipher
US20190386815A1 (en) * 2018-06-15 2019-12-19 Intel Corporation Unified aes-sms4-camellia symmetric key block cipher acceleration
US20190044699A1 (en) * 2018-06-28 2019-02-07 Intel Corporation Reconfigurable galois field sbox unit for camellia, aes, and sm4 hardware accelerator
CN109379180A (en) * 2018-12-20 2019-02-22 湖南国科微电子股份有限公司 Aes algorithm implementation method, device and solid state hard disk
CN111767586A (en) * 2020-06-09 2020-10-13 北京智芯微电子科技有限公司 Microprocessor and safety chip with built-in hardware cryptographic algorithm coprocessor
CN111865560A (en) * 2020-06-23 2020-10-30 华中科技大学 AES password coprocessor and terminal equipment
CN113722702A (en) * 2021-09-01 2021-11-30 上海兆芯集成电路有限公司 Processor with block cipher algorithm and processing method thereof

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
HUAFENG CHEN: "An efficient hardware implementation of SM4", 2017 4TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND CAREER EDUCATION, 30 November 2017 (2017-11-30) *
修佳鹏;田超宇;杨正球;王志龙;: "SecOC安全机制中国密算法应用方案研究", 信息安全研究, no. 09, 5 September 2020 (2020-09-05) *
李秀滢;吉晨昊;段晓毅;周长春;: "GPU上SM4算法并行实现", 信息网络安全, no. 06, 6 July 2020 (2020-07-06) *
费雄伟;李肯立;阳王东;杜家宜;: "基于CUDA的并行AES算法的实现和加速效率探索", 计算机科学, no. 01, 15 January 2015 (2015-01-15) *

Similar Documents

Publication Publication Date Title
Li et al. Privacy-preserving machine learning with multiple data providers
US7822797B2 (en) System and method for generating initial vectors
WO2019214066A1 (en) Method and apparatus for re-establishing user database on blockchain, and device and medium
AU2016386405B2 (en) Fast format-preserving encryption for variable length data
WO2019114122A1 (en) Encryption method for login information, device, electronic device, and medium
EP3014800B1 (en) Method and apparatus to encrypt plaintext data
US9515818B2 (en) Multi-block cryptographic operation
JP6575532B2 (en) Encryption device, decryption device, encryption processing system, encryption method, decryption method, encryption program, and decryption program
US9893880B2 (en) Method for secure symbol comparison
US9716586B2 (en) Precomputing internal AES states in counter mode to protect keys used in AES computations
WO2021129470A1 (en) Polynomial-based system and method for fully homomorphic encryption of binary data
JP2017187724A (en) Encryption device, encryption method, decryption device, and decryption method
EP3667647A1 (en) Encryption device, encryption method, decryption device, and decryption method
WO2020223691A1 (en) System and method for adding and comparing integers encrypted with quasigroup operations in aes counter mode encryption
JPWO2012157279A1 (en) Order-preserving encryption system, apparatus, method, and program
CN109804596B (en) Programmable block cipher with masked input
CN114244510A (en) Hardware acceleration apparatus, method, device, and storage medium
Bajaj et al. AES algorithm for encryption
Singh et al. Study & analysis of cryptography algorithms: RSA, AES, DES, T-DES, blowfish
CN113726501A (en) Method and device for preserving format encrypted data, electronic equipment and storage medium
EP3931999A1 (en) Method secured against side-channel attacks with a new masking scheme protecting linear operations of a cryptographic algorithm
Surameery Modified advanced encryption standard for boost image encryption
Lee et al. Using AES Encryption Algorithm to Optimize High-tech Intelligent Platform
Pal et al. An ANN Approach of Twisted Fiestel Block Ciphering
Sayed et al. Split-n-Swap: A New Modification of the Twofish Block Cipher Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination