CN115883068A - SM4 encryption algorithm hardware architecture suitable for microprocessor chip - Google Patents

SM4 encryption algorithm hardware architecture suitable for microprocessor chip Download PDF

Info

Publication number
CN115883068A
CN115883068A CN202211679820.9A CN202211679820A CN115883068A CN 115883068 A CN115883068 A CN 115883068A CN 202211679820 A CN202211679820 A CN 202211679820A CN 115883068 A CN115883068 A CN 115883068A
Authority
CN
China
Prior art keywords
encryption
data
computation
hardware architecture
microprocessor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211679820.9A
Other languages
Chinese (zh)
Inventor
沙金
孟繁树
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN202211679820.9A priority Critical patent/CN115883068A/en
Publication of CN115883068A publication Critical patent/CN115883068A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Storage Device Security (AREA)

Abstract

At present, the transmission and the interaction of information are increasing day by day, and in the information transmission process, the confidentiality and the security of information need to be paid attention to, otherwise probably lead to key information to reveal or suffer maliciously and falsely falsify, and then lead to more serious economic loss or potential safety hazard. In order to ensure the confidentiality and security of information, an encryption algorithm hardware module is usually embedded in a microprocessor chip to realize the secure encryption of information inside the chip. The invention designs a low-complexity SM4 encryption algorithm hardware architecture suitable for a microprocessor chip, which comprises a control module, a round key calculation unit and an encryption operation unit, wherein a plurality of encryption operation units are used in the architecture to encrypt information in the microprocessor chip, and the control module schedules the encryption operation unit to realize continuous encryption of data information and realize an encryption operation process with low complexity and high performance.

Description

SM4 encryption algorithm hardware architecture suitable for microprocessor chip
Technical Field
The invention relates to the field of information security, in particular to an SM4 encryption algorithm hardware architecture suitable for a microprocessor chip.
Background
With the development of information technology, information transmission and interaction are more and more frequent, and once information is tampered or leaked in the transmission process, immeasurable economic loss can be caused. Therefore, it is important to ensure that information can be transmitted and communicated safely, and processing data by using an encryption algorithm is an effective technical means for ensuring information safety.
The national cryptology bureau successively releases cryptographic algorithms such as SM1, SM2, SM3, SM4 and the like, the algorithms are collectively called national cryptographic algorithms, the Chinese cryptographic SM4 algorithm is the most prominent in the field of commercial encryption and is determined as the national commercial cryptographic standard in 3 months of 2012.
The SM4 algorithm is a packet encryption algorithm, the message packet length and the key length are both 128 bits, and the SM4 algorithm mainly comprises a key expansion algorithm and an encryption (decryption) algorithm. The algorithm adopts a 32-round nonlinear iteration structure, the encryption and decryption algorithm structures are the same, and only the round key use sequence is opposite. The input of the key expansion algorithm is 128-bit initial key, and the output is 32-bit round keys. The data encryption algorithm inputs 128-bit plain text data and 32-bit round keys, and outputs 128-bit cipher text data after encryption.
The encryption algorithm of the SM4 algorithm is as follows: let the 128-bit plaintext be X = (X) 0 ,x 1 ,x 2 ,x 3 ) The output 128-bit cipher text is Y = (Y) 0 ,y 1 ,y 2 ,y 3 ) Wherein x is j And y j (j =0,1,2, 3) are all 32-bit binary data, and the round key of each round is set to rk i (i=0,1,2,3…31),rk i Is 32-bit binary data, and the round function of each round of iteration is expressed as F 1 (x) Then the formula of the encryption algorithm in the SM4 algorithm is
x i+4 =F 1 (x i ,x i+1 ,x i+2 ,x i+3 ,rk i ),i=0,1,2,3…31
Y=(y 0 ,y 1 ,y 2 ,y 3 )=(x 35 ,x 34 ,x 33 ,x 32 )
The key expansion algorithm of the SM4 algorithm is as follows: let the 128-bit initial key of the input be MK = (MK) 0 ,mk 1 ,mk 2 ,mk 3 ),mk j (j =0,1,2,3) is a 32-bit data group and the round function is F 2 (x) The intermediate variable is k i The round key is rk i 。k i And rk i Are all 32-bit data groups, wherein i =0,1,2 \ 823031, and the system parameter is FK = (FK) 0 ,fk 1 ,fk 2 ,fk 3 ) The fixed parameter is CK = (CK) 0 ,ck 1 ,ck 2 …ck 31 ) Wherein fk 0 ,fk 1 ,fk 2 ,fk 3 And ck 0 ,ck 1 ,ck 2 …ck 31 Are all 32bit data groups, then the round keyCan be formulated as
Figure BSA0000292191960000021
rk i =k i+4 =F 2 (k i ,k i+1 ,k i+2 ,k i+3 ,ck i ),i=0,1,2,3…31
Wherein, F in the encryption algorithm 1 (x) And F in the key expansion algorithm 2 (x) The structure principle is basically the same, and the method mainly comprises three parts of XOR operation, nonlinear transformation and cyclic shift operation, and the difference is that the input parameters of the round function and the shift value of the internal cyclic shift operation are different.
In order to ensure the confidentiality and security of information, a hardware module of an encryption algorithm is usually embedded in a microprocessor chip to realize secure encryption of information. Because the data bit width of the microprocessor is relatively small (taking 32 bits as an example) and is not matched with the requirement (128 bits) of the SM4 algorithm, two situations can occur when the data is transmitted between the microprocessor and the SM4 algorithm, the first situation is that memory resources are needed to cache the data, pipeline processing is performed after the data is cached, large storage resource overhead can be caused, and the second situation is that the data is directly put into a pipeline, and at the moment, the pipeline has idle and the resource utilization rate is low. Therefore, an efficient hardware architecture for cryptographic algorithm is needed to solve the problems of data transmission and resource waste between the processor and the cryptographic module.
Disclosure of Invention
The invention aims to: the invention mainly aims at the problem of hardware resource waste caused by encryption processing of continuous transmission data between a microprocessor and an SM4 encryption algorithm module, provides an SM4 encryption algorithm hardware architecture design suitable for a microprocessor chip, can receive continuous data transmitted by the microprocessor for encryption processing, obviously reduces hardware resource consumption, and has no influence on the encryption effect of the data.
The technical scheme is as follows: the invention discloses an SM4 encryption algorithm hardware architecture suitable for a microprocessor chip, which is characterized in that: the hardware architecture mainly comprises a control module, a round key calculation unit and four encryption operation units, wherein the control module realizes serial-parallel conversion and controls the encryption operation units to cooperate with each other, and continuous encryption operation can be performed on a large amount of data transmitted by the microprocessor. The specific architecture is shown in fig. 1.
The hardware architecture comprises main modules including:
the S2P interface module: the serial to parallel conversion is carried out, the data to be encrypted from the microprocessor is received, a plurality of data are combined into a frame of initial key or plaintext, and the frame of initial key or plaintext is sent to the next stage SM4_ core module for SM4 encryption algorithm operation.
P2S interface module: and converting the data into serial data in parallel, receiving the ciphertext data from the SM4_ core module, splitting a frame of ciphertext data into a plurality of data according with the bit width of the microprocessor, and sending the data to the microprocessor.
SM4_ core module: the round key calculation unit receives an initial key to complete key expansion operation, and the four encryption operation units receive multi-frame plaintext data to complete encryption operation.
A Controller module: the method is used for matching the bit width of the microprocessor interface with the bit width of data between the SM4 algorithm modules, realizing serial/parallel conversion, simultaneously carrying out unified scheduling on two types of calculation unit cores in the SM4_ core module, controlling the operation types and ensuring the operation efficiency of the encryption algorithm.
In a hardware architecture, an SM4_ core module is a main innovative part of the present invention, and the module includes a round key calculation unit (key _ core module) and four encryption operation units (enc _ core modules), wherein the SM4_ core module firstly receives an initial key, obtains 32 round keys through calculation of the round key calculation unit, stores the 32 round keys in a register inside the round key calculation unit, then receives continuous plaintext data, and performs continuous encryption operation by the four encryption operation units. The specific architecture of the two large computational unit cores is shown in fig. 2 and 3. The original round key calculation process and the encryption operation process both need 32 rounds of nonlinear iterative operations to obtain results, the two iterative operations are combined into one time, and then the pipeline structure is folded, so that any one calculation unit core in the SM4_ core module is composed of hardware resources required by two rounds of nonlinear iterative operations, and the complete 32 rounds of nonlinear iterative operations are completed every 16 times of operations in the calculation unit core.
A specific SM4 encryption algorithm hardware architecture implementation flowchart is shown in fig. 4, and specifically includes the following steps:
(1) And the Controller module receives the configuration information, completes initialization parameter configuration, matches the proportional relation between the interface bit width of the microprocessor and the data bit width of the SM4 algorithm module, and is used for completing serial-parallel conversion of subsequent received data and controlling the core of the calculation unit to participate in encryption algorithm calculation.
(2) The S2P module receives the initial key information from the microprocessor, combines the initial key information into a frame of initial key data, and sends the initial key data to the SM4_ core module.
(3) The SM4_ core module receives the initial key data, and the Controller module controls the unique round key calculation unit to carry out iterative operation to obtain 32 round keys.
(4) The S2P module receives plaintext information from the microprocessor, combines continuous plaintext information into multi-frame plaintext data, and sequentially sends the multi-frame plaintext data to the SM4_ core module according to a receiving sequence.
(5) The SM4_ core module receives plaintext data, the Controller module carries out unified scheduling on the four encryption operation units, each calculation unit core is responsible for iterative operation of a frame of plaintext data, and the four cores are cooperatively matched to realize continuous encryption operation.
(6) And the P2S module receives the ciphertext data from the SM4_ core module, splits the ciphertext data into a plurality of data which accord with the bit width of the microprocessor interface and sequentially sends the data to the microprocessor.
Drawings
Fig. 1 is a hardware architecture of SM4 encryption algorithm suitable for a microprocessor chip according to the present invention.
Fig. 2 is a specific architecture of the SM4 algorithm round key calculation unit core provided by the present invention.
Fig. 3 is a specific architecture of the SM4 algorithm encryption arithmetic unit core provided in the present invention.
Fig. 4 is a flow chart of the SM4 encryption algorithm hardware architecture implementation of the present invention.
Detailed Description
In order to explain the technical solution disclosed in the present invention in detail, the following description is made with reference to specific examples.
In this example, the low-complexity SM4 encryption algorithm hardware architecture is embedded into a 32-bit microprocessor chip, round key calculation is performed first, and then continuous encryption operation is performed, specifically, the following steps are performed:
step 1: and the Controller module receives the configuration information, determines that the ratio of the interface bit width to the SM4 algorithm data bit width is 1: 4, generates a corresponding control signal and sends the control signal to the other modules.
Step 2: the 32-bit microprocessor transmits 32-bit data in each clock cycle, and the S2P module receives 4 32-bit data by using 4 clock cycles, combines the data into 128-bit initial key data and sends the 128-bit initial key data to the SM4_ core module.
And step 3: and a key _ core module in the SM4_ core module receives the initial key data, performs iterative computation for 16 clock cycles to obtain 32-bit round key data, and stores the 32-bit round key data in an internal register.
And 4, step 4: the 32-bit microprocessor transmits 32-bit data in each clock cycle, a plurality of clock cycles transmit a plurality of 32-bit data, and the S2P module combines the received 4 32-bit data into 1 plaintext data of 128 bits in each 4 clock cycles and sequentially transmits the plaintext data to the SM4_ core module.
And 5: 4 enc _ core modules in the SM4_ core module receive plaintext data in sequence, each encryption calculation unit needs to obtain 1 ciphertext data with 128 bits through 16 clock cycles of iterative calculation, when the 5 th frame of plaintext data is transmitted to the SM4_ core module from the S2P module, the 1 st enc _ core module in the SM4_ core module finishes encryption operation on the 1 st frame of plaintext, therefore, the 1 st enc _ core module receives the 5 th frame of plaintext data and carries out encryption operation, and so on, the 4 enc _ core modules can receive the 128 bits of plaintext data from the S2P module all the time, and continuous encryption of data transmitted by the 32-bit microprocessor is realized.
Step 6: the P2S module receives 128-bit ciphertext data from the SM4_ core module every 4 clock cycles, the ciphertext data is split into 4 data with 32 bits, the data is sequentially sent to the 32-bit microprocessor, and data feedback after encryption is completed.
The invention is designed aiming at the common application scene of a 32-bit microprocessor chip, a Controller module can automatically match the bit width of a microprocessor interface according to configuration information, but for the microprocessor chips with lower bit widths such as 16-bit and 8-bit, the 4 encryption operation units in the invention have the condition that a calculation unit core waits for data when performing encryption operation, which causes resource waste and resource utilization rate reduction, therefore, when the invention is embedded into the microprocessor chip with low bit width, the number of the encryption operation units can be reduced according to specific conditions, so that when the bit width of the microprocessor interface is matched with the bit width of SM4 encryption algorithm data, the microprocessor interface sends continuous data, the encryption operation units are all in a working state, and the complexity of an SM4 encryption algorithm hardware architecture is further reduced under the condition of not influencing the continuous encryption of the data.

Claims (6)

1. The SM4 encryption algorithm hardware architecture is suitable for a microprocessor chip and mainly comprises a serial-parallel conversion module, a control module, a round key calculation unit and four encryption operation units, wherein the control module controls transmission data to complete serial-parallel conversion and dispatch of the encryption operation units in cooperative fit, continuous encryption operation can be performed on a large amount of data stored in the microprocessor chip, hardware resource consumption is remarkably reduced, and encryption operation efficiency is improved.
2. The SM4 encryption algorithm hardware architecture suitable for microprocessor chips of claim 1, wherein an original key needs to get 32 round keys after 32 rounds of computation, the round key computation unit combines hardware resources of 2 rounds of computation into 1 round, and a subsequent round key computation process multiplexes the combined resources, so that a complete key expansion operation requires 16 rounds of computation in total, and the same computation resources are multiplexed every 1 round of computation.
3. The SM4 encryption algorithm hardware architecture suitable for microprocessor chips of claim 1, wherein encryption operations require 32 rounds of computation to obtain an encryption result, similar to a round key computation unit, the encryption operation unit also combines 32 rounds of computation into 16 rounds of computation, and each 1 round of computation multiplexes the same computation resource, and the four encryption operation units cooperate to ensure that multi-frame data are simultaneously encrypted.
4. The SM4 encryption algorithm hardware architecture suitable for microprocessor chips of claim 1, wherein a serial-to-parallel conversion module is designed for a microprocessor interface inside the hardware architecture, so as to realize bit width conversion between microprocessor interface data and SM4 algorithm required data.
5. The SM4 encryption algorithm hardware architecture suitable for microprocessor chips as claimed in claim 1, wherein a special control module is designed inside the hardware architecture, and can adjust related control signals according to the conditions of microprocessor chip interfaces with different data bit widths to control the serial-to-parallel conversion module and schedule four encryption calculation units, so as to realize continuous transmission of multiple frames of data to be encrypted by the microprocessor interface, and the received data is continuously encrypted inside the encryption algorithm hardware architecture.
6. The SM4 encryption algorithm hardware architecture suitable for microprocessor chips as claimed in claim 1, wherein the round key calculation unit and the encryption operation unit inside the hardware architecture are both implemented using a non-pipeline structure, and the SM4 encryption algorithm hardware architecture with low complexity is implemented by reasonably scheduling the multi-core encryption operation unit on the premise of ensuring the same encryption performance as the pipeline hardware structure.
CN202211679820.9A 2022-12-19 2022-12-19 SM4 encryption algorithm hardware architecture suitable for microprocessor chip Pending CN115883068A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211679820.9A CN115883068A (en) 2022-12-19 2022-12-19 SM4 encryption algorithm hardware architecture suitable for microprocessor chip

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211679820.9A CN115883068A (en) 2022-12-19 2022-12-19 SM4 encryption algorithm hardware architecture suitable for microprocessor chip

Publications (1)

Publication Number Publication Date
CN115883068A true CN115883068A (en) 2023-03-31

Family

ID=85755590

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211679820.9A Pending CN115883068A (en) 2022-12-19 2022-12-19 SM4 encryption algorithm hardware architecture suitable for microprocessor chip

Country Status (1)

Country Link
CN (1) CN115883068A (en)

Similar Documents

Publication Publication Date Title
US7688974B2 (en) Rijndael block cipher apparatus and encryption/decryption method thereof
US8625781B2 (en) Encrypton processor
CN101969376B (en) Self-adaptive encryption system and method with semantic security
CN110880967B (en) Method for parallel encryption and decryption of multiple messages by adopting packet symmetric key algorithm
US8385540B2 (en) Block cipher algorithm based encryption processing method
CN102035641A (en) Device and method for implementing AES encryption and decryption
CN103632104A (en) Parallel encryption and decryption method for dynamic data under large data environment
KR20090037366A (en) Aes encryption/decryption circuit
US11695542B2 (en) Technology for generating a keystream while combatting side-channel attacks
WO2012132621A1 (en) Encryption processing device, encryption processing method, and programme
CN114679252A (en) Resource sharing method for MACsec AES algorithm
CN103346878A (en) Secret communication method based on FPGA high-speed serial IO
CN109150495B (en) Round conversion multiplexing circuit and AES decryption circuit thereof
Hussain et al. Efficient video encryption using lightweight cryptography algorithm
Buell Modern symmetric ciphers—Des and Aes
CN115883068A (en) SM4 encryption algorithm hardware architecture suitable for microprocessor chip
CN111614457A (en) P replacement improvement-based lightweight packet encryption and decryption method, device and storage medium
CN108566271B (en) Multiplexing round conversion circuit, AES encryption circuit and encryption method thereof
CN112134691B (en) NLCS block cipher realization method, device and medium with repeatable components
CN101335741B (en) Acceleration method and apparatus for GHASH computation in authenticated encryption Galois counter mode
SK10382000A3 (en) Method for the cryptographic conversion of binary data blocks
CN103051443B (en) AES (Advanced Encryption Standard) key expansion method
CN109039608B (en) 8-bit AES circuit based on double S cores
CN109033023B (en) Ordinary round conversion operation unit, ordinary round conversion circuit and AES encryption circuit
Wang et al. Research on AES encryption algorithm based on timestamp in Wireless Sensor Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination