CN100573540C - Design method of asynchronous block cipher algorithm coprocessor - Google Patents

Design method of asynchronous block cipher algorithm coprocessor Download PDF

Info

Publication number
CN100573540C
CN100573540C CN 200810143205 CN200810143205A CN100573540C CN 100573540 C CN100573540 C CN 100573540C CN 200810143205 CN200810143205 CN 200810143205 CN 200810143205 A CN200810143205 A CN 200810143205A CN 100573540 C CN100573540 C CN 100573540C
Authority
CN
China
Prior art keywords
lt
overscore
gt
input
signal
Prior art date
Application number
CN 200810143205
Other languages
Chinese (zh)
Other versions
CN101350038A (en
Inventor
任江春
葵 戴
勇 李
蕾 王
王志英
伟 石
童元满
坚 阮
陆洪毅
锐 龚
Original Assignee
中国人民解放军国防科学技术大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国人民解放军国防科学技术大学 filed Critical 中国人民解放军国防科学技术大学
Priority to CN 200810143205 priority Critical patent/CN100573540C/en
Publication of CN101350038A publication Critical patent/CN101350038A/en
Application granted granted Critical
Publication of CN100573540C publication Critical patent/CN100573540C/en

Links

Abstract

本发明公开了一种异步分组密码算法协处理器的设计方法,要解决的技术问题是提供一种异步分组密码算法协处理器的设计方法。 The present invention discloses a design method of an asynchronous packet cipher coprocessor, to solve the technical problem of providing an asynchronous block cipher coprocessor design. 技术方案是将分组密码算法中每一轮迭代作为独立的子模块;采用HDL设计各子模块并对各子模块进行逻辑综合,得到静态单轨网表;将静态单轨网表转换为仅由互补的二输入与门和或门组成的复合逻辑网表;对各子模块进行延时匹配,增加与子模块延时相同的延时匹配模块,并保证各子模块的输入信号至输出信号的延时相同,且任意与门和或门的输入到达时间相同;将各子模块顺序连接,得到完整网表;进行后端布局布线,得到GDS版图。 Aspect is a block cipher algorithm in each iteration as a separate sub-module; HDL design using submodules submodules and logic synthesis, to obtain static monorail netlist; converting static monorail netlist only by complementary composite logical input aND gates and an oR gate netlist thereof; each sub-module delay matching, increasing the delay the same sub-module delay matching module, and to ensure that the input signal of each delay sub-modules to an output signal the same, and the same at any time and oR gates input arrival; connecting submodules order to obtain a complete netlist; backend layout, layout GDS obtained. 采用这种方法设计的协处理器具有较高的抗功耗攻击防护能力,同时具有高运算性能与低功耗特性。 With this design of the coprocessor method is highly resistant to attack power protection capability, while having a high operational performance and low power consumption.

Description

一种异步分组密码算法协处理器的设计方法 An asynchronous block cipher designing method coprocessor

技术领域 FIELD

本发明涉及一种微处理器的设计方法,尤其是一种密码算法协处理器的设计方法。 The present invention relates to a design method of a microprocessor, especially in the design method of cryptographic algorithm coprocessor. 背景技术 Background technique

密码算法的安全性包括两个方面, 一是密码算法数学意义上的安全性,二是密码算法实现上的安全性。 The security of cryptographic algorithms include two aspects, one is the security of the password algorithm mathematical sense, the second is security on the implementation of cryptographic algorithms. 传统的密码分析就是针对密码算法本身实施的被解手段,比如差分和线性密码分析,常规的密码分析和暴力破解对现有的广泛应用的密码算法而言是无效的。 The traditional means of cryptanalysis is the solution for the implementation of cryptographic algorithm itself, such as differential and linear cryptanalysis, conventional cryptanalysis and brute force for existing cryptographic algorithm widely used is invalid. 功耗攻击是一种利用密码算法具体实现中的薄弱环节实施的破解密钥的有效手段,是旁路攻击中具有最高安全威胁的攻击手段。 Power attacks is an effective means to crack the key cryptographic algorithm implementation of specific weaknesses in the implementation of the utilization side channel attacks are attacks with the highest security threats. 因为安全芯片内的私有密钥与密码算法运行时所消耗的功耗之间存在统计意义上的相关性,攻击者在采集大量功耗样本的基础上,运用数理统计方法可推导出片内私有密钥。 Because there between power consumption and private key cryptographic algorithm running in the security chip consumption correlation, an attacker on the basis of a large number of power samples collected on the use of statistical significance of mathematical statistics can be deduced piece of private key. 从功耗攻击的角度,功耗攻击可分为三类:简 From the perspective of power attacks, power attacks can be divided into three categories: simple

单功耗攻击(Simple Power Analysis: SPA)、差分功耗攻击(Differential Power Analysis: DPA)以及高阶功耗攻击(HighOrderDPA: HODPA)。 Single power attacks (Simple Power Analysis: SPA), differential power attacks (Differential Power Analysis: DPA) and higher-order power attacks (HighOrderDPA: HODPA). 功耗攻击所需的代价小,且可适用于几乎所有的密码算法,如DES (Data Encryption Standard:数据加密标准)、AES (Advanced Encryption Standard:高级加密标准)、RSA (Rivest-Shamir-Adlemen)以及ECC (Elliptic Curve Crypto-system:椭圆曲线密码算法)等。 Small cost of the power required to attack, and can be applied to almost all cryptographic algorithms, such as DES (Data Encryption Standard: Data Encryption Standard), AES (Advanced Encryption Standard: Advanced Encryption Standard), RSA (Rivest-Shamir-Adlemen) and ECC (elliptic curve Crypto-system: elliptic curve Cryptography) and the like.

以智能卡为代表的安全芯片在各领域得到了非常广泛的应用。 Secure smart card chip as the representative of a very wide range of applications in various fields. 安全芯片的主要作用包括-数据的安全存储、数据加解密、数字签名与认证以及身份鉴别等。 The main role of the security chip included - secure storage of data, data encryption and decryption, digital signatures and authentication, and authentication and so on. 上述各种功能的实现有赖于现代密码算法,包括公钥密码算法、分组密码算法以及流密码算法等。 Achieve these various functions depends on modern cryptographic algorithms, including public key cryptography algorithm, stream ciphers and block cipher algorithm. 公钥密码算法主要用于数字签名与认证以实现身份鉴别,如RSA和ECC;分组密码算法用于数据加解密,如DES和AES;而流密码算法主要用于数据流的加解密,如RC4 (Rivest Cipher 4)。 Public key cryptography algorithm is mainly used for digital signatures and certification to enable authentication, such as RSA and ECC; block cipher algorithm used to encrypt and decrypt data, such as DES and AES; and stream cipher algorithm is mainly used to encrypt and decrypt data streams, such as RC4 (Rivest Cipher 4). 各种不同类型的密码算法部件(软件模块或硬件协处理器)是安全芯片中不可缺少的组成部分。 Various different types of cryptographic algorithms components (software modules or hardware co-processors) is an integral part of the security chip. 受非法利益驱使,安全芯片的安全环境较为恶劣,易遭受各种类型的攻击与破解。 By illegal profit-driven, security chip security environment is more severe, vulnerability to various types of attacks and crack. 其中针对密码算法部件的功耗攻击是一种破解安全芯片的有效手段;已有文献报道,采用功耗攻击技术成功破解了多款不同类型的安全芯片。 Which power attacks against cryptographic algorithms parts is an effective means to break a security chip; have been reported in the literature, the use of power attack techniques successfully cracked the security chip variety of different types. 因此,安全芯片中的密码算法部件必须具有有效的抗功耗攻击防护能力。 Thus, the security chip cryptographic algorithms member must have an effective anti-attack power protection.

根据防护技术的目标,可将各种防护技术分为两类: 一是消除密码算法具体实现中可被功耗攻击的漏洞,二是增大功耗攻击的难度。 According target protection technology, a variety of protection techniques can be divided into two categories: the elimination of specific cryptographic algorithm implementations power consumption can be attacked vulnerabilities, the second is more difficult to attack power. 消除功耗攻击漏洞指的是消除密钥与功耗之间的相关性,最常见的技术手段为随机掩码技术。 Eliminate vulnerabilities power consumption refers to the elimination of the correlation between the key and the power consumption, the most common technology for the random mask technology. 随机掩码就是服从均匀分布的随机数掩盖密码运算过程中的中间结果,使得被掩码的中间结果同样为随机数,且其概率分布与密钥无关,进而使得功耗与密钥无关,从而消除了功耗攻击漏洞。 Obey the random mask is uniformly distributed random number masked intermediate results during the cryptographic computation, so that intermediate results are likewise masked random number, and probability distribution regardless of the key, with the key regardless of the power consumption further, whereby eliminating the power consumption vulnerabilities. 从另一角度来看,如果实施功耗攻击所需代价和时间太高,几乎无法实施,也可认为达到了事实上的抗功耗攻击的目的。 From another perspective, if the implementation of the cost and time required to attack power consumption is too high, almost impossible to implement, can also be considered to achieve the purpose of de facto anti-power attack. 为增大功耗攻击的难度,常见的技术手段包括:随机化技术,比如在密码算法实现过程中插入隨机冗余伪操作、运算流程随机化、插入随机的延时以及引入随机的功耗噪声等;恒定化技术,即使得安全芯片执行密码算法时所消耗的功耗几乎为恒定值,即大幅度削弱功耗与密钥之间的相关性,比如采用新型的具有功耗恒定特性的动态双轨逻辑单元,如基于敏感放大器的逻辑(Sense Amplifier Based Logic: SABL) 等;功耗平滑技术,即保证安全芯片工作时的功耗在预定范围之内,增加额外电路对整个芯片的功耗进行动态补偿。 To increase the difficulty of attack power of common techniques include: randomization techniques, such as inserting the cryptographic algorithm redundant pseudo random during operation, the operational flow randomized, insert a random delay and introducing a random power noise; constant technology, even if obtained when performing the security chip cryptographic algorithm consumption power is almost a constant value, i.e. a substantial weakening of the correlation between power consumption and the key, such as the use of the new characteristic having a constant power dual-rail dynamic logic unit, such as a logic-based sensitive amplifier (Sense amplifier based logic: SABL) and the like; power smoothing technique, which is to ensure the secure chip power consumption when operating within a predetermined range, the additional circuitry increases the power consumption of the entire chip dynamic compensation.

无论何种防护技术,都是以一定的代价来达到一定的防护能力,主要体现在如下方面: 1)运算性能下降,比如在密码运算过程中插入冗余操作不可避免造成运算性能下降;在基于动态双轨逻辑的功耗恒定密码算法部件中,每个时钟周期内仅有一半时间做有效运算,另一半需要进行预充,因此运算性能至少下降一半;2)芯片面积增大,比如引入动态补偿电路以平滑功耗、引入随机功耗噪声产生模块、以及动态双轨逻辑等防护技术都不可避免增加芯片面积;3)功耗增大,功耗噪声、功耗恒定化和功耗平滑等防护技术都使得功耗大幅增加;比如以动态双轨逻辑实现密码算法时,因为所有逻辑单元都以时钟作为预充信号,使得时钟的负载大幅增加,与时钟相关的功耗也不可避免的增加。 Regardless protection technology, are based on some cost to achieve a certain degree of protection, mainly in the following aspects: 1) degradation operation, such as inserting redundant operation in a cryptographic operation process inevitably cause operational degradation; based constant power dual-rail dynamic logic cryptographic algorithms member, each of the clock cycle only half the time to do an effective operation, the other half needs to be pre-filled, so at least half of the operational performance degradation; 2) increase in chip area, such as the introduction of motion compensation a smoothing circuit power consumption, introducing a random noise power generation module, and a dual-rail logic, dynamic protection technology will inevitably increase in chip area; 3) increase in power consumption, noise power, and power consumption constant smoothing protection technology are such that a substantial increase in the power consumption; such as when a cryptographic algorithm implemented in dual-rail dynamic logic because all the logic units as a precharge clock signal, such that a substantial increase in the load of the clock, the clock associated with the power consumption inevitably increases.

目前,也有研究采用异步电路实现密码算法协处理器以达到抗功耗攻击的目标,这主要是因为异步电路也具有一定的功耗恒定特性。 Currently, there are studies using asynchronous circuit implementation cryptographic algorithm coprocessor to achieve the target of anti-power attack, mainly due to the asynchronous circuit also has a certain constant power consumption characteristics. 异步电路的功耗恒定特性主要来源于信号的双轨编码和互补的电路结构。 Constant power characteristic mainly from the asynchronous circuit and the dual-rail encoding circuit configuration complementary signals. 但是,与上述的功耗恒定逻辑单元如基于敏感放大器的逻辑单元相比,异步电路的功耗恒定特性相对较差,也就是防护能力相对较低。 However, the above-described constant power consumption as compared to the logical unit based on the logical unit of sense amplifier, a constant power characteristic of the asynchronous circuit is relatively poor, which is relatively low protection. 与同步电路相比,异步电路的最显著优势在于其低功耗性质,基于异步电路的低功耗设计与实现技术也是当前集成电路领域的前沿研究内容;但异歩电路并不具有运算性能和芯片面积等方面的优势。 Compared with the synchronizing circuit, the most significant advantages of the asynchronous circuit research frontier nature of its low-power, low-power design and implementation based on asynchronous circuit techniques are also currently in the field of integrated circuits; ho circuit but not exclusive computing performance and having and other aspects of the area of ​​the chip advantage. 另外,异步电路的实现较为困难,缺乏成熟的辅助设计工具。 In addition, the asynchronous circuit more difficult, the lack of mature-aided design tools.

分组密码算法是一种用于大数据量快速加解密的算法,是解决信息系统安全问题的关键技术,典型的分组密码算法包括DES和AES,均为多轮迭代型密码算法。 Block cipher algorithm is a large amount of data for fast encryption and decryption is the key technology to solve the problem of information system security, a typical block cipher algorithms including DES and AES, are multi-iteration password algorithm. 分组密码算法运算模块(包括软件实现和硬件实现)是各种安全芯片中的必要组成部分,其具体实现也必须具有有效的抗功耗攻击防护能力。 Block cipher computation module (including hardware and software) is an integral part of various security chip, specific implementation must also have potent anti-attack power protection. 目前还没有采用异步电路设计实现既具有较高的抗功耗攻击防护能力,又具有高运算性能和低功耗特性的异步分组密码算法协处理器的公开报道。 There is no published reports using the asynchronous block cipher coprocessor asynchronous circuit design to achieve both a high resistance to power analysis attack protection, but also has a high computing performance and low power consumption characteristics.

发明内容 SUMMARY

本发明要解决的技术问题是:在现有技术条件下,提供一种异步分组密码算法协处理器的设计方法,采用这种方法设计的协处理器具有良好的功耗恒定特性即较高的抗功耗攻击防护能力,同时具有高运算性能与低功耗特性。 The present invention is to solve the technical problem: under the conditions of the prior art, there is provided a method of designing an asynchronous packet cipher coprocessor, the coprocessor using this design method has good high-power constant characteristic i.e. anti-power attack protection, while having a high operational performance and low power consumption.

为了解决上述技术问题,本发明的技术方案为:将分组密码算法中每一轮迭代作为独 To solve the above technical problem, the technical solution of the present invention is: each iteration as a separate block cipher algorithm

立的子模块;采用硬件描述语言HDL ( Hardware Description Language)分别设计各子模块,且各子模块完全为组合电路;运用现有的综合工具对各子模块迸行逻辑综合,得到仅包括反相器、二输入与门和或门的网表;将网表转换为仅由互补的二输入与门和或门组成的网表;对各子模块进行延时匹配,增加与子模块的延时相同的延时匹配模块, 并保证各子模块的任意输入信号至输出信号的延时相同,且电路中任意二输入与门和或门的两个输入的到达时间相同;将各子模块顺序连接,得到异步分组密码算法协处理器的完整网表;进行后端布局布线,得到异步分组密码算法协处理器的GDS (GraphicData System)版图。 Vertical sub-module; hardware description language HDL (Hardware Description Language) were designed for each sub-module, and each sub-module is completely combinational circuit; use of existing tools for each sub-module integrated into line logic synthesis, to obtain only the inverter comprising , a two-input aND gate and an oR gate netlist; converting the netlist by the netlist is complementary to only two-input aND gate and oR gate; and each sub-module delay matching, increase the delay of the sub-modules the same delay matching module, and to ensure that any input signal of each sub-module identical to the delay output signal and the second input circuit with the arrival time at any of the two inputs of aND and oR gates same; sequentially connecting submodules to give a complete netlist asynchronous block cipher coprocessor; backend layout, to obtain an asynchronous packet cipher coprocessor GDS (GraphicData System) layout. 具体技术方案为: Specific technical solutions for:

第一步,对分组密码算法进行子模块划分,将分组密码算法中每一轮迭代作为独立的子模块。 The first step of the block cipher algorithm sub-module division, the block cipher algorithm in each round of iteration as a separate module. 分组密码算法由若干轮迭代组成,如DES算法由16轮迭代组成,128位密钥的AES算法由IO轮迭代组成。 Block cipher composed of several iterations, such as 16 iterations of the DES algorithm composition, AES 128-bit key algorithm iterations by IO composition. 将每一轮迭代作为独立的子模块,其主要功能包括轮变换和轮密钥编排。 Each iteration as a separate sub-module, whose primary function includes a wheel and a wheel conversion key schedule. 各子模块分别记为Si,S2,…S,,…,Sn(n^l), n表示分组密码算法的迭代轮数,iSi^n。 Submodules are denoted as Si, S2, ... S ,, ..., Sn (n ^ l), n represents the number of iterations block cipher, iSi ^ n. 设M为初始明文,K为密钥,C为密文,Rj表示第j (1^j^nl)轮变换的结果,Kk表示第k (2SkSn)轮变换的轮密钥;各子模块之间的连接关系可用如下方式描述:(R(, K2) = Fi(M, K), (R2, K3) = &(^, K2),…,(Rn—h Kn) = F^(Rn.2, Kn—0, C = Fn(Rw, K„),其中Fi表示第i轮变换的功能函数。 Let M be the original plaintext, K is the key, C is the cipher text, Rj j represents the results (1 ^ j ^ nl) wheel transform, Kk represents a first round key k (2SkSn) transformed wheel; the sub-modules, connections between usable described as follows: (R (, K2) = Fi (M, K), (R2, K3) = & (^, K2), ..., (Rn-h Kn) = F ^ (Rn. 2, Kn-0, C = Fn (Rw, K "), wherein Fi represents the i-th round function conversion function.

第二步,子模块设计。 The second step, sub-module design. 对各子模块Si (1 SiSn)依次执行如下步骤- Sequentially performing the following steps for each sub-block Si (1 SiSn) -

2.1采用硬件描述语言HDL (如VHDL和Verilog)设计子模块,即描述子模块的功能,完全以组合电路实现各子模块的所有算术和逻辑运算,不包括时序电路,得到子模块的HDL代码。 2.1 hardware description language HDL (such as VHDL and Verilog) design sub-module, i.e. describe the function of the sub-module, complete with the combining circuit implements all arithmetic and logic operations of each sub-module does not include a timing circuit, to give the HDL code submodule.

2.2运用现有的综合工具对各子模块的HDL代码进行逻辑综合,且仅使用反相器、 二输入与门和或门这三种标准单元,得到子模块的逻辑网表;网表中所有逻辑单元均为静态单轨单元,因此将其称为静态单轨网表。 2.2 Application of conventional synthesis tool submodules HDL code logic synthesis is performed, and only the inverter input AND and OR gates three standard means to afford the sub-module logic netlist; all netlist logic unit cells are static monorail, thus referred to as static monorail netlist. 由于{、 a, v)即(逻辑非、逻辑与和逻辑 Since {, a, v) i.e. (logical negation, logical AND and logical

或)是布尔代数中完备联结词集,可以实现任意的布尔函数,因此反相器、二输入与门和或门即可实现任意的逻辑运算。 Or) is complete Boolean algebra coupled word set, any Boolean function can be realized, and therefore the inverter, input AND and OR gates can achieve any logical operation. 本步骤未引入特殊标准单元,并且无额外约束,可以使用目前成熟的商用综合工具,如Synopsys Design Compiler™等。 This step is not specific standard cell is introduced, and no additional constraint can be used currently mature commercial synthesis tool such as Synopsys Design Compiler ™ and the like.

2.3将静态单轨网表转换为仅由互补的二输入与门和或门组成的复合逻辑网表; 一对互补的二输入与门和或门组成了复合逻辑单元,转换后的网表称为复合逻辑网表。 2.3 converting the netlist static monorail composite logical netlist only by the complementary input AND gate and OR gate; and a complementary pair of two-input AND gate and OR gate composite logical unit, referred to as the converted netlist composite logical netlist. 具体转换方法为: Specific conversion method:

2.3.1为静态单轨网表中任意的信号(包括输入信号、输出信号和内部的互联信号) 增加对应的反相信号,,样所有的信号均为双轨编码。 2.3.1 any static signals monorail netlist (including input signals, output signals and internal signals of the network) is increased corresponding to the inverted signal of the signal are all like ,, dual-rail encoding. 设w为静态单轨网表中的任意信号,则增加其反相信号;。 Let w be any static signals monorail netlist, which is the inverted signal is increased;.

2.3.2删除静态单轨网表中所有的反相器。 2.3.2 Delete all the static monorail netlist inverters. 由于步骤2.3.1为所有信号增加了对应的反相信号,因此复合逻辑网表中无需反相器。 Since 2.3.1 increased corresponding to an inverted signal of all signals, so that the composite logic netlist without inverters. 设静态单轨网表中的某个反相器为INVul(a, z),其中INV表示反相器,ul表示反相器的名称,a为输入信号,一z ,输出信号,即z 为a的反相,则删除网表中的反相器ul,并将网表中的信号z替换为"S表示a的反相)。 Static disposed in a Monorail netlist inverter is INVul (a, z), which represents an inverter INV, ul represents the name of the inverter, a signal input, a z, output signal, i.e., z is a an inverter, the inverter deleted ul netlist, the netlist signal z and replace "S represents a inverted).

2.3.3为静态单轨网表中任意的二输入与门增加与之互补的或门。 2.3.3 static monorail netlist arbitrary two-input AND gates or doors increased complementary thereto. 设静态单轨网表中某个二输入与门为AND2u2(xI,x2,yl),其中AND2表示二输入与门,u2表示二输入与门的名称,xl和x2^/,^fS号,yl为输出信号,即yl-(x"x2);增加与u2互补的二输入或门OR2 ui2(^I,S,^i),其中OR2表示二输入或门,ui2表示二输入或门的名称, 3和^为输入信号,j^为输出信号,即^-(^Iv^)。 Static provided monorail netlist is a two-input AND gate AND2u2 (xI, x2, yl), which represents a two-input AND gate AND2, u2 represents the name of a two-input AND gate, xl and x2 ^ /, ^ fS number, yl an output signal, i.e. yl- (x "x2); u2 complementary to increase two-input oR gate OR2 ui2 (^ I, S, ^ i), which represents a two-input oR gate OR2, UI2 represents the name of a two-input oR gate , ^ 3, and the input signal, the output signal J ^, i.e., ^ - (^ Iv ^).

2.3.4为静态单轨网表中任意的二输入或门增加与之互补的与门。 2.3.4 Static monorail netlist any two-input OR gate and an increase complementary thereto. 设静态单轨网表中某个二输入,门为OR2 u3(x3, x4, y2),即y2-(x3vx4);增加与u3互补的二输入与门2.4增加与子模块的延时相同的延时匹配模块进行延时匹配,保证各子模块的任意输入信号至输出信号的延时相同,且电路中任意二输入与门和或门的两个输入的到达时间相同。 Static disposed in a Monorail netlist two input gate is OR2 u3 (x3, x4, y2), i.e. y2- (x3vx4); u3 same increase in delay is complementary to a two-input AND gate submodule 2.4 increase in casting matching delay matching module, to ensure that any delay of the output signal of the input signal of each sub-modules to the same, and to any two-input circuit with the arrival time of the two input aND and oR gates are the same. 具体方法为: Specific methods are:

2.4.1增加与子模块的延时相同的延时匹配模块。 2.4.1 increase the delay of the same sub-module delay matching modules. 延时匹配模块由顺序连接的缓冲单元BUF组成,BUF同样为互补的二输入与门和或门组成的复合逻辑单元,BUF的级数与子模块的关键路径所包含的逻辑单元的级数相同。 Delay matching buffer unit BUF module is formed by connecting the composition, the same number of stages as the same logical unit BUF composite logical gate means and a two-input OR gate with complementary, the number of stages BUF critical paths contained submodule . 在延时匹配模块中,BUF中与门的两个输入均为e,或门的两个输入均为"双轨信号(ej)称为运算触发控制信号;当执行有效运算时,将运算触发控制信号置为(l, 0);当不执行有效运算时,将运算触发控制信号置为(0,0);运算触发控制信号在延时匹配模块中逐级传递;当运算触发控制信号(l,O) 由延时匹配模块的输入端逐级传递至输出端时,即延时匹配模块的输出为(!,O)时,对应子模块也完成了有效的逻辑运算,子模块的输出为正确的计算结果。 In the delay matching module, in BUF are two input AND gate E, or both input gates are "dual-rail signal (EJ) operation called trigger control signal; effective operation when performing the operation control trigger signal is set (l, 0); efficient operation is not performed when the operation control trigger signal is set to (0,0); stepwise operation trigger control signal transfer delay matching module; when the operation control trigger signal (L , O) when passing from the input of delay matching module to the output terminal stepwise, i.e., the output of delay matching modules (!, when O), corresponding to the completed sub-module also valid logic operation, the output of sub-module correct calculation results.

2.4.2为不在子模块的关键路径中的输出信号增加缓冲单元BUF,保证从输入信号至任意输出信号的延时均相同。 2.4.2 increase of the output signal of the buffer unit BUF submodule not critical path to ensure that any output from the input signal to the delay signal are the same. 设子模块的所有输出信号为0,,02,…,Om(m2 1),且输入信号至各输出信号的关键路径包含的逻辑单元的级数分别为Hh H2,…,Hm,设其中最大的逻辑单元级数为H;则为各输出信号添加(H—Hp) (1 Sp5m)级顺序连接的缓冲单元。 All output signals provided to sub-module 0,, 02, ..., Om (m2 1), the logic unit and the input series signal to each of the critical path of the output signal are included Hh H2, ..., Hm, where the maximum design logical unit number of stages is H; buffer unit was added the output signals (H-Hp) (1 Sp5m) stages connected in sequence. 在被插入的缓冲单元中,除被缓冲的信号外,另一组输入来自延时匹配模块中上一级缓冲单元的输出。 Inserted in the buffer unit, the buffered signal in addition, another set of inputs from the delay matching module on an output of the buffer unit. 这样所有输入信号至被缓冲后的输出信号之间的关键路径所包含的逻辑单元级数均相同,其最大延时也相等。 So that all the input signals to the logic unit is included in the critical path between the output signal of the buffer stages are the same, which is also equal to the maximum delay.

2.4.3为不在子模块关键路径中的输入信号增加缓冲单元BUF以保证所有逻辑单元的两个输入端的到达时间相同。 2.4.3 increase of the input signal buffer unit BUF submodule not critical path to ensure that the same two input terminals of the logic unit all the time of arrival. 子模块的任意输出信号的运算电路(相当于逻辑表达式) 均可以用一棵二叉树来表示:二叉树中每个结点表示网表中的信号,其子结点要么为空(表明该信号为子模块的输入信号),要么为对应逻辑单元的两个输入端;二叉树的高度 Arithmetic circuit (equivalent logical expressions) in arbitrary sub-module output signals can be represented by a binary tree: each node in the binary tree represents the signal nets, its child nodes either empty (indicating that the signal is input sub-module), or to two input terminals of the corresponding logic unit; binary tree height

表示从输入信号到输出信号所经过的逻辑单元级数。 It indicates the input signal to the output signal of the logic unit through which stages. 从二叉树的根结点(即输出信号〉 开始,按层序遍历二叉树,保证同一层次信号的到达时间相同,同一层次的逻辑单元同 From the root of the binary tree (i.e., the output signal> starting at the binary tree traversal sequence, to ensure the same level of the same signal, time of arrival, the same level of the same logical unit

步执行有效运算。 Step perform efficient operation. 设与输出信号o对应的二叉树的高度为h,第2层结点(即o的两个子结点01和02)的子树的高度应为hl,如果01或02为子模块的输入信号,则直接为之增加h-2级顺序连接的缓冲单元;第d(l <d《h)层的任意结点x的子树的高度应为h-d+l, 且如果x为输入信号,则直接为之增加hd级顺序连接的缓冲单元;直至遍历至第hl 层的结点,如果该层某个结点为输入信号,则直接为之增加一级缓冲单元。 Set the output signal o corresponding to the binary tree of the height is h, the second layer nodes (i.e., two sub-o nodes 01 and 02) the height of the sub-tree should be HL, if 01 or 02 of the input signal into sub-modules, directly add an additional buffering unit is connected in order of h-2; a highly arbitrary subtree of node d (l <d "h) x layer should be h-d + l, and if x is an input signal, directly add an additional buffer unit hd order of connection; traversed up to the node of hl layer, if the layer is a node as an input signal, directly add an additional one buffer unit.

第三步,将经过延时匹配的子模块集成,即将S,,S2,…,SJ顿序连接,子模块S!接受初始输入信号即明文M和密钥K; Si(2SlSn-l港受S!-!产生的输出,其结果作为Sw The third step, after the delay matching sub module integration, i.e. S ,, S2, ..., SJ Dayton sequence connection, an initial sub-module receiving the input signal, i.e. S plaintext M and the key K;! Si (2SlSn-l receiving port S -!! output generated as a result Sw

的输入;Sn产生最终的运算结果即密文C;同时将各子模块对应的延时匹配模块顺序连接,因此S,至Sn之间的逻辑单元级数就同顺序连接的所有延时匹配单元的逻辑级数一致, 也就是实现了整个协处理器的延时匹配。 Input; Sn calculation result i.e. to produce a final ciphertext C; while connecting the sub-modules corresponding to each of delay matching modules sequentially, so between the logic unit S, to Sn, for all delay stages connected in sequence with the matching unit the same logic levels, delay matching is achieved throughout the coprocessor. 所有子模块集成之后,就得到了异步分组密码算法协处理器的完整网表。 After all the sub-module is integrated, to give a complete netlist to an asynchronous packet cipher coprocessor.

第四步,进行后端布局布线,得到异步分组密码算法协处理器的GDS版图。 A fourth step, the back-end layout, layout GDS obtain an asynchronous packet cipher coprocessor. 在后端 In the back-end

设计时,需要保证所有双轨信号具有相同的负载,以使得互补的二输入与门和或门具有与输入无关的功耗恒定特性。 Design, the need to ensure that all dual-rail signal having the same load, so that the complementary input AND and OR gates having constant characteristics independent of the power input. 在现有集成电路工艺条件下,由互联线引起的寄生效应在整个电路中占主要部分,只要双轨信号的互联线的负载相同,则双轨信号的负载相同。 In the conventional integrated circuit process conditions, parasitic effects caused by interconnect line is dominant throughout the circuit, the same load as long as the two-track signal interconnection line, the load of the same dual-rail signal. 因此将所有互补的二输入与门和或门成轴对称放置,使双轨信号具有对称的走线和相同的互联线长度,从而使双轨信号的负载几乎完全相同。 Thus all the complementary input AND and OR gates axisymmetric place the dual-rail signal having the same symmetrical alignment and interconnect wire length, so that the load virtually identical dual-rail signal.

从本质上看,按照上述步骤设计的异步分组密码算法协处理器为一组合电路。 In essence, the design according to the above steps asynchronous block cipher is a combination of co-processor circuit. 但与传统意义上的组合电路不同的是,每组数据在各层次逻辑单元中逐级(流水化)传递,且无需等当前数据处理完毕才能输入下一组数据,即当第一组输入从第一层次的逻辑单元 However, the combination circuit is different from the traditional sense, each data set stepwise (pipelined) at all levels in the logic unit is transmitted, without waiting for the current data and the processed data to the next set of input, i.e., when the first set of input from the first level of logic unit

传递到下一层次的逻辑单元时,即可输入第二组数据;依此类推,当第一组输入处理完毕即得到相应的运算结果时,第二组输入被传递到倒数第二层次的逻辑单元。 Passed to the next level when the logic unit to input a second set of data; and so on, when the first set of input that is processed to give the corresponding operation result, a second set of input is transmitted to the inverse of the second logic level unit. 也就是说, 协处理器可视为一种流水线,其工作方式为:当包括运算触发控制信号在内的所有输入信号均为0时,协处理器中逻辑单元逐级进入无翻转的状态,不消耗动态功耗;当运算触发控制信号置为(1,0)且输入有效数据时,经过协处理器中逻辑单元的逐级传递,当与子模块Sn对应的延时匹配模块的输出变为(l,O)时,S。 That is, as a coprocessor pipeline, which works as follows: When all the input signals comprises an operational trigger control signal including both 0, cascade logic unit coprocessor into the non-inverted state, do not consume dynamic power; when the operation control trigger signal is set to (1,0) and the input data is valid, step by step through the transfer coprocessor logic unit, when the output of the corresponding sub-module Sn variable delay matching module when (l, O), S. 的输出端即为有效运算结果;在满 Output terminal is the effective operation result; at full

足最小时间间隔条件下,可连续输入多组数据;协处理器在同一时刻可处理多组数据, 相邻输入数据进入协处理器的最小时间间隔取决于协处理器中单个逻辑单元的最大延时;为达到更好的功耗恒定特性,令全O输入和有效输入数据交替进入协处理器。 Under conditions sufficient minimum time interval, a plurality of sets of data may be continuously input; at the same time co-processor may process a plurality of sets of data, the coprocessor adjacent input data into the minimum time interval depends on the maximum delay of a single logic coprocessor unit when; constant power to achieve better properties, and make the whole O input data are alternately input into the valid coprocessor.

协处理器相当于多层次互联网络结构,每一级逻辑单元仅接受上一级逻辑单元的输出;协处理器仅存在相邻层次的局部互联,不存在跨层次的互联;同一级的逻辑单元同步执行有效运算。 Coprocessor network corresponding to the multi-level structure, each level of the logic unit only accepts the output of a logic unit; coprocessor is present only adjacent local interconnect level, the absence of cross-level interconnection; with a logic unit of effective implementation of synchronous operation. 从宏观上看,各子模块作为流水线的一段;从微观上看,同一层次的逻辑单元构成流水线的一段。 Macroscopically, as submodules pipeline stage; Microscopically, the same logic level pipeline stage units. 只要相邻输入数据的时间间隔不小于流水线中具有最大负载的逻辑单元的延时(记为At),就可以保证不同输入数据的信号在流水线中不会发生穿透,即保证了同一层次的逻辑单元在同一时刻只可能处理同时输入的一组数据。 Long as the time interval of adjacent input data delay line having a logical unit is not less than the maximum load (referred to as At), we can ensure that the input data signals of different penetration does not occur in the pipeline, i.e. to ensure the same level of logic unit may only process a set of input data while at the same time. 从这一角度看,流水线中具有最大负载的逻辑单元的延时决定了流水线能够达到的最高等价时钟频率(1/At),即最大延时的倒数。 From this perspective, the delay logic pipeline unit having the maximum equivalent load determines the maximum clock frequency of the pipeline can be achieved (1 / At), i.e. the inverse of the maximum delay. 实际上,单个逻辑单元的最大延时远远小于一轮迭代的延时,协处理器不包括寄存器和锁存器,也就避免了由寄存器和锁存器引起的延时; 与用常规方式实现的分组密码算法协处理器(包括同步电路和异步电路, 一轮迭代作为流水线的一段)相比,采用本发明设计的异步分组密码算法协处理器能够达到远远高于前者的最高等价时钟频率。 In fact, the maximum delay is much less than a single logical unit delay of one iteration, the coprocessor does not include a register and a latch, thus avoiding the delays caused by the register and latch; and a conventional manner block cipher achieved compared to the coprocessor (including synchronous and asynchronous circuits, as an iterative pipeline stage), the present invention is designed asynchronous block cipher coprocessor highest equivalent much higher than the former Clock frequency. 当协处理器满负荷工作时,所能达到的最高加解密吞吐率为(l/厶t)。 When the coprocessor full load, the maximum achievable throughput of decryption (l / Si t).

采用本发明可以达到以下的技术效果: The invention can achieve the following technical effects:

1. 在第二步子模块设计时仅采用互补的二输入与门和或门实现异步分组密码算法协处理器,当全0输入和有效数据交替输入互补的二输入与门和或门时,二者组成的复合 1. In the second sub-module is complementary to the design uses only two-input AND gates and an OR gate block cipher asynchronous coprocessor, when the all-zero input and complementary input valid data are alternately input AND and OR gates, two who make up the composite

逻辑具有良好的功耗恒定特性;因此由互补的二输入与门和或门构成的分组密码算法协处理器同样具有良好的功耗恒定特性,密钥与协处理器的功耗之间的相关性趋近于O,协处理器具有较高的抗功耗攻击防护能力。 Logic has good power characteristics constant; therefore block cipher coprocessor constituted by two complementary inputs and OR gates also have a good correlation between the constant power characteristic, with the power key coprocessor of close to O, the coprocessor has a high resistance to power analysis attack protection.

2. 在第二步子模块设计时为各子模块进行延时匹配,使得同一层次的逻辑单元同步执行有效运算,同一层次的逻辑单元构成了流水线的一段;只要满足一定的时间间隔, 就可以连续向协处理器输入多组数据;且相邻输入数据的最小时间间隔取决于协处理器中单个逻辑单元的最大延时,由于单个逻辑单元的最大延时远远小于一轮迭代的延时, 协处理器不包括寄存器和锁存器,也就避免了由寄存器和锁存器引起的延时;采用本发明设计的异步分组密码算法协处理器的最高等价时钟频率能够远远高于采用常规实现方式(包括同步电路和异步电路, 一轮迭代作为流水线的一段)设计的协处理器,运算性能大大提高。 2. In the second sub-module designed for the sub-module delay matching, so that the same logic level synchronous execution unit effective operation, the same level of a logical unit constituting the pipeline stage; as long as a certain time interval, can be continuously RA is input to the processor a plurality of sets of data; and the minimum time interval of the adjacent input data depends on the maximum delay of a single logic unit coprocessor, since the maximum delay is much less than a single logical unit delay of one iteration, does not include the coprocessor register and latch, thus avoiding the delays caused by the register and latch; asynchronous block cipher algorithm according to the present invention, coprocessor designed maximum clock frequency can be much higher than the equivalent employed conventional implementations (including synchronous and asynchronous circuits, as an iterative pipeline stage) design of the coprocessor arithmetic performance greatly improved.

3. 采用本发明设计的异步分组密码算法协处理器无时钟信号和寄存器等时序逻辑, 无长互联线;协处理器不工作时,所有输入信号保持为0,协处理器中无信号翻转,仅存在静态功耗,因此采用本发明设计的异步分组密码算法协处理器具有低功耗特性。 3. The asynchronous packet cipher algorithm of the present invention designed coprocessor clock signal and a non-sequential logic registers and the like, no long interconnect lines; the coprocessor does not work, all input signal remains 0, no signal is inverted coprocessor, there is only the static power consumption, the present invention is designed asynchronous block cipher coprocessor having low power consumption.

4. 采用本发明不需要设计任何特殊的逻辑单元,可直接使用现有的标准单元;最大限度的使用现有的成熟集成电路设计方法,包括HDL设计、逻辑综合以及后端设计等; 采用延时匹配(即在各子模块的复合逻辑网表中插入必要的缓冲单元)的方法使异步流水线正确,无须设计常规的异步电路中的握手电路(握手电路必须检测电路中的组合逻辑单元是否完成运算进而到达稳态,并向前一级发出应答信号,向后一级发出请求信号); 因此本发明充分利用了现有技术和工具,简单易行。 4. The design of the present invention does not require any special logic unit, may be used as a conventional standard cell; maximize the use of existing mature integrated circuit design method, comprising HDL design, logic synthesis design, and a rear end; Sustain the method of matching (i.e., insert the necessary buffer unit in composite logical netlist each sub-module) causes the asynchronous pipeline correctly, without the design of conventional asynchronous circuit handshake circuit (handshake circuit must detect a combinational logic unit circuit is completed Further reach steady state operation, and a response signal sent forward, backward request signal a); thus full advantage of the present invention and the prior art tools, easy.

本发明适用于可能受到功耗攻击的各种安全芯片中分组密码算法协处理器的设计与实现,可以达到抗功耗攻击防护能力和运算性能及功耗的良好折衷。 Design and implementation of various security chip of the present invention may be applied to power challenged block cipher coprocessor, can achieve a good trade-off protection and power attacks anti-computing performance and power consumption.

附图说明 BRIEF DESCRIPTION

图1为采用本发明设计异步分组密码算法协处理器的总体流程图; 图2为第二步中第三个步骤网表转换过程的示意图; 图3为第二步中第四个步骤延时匹配过程示意图; 图4为采用本发明设计的异步分组密码算法协处理器的总体结构图; 图5为异步分组密码算法协处理器的工作方式示意图。 Figure 1 is a general flow chart of the design according to the present invention using an asynchronous packet cipher coprocessor; FIG. 2 a schematic view of a second step a third step is netlist conversion process; FIG. 3 is a second step, a fourth step delay schematic matching process; FIG. 4 is a general configuration diagram of an asynchronous packet using the cipher coprocessor of the present invention is designed; schematic block cipher works asynchronously FIG. 5 is a coprocessor. 具体实施方式 Detailed ways

图1为采用本发明进行异步分组密码算法协处理器的设计流程图,主要包括如下步 Figure 1 is a flowchart of the design according to the present invention is an asynchronous packet cipher coprocessor, including the following steps

骤: Step:

1. 对分组密码算法进行子模块划分,得到子模块; 1. sub block cipher module division, to give the sub-module;

2. 子模块设计,包括如下步骤-2.1HDL设计,得到子模块的HDL代码。 2. The sub-module design, the design comprising the steps of -2.1HDL give HDL code submodule.

2.2对子模块的HDL代码进行逻辑综合,得到子模块的静态单轨网表。 2.2 HDL code module pair logic synthesis, to obtain static monorail netlist submodule.

2.3将静态单轨网表转换为仅由互补的二输入与门和或门组成的网表,得到复合逻辑网表。 2.3 convert static monorail netlist netlist complementary to only a two-input AND gate and OR gate to obtain a composite logic netlist.

2.4增加与子模块的延时相同的延时匹配模块,保证各子模块的任意输入信号至输出信号的延时相同,且电路中任意二输入与门和或门的两个输入的到达时间相同。 2.4 increasing the delay the same sub-module delay matching module, to ensure the same delay any input signal of each sub-module to the output signal, and the same arrival time of the circuit to any input of two input AND gates and OR gates .

3. 将各子模块经延时匹配后的网表集成,得到异步分组密码算法协处理器的完整网表。 3. After each sub-module is integrated delay matching netlist, the netlist to give complete asynchronous block cipher coprocessor.

4. 后端布局布线,得到异步分组密码算法协处理器的GDS版图。 4. The rear end of the layout, the layout GDS obtain an asynchronous packet cipher coprocessor. 图2为第二步中第三个步骤网表转换过程的示意图。 Figure 2 is a schematic view of a second step a third step netlist conversion process. il,i2, i8表示输入信号,ol 和02表示输出信号,nl,n2, ...,n5为内部互联信号。 il, i2, i8 an input signal, ol, and 02 an output signal, nl, n2, ..., n5 is the signal interconnects. 图2的箭头左边为网表转换之前的逻辑电路图(即静态单轨网表),各输出信号的逻辑运算表达式分别为: o 1 = ((i 1 ai2) a(i3 vi4)) v (i5 ai6),o2 = (i7 ai8)。 FIG 2 is an arrow to the left of the logic circuit netlist prior to conversion (i.e., static monorail netlist), the output signals from the logic operation expression were: o 1 = ((i 1 ai2) a (i3 vi4)) v (i5 ai6), o2 = (i7 ai8). 图2的箭头右边为经过网表转换后的逻辑电路图,各输出信号的逻辑表达式为:^I = ((nv5)v(5A^》A(^v^), o2 = (5v5), 5-(i7Ai8)。网表转换的具体过程为.- Arrow to the right in FIG. 2 is a circuit diagram of the network through the logic conversion table, the output signal of each logical expression is: ^ I = ((nv5) v (5A ^ "A (^ v ^), o2 = (5v5), 5 -. (i7Ai8) specific process to convert the netlist .-

—l.一为所有的信号增加对应的反相信号,如图2中箭头右边的&...,5, 3,5, -l. a signal corresponding to an increase for all of the inversion signal, as shown by the arrow to the right of & ..., 5, 3,5,

2. 删除静态单轨网表中所有的反相器,并将反相器的输出信号替换为反相器输^信号的反相。 2. Remove all static monorail netlist inverter, and the output signal of the inverter alternatively input to inverter ^ inverted signal. 如图2中箭头右边所示,删除图2中箭头左边的反相器6,并将o2替换为S 。 As shown by arrows to the right, left of the arrow in FIG. 2 to delete the inverter 6, and o2 replace S.

3. 为静态单轨网表中任意的二输入与门增加与之互补的或门。 3. static monorail netlist arbitrary two-input AND gates or doors increased complementary thereto. 如图2中箭头右边所示,为图2中箭头左边的第一与门l、第三与门3、第四与门4及第五与门5分别增加互补的第八或门8、第十或门10、第十一或门11及第十二或门12。 2 As indicated by the right, left of the arrow in FIG. 2 l of the first AND gate, the third AND gate 3, the fourth and the fifth AND gate 4 and the gate 5 increases or complementary to an eighth gate 8, respectively, of ten oR gate 10, oR gate 11 and the eleventh or twelfth gate 12.

4. 为静态单轨网表中任意的二输入或门增加与之互补的与门。 4. a monorail static netlist any two-input OR gate and gate increases complementary thereto. 如图2中箭头右边所示,为图2中箭头左边的第二或门2及第七或门7分别增加互补的第九与门9及第十三与门13。 As shown by arrows to the right, to the left of the arrow in FIG. 2 or the second door 2 and the door 7 or greater complementarity seventh ninth and thirteenth AND gate 9 and the door 13 respectively.

图3为第二步中第四个步骤延时匹配过程示意图。 3 is a schematic fourth step in the second step of delay matching procedure. 延时匹配主要包括如下步骤- Delay matching includes the steps of -

1. 增加与子模块的延时相同的延时匹配模块。 1. Increase the delay submodule same delay matching modules. 在如图3箭头左边的电路中,关键路径为输入信号il (或i2)至ol的数据通路,包含3级逻辑单元。 In the left of the arrow in FIG. 3 circuit, the input signal IL critical path (or i2) to the data path ol, comprising three logic unit. 因此,与该电路相对应的延时匹配模块就包含3级顺序连接的缓冲单元BUF;如图3箭头右边所示,第十四与门14和第十五或门15、第十六与门16和第十七或门17、以及第十八与门18和第十九^门19分别组成3til序连接的BUF。 Thus, the circuit corresponding to the delay matching buffer unit BUF module contains three serially connected; right arrow shown in FIG. 3, the fourteenth and the fifteenth gate 14 and OR gate 15, AND gate sixteenth 16 oR gate 17 and the seventeenth, eighteenth and nineteenth ^ aND gate 18 and gate 19 are respectively connected in sequence BUF 3til composition. 延时2^模块的^入即运算触发控制信号记为(el, 3),输出记为(e4, a),中间结果记为(e2,巨)和(e3, ^)。 ^ 2 ^ delay module into operation trigger control signal that is referred to as (el, 3), referred to as output (e4, a), referred to as the intermediate result (e2, giant), and (e3, ^). 在延时匹配模块中,与门的两个输入均相同,分别为el,e2,e3;或门的两个输入也相同,分别为a, 5, 5。 In the delay matching module, with two input gates are the same, respectively, el, e2, e3; two-input OR gate are the same, respectively, a, 5, 5.

2. 为不在子模块关键路径上的输出信号增加缓冲单元BUF,保证从输入信号至,意输出信号的延时均相同。 2. The buffer unit BUF to increase the output signal are not on the critical path submodule, from the input signal to ensure, is intended to delay the output signal are the same. 在如图3箭头左边的电路中,由输入信号至输出i言号02和5的逻辑单元级数为l,小于输入信号至ol的逻辑单元级数,因此为o2和5添加两级缓冲单元;如图3箭头右边所示,第二十与门20和第二十一或门21、 g二及第二十二与门22 ,第二十三或门2£分别组成两级顺,连接的BUF,前者为(n4, i5)进行缓冲得到(n6, i^),后者为(n6, ^)进行缓冲得至lj(S,02)。 In the left of the arrow in FIG. 3 circuit, the output from the input signal to the logic unit 02 and a statement i number of stages 5 l, less than the input signal to the logic unit ol number of stages, so as to add two buffer unit 5 and o2 ; arrow to the right as shown in FIG. 3, the aND gate 20 and the twenty-one or twenty-gate 21, g and two twenty-second aND gate 22, the twenty-third oR gate 2 are composed of two cis £ connected the BUF, the former is (n4, i5) buffered to give (n6, i ^), which is buffered (n6, ^) to give lj (S, 02). 第二十与门20的两个输入为n4和e2;第二十一或门21的两个输^_为^和5;第二十二与门22的两个输入为n6和e3;第二十三或门23的两个输入为^和S。 Twenty-two inputs of the AND gate 20 and n4 E2; twenty-two input of an OR gate 21 and as ^ _ ^ 5; twenty-second two input AND gate 22 is n6 and E3; first two input gates 23 or 28 for 23 ^ and S.

3. 为不在子模块关键路径上的输入信号增加缓冲单^以保证^有逻辑单元的两个输入端的到达时间相同。 3. increasing the buffer input signal on a single sub-module is not critical to ensure that the path ^ ^ same logical unit of the arrival times of the two input terminals. 在如图3箭头左边的电路中,(i5,5)和(i6,^)不在关键路径上,因此为二者分别增加一级缓,单元。 In the left of the arrow in FIG. 3 circuit, (i5,5) and (i6, ^) is not on the critical path, and therefore a buffer is increased by two, units. 如图3箭头右i^^示,第二十四与门24和第二十五或门25组成的BUF为(^5,^)进行缓冲,得到(n二第二十六与门26和第二十七或门27组成的BUF为(i6,S)进行缓冲,得到(n8, i^)。第二十四与门24的两个输入为i5和el;第二十五或门25的两个,入》5和;i;第二十六与门26的两个输入为i6和el;第二十七或门27的两个输入为^和H。 I ^^ a right arrow in FIG. 3 illustrates, BUF composition and gate 24 or gate 25 and the twenty-fifth to twenty-fourth (5 ^, ^) buffered to obtain (n two twenty sixth AND gate 26 and twenty seventh oR gate 27 BUF consisting of (i6, S) buffered to give (n8, i ^) twenty fourth two input aND gate 24 is i5 and el;. a twenty-fifth oR gate 25 two, into "and 5; I; twenty sixth two input aND gate 26 to i6 and EL; twenty-seventh two input oR gate 27 is H. ^ and

图4为采用本发明设计的异步分组密码算法协处理器的总体结构图,此图也说明了本 FIG 4 is a general configuration diagram of an asynchronous packet using the cryptographic algorithm of the present invention, coprocessor design, this figure also illustrates the present

发明第三歩子模块集成的过程。 The third invention is an integrated process submodule ho. 子模块S,,S2,…,Sn顺序连接,S,接受初始明文M和密 Submodule S ,, S2, ..., Sn are sequentially connected, S, accepts the initial ciphertext and plaintext M

钥K, Sn产生运算结果即密文C; S,(Ki〈n)接受Sw的输出,产生的结果提供给Sw; 相邻子模块之间无锁存器。 Key K, Sn calculation result i.e. to produce a ciphertext C; S, (Ki <n) receiving an output of Sw, Sw are provided to the result; no latches between adjacent sub-module. 各子模块具有对应的延时匹配模块,延时匹配模块由顺序连接的缓冲单元组成,缓冲单元的级数与子模块关键路径所包含的逻辑单元级数相同。 Each submodule has a corresponding delay matching modules, the module delay matching buffer unit is formed by connecting the composition, the number of stages and the same number of stages submodule logic means critical path contained in the buffer unit. 延时匹配模块的输入为运算触!,制信号二与各子模i央!应的延时匹配模块的运算触!控制的输入信号分别记为(eil, iH),(ei2, ^),…,(ein, ^),输出信号分别记为(eol, ^T), (eo2, (eon, ^)。同时输入的一组数据在各子模块之间逐级传递,而运算触发 Delay matching module touch input operation!, Two system signals of each central submodules i! Corresponding delay matching module touch operation! Input control signals are referred to as (eil, iH), (ei2, ^), ..., (ein, ^), output signals are referred to as (eol, ^ T), (eo2, (eon, ^). at the same time a group of data input gradual transfer between the sub-modules, and the operation triggered

控制信号在各段延时匹配模块之间逐级传递。 A control signal is transmitted between segments gradual delay matching modules. 当明文、密钥和,触发控制信号均为0 时,协处理器逐级进入无翻转状态,仅消耗静态功耗;当(eil, ^T)为(1,0)且明文和密钥为有效值时,协处理器开始执行有效处理,经过逐级传递之后,当(eon, ^n)变为(1, 0) 时,Sn的输出即为相应密文C。 When the plaintext, and the key, triggering the control signals are zero, the coprocessor progressively enters the no-inverted state, only the static power consumption; if (eil, ^ T) (1,0) and the plaintext and the key is when the effective value, the coprocessor starts executing effective processing, after passing through step by step, when (eon, ^ n) becomes (1, 0), Sn is the output of a corresponding ciphertext C.

图5为采用本发明设计的异步分组密码算法协处理器的工作方式示意图。 FIG 5 is a schematic view of the present invention is designed to work using the asynchronous block cipher coprocessor. 设异歩分组密码算法协处理器包含N级逻辑单元,且协处理器中单个逻辑单元的最大延时为At,即相邻输入数据进入协处理器的最小时间间隔为At:不同输入数据之间的间隔记为axAt, 则a为不小于l的任意值;完成一次有效密码运算所需时间为NxAt;在同一时刻,可能存在多组数据被处理。 The minimum time is set isobutyl ho block cipher logic coprocessor unit comprises N stages, and a maximum delay of a single logic unit coprocessor is At, i.e. adjacent input data into the coprocessor interval At: the different input data referred to as the interval between AXAT, then a is not less than any value of l; perform a valid cryptographic computation time required to NxAt; at the same time, there may be multiple sets of data are processed. 当协处理器的所有输入均为0时,协处理器逐级进入无翻转的状态;为达到更好的功耗恒定特性,可以将全0输入与有效数据输入交替进入协处理器; 有效数据也可以连续进入协处理器以达到更高的运算性能即吞吐率。 When all inputs are 0 coprocessor, the coprocessor into the non-inverted state step by step; valid data; constant power to achieve better characteristics, with the all-zero input valid data may be input alternately into the coprocessor It may be continuously enter the coprocessor to achieve higher throughput rate that is computing performance. 当协处理器满负荷工作时,如果全0输入和有效数据输入交替进入协处理器,吞吐率(i/2At);如果有效数据输入连续进入协处理器水线,吞吐率为(1/At)。 When the coprocessor work at full capacity, and if the all-zero input valid input data alternately into the coprocessor, throughput (i / 2At); If the consecutive valid data inputted into the coprocessor waterline throughput rate (1 / At ). 由于单个逻辑单元的最大延时即At远远小于一轮迭代的延时,协处理器不包括寄存器和锁存器,也就避免了由寄存器和锁存器引起的延时;与采用常规方式实现的分组密码算法协处理器(包括同步电路和异歩电路, 一轮迭代作为流水线的一段)相比,采用本发明设计的异步分组密码算法协处理器的运算性能远远高于前者。 Since the maximum delay of a single logic unit i.e. much less than the delay At of the next iteration, the coprocessor does not include a register and a latch, thus avoiding the delays caused by the register and latch; and a conventional manner block cipher implementation coprocessor (including synchronizing circuit and iso-ho circuit, as an iterative pipeline stage) compared to the performance of the present invention by using the operational block cipher designed asynchronous coprocessor is much higher than the former.

Claims (2)

1.一种异步分组密码算法协处理器的设计方法,其特征在于包括如下步骤: 第一步,对分组密码算法进行子模块划分,将分组密码算法中每一轮迭代作为独立的子模块,各子模块分别记为S1,S2,…Si,…,Sn(n≥1),n表示分组密码算法的迭代轮数;设M为初始明文,K为密钥,C为密文,Rj表示第j轮变换的结果,Kk表示第k轮变换的轮密钥,1≤j≤n-1,2≤k≤n,各子模块之间的连接关系为:(R1,K2)=F1(M,K),(R2,K3)=F2(R1,K2),…,(Rn-1,Kn)=Fn-1(Rn-2,Kn-1),C=Fn(Rn-1,Kn),其中Fi表示第i轮变换的功能函数,1≤i≤n; 第二步,子模块设计,对各子模块Si依次执行如下步骤: 步骤一,采用硬件描述语言HDL描述子模块的功能,完全以组合电路实现各子模块的所有算术和逻辑运算,得到子模块的HDL代码; 步骤二,运用现有的综合工具对各子模块的HDL代码进行逻辑综 1. A design method of an asynchronous packet cipher coprocessor, comprising the steps of: a first step of the block cipher algorithm is divided into sub-modules, each of the block cipher algorithm iteration as a separate sub-modules, submodules are denoted as S1, S2, ... Si, ..., Sn (n≥1), n ​​represents the number of iterations block cipher; original plaintext M is set, K is the key, C is the cipher text, Rj represents the results of the j-th round of transformation, Kk represents a k-th wheel converted round key, 1≤j≤n-1,2≤k≤n, connections between the sub-modules: (R1, K2) = F1 ( M, K), (R2, K3) = F2 (R1, K2), ..., (Rn-1, Kn) = Fn-1 (Rn-2, Kn-1), C = Fn (Rn-1, Kn ), where Fi represents the i-th round function conversion function, ≦ i ≦ n; the second step, sub-module design, the steps are sequentially performed for each sub-module Si: step a, using hardware description language, HDL functions described submodule all arithmetic and logic operations, a circuit implemented totally in a combination of each sub-module, sub-module to obtain the HDL code; step two, the use of conventional logic synthesis tools for mechanized HDL code for each sub-module 合,且仅使用反相器、二输入与门和或门这三种标准单元,得到子模块的静态单轨网表; 步骤三,将静态单轨网表转换为仅由互补的二输入与门和或门组成的复合逻辑网表,转换方法为: 步骤1,为静态单轨网表中任意的信号增加对应的反相信号:设w为静态单轨网表中的任意信号,则增加其反相信号<overscore>w</overscore>; 步骤2,删除静态单轨网表中所有的反相器:设静态单轨网表中的某个反相器为INVu1(a,z),其中INV表示反相器,u1表示反相器的名称,a为输入信号,z为输出信号,即z为a的反相,则删除网表中的反相器u1,并将网表中的信号z替换为<overscore>a</overscore>; 步骤3,为静态单轨网表中任意的二输入与门增加与之互补的或门:设静态单轨网表中某个二输入与门为AND2 u2(x1,x2,y1),其中AND2表示二输入与门,u2表示二输入与门的名称,x1和x2为输入 Together, and only the inverter input AND and OR gates three standard means to afford the sub-module static monorail netlist; Step three, the netlist is converted to static monorail complementary to only a two-input AND gate and or a composite logical netlist conversion method gates of: step 1, a corresponding increase in signal inverted signal of an arbitrary static monorail netlist: w is an arbitrary signal set static monorail netlist, which is the inverted signal is increased <overscore> w </ overscore>; step 2, remove all of the static monorail netlist inverters: static inverter is provided a monorail netlist is INVu1 (a, z), which represents an inverter INV , u1 represents the name of the inverter, a is an input signal, the output signal z, i.e., a z is inverted, the inverter deleted u1 netlist, the netlist signal z and replace <overscore > a </ overscore>; step 3, the monorail static netlist arbitrary two-input aND gate or door increases complementary thereto: static provided monorail netlist is a two-input aND gate AND2 u2 (x1, x2, y1), which represents a two-input aND gate AND2, u2 represents the name of a two-input aND gate, x1 and x2 is input 号,y1为输出信号,即y1=(x1∧x2);增加与u2互补的二输入或门OR2 ui2(<overscore>x1</overscore>,<overscore>x2</overscore>,<overscore>y1</overscore>),其中OR2表示二输入或门,ui2表示二输入或门的名称,<overscore>x1</overscore>和<overscore>x2</overscore>为输入信号,<overscore>y1</overscore>为输出信号,即<overscore>y1</overscore>=(<overscore>x1</overscore>∨<overscore>x2</overscore>); 步骤4,为静态单轨网表中任意的二输入或门增加与之互补的与门:设静态单轨网表中某个二输入或门为OR2 u3(x3,x4,y2),即y2=(x3∨x4),增加与u3互补的二输入与门AND2 ui3(<overscore>x3</overscore>,<overscore>x4</overscore>,<overscore>y2</overscore>),即<overscore>y2</overscore>=(<overscore>x3</overscore>∧<overscore>x4</overscore>); 步骤四,增加与子模块的延时相同的延时匹配模块进行延时匹配,方法为: 步骤1),增加与子模块的延时相同的延时 Number, y1 is the output signal, i.e., y1 = (x1∧x2); u2 complementary to increase two-input OR gate OR2 ui2 (<overscore> x1 </ overscore>, <overscore> x2 </ overscore>, <overscore> y1 </ overscore>), wherein OR2 represents a two-input oR gate, ui2 represents the name of a two-input oR gate, <overscore> x1 </ overscore> and <overscore> x2 </ overscore> input signal, <overscore> y1 </ overscore> output signal, i.e., <overscore> y1 </ overscore> = (<overscore> x1 </ overscore> ∨ <overscore> x2 </ overscore>); step 4, static monorail netlist any 2-input oR door and door increases complementary thereto: static provided monorail netlist is a two-input oR gate OR2 u3 (x3, x4, y2), i.e., y2 = (x3∨x4), complementary to increase u3 input aND gate AND2 ui3 (<overscore> x3 </ overscore>, <overscore> x4 </ overscore>, <overscore> y2 </ overscore>), i.e. <overscore> y2 </ overscore> = (<overscore> x3 </ overscore> ∧ <overscore> x4 </ overscore>); step 4 increases the delay of the same sub-module delay matching delay matching module, method: step 1), to increase the delay of the delay the same submodule 匹配模块,延时匹配模块由顺序连接的缓冲单元BUF组成,BUF为互补的二输入与门和或门组成的复合逻辑单元,BUF的级数与子模块的关键路径所包含的逻辑单元的级数相同;BUF中与门的两个输入均为e,或门的两个输入均为<overscore>e</overscore>,双轨信号(e,<overscore>e</overscore>)称为运算触发控制信号;当执行有效运算时,将运算触发控制信号置为(1,0);当不执行有效运算时,将运算触发控制信号置为(0,0);运算触发控制信号在延时匹配模块中逐级传递;当运算触发控制信号(1,0)由延时匹配模块的输入端逐级传递至输出端时,即延时匹配模块的输出为(1,0)时,对应子模块也完成了有效的逻辑运算,子模块的输出为正确的计算结果; 步骤2),为不在子模块的关键路径中的输出信号增加缓冲单元BUF:设子模块的所有输出信号为O1,O2,…,Om,m≥1,且输入信 Matching module level, delay matching buffer unit BUF module is formed by connecting the composition, a composite logic unit BUF input AND gates and OR gates are complementary, the number of stages BUF critical paths contained submodule logic unit the same number; BUF gate with two inputs are e, or both input gates are <overscore> e </ overscore>, two-track signal (e, <overscore> e </ overscore>) trigger operation is called a control signal; effective operation when performing the operation control trigger signal is set to (1,0); efficient operation is not performed when the operation control trigger signal is set to (0,0); operator trigger control signal delay matching progressively transfer module; when the operation control trigger signal (1,0) is transmitted from the input terminal of the delay matching module to the output terminal stepwise, i.e., the output of delay matching modules (1,0), the corresponding sub-module also completed a valid logic operation, the output sub-module for the correct calculation result; step 2), to increase the buffer unit BUF an output signal is not sub-module critical path: all output signals provided submodule for O1, O2, ..., Om, m≥1, and the input signal 号至各输出信号的关键路径包含的逻辑单元的级数分别为H1,H2,…,Hm,设其中最大的逻辑单元级数为H;则为各输出信号添加H-Hp级顺序连接的缓冲单元,1≤p≤m;在被插入的缓冲单元中,除被缓冲的信号外,另一组输入来自延时匹配模块中上一级缓冲单元的输出; 步骤3),为不在子模块关键路径中的输入信号增加缓冲单元BUF:子模块的任意输出信号的运算电路用一棵二叉树来表示时,二叉树中每个结点表示网表中的信号,其子结点要么为空即该信号为子模块的输入信号,要么为对应逻辑单元的两个输入端;二叉树的高度表示从输入信号到输出信号所经过的逻辑单元级数,从二叉树的根结点即输出信号开始,按层序遍历二叉树,保证同一层次信号的到达时间相同,同一层次的逻辑单元同步执行有效运算;设与输出信号o对应的二叉树的高度为h,第2 Series critical path logical unit number to the respective output signals are included in H1, H2, ..., Hm, is provided wherein the maximum number of stages of logic cells is H; H-Hp buffer addition stage connected sequentially compared with the output signals unit, 1≤p≤m; inserted in the buffer unit, the buffered signal in addition, another set of inputs from the buffer unit of a delay matching module output; step 3), is not critical submodule input signal path increases the buffer unit BUF: arithmetic circuit when the output signal of any sub-module is represented by a binary tree, each node represents a binary signal in the netlist, which is a child node of the signal that is either empty submodule input signal, either of the two inputs of a corresponding logic unit; height of the binary tree representation from the input signal to the output stages through which the logic unit, i.e., the binary tree from the root node output signal, beginning with the sequence traversing the binary tree, the arrival time to ensure the same level of the same signal, the same level of efficient operation synchronous execution logic unit; o height is provided with an output signal corresponding to the binary tree is h, the second 结点即o的两个子结点o1和o2的子树的高度为h-1;如果o1或o2为子模块的输入信号,则直接为之增加h-2级顺序连接的缓冲单元;第d层的任意结点x的子树的高度应为h-d+1,1<d<h,且如果x为输入信号,则直接为之增加hd级顺序连接的缓冲单元;直至遍历至第h-1层的结点,如果该层某个结点为输入信号,则直接为之增加一级缓冲单元; 第三步,将经过延时匹配的子模块集成,即将S1,S2,…,Sn顺序连接,子模块S1接受初始输入信号即明文M和密钥K;Sl接受Sl-1产生的输出,其结果作为Sl+1的输入,2≤l≤n-1;Sn产生最终的运算结果即密文C;同时将各子模块对应的延时匹配模块顺序连接,所有子模块集成之后,得到异步分组密码算法协处理器的完整网表; 第四步,进行后端布局布线,得到异步分组密码算法协处理器的GDS版图。 I.e., the height of the junction of the two child nodes o and o1 o2 subtree to h-1; if the input signal o1 or o2 submodule, directly add an additional buffering unit is connected in order of h-2; a first d the height of any subtree of nodes x layer should hd + 1,1 <d <h, and if x is an input signal, directly add an additional buffer unit hd order of connection; traversed through h until 1 layer of the node, if the node is a layer of an input signal, directly whom an increased buffer unit; a third step, after the delay matching sub module integration, i.e. S1, S2, ..., Sn sequentially connected sub-module receiving the initial input signal S1 i.e., plaintext M and the key K; Sl-Sl receives the output produced 1, the result as an input Sl + 1, 2≤l≤n-1; Sn to generate the final operation result i.e., ciphertext C; while connecting the sub-modules corresponding to each of delay matching modules sequentially, after all the sub-modules are integrated to obtain an asynchronous packet cipher coprocessor complete netlist; a fourth step, the back-end layout, to give asynchronous GDS layout block cipher coprocessor.
2.如权利要求1所述的异步分组密码算法协处理器的设计方法,其特征在于在进行后端布局布线时,将所有互补的二输入与门和或门成轴对称放置,使双轨信号具有对称的走线和相同的互联线长度。 2. The method of designing a block cipher asynchronous coprocessor claim, characterized in that the rear end during layout, all the two-input AND gates and OR gates axisymmetric complementary placed the dual-rail signal symmetrical alignment with the same interconnection length.
CN 200810143205 2008-09-16 2008-09-16 Design method of asynchronous block cipher algorithm coprocessor CN100573540C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200810143205 CN100573540C (en) 2008-09-16 2008-09-16 Design method of asynchronous block cipher algorithm coprocessor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200810143205 CN100573540C (en) 2008-09-16 2008-09-16 Design method of asynchronous block cipher algorithm coprocessor

Publications (2)

Publication Number Publication Date
CN101350038A CN101350038A (en) 2009-01-21
CN100573540C true CN100573540C (en) 2009-12-23

Family

ID=40268828

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200810143205 CN100573540C (en) 2008-09-16 2008-09-16 Design method of asynchronous block cipher algorithm coprocessor

Country Status (1)

Country Link
CN (1) CN100573540C (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102447556A (en) * 2010-10-14 2012-05-09 上海华虹集成电路有限责任公司 DES (data encryption standard) encryption method of resisting differential power analysis based on random offset
CN102546160B (en) * 2010-12-08 2016-03-02 上海华虹集成电路有限责任公司 Method for Elliptic Curve Cryptography differential power attack defense
CN102609556A (en) * 2011-01-25 2012-07-25 深圳市证通电子股份有限公司 Method and circuit for designing function of resisting power consumption attack for AES (advanced encryption standard) module
CN103384197B (en) * 2012-05-03 2016-08-31 国家电网公司 A kind of defence circuit, chip and method to grouping algorithm Attacks
CN103986571B (en) * 2014-01-15 2018-04-20 上海新储集成电路有限公司 A kind of smart card multi-core processor system and its method for defending differential power consumption analysis
JP2015191106A (en) * 2014-03-28 2015-11-02 ソニー株式会社 Encryption processing device, encryption processing method, and program
CN104158651B (en) * 2014-07-15 2017-05-24 南京航空航天大学 All-unfolded-structured AES encryption/decryption circuit based on data redundancy real-time error detection mechanism
CN104158652B (en) * 2014-07-15 2017-05-24 南京航空航天大学 Circulating-unfolded-structured AES encryption/decryption circuit based on data redundancy real-time error detection mechanism
CN105069215A (en) * 2015-07-31 2015-11-18 中国人民解放军国防科学技术大学 Double-track signal wiring method based on wide line
CN107241324A (en) * 2017-06-01 2017-10-10 东南大学 Cryptochannel power consumption compensation anti-bypass attack method and circuit based on machine learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6651228B1 (en) 2000-05-08 2003-11-18 Real Intent, Inc. Intent-driven functional verification of digital designs
CN1835207A (en) 2005-03-17 2006-09-20 联想(北京)有限公司 Method of preventing energy analysis attack to RSA algorithm
CN1858722A (en) 2006-03-31 2006-11-08 清华大学 System for improving SRAM process EPGA design safety by asynchronous circuit
CN1983245A (en) 2005-12-14 2007-06-20 戴尔产品有限公司 System and method for configuring information handling system integrated circuits

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6651228B1 (en) 2000-05-08 2003-11-18 Real Intent, Inc. Intent-driven functional verification of digital designs
CN1835207A (en) 2005-03-17 2006-09-20 联想(北京)有限公司 Method of preventing energy analysis attack to RSA algorithm
CN1983245A (en) 2005-12-14 2007-06-20 戴尔产品有限公司 System and method for configuring information handling system integrated circuits
CN1858722A (en) 2006-03-31 2006-11-08 清华大学 System for improving SRAM process EPGA design safety by asynchronous circuit

Also Published As

Publication number Publication date
CN101350038A (en) 2009-01-21

Similar Documents

Publication Publication Date Title
Güneysu et al. Cryptanalysis with COPACOBANA
Nikova et al. Threshold implementations against side-channel attacks and glitches
Tiri et al. Prototype IC with WDDL and differential routing–DPA resistance assessment
Mangard et al. Side-channel leakage of masked CMOS gates
Ma An effective memory addressing scheme for FFT processors
Sklavos et al. Architectures and VLSI implementations of the AES-proposal Rijndael
Hwang et al. AES-Based Security Coprocessor IC in 0.18-$ muhbox m $ CMOS With Resistance to Differential Power Analysis Side-Channel Attacks
Su et al. A high-throughput low-cost AES processor
Maistri et al. Double-data-rate computation as a countermeasure against fault analysis
Mangard et al. Pinpointing the side-channel leakage of masked AES hardware implementations
Chen et al. Dual-rail random switching logic: a countermeasure to reduce side channel leakage
Sokolov et al. Design and analysis of dual-rail circuits for security applications
Popp et al. Masked dual-rail pre-charge logic: DPA-resistance without routing constraints
Trichina et al. Small size, low power, side channel-immune AES coprocessor: design and synthesis results
Satoh et al. ASIC-hardware-focused comparison for hash functions MD5, RIPEMD-160, and SHS
Elbirt et al. An instruction-level distributed processor for symmetric-key cryptography
Mathew et al. 340 mV–1.1 V, 289 Gbps/W, 2090-gate nanoAES hardware accelerator with area-optimized encrypt/decrypt GF (2 4) 2 polynomials in 22 nm tri-gate CMOS
Gupta et al. High-performance hardware implementation for RC4 stream cipher
Pramstaller et al. Efficient AES implementations on ASICs and FPGAs
Zhou et al. Efficient and high-throughput implementations of AES-GCM on FPGAs
Ahmad et al. Design of AES S-Box using combinational logic optimization
Mozaffari-Kermani et al. Efficient and high-performance parallel hardware architectures for the AES-GCM
Sokolov et al. Improving the security of dual-rail circuits
Morioka et al. A 10-Gbps full-AES crypto design with a twisted BDD S-box architecture
Morioka et al. A 10 Gbps full-AES crypto design with a twisted-BDD S-box architecture

Legal Events

Date Code Title Description
C06 Publication
C10 Request of examination as to substance
C14 Granted
C17 Cessation of patent right