CN1750459A

CN1750459A - Method for accelerating common key code operation and its system structure

Info

Publication number: CN1750459A
Application number: CN 200510061070
Authority: CN
Inventors: 沈海斌; 严晓浪
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2005-10-12
Filing date: 2005-10-12
Publication date: 2006-03-22
Anticipated expiration: 2025-10-12
Also published as: CN100518058C

Abstract

This invention discloses a method used in accelerating the operation of public key ciphered codes and its system structure, which utilizes SoC bus line to configure parameters to a register module to finish the definition of public key ciphered code algorithm. A sequential instruction generator module is responsible for generating instructions of realizing dot product or mode power operation and sends a zone bit for finishing the operation to the register module when ending the operation, an operation processor applies a four level pipeline to accomplish instruction and fetches data from a static storage unit.

Description

A kind of method and architecture thereof that is used for the common key code operation acceleration

Technical field

The present invention relates to a kind of method and architecture thereof that common key code operation quickens that be used for, be particularly useful for realizing that the point multiplication operation of ECC and the Montgomery Algorithm of RSA quicken.

Background technology

Current; along with Chinese national economy and IT application to our society process; fast development as Financial Informationization, ecommerce, E-Government; be badly in need of solving the information security issue of key areas information systems such as economy, culture; improve the safeguard protection level, strengthen information security active defense and respond.And most crucial technology is a cryptographic technique in the information security, and public-key cryptosystem wherein emerges a beginning from it especially, has just brought into play great effect at information security field, and application prospect is good.

The realization of common key code operation is based on all greatly that CPU finishes with software.But in the middle of the application of reality, a lot of encryptions are used to break away from computer and are existed, and it is applied in the middle of the various embedded systems.If allow flush bonding processor finish all cryptographic calculations, also to finish other application simultaneously, its real-time just may be affected.And aspect another one, realize cryptographic algorithm with hardware, also far better aspect fail safe than realizing with software.To operating in certain cryptographic algorithm on the general computer that does not have physical protection, can anyone not known with various trace tools secret modification algorithm.And hardware encipher can get up in safe encapsulation, can prevent that as anti-tamper box others from revising hardware encipher equipment, for the VLSI chip, can cover one deck chemical substance, makes any attempt conduct interviews to their inside and all will cause the destruction of chip logic.In view of above-mentioned application scenario, the consideration of real-time and fail safe, the realization of public-key cryptosystem must also need directly to finish with hardware.

At present, being used in the integrated circuit (IC) design realizes that the method for calculating acceleration is: operating instruction is produced by compiler, the processor adopting five-stage pipeline, comprise the fetch phase (IF), instruction decoding/read register the cycle (ID), execution/effective address the cycle (EX), reference to storage (MEM) and write back the cycle (WB).This method often is conceived to the processor calculating unit and improves, or does some improvement on the scheduling strategy.Can improve the efficient of computing to a certain extent.But be to use compiler to produce instruction, can cause the operational performance of whole system to be decided by technique of compiling to a great extent; Also can influence the quality of code, thereby cause computational efficiency to reduce.

Summary of the invention

The objective of the invention is at the deficiencies in the prior art, provide and realize method and the architecture thereof that common key code operation quickens in a kind of integrated circuit.

The method that is used for the common key code operation acceleration of the present invention: it may further comprise the steps:

1) utilize the SoC bus that register module is carried out parameter configuration, in order to finish the definition of public key algorithm;

2) the sequence instruction generator module produces the instruction that realizes dot product or Montgomery Algorithm, and sends to arithmetic processor according to the configuring condition of register module;

3) arithmetic processor is from the static memory cell access data, and adopts instruction decoding/memory to read, instruct and carry out I, instruction execution II and memory stores level Four streamline, finishes the execution of instruction;

4) arithmetic processor sends computing complement mark position to register module when computing finishes;

Above-mentioned sequence instruction generator module can adopt the finite state machine of layering to realize.

Said instruction decoding is finished by decoding circuit, and in one-period to the on-chip memory reading of data; Instruction is carried out I and is mainly carried out two territories multiply-add operation, and instruction is carried out II and mainly carried out shift operation and two territory add operation.

Be used to realize that the architecture of above-mentioned common key code operation accelerated method comprises: comprise the register module of forming by registers group, be used to produce the sequence instruction generator module of realizing dot product or Montgomery Algorithm instruction, by two territories adder and multiplier, shift unit, the arithmetic processor module that two territories adder is formed, realize that two read a static memory module of writing and a SoC bus, register module links to each other with the SoC bus that is connected the external piloting control device respectively with the static memory module, register module links to each other with the sequence instruction generator module, and sequence instruction generator module and static memory module link to each other with the arithmetic processor module respectively.

Operation principle of the present invention:

The external piloting control device carries out parameter configuration by the SoC bus to register module, in order to finish the definition of public key algorithm; By the inquiry to register module computing complement mark position, main controller can read operation result after computing is finished.

The sequence instruction generator module adopts the finite state machine of layering to realize, it produces the instruction that realizes dot product or Montgomery Algorithm according to the configuring condition of register module, and sends computing complement mark position to register module when computing finishes; Main controller can read operation result after inquiring about this flag bit.

The instruction that module is sent takes place in the instruction of arithmetic processor receiving sequence, reads, instructs execution I, instruction to carry out II and memory stores level Four streamline by instruction decoding/memory, finishes the execution of instruction, and from static memory module access data; If data dependence takes place, then feed back signal to sequence instruction generation module, make it stop to send instruction when being necessary.

The static memory module then according to the requirement of arithmetic processor, is finished data access.In a clock cycle, need to realize reading two data, the requirement of data of storage.

The present invention has following technique effect:

1) introducing of sequence instruction generator module makes the generation of call instruction not need to rely on technique of compiling, and is not only simple on using, and algorithm can be realized optimization.

2) command sequence is carried out fully in order, so arithmetic processor only need adopt level Four streamline targetedly, the design of processor is become simply, has also improved operation efficiency.

Description of drawings

Fig. 1 is used for the architectural block diagram that common key code operation quickens.

Embodiment

Further specify the present invention below in conjunction with accompanying drawing.

With reference to Fig. 1, being used for the architecture that common key code operation quickens comprises: the register module of being made up of registers group 1, be used to produce the sequence instruction generator module 2 of realizing dot product or Montgomery Algorithm instruction, by two territories adder and multiplier, shift unit, the arithmetic processor module 3 that two territories adder is formed, realize that two read a static memory module of writing 4 and a SoC bus 5, register module 1 links to each other with the SoC bus 5 that is connected the external piloting control device respectively with static memory module 4, register module 1 links to each other with sequence instruction generator module 2, and sequence instruction generator module 2 and static memory module 4 link to each other with arithmetic processor module 3 respectively.

With elliptic curve cipher (ECC) point multiplication operation and RSA Montgomery Algorithm is example, illustrates that the present invention realizes that common key code operation quickens.

The password of ECC password and RSA, and though its be add, the deciphering field, still in fields such as digital signature, its its main operational is respectively point multiplication operation and Montgomery Algorithm.Main effect of the present invention is exactly to realize the point multiplication operation of ECC and the Montgomery Algorithm of RSA.Finally finishing of public key cryptography then also depends on the control of main controller.That is to say that this architecture is subordinated to main controller work.It plays is the effect that its main operational to ECC and RSA quickens.

The initialization of common key code operation accelerating system: the external piloting control device writes initial data such as plaintext to static memory module 4 by SoC bus 5; After the input of data is finished, write configuration parameter to register module 1 again.

The operation of common key code operation accelerating system: after initialization was finished, system started working.Sequence instruction generator module 2 produces finishes the instruction of ECC dot product or RSA Montgomery Algorithm, and sends it to arithmetic processor module 3.The arithmetic processor module is deciphered it, and reads the data that need from the static memory module, calculates, and after calculating is finished data is deposited back the static memory module.If wherein the data dependence problem takes place, then feed back to sequence instruction generator module 2 according to circumstances, make its temporary transient interrupt instruction produce.After whole computing is finished, send computing complement mark position to register module 1.

The data read of common key code operation accelerating system: after computing is finished, main controller is by the inquiry to register module 1 correlating markings position, perhaps directly provide an interrupt signal and give main controller, make main controller learn the information that computing is finished by the register module among the present invention.Main controller can read operation result from static memory by SoC bus 5.

Claims

1. one kind is used for the method that common key code operation quickens, and it may further comprise the steps:

2. the method that is used for the common key code operation acceleration according to claim 1 is characterized in that the sequence instruction generator module adopts the finite state machine of layering to realize.

3. according to claim 1ly be used for the method that common key code operation quickens, it is characterized in that said instruction decoding finished by decoding circuit, and in one-period to the on-chip memory reading of data; Instruction is carried out I and is mainly carried out two territories multiply-add operation, and instruction is carried out II and mainly carried out shift operation and two territory add operation.

4. the architecture that is used for the described common key code operation accelerated method of claim 1, it is characterized in that comprising: the register module of forming by registers group (1), be used to produce the sequence instruction generator module (2) of realizing dot product or Montgomery Algorithm instruction, by two territories adder and multiplier, shift unit, the arithmetic processor module (3) that two territories adder is formed, realize that two read a static memory module of writing (4) and a SoC bus (5), register module (1) links to each other with the SoC bus (5) that is connected the external piloting control device respectively with static memory module (4), register module (1) links to each other with sequence instruction generator module (2), and sequence instruction generator module (2) and static memory module (4) link to each other with arithmetic processor module (3) respectively.