JPH05324277A

JPH05324277A - Code communication method

Info

Publication number: JPH05324277A
Application number: JP4124982A
Authority: JP
Inventors: Keiichi Iwamura; 恵市岩村; Takahisa Yamamoto; 貴久山本
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1992-05-18
Filing date: 1992-05-18
Publication date: 1993-12-07
Anticipated expiration: 2017-07-15
Also published as: JP3302043B2

Abstract

PURPOSE:To provide the circuit executing the power residue arithmetic operation and the residue multiplication at high speed with smaller circuit by repeatedly executing the residue multiplication using both modulo N and prime R of the residue. CONSTITUTION:In the residue arithmetic circuit, the outputs for input pairs (A, RR), (B, RR), (AR, BR) (TR, 1) are AR, BR, TR, and Q. In this case, the power residue arithmetic operation and the residue multiplication are executed by repeating the operation of Z=X.Y.R<-1>mod N. Therefore, the required arithmetic operation is executed by the same or similar type arithmetic circuit. In performing the arithmetic operation with the use of the Montgomery residue multiplication Z=X.Y.R<-1>modN=(X.Y+S.N)/R, in this case, S=X.Y.N'modN, the residue multiplication and the power residue arithmetic operation can be executed while simply repeating the Montgomery residue multiplication by using the input value satisfying the condition.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明はコンピュータネットワー
クにおけるホームバンク、ファームバンク、電子メール
及び電子会議などの様々な通信サービスに用いられる暗
号化技術に関する。特にべき乗剰余演算及び剰余乗算を
用いる暗号方式（ＲＳＡ暗号、エルガマル暗号等）、鍵
共有方式（ＤＨ型鍵共有方式、ＩＤ-based鍵共有方式
等）、零知識証明方式等を用いて暗号通信を行うシステ
ムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an encryption technique used for various communication services such as home bank, firm bank, electronic mail and electronic conference in a computer network. In particular, cryptographic communication using power-residue calculation and modular multiplication (RSA cryptography, Ergamal cryptography, etc.), key sharing schemes (DH type key sharing scheme, ID-based key sharing scheme, etc.), zero-knowledge proof scheme, etc. Regarding the system to do.

【０００２】[0002]

【従来の技術】近年、コンピュータネットワークを用い
た情報通信システムの急速な進展とともに、データ内容
の保護を目的とする暗号技術の重要性が高まっている。
特に、コンピュータネットワークの高速化・大容量化が
進展する中で、高速な暗号技術が不可欠になりつつあ
る。2. Description of the Related Art In recent years, with the rapid development of information communication systems using computer networks, the importance of encryption technology for protecting data contents is increasing.
In particular, as the speed and capacity of computer networks have increased, high-speed encryption technology has become indispensable.

【０００３】かかる暗号技術において、べき乗剰余演算
及び剰余乗算は種々の暗号技術に用いられている重要な
演算であり、次のような利用例を挙げることができる。In such cryptographic techniques, the modular exponentiation operation and the modular multiplication are important operations used in various cryptographic techniques, and the following usage examples can be given.

【０００４】まず、暗号方式には秘密鍵暗号方式と公開
鍵暗号方式があることが知られている。公開鍵暗号方式
では暗号化鍵と復号鍵とが異なり、暗号化鍵は公開し、
復号鍵は受信者が秘密に保持するもので、公開された暗
号化鍵から復号鍵を推定するのが困難なようになってい
るものである。その公開鍵暗号方式としてＲＳＡ暗号や
エルガマル暗号などのべき乗剰余演算及び剰余乗算に基
づく暗号がよく用いられている。更にこれらの暗号は、
秘密通信機能の他に認証と呼ばれるもう１つの用途があ
ることが注目されている。認証とは、通信文の送信者が
正しいかどうかを検査する機能であり、ディジタル署名
とも呼ばれている。これらの暗号を用いたディジタル署
名では、送信者のみが知っている秘密鍵で署名でき、偽
造できないので安全であり認証通信として金融機関など
で多く用いられている。First, it is known that there are a secret key cryptosystem and a public key cryptosystem as cryptosystems. In the public key cryptosystem, the encryption key and the decryption key are different, and the encryption key is open,
The decryption key is kept secret by the receiver, and it is difficult to deduce the decryption key from the publicly available encryption key. As the public key cryptosystem, a cipher based on a modular exponentiation operation and a modular multiplication such as RSA cipher and Ergamal cipher is often used. Furthermore, these ciphers
It is noted that there is another application called authentication besides the secret communication function. Authentication is a function of checking whether the sender of a message is correct, and is also called a digital signature. Digital signatures using these ciphers are safe because they can be signed with a private key known only to the sender and cannot be forged, and are widely used as authentication communication at financial institutions and the like.

【０００５】また、同一の鍵を送信者と受信者が秘密に
共有する秘密鍵暗号方式として、乱数をデータに加える
バーナム暗号が知られているが、その乱数として平方剰
余と呼ばれるべき乗剰余演算及び剰余乗算に基づく乱数
が知られている。As a secret key cryptosystem in which the same key is secretly shared by the sender and the receiver, the Vernam cipher which adds a random number to data is known, and as a random number, a modular exponentiation operation called a modular exponentiation and Random numbers based on modular multiplication are known.

【０００６】また、以上の秘密鍵暗号方式及び公開鍵暗
号方式は、鍵配送方式または鍵共有方式と呼ばれる技術
とともに用いられることが多い。鍵配送方式としては、
DiffieとHellman によるＤＨ型鍵配送方式がよく知られ
ているが、この方式もべき乗剰余演算及び剰余乗算を用
いて演算を行う。さらに、鍵共有方式としてＩＤ-based
鍵共有方式が注目されているが、この方式を含む種々の
鍵共有方式においてべき乗剰余演算及び剰余乗算が用い
られている。Further, the secret key cryptosystem and the public key cryptosystem described above are often used together with a technique called a key distribution system or a key sharing system. As a key distribution method,
The DH-type key distribution system by Diffie and Hellman is well known, but this system also performs arithmetic using modular exponentiation and modular multiplication. Furthermore, ID-based is used as a key sharing method.
The key sharing system has been attracting attention, and exponentiation modular exponentiation and modular multiplication are used in various key sharing systems including this system.

【０００７】他に、暗号技術には零知識証明と呼ばれる
ものがある。これは自分がある知識を持っていること
を、その内容をいっさい告げることなく（＝零知識）、
相手に納得させる（＝証明）方法である。これにも、べ
き乗剰余演算及び剰余乗算に基づく種々の手法がある。Another cryptographic technique is called zero-knowledge proof. This tells me that I have some knowledge (= zero knowledge),
It is a method to convince the other party (= proof). Again, there are various techniques based on the modular exponentiation operation and the modular multiplication.

【０００８】以上の暗号技術の詳細については池野信
一，小山謙二著“現代暗号理論”，電子情報通信学会
（1986）及び辻井重男，笠原正雄著“暗号と情報セキュ
リティ”，昭晃堂(1990)等に詳しく説明されている。For details of the above cryptographic techniques, Shinichi Ikeno and Kenji Koyama “Modern Cryptography”, The Institute of Electronics, Information and Communication Engineers (1986) and Shigeo Tsujii, Masao Kasahara “Cryptography and Information Security”, Shokodo (1990) ) And the like.

【０００９】従って、種々の暗号システムを効率よく構
成するために、効率的なべき乗剰余演算及び剰余乗算回
路の実現が望まれていた。更に、高速なべき乗剰余演算
及び剰余乗算回路が構成できれば、種々の暗号システム
の高速化が実現できる。Therefore, in order to efficiently construct various cryptographic systems, it has been desired to realize an efficient modular exponentiation and modular multiplication circuit. Further, if a high-speed modular exponentiation and modular multiplication circuit can be configured, various cryptographic systems can be speeded up.

【００１０】ところで、Ｎを法とする剰余乗算を演算す
る方法として、Ｎと素な整数Ｒを用いて演算を行う手法
がある。例えば、モンゴメリーによって提案された手法
〔モンゴメリー法〕（Montgomery,P.L.:“Modular mult
iplication without trial division, ”Math. of Comp
utation,Vol.44,1985,pp.519-521 ）は、Ｑ＝Ａ・Ｂmod
Ｎの代わりにＱ＝Ａ・Ｂ・Ｒ^-1 mod Ｎを演算するこ
とで、除算を行うことなしに剰余乗算を計算することが
できる。By the way, as a method of calculating a modular multiplication modulo N, there is a method of using an integer R prime to N. For example, the method proposed by Montgomery (Montgomery method) (Montgomery, PL: “Modular mult
iplication without trial division, ”Math. of Comp
utation, Vol.44,1985, pp.519-521), Q = A ・ B mod
By calculating Q = A · B · R ⁻¹ mod N instead of N, the remainder multiplication can be calculated without performing division.

【００１１】一方、処理を高速化していく１つの手法と
して、並列処理がある。その代表的なアーキテクチャと
してシストリックアレイが知られている。シストリック
アレイは処理を数種類の演算素子（プロセッシング・エ
レメント：以後ＰＥ）によるパイプライン処理によって
実行し、高速処理を実現する。また、制御がＰＥ単位の
局所的なものですみ容易である。従って、シストリック
アレイは全体構造の規則性とＰＥ単位の局所性を有し、
ＶＬＳＩ等の大規模な処理の装置化を容易にするアーキ
テクチャとして知られている。このような並列処理的手
法は大規模な処理を必要とする大きな整数に対するべき
乗剰余演算及び剰余乗算の高速化にも適していると考え
られるが、従来の手法の中でシストリックアレイ等の並
列処理的手法をべき乗剰余演算及び剰余乗算に対して適
用したものは殆どなかった。On the other hand, there is parallel processing as one method for speeding up the processing. A systolic array is known as a typical architecture. The systolic array realizes high-speed processing by executing processing by pipeline processing by several kinds of processing elements (processing elements: hereinafter PE). In addition, the control is local and is easy in PE units. Therefore, the systolic array has regularity of the whole structure and locality of PE unit,
It is known as an architecture that facilitates deviceization of large-scale processing such as VLSI. Such parallel processing method is considered to be suitable for speeding up modular exponentiation and modular multiplication for large integers that require large-scale processing, but among conventional methods, parallel processing such as systolic array Almost no processing techniques have been applied to modular exponentiation and modular multiplication.

【００１２】そこで、本出願人は、先に特願平3-225986
号として、シストリックアレイを用いた剰余乗算回路を
提案したが、これはモンゴメリー法を用いたものではな
い。一方、モンゴメリー法を用いたアレイがイブンによ
って提案されている。（ Shimon Even: “Systolic mod
ular multiplication,”Advances in Cryptology-CRYPT
O'90,pp.619-624,Springer-Verlag.）Therefore, the applicant of the present invention has previously filed Japanese Patent Application No. 3-225986.
As the issue, we proposed a modular multiplication circuit using a systolic array, but it does not use the Montgomery method. On the other hand, an array using the Montgomery method has been proposed by Ibn. (Shimon Even: “Systolic mod
ular multiplication, ”Advances in Cryptology-CRYPT
O'90, pp.619-624, Springer-Verlag.)

【００１３】[0013]

【発明が解決しようとしている課題】上述のような暗号
システムに用いられるべき乗剰余演算及び剰余乗算で用
いられる整数は、十分な安全性を確保するために５１２
ビット以上のビット数を持つことが要求される。このよ
うに大きな整数に対するべき乗剰余演算及び剰余乗算を
通常のコンピュータを用いて高速に演算することは困難
であった。The integer used in the modular exponentiation operation and the modular multiplication used in the above-described cryptosystem is 512 in order to ensure sufficient security.
It is required to have more bits than bits. Thus, it has been difficult to perform high-speed modular exponentiation and modular multiplication for large integers using a normal computer.

【００１４】また、モンゴメリー法を繰り返してべき乗
剰余演算を実行する場合、剰余乗算を繰り返す度に出力
の最大ビット数が大きくなり、同じ回路によってべき乗
剰余演算を実行することは困難であった。これについ
て、イブンのアレイは、剰余乗算出力のビット数が入力
値のビット数を越えた場合の処理を行うＰＥについて示
されておらず、べき乗剰余演算に対しては不十分なもの
になっている。Further, when the modular exponentiation calculation is executed by repeating the Montgomery method, the maximum number of bits of the output increases each time the modular multiplication is repeated, and it is difficult to execute the modular exponentiation calculation by the same circuit. In this regard, Ibn's array is not shown for a PE that performs processing when the number of bits of the modular multiplication output exceeds the number of bits of the input value, and becomes insufficient for the modular exponentiation operation. There is.

【００１５】さらに、従来のモンゴメリー法は後述する
ようにＱ＝Ａ・Ｂ・Ｒ^-1 mod Ｎの演算を行う前後に、
Ａ，Ｂ及びＱに対して別の演算を行う必要があり、数種
類の演算手段が必要であった。Further, in the conventional Montgomery method, as described later, before and after the calculation of Q = A · B · R ⁻¹ mod N,
It is necessary to perform another calculation for A, B, and Q, and several kinds of calculation means are necessary.

【００１６】また、特に、上述のイブンのアレイは、乗
算Ｔ＝Ａ・Ｂを実行するアレイと、定数として扱われる
Ｒに対する剰余演算Ｑ＝Ｔ・Ｒ^-1 mod Ｎを実行するア
レイから構成されている。従って、イブンのシストリッ
クアレイは、Ｔを演算するアレイとＱを演算するアレイ
が２種類必要であるために効率的ではなかった。さら
に、ＰＥ内で行なれる演算として１ビット毎の演算のみ
を提案しており、柔軟性に欠けていた。Further, in particular, the above-mentioned Ibn array is composed of an array for performing multiplication T = A · B and an array for performing a remainder operation Q = T · R ⁻¹ mod N for R treated as a constant. ing. Therefore, the Ibn systolic array is not efficient because two types of arrays are required to calculate T and Q. Furthermore, only the operation for each bit is proposed as the operation that can be performed in the PE, which lacks flexibility.

【００１７】[0017]

【課題を解決するための手段】そこで、本発明の目的
は、上述の欠点を除去し、暗号通信におけるべき乗剰余
演算及び剰余乗算を、剰余の法となるＮと素であるＲを
用いた剰余乗算を繰り返すだけで実行する方法を提供す
ることにある。SUMMARY OF THE INVENTION Therefore, an object of the present invention is to eliminate the above-mentioned drawbacks, and to perform a modular exponentiation operation and a modular multiplication in cryptographic communication, using a modulo R that is a modulo N modulo R. It is to provide a method of executing only by repeating multiplication.

【００１８】また、本発明の他の目的は、モンゴメリー
法を用いて、より小さな回路規模で高速にべき乗剰余演
算及び剰余乗算を実行する回路を実現することにある。Another object of the present invention is to realize a circuit for executing a modular exponentiation operation and a modular multiplication at high speed with a smaller circuit scale by using the Montgomery method.

【００１９】かかる課題を解決するために、本発明で
は、Ｎを法とする整数Ａ、Ｂの剰余乗算Ｑ＝Ａ・Ｂ mod
Ｎを利用して、通信内容の暗号化または復号を行なう
暗号通信方法において、入力データＵ、Ｖに対して、Ｎ
と素である整数Ｒを用いて、Ｚ＝Ｕ・Ｖ・Ｒ^-1 mod Ｎ
を演算して出力する演算部を１つ以上具える。In order to solve such a problem, according to the present invention, a modular multiplication Q = A · B mod of integers A and B modulo N.
In an encrypted communication method for encrypting or decrypting communication contents by using N, N is applied to input data U and V.
Using an integer R which is prime to Z = U · V · R ⁻¹ mod N
It has one or more calculation units for calculating and outputting.

【００２０】また、本発明の他の態様によれば、Ｎを法
とする整数Ｍ、ｅに関するべき乗剰余演算：Ｃ＝Ｍ^e mo
d Ｎを利用して、通信内容の暗号化または復号を行なう
暗号通信方法において、入力データＵ、Ｖに対して、Ｎ
と素である整数Ｒを用いて、Ｚ＝Ｕ・Ｖ・Ｒ^-1 mod Ｎ
を演算して出力する演算部を１つ以上具える。According to another aspect of the present invention, a modular exponentiation operation for integers M and e modulo N: C = M ^e mo
In the encrypted communication method for encrypting or decrypting communication contents using dN, N is applied to input data U and V.
Using an integer R which is prime to Z = U · V · R ⁻¹ mod N
It has one or more calculation units for calculating and outputting.

【００２１】また、本発明の他の態様によれば、入力さ
れた整数Ａ、Ｂに対するＮを法とした剰余乗算Ｑ＝Ａ・
Ｂ mod Ｎを利用して、通信内容の暗号化または復号を
行なう暗号通信方法において、Ｎと素である整数Ｒを用
いて、入力されたＡ及び前記ＲよりＡ・Ｒ mod Ｎを演
算してその結果をＡ_R とする演算工程と、入力されたＢ
及び前記ＲよりＢ・Ｒ mod Ｎを演算してその結果をＢ
_R とする演算工程と、前記演算結果Ａ_R 、Ｂ_R 及び前記
Ｒに基づき、Ａ_R・Ｂ_R ・Ｒ^-1 mod Ｎを求めてその結果
をＴ_R とする演算工程と、前記Ｔ_R と前記ＲとによりＴ
_R・Ｒ^-1 mod Ｎを演算し、その結果としてＱを求める演
算工程とを有し、前記Ｔ_R を求める演算工程に、Ａ_i を
任意の整数ｖによる前記Ａ_R のｖビット毎の分割、Ｙ＝
２^v として、Ｔ_i ＝( Ｔ_i-1 ＋Ａ_i・Ｂ_R・Ｙ＋Ｍ_i-1・Ｎ)/ＹＭ_i-1 ＝( Ｔ_i-1 mod Ｙ)・( −Ｎ^-1 mod Ｙ) mod Ｙの順次演算により実行する演算工程とを具える。According to another aspect of the present invention, the modular multiplication Q = A.multiplied to the input integers A and B modulo N.
In an encrypted communication method for encrypting or decrypting communication contents using B mod N, an integer R which is a prime to N is used to calculate A · R mod N from the input A and R. The calculation process with the result as A _R and the input B
And B · R mod N is calculated from R and the result is B
A calculation step of the _R, based on the calculation result A _R, B _R and the R, a calculation step of the result with T _R seeking _{_{^{A R · B R · R -1}}} mod N, said T _R By R and T
Calculates the _R · _R ^-1 mod N, as a result and a calculation step of calculating Q, the computation step of obtaining the T _R, divided for each v bits of the A _R the A _i by any integer v , Y =
2 ^v , T _i = (T _i-1 + A _i · B _R · Y + M _i-1 · N) / Y M _i-1 = (T _i-1 mod Y) · (-N ^-1 mod Y) mod And a calculation step executed by sequential calculation of Y 1.

【００２２】また、本発明の他の態様によれば、入力さ
れた整数Ａ、Ｂに対するＮを法とした剰余乗算Ｑ＝Ａ・
Ｂ mod Ｎを利用して、通信内容の暗号化または復号を
行なう暗号通信方法において、Ｎと素である整数Ｒを用
いて、入力されたＡ及び前記ＲよりＡ・Ｒ mod Ｎを演
算してその結果をＡ_R とする演算工程と、入力されたＢ
及び前記ＲよりＢ・Ｒ mod Ｎを演算してその結果をＢ
_R とする演算工程と、前記演算結果Ａ_R 、Ｂ_R 及び前記
Ｒに基づき、Ａ_R・Ｂ_R ・Ｒ^-1 mod Ｎを求めてその結果
をＴ_R とする演算工程と、前記Ｔ_R と前記ＲとによりＴ
_R・Ｒ^-1 mod Ｎを演算し、その結果としてＱを求める演
算工程とを有し、前記Ｔ_R を求める演算工程に、Ａ_iを
任意の整数ｖによる前記Ａ_R のｖビット毎の分割、Ｙ＝
２^v として、Ｔ_i ＝( Ｔ_i-1 ／Ｙ＋Ａ_i・Ｂ_R ）＋Ｍ_i・ＮＭ_i-1 ＝((Ｔ_i-1 ／Ｙ＋Ａ_i・Ｂ_R ）mod Ｙ)・( −Ｎ^-1 mod Ｙ) mod Ｙの順次演算により実行する演算工程を具える。Further, according to another aspect of the present invention, a modular multiplication Q = A.multiplied to the input integers A and B modulo N.
In an encrypted communication method for encrypting or decrypting communication contents using B mod N, an integer R which is a prime to N is used to calculate A · R mod N from the input A and R. The calculation process with the result as A _R and the input B
And B · R mod N is calculated from R and the result is B
A calculation step of the _R, based on the calculation result A _R, B _R and the R, a calculation step of the result with T _R seeking _{_{^{A R · B R · R -1}}} mod N, said T _R By R and T
Calculates the _R · _R ^-1 mod N, as a result and a calculation step of calculating Q, the computation step of obtaining the T _R, divided for each v bits of the A _R the A _i by any integer v , Y =
2 ^v , T _i = (T _i-1 / Y + A _i · B _R ) + M _i · N M _i-1 = ((T _i-1 / Y + A _i · B _R ) mod Y) · (-N ⁻¹ mod Y) Computation step executed by sequential computation of mod Y.

【００２３】[0023]

【作用】入力データＵ、Ｖに対して、Ｎと素である整数
Ｒを用いて、Ｚ＝Ｕ・Ｖ・Ｒ^-1mod Ｎを演算して出力
する１つ以上の演算部に対して、Ａと、Ｒ_R ＝Ｒ² mod
Ｎなる定数Ｒ_R とを入力して、Ａ_R ＝Ａ・Ｒ_R・Ｒ^-1 mod
Ｎを出力させ、Ｂと、前記定数Ｒ_R とを入力して、Ｂ_R
＝Ｂ・Ｒ_R・Ｒ^-1 mod Ｎを出力させ、出力された前記Ａ
_R と前記Ｂ_R とを入力して、Ｔ_R ＝Ａ_R・Ｂ_R ・Ｒ^-1 mod
Ｎを出力させ、出力された前記Ｔ_R と定数１とを入力
して、Ｔ_R・１・Ｒ^-1 mod ＮをＱとして出力させること
により、前記剰余乗算Ｑ＝Ａ・Ｂ mod Ｎを実行する。With respect to the input data U and V, using one or more integers R that are prime to N, Z = U · V · R ⁻¹ mod N is calculated and output to one or more calculation units. A and R _R = R ² mod
Input N constant R _R and input A _R = A ・ R _R・ R ^-1 mod
N is output, B and the constant R _R are input, and B _R
= B · R _R · R ⁻¹ mod N is output, and the output A
Enter and said and _{_{_{R B R, T R = A}}} R · B R · R -1 mod
N is output, the output T _R and constant 1 are input, and T _R · 1 · R ⁻¹ mod N is output as Q, whereby the modular multiplication Q = A · B mod N is executed. To do.

【００２４】入力データＵ、Ｖに対して、Ｎと素である
整数Ｒを用いて、Ｚ＝Ｕ・Ｖ・Ｒ^-1mod Ｎを演算して
出力する１つ以上の演算部に対して、Ｍと、Ｒ_R ＝Ｒ²
modＮなる定数Ｒ_R とを入力して、Ｍ_R ＝Ｍ・Ｒ_R・Ｒ^-1
modＮを出力させ、ｅの２進表現をｅ＝〔ｅ^t,ｅ^t-1,…,
ｅ¹ 〕とし、Ｃ_R の初期値をＣ_R ＝Ｒ_R・Ｒ^-1 modＮと
して、順次高位ビットからのｅⁱ の値に従って、ｅⁱ ＝
１なるときに、前記演算部に対してＣ_R とＭ_R とを入力
して、Ｃ_R・Ｍ_R・Ｒ^-1 modＮを新たなＣ_R として出力さ
せ、更に、前記ｅⁱ におけるｉが１より大なるときに
は、前記演算部に対して２つの入力データとして共にＣ
_R を入力して、Ｃ_R・Ｃ_R・Ｒ^-1 mod Ｎを新たなＣ_R とし
て出力させ、全ての前記ｅⁱ に対する処理の終了後に、
前記演算部に対してＣ_R と定数１とを入力して、Ｃ＝Ｃ
_R・１・Ｒ^-1 mod Ｎを出力させることにより、前記べき
乗剰余演算Ｃ＝Ｍ^e mod Ｎを実行する。With respect to the input data U and V, using one or more integer R, which is a prime to N, Z = U · V · R ⁻¹ mod N is calculated and output to one or more calculation units, M and R _R = R ²
Input the constant R _R mod _N and M _R = M · R _R · R ⁻¹
modN is output, and the binary representation of e is e = [e ^t , e ^t-1 , ...,
and e ^1], the initial value of C _R as _{_{^{C R = R R · R -1}}} modN, according to the value of e ⁱ from sequentially higher order bits, e ⁱ =
When it becomes 1, C _R and M _R are input to the arithmetic unit to output C _R · M _R · R ⁻¹ mod _N as a new C _R , and ⁱ in i i is 1 When it becomes larger, C is used as two input data to the arithmetic unit.
_R is input and C _R C _R R ^-1 mod N is output as a new C _R , and after the processing for all the e ⁱ is finished,
By inputting C _R and a constant 1 to the arithmetic unit, C = C
By outputting _R · 1 · R ⁻¹ mod N, the modular exponentiation operation C = M ^e mod N is executed.

【００２５】入力データＵ、Ｖに対して、Ｎと素である
整数Ｒを用いて、Ｚ＝Ｕ・Ｖ・Ｒ^-1mod Ｎを演算して
出力する１つ以上の演算部に対して、Ｍと、Ｒ_R ＝Ｒ²
modＮなる定数Ｒ_R とを入力して、Ｍ_R ＝Ｍ・Ｒ_R・Ｒ^-1
modＮを出力させ、ｅの２進表現をｅ＝〔ｅ^t,ｅ^t-1,…,
ｅ¹ 〕とし、Ｃ_R の初期値をＣ_R ＝Ｒ_R・Ｒ^-1 modＮと
して、順次低位ビットからのｅⁱ の値に従って、ｅⁱ ＝
１なるときに、前記演算部に対してＣ_R とＭ_R とを入力
して、Ｃ_R・Ｍ_R・Ｒ^-1 modＮを新たなＣ_R として出力さ
せ、更に、前記ｅⁱ におけるｉがｔより小なるときに
は、前記演算部に対して２つの入力データとして共にＭ
_R を入力して、Ｍ_R・Ｍ_R・Ｒ^-1 mod Ｎを新たなＭ_R とし
て出力させ、全ての前記ｅⁱ に対する処理の終了後に、
前記演算部に対してＣ_R と定数１とを入力して、Ｃ＝Ｃ
_R・１・Ｒ^-1 mod Ｎを出力させることにより、前記べき
乗剰余演算Ｃ＝Ｍ^e mod Ｎを実行する。For the input data U and V, an integer R that is a prime to N is used, and Z = U · V · R ⁻¹ mod N is calculated and output to one or more calculation units. M and R _R = R ²
Input the constant R _R mod _N and M _R = M · R _R · R ⁻¹
modN is output, and the binary representation of e is e = [e ^t , e ^t-1 , ...,
and e ^1], the initial value of C _R as _{_{^{C R = R R · R -1}}} modN, according to the value of e ⁱ from sequential low-order bits, e ⁱ =
When it becomes 1, C _R and M _R are input to the arithmetic unit to output C _R · M _R · R ⁻¹ mod _N as a new C _R , and further, ⁱ in e i is t. When it becomes smaller, M is used as two input data for the arithmetic unit.
_R is input and M _R , M _R , R ⁻¹ mod N is output as a new M _R , and after the processing for all the e ⁱ is finished,
By inputting C _R and a constant 1 to the arithmetic unit, C = C
By outputting _R · 1 · R ⁻¹ mod N, the modular exponentiation operation C = M ^e mod N is executed.

【００２６】入力された整数Ａ、Ｂに対するＮを法とし
た剰余乗算Ｑ＝Ａ・Ｂ mod Ｎを、Ｎと素である整数Ｒ
を用いて、入力されたＡ及び前記ＲよりＡ・Ｒ mod Ｎ
を演算してその結果をＡ_R とし、入力されたＢ及び前記
ＲよりＢ・Ｒ mod Ｎを演算してその結果をＢ_R とし、
前記演算結果Ａ_R 、Ｂ_R 及び前記Ｒに基づき、Ａ_R・Ｂ_R
・Ｒ^-1 mod Ｎを求めてその結果をＴ_R とし、前記Ｔ_R
と前記ＲとによりＴ_R・Ｒ^-1 mod Ｎを演算し、その結果
としてＱを求めるようにし、前記Ｔ_R を求める演算を、
Ａ_i を任意の整数ｖによる前記Ａ_R のｖビット毎の分
割、Ｙ＝２^v として、Ｔ_i ＝( Ｔ_i-1 ＋Ａ_i・Ｂ_R・Ｙ＋Ｍ_i-1・Ｎ)/ＹＭ_i-1 ＝( Ｔ_i-1 mod Ｙ)・( −Ｎ^-1 mod Ｙ) mod Ｙの順次演算により実行する。The remainder multiplication Q = A · B mod N modulo N to the input integers A and B is an integer R that is prime to N.
By using the input A and the R, A · R mod N
To calculate the result as A _R , calculate B · R mod N from the input B and _R, and set the result as B _R ,
Based on the calculation results A _R , B _R and R, A _R · B _R
- seeking R ^-1 mod N to the result with T _R, wherein T _R
And R, T _R · R ⁻¹ mod N is calculated, Q is obtained as a result, and the calculation of T _R is performed as follows.
_Assuming that A _i is a v-bit division of A _R by an arbitrary integer v, Y = 2 ^v , T _i = (T _i−1 + A _i · B _R · Y + M _i-1 · N) / Y M _{i- 1} = (T _i−1 mod Y) · (−N ⁻¹ mod Y) mod Y

【００２７】入力された整数Ａ、Ｂに対するＮを法とし
た剰余乗算Ｑ＝Ａ・Ｂ mod Ｎを、Ｎと素である整数Ｒ
を用いて、入力されたＡ及び前記ＲよりＡ・Ｒ mod Ｎ
を演算してその結果をＡ_R とし、入力されたＢ及び前記
ＲよりＢ・Ｒ mod Ｎを演算してその結果をＢ_R とし、
前記演算結果Ａ_R 、Ｂ_R 及び前記Ｒに基づき、Ａ_R・Ｂ_R
・Ｒ^-1 mod Ｎを求めてその結果をＴ_R とし、前記Ｔ_R
と前記ＲとによりＴ_R・Ｒ^-1 mod Ｎを演算し、その結果
としてＱを求めるようにし、前記Ｔ_R を求める演算を、
Ａ_i を任意の整数ｖによる前記Ａ_R のｖビット毎の分
割、Ｙ＝２^v として、Ｔ_i ＝( Ｔ_i-1 ／Ｙ＋Ａ_i・Ｂ_R ）＋Ｍ_i・ＮＭ_i-1 ＝((Ｔ_i-1 ／Ｙ＋Ａ_i・Ｂ_R ）mod Ｙ)・( −Ｎ^-1 mod Ｙ) mod Ｙの順次演算により実行する。The remainder multiplication Q = A · B mod N modulo N to the input integers A and B is an integer R that is prime to N.
By using the input A and the R, A · R mod N
To calculate the result as A _R , calculate B · R mod N from the input B and _R, and set the result as B _R ,
Based on the calculation results A _R , B _R and R, A _R · B _R
- seeking R ^-1 mod N to the result with T _R, wherein T _R
And R, T _R · R ⁻¹ mod N is calculated, Q is obtained as a result, and the calculation of T _R is performed as follows.
_Assuming that A _i is a v-bit division of A _R by an arbitrary integer v, Y = 2 ^v , T _i = (T _i-1 / Y + A _i · B _R ) + M _i · N M _i-1 = (( T _i-1 / Y + A _i · B _R ) mod Y) · (−N ⁻¹ mod Y) mod Y is executed in sequence.

【００２８】[0028]

【実施例】以下、Ｎを法とする剰余乗算を、Ｎと素であ
る整数Ｒを用いた値に対する剰余乗算として、モンゴメ
リーによって提案された手法（モンゴメリー法）を例に
とり説明を行う。まず、べき乗剰余演算及び剰余乗算を
用いる暗号システムについて示し、次にモンゴメリー法
を用いたべき乗剰余演算及び剰余乗算を行う場合の前後
の処理法とモンゴメリー法を用いた剰余乗算の入出力の
整合性について示す。さらに、モンゴメリー法を実行す
るＰＥを示し、それを複数並列に用いることによってべ
き乗剰余演算及び剰余乗算を効率的に実行する回路を示
す。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS A modular multiplication modulo N will be described below as an example of a modular multiplication (Montgomery method) proposed by Montgomery as a modular multiplication for a value using an integer R that is a prime to N. First, we show a cryptographic system that uses modular exponentiation and modular multiplication, and then the processing before and after the modular exponentiation and modular multiplication using the Montgomery method and the input / output consistency of the modular multiplication using the Montgomery method. About. Furthermore, a PE that executes the Montgomery method is shown, and a circuit that efficiently uses the modular exponentiation operation and the modular multiplication by using the PEs in parallel is shown.

【００２９】〔暗号システム〕まず、図１に示すｎ対ｎ
の通信系における暗号システムについて説明する。図１
における結線はローカルエリアネットワーク（ＬＡＮ）
のような局所的な通信網、または電話回線のような大域
的な通信網を表す。ここで、Ａ〜Ｚは利用者であり、そ
れぞれに通信網につながるための通信機（通信端末）Ｔ
が割り当てられている。暗号装置は、入力された情報を
暗号化して出力するものであり、例えば、その通信機Ｔ
に内臓させ、通信機Ｔが暗号化情報を出力する構成とし
てもよいし、通信機Ｔと通信網の間に挿入させて、通信
機Ｔの出力を暗号化して通信網に出力するようにしても
よい。また、通信機に接続され、通信機に情報を出力す
る装置に内臓させることもできる。また、暗号装置が通
信機に常時接続されていなくても、ＩＣカードのような
携帯用の装置に暗号装置を内蔵し、必要なときに通信機
または通信機に接続された装置と接続する構成としても
よい。このような暗号装置によって、秘密通信や認証通
信及び鍵共有，零知識証明などの暗号通信を行うことが
できる。[Cryptographic System] First, n to n shown in FIG.
The cryptographic system in the communication system will be described. Figure 1
Local area network (LAN)
Or a global communication network such as a telephone line. Here, A to Z are users, and a communication device (communication terminal) T for connecting to each of the communication networks.
Has been assigned. The encryption device encrypts input information and outputs it. For example, the communication device T
The communication device T may output the encrypted information, or the communication device T may be inserted between the communication device T and the communication network so that the output of the communication device T is encrypted and output to the communication network. Good. Further, it can be incorporated in a device which is connected to the communication device and outputs information to the communication device. Further, even if the encryption device is not always connected to the communication device, the encryption device is built in a portable device such as an IC card and connected to the communication device or a device connected to the communication device when necessary. May be With such an encryption device, secret communication, authentication communication, key sharing, zero-knowledge proof, and other encrypted communication can be performed.

【００３０】このような暗号装置で必要となる、剰余乗
算回路またはべき乗剰余演算回路の例としては、平文Ｍ
を入力し、ｅ及びＮを他の入力または記憶された値とし
て、暗号Ｃ＝Ｍ^e mod Ｎを出力するべき乗剰余演算回路
を考えることができる。この場合、べき乗剰余演算回路
が暗号装置そのものとなる。秘密通信の場合、同様の暗
号装置により、逆演算Ｍ＝Ｃ^d mod Ｎによって復号を行
うこともある。また、剰余乗算回路またはべき乗剰余演
算回路を暗号装置の一部として、暗号装置への外部から
の入力あるいは暗号装置内の他の処理部の処理結果をこ
の回路に入力して、演算を実行し、演算結果を暗号装置
の外部への出力あるいは暗号装置内の他の処理部に対す
る入力とする構成をとることもできる。As an example of the modular multiplication circuit or the modular exponentiation circuit required in such an encryption device, a plaintext M is used.
It is possible to consider a modular exponentiation operation circuit that outputs the cipher C = M ^e mod N, with E and N as other input or stored values. In this case, the modular exponentiation circuit is the encryption device itself. In the case of secret communication, the same encryption device may be used to perform decryption by the inverse operation M = C ^d mod N. In addition, the modular multiplication circuit or the modular exponentiation operation circuit is used as a part of the cryptographic device, and an external input to the cryptographic device or a processing result of another processing unit in the cryptographic device is input to this circuit to execute an operation. The calculation result may be output to the outside of the cryptographic device or input to another processing unit in the cryptographic device.

【００３１】また、記憶媒体へのアクセスを通信と見な
した場合は、磁気ディスク等のような記憶媒体と、この
記録媒体へのアクセス装置が通信機に相当し、通信系と
同様に記憶系においても本発明による回路を用いた暗号
装置によって、暗号システムを利用することができる。When the access to the storage medium is regarded as communication, the storage medium such as a magnetic disk and the access device to the recording medium correspond to a communication device, and the storage system is similar to the communication system. Also in the above, the cryptographic system can be utilized by the cryptographic device using the circuit according to the present invention.

【００３２】〔モンゴメリーの剰余乗算〕次の定理がモ
ンゴメリーによって導かれた。[Montgomery's modular multiplication] The following theorem was derived by Montgomery.

【００３３】定理１：ＮとＲを互いに素な整数、Ｔを任
意の整数とし、Ｎ' ＝−Ｎ^-1 mod Ｒとし、Ｍ＝Ｔ・
Ｎ' mod Ｒとするとき、次の関係を満足する。Theorem 1: N and R are mutually prime integers, T is an arbitrary integer, N '=-N- ¹ mod R, and M = T.
When N ′ mod R, the following relationship is satisfied.

【００３４】（Ｔ＋Ｍ・Ｎ）／Ｒ＝Ｔ・Ｒ^-1 mod Ｎ（１）証明：略従って、剰余乗算：Ｑ＝Ａ・Ｂ mod Ｎを実行する場
合、Ｎに対して素である整数Ｒを用いて次のようにして
行うことができる。(T + M · N) / R = T · R ⁻¹ mod N (1) Proof: Approximately therefore, when performing modular multiplication: Q = A · B mod N, an integer R that is a prime to N Can be performed as follows.

【００３５】Ａ_R ＝Ａ・Ｒ mod Ｎ（２）Ｂ_R ＝Ｂ・Ｒ mod Ｎ（３）Ｔ＝Ａ_R・Ｂ_R （４）Ｔ_R ＝Ｔ・Ｒ^-1 mod Ｎ＝（Ｔ＋Ｍ・Ｎ）／Ｒ（５）Ｑ＝Ｔ_R・Ｒ^-1 mod Ｎ（６）ここで、式（４），（５）の演算をモンゴメリーの剰余
乗算と呼ぶとすると、モンゴメリーの剰余乗算は次のよ
うに表すことができる。A _R = A · R mod N (2) B _R = B · R mod N (3) T = A _R · B _R (4) T _R = T · R ⁻¹ mod N = (T + M · N ) / R (5) Q = T _R · R ^-1 mod N (6) Here, when the operations of the expressions (4) and (5) are called Montgomery's modular multiplication, Montgomery's modular multiplication is as follows. Can be expressed as

【００３６】Ｔ_R ＝Ａ_R・Ｂ_R・Ｒ^-1 mod Ｎ＝（Ａ_R・Ｂ_R ＋Ｍ・Ｎ）／Ｒ（７）ただし、Ｍ＝Ａ_R・Ｂ_R・Ｎ' mod Ｒ（８）モンゴメリーの剰余乗算においてＮが奇数の場合、Ｒ＝
２ⁿ （ｎは任意の整数）と選べば、ＲはＮに対して素な
整数になる。この場合、Ｒによる除算はビットシフトの
みで済むので、式（５）または式（７）のモンゴメリー
の剰余乗算は乗算のみによって実行できる。T _R = A _R · B _R · R ⁻¹ mod N = (A _R · B _R + M · N) / R (7) where M = A _R · B _R · N 'mod R (8) If N is odd in Montgomery's modular multiplication, R =
If 2 ⁿ (n is an arbitrary integer) is selected, R becomes an integer that is prime to N. In this case, since the division by R only requires a bit shift, the Montgomery's remainder multiplication of Expression (5) or Expression (7) can be executed only by multiplication.

【００３７】このとき、式（２），（３）及び（６）の
前後の処理もまたモンゴメリーの剰余乗算によって次の
ように実行できる。At this time, the processing before and after the equations (2), (3) and (6) can also be executed as follows by the Montgomery's modular multiplication.

【００３８】Ａ_R ＝Ａ・Ｒ mod Ｎ＝Ａ・Ｒ_R・Ｒ^-1 mod ＮＢ_R ＝Ｂ・Ｒ mod Ｎ＝Ｂ・Ｒ_R・Ｒ^-1 mod ＮＱ＝Ｔ_R・Ｒ^-1 mod Ｎ＝Ｔ_R・１・Ｒ^-1 mod Ｎただし、Ｒ_R ＝Ｒ² mod ＮＲ_R はＮによって一意に定まる値であるので、Ｎを定め
たときに定まり、定数として扱うことができる。従っ
て、図２に示すようにＺ＝Ｘ・Ｙ・Ｒ^-1 mod Ｎを実行
する演算回路を用いて、式（２）〜（６）の演算が共通
に実行でき、求める剰余乗算：Ｑ＝Ａ・Ｂ mod Ｎが演
算される。図２は入力組（Ａ，Ｒ_R ），（Ｂ，Ｒ_R ），
（Ａ_R ，Ｂ_R ），（Ｔ_R ，１）に対する出力が各々Ａ
_R ，Ｂ_R ，Ｔ_R ，Ｑであることを示している。A _R = A · R mod N = A · R _R · R ⁻¹ mod N B _R = B · R mod N = B · R _R · R ⁻¹ mod N Q = T _R · R ⁻¹ mod N = T _R · 1 · R ⁻¹ mod N However, since R _R = R ² mod N _RR is a value uniquely determined by N, it is determined when N is determined and can be treated as a constant. Therefore, as shown in FIG. 2, the arithmetic circuits for executing Z = X · Y · R ⁻¹ mod N can be commonly used to execute the arithmetic operations of Expressions (2) to (6), and the required modular multiplication: Q = A · B mod N is calculated. FIG. 2 shows input groups (A, _RR ), (B, _RR ),
The outputs for (A _R , B _R ) and (T _R , 1) are A
_R , B _R , T _R , and Q.

【００３９】〔モンゴメリーのべき乗剰余演算１〕ま
た、モンゴメリー法を用いて、べき乗剰余演算：Ｃ＝Ｍ
^e mod Ｎも次のようにして実行される。[Montgomery's modular exponentiation 1] Also, using the Montgomery method, the modular exponentiation: C = M
^e mod N is also executed as follows.

【００４０】ＩＮＰＵＴＭ，ｅ，Ｎ，Ｒ_R Ｍ_R ＝Ｍ・Ｒ_R・Ｒ^-1 mod Ｎ（９）Ｃ_R ＝１・Ｒ_R・Ｒ^-1 mod Ｎ（１０）ＦＯＲｉ＝ｔＴＯ１ＩＦｅ_i ＝１ＴＨＥＮＣ_R ＝Ｃ_R・Ｍ_R・Ｒ^-1 mod Ｎ（１１）ＩＦｉ＞１ＴＨＥＮＣ_R ＝Ｃ_R・Ｃ_R・Ｒ^-1 mod Ｎ（１２）ＮＥＸＴＣ＝Ｃ_R・１・Ｒ^-1 mod Ｎ（１３）従って、べき乗剰余演算もモンゴメリーの剰余乗算のみ
によって実行できる。なお、式（１０）に示すＣ_R の初
期値は、Ｒ_R とＮによって定まるので定数として扱うこ
ともできる。以後、このようにモンゴメリーの剰余乗算
のみを用いたべき乗剰余演算をモンゴメリーのべき乗剰
余演算と呼ぶ。INPUT M, e, N, R _R M _R = M · R _R · R ⁻¹ mod N (9) C _R = 1 · R _R · R ⁻¹ mod N (10) FOR i = t TO 1 IF e _i = 1 THEN C _R = C _R · M _R · R ^-1 mod N (11) IF i> 1 THEN C _R = C _R · C _R · R ^-1 mod N (12) NEXT C = C _R 1 · R ⁻¹ mod N (13) Therefore, the modular exponentiation operation can be performed only by the Montgomery modular multiplication. The initial value of C _R shown in equation (10) is determined by R _R and N, so it can be treated as a constant. Hereinafter, the modular exponentiation operation using only the Montgomery modular exponentiation will be referred to as a Montgomery modular exponentiation operation.

【００４１】ここで、以上のように、モンゴメリーのべ
き乗剰余演算を実行する場合、１つの演算結果を次の演
算の入力として乗算を繰り返すので、各乗算を同一の回
路構成で実現しようとするとき、出力の最大ビット数が
入力の最大ビット数を越えてしまうと実現が困難とな
る。As described above, when the Montgomery's modular exponentiation operation is executed, one operation result is used as an input for the next operation, and the multiplication is repeated. Therefore, when each multiplication is to be realized with the same circuit configuration, If the maximum number of output bits exceeds the maximum number of input bits, it will be difficult to realize.

【００４２】そこで、式（７）のモンゴメリーの剰余乗
算において、入力と出力の最大ビット数が等しくなるた
めの条件を以下に考察する。Therefore, in the Montgomery remainder multiplication of the equation (7), the condition for the maximum number of bits of the input and the output to be equal will be considered below.

【００４３】定理２：式（７），（８）においてＡ_R ＜
２^n+u ，Ｂ_R ＜２^n+u ，Ｎ＜２ⁿ ，Ｒ＝２ⁿ⁺ ^r としたと
き、Ｔ_R ＜２^n+u となるためには、ｕ＝１かつｒ＞１、
または、ｕ＞１かつｒ＝ｕ＋１ならば十分である。Theorem 2: In equations (7) and (8), A _R <
When 2 ^{n + u} , B _R <2 ^{n + u} , N <2 ⁿ , and R = 2 ^{n +} ^r , in order to ^satisfy T _R <2 ^{n + u} , u = 1 and r> 1,
Alternatively, it is sufficient if u> 1 and r = u + 1.

【００４４】証明：Ｒ＝２^n+r とすると式（８）よりＭ＜２^n+r ．Ａ_R ＜２^n+u ，Ｂ_R ＜２^n+u ，Ｎ＜２ⁿ とすると、Ａ_R・Ｂ_R ＜２^2(n+u)，Ｍ・Ｎ＜２^2n+r．キャリーによる桁上がりを考慮して、Ａ_R・Ｂ_R ＋Ｍ・Ｎ＜max （２^2(n+u)+1，２^2n+r+1）．よって、Ｔ_R ＜max （２^n+2u+1-r，２ⁿ⁺¹ ）．従って、２^n+2u+1-r≦２ⁿ⁺¹ の場合：Ｔ_R ＜２ⁿ⁺¹ ． ∴ ｕ＝１，ｒ＞１（１４）２^n+2u+1-r＞２ⁿ⁺¹ の場合：Ｔ_R ＜２^n+2u+1-r． ∴ ｕ＞１，ｒ＝ｕ＋１（１５）ただし、max （Ａ，Ｂ）はＡ，Ｂのうち大きい方を選択
する関数である。Proof: If R = 2 ^{n + r} , then M <2 ^{n + r} . If A _R <2 ^{n + u} , B _R <2 ^{n + u} , N <2 ⁿ , then A _R · B _R <2 ^{2 (n + u)} , M · N <2 ^{2n + r} . In consideration of carry by carry, A _R · B _R + M · N <max (2 ^{2 (n + u) +1} , 2 ^{2n + r + 1} ). Therefore, T _R <max (2 ^{n + 2u + 1-r} , 2 ^{n + 1} ). Therefore, if 2 ^{n + 2u + 1-r} ≦ 2 ^{n + 1} : T _R <2 ^{n + 1} . ∴ u = 1, r> 1 (14) In the case of 2 ^{n + 2u + 1-r} > 2 ^{n + 1} : T _R <2 ^{n + 2u + 1-r} . ∴ u> 1, r = u + 1 (15) Here, max (A, B) is a function that selects the larger one of A and B.

【００４５】このとき、式（１４），（１５）の条件を
満足していればモンゴメリーのべき乗剰余演算はすべて
モンゴメリーの剰余乗算の単純な繰り返しによって実現
することができる。従って、図３に示すように式（９）
〜（１３）に対してセレクタＳによって入力を選択する
だけでべき乗剰余演算が実行できる。At this time, if the conditions of equations (14) and (15) are satisfied, all Montgomery modular exponentiation operations can be realized by simple repetition of Montgomery modular multiplication. Therefore, as shown in FIG.
The power-residue calculation can be executed only by selecting an input with respect to (13) by the selector S.

【００４６】なお、図３の回路で、２つのセレクタＳ
は、選択可能なフィードバック入力として、一方にＣ
_R 、他方にＣ_R ，Ｍ_R を一時蓄えるメモリを具えるもの
とする。このようなメモリは、２つのセレクタＳの前に
設けて、両セレクタＳが共通に利用できるようにしても
よいことはもちろんである。また、このようなセレクタ
Ｓにおける入力の切換のためには、例えば、ｅをシフト
レジスタに記憶させ、ｅ_iを上位ビットから順次出力さ
せ、その出力を受けて、ｅ_i ＝１であるか、およびｉ
＞１であるかの判定を行い、切換信号を出力する制御部
（論理回路やカウンタなどにより構成できる）を設けれ
ばよい。In the circuit of FIG. 3, the two selectors S
C as one of the selectable feedback inputs
_R , and on the other hand, a memory for temporarily storing C _R and M _R shall be provided. It goes without saying that such a memory may be provided in front of the two selectors S so that both selectors S can be commonly used. In order to switch the input in the selector S as described above, for example, e is stored in the shift register, e _i is sequentially output from the upper bit, and e _i = 1 is received in response to the output. And i
It is only necessary to provide a control unit (which can be configured by a logic circuit, a counter, or the like) that determines whether> 1 and outputs a switching signal.

【００４７】このとき、式（１４），（１５）の条件を
満足していればモンゴメリーのべき乗剰余演算はすべて
モンゴメリーの剰余乗算の単純な繰り返しによって実現
することができる。ただし、式（１４），（１５）から
ｕ＞０であるので、演算結果であるＣだけは、Ｃ＜Ｎと
なるように補正しなければならない。At this time, if the conditions of equations (14) and (15) are satisfied, all Montgomery modular exponentiation operations can be realized by simple repetition of Montgomery modular multiplication. However, since u> 0 from the expressions (14) and (15), only the calculation result C must be corrected so that C <N.

【００４８】従来のイブンの手法では、このような補正
をモンゴメリーの剰余乗算を行う度に行わなければなら
ないが、本方式はモンゴメリーのべき乗剰余演算の終了
後に１度だけ補正を行えばよい。また、この補正は簡単
な処理であるので、以下に示すモンゴメリーのべき乗剰
余演算のための回路規模や処理速度に比べてほとんど影
響しない。In the conventional Ibn's method, such a correction must be performed every time Montgomery's modular multiplication is performed, but in this method, the correction needs to be performed only once after the completion of Montgomery's modular exponentiation. Further, since this correction is a simple process, it has almost no effect on the circuit scale and the processing speed for the Montgomery modular exponentiation operation described below.

【００４９】〔モンゴメリーのべき乗剰余演算２〕ま
た、べき乗剰余演算：Ｃ＝Ｍ^e mod Ｎは次のようにして
も実行できる。[Montgomery's modular exponentiation calculation 2] The modular exponentiation calculation: C = M ^e mod N can also be executed as follows.

【００５０】ＩＮＰＵＴＭ，ｅ，Ｎ，Ｒ_R Ｍ_R ＝Ｍ・Ｒ_R・Ｒ^-1 mod ＮＣ_R ＝１・Ｒ_R・Ｒ^-1 mod ＮＦＯＲｉ＝１ＴＯｔＩＦｅⁱ ＝１ＴＨＥＮＣ_R ＝Ｃ_R・Ｍ_R・Ｒ^-1 mod
ＮＩＦｉ＜ｔＴＨＥＮＭ_R ＝Ｍ_R・Ｍ_R・Ｒ^-1 mod
ＮＮＥＸＴＣ＝Ｃ_R・１・Ｒ^-1 mod Ｎこの場合も、式（１４），（１５）の条件を用いればモ
ンゴメリーの剰余乗算の単純な繰り返しによってＣが演
算できることは明らかである。また、図３の回路で、２
つのセレクタＳがそれぞれＣ_R ，Ｍ_R を選択可能とする
と共に、２つのセレクタＳが、ともにＭ_R を選択可能と
するだけで、同様にべき乗剰余演算が実行できることは
明らかである。[0050] _{INPUT M, e, N, R} R M R = M · R R · R -1 mod N C R = 1 · R R · R -1 mod N FOR i = 1 TO t IF e i = 1 THEN C _R = C _R · M _R · R ^-1 mod
N IF i <t THEN M _R = M _R · M _R · R ⁻¹ mod
N NEXT C = C _R · 1 · R ⁻¹ mod N In this case as well, it is clear that C can be calculated by a simple iteration of Montgomery's remainder multiplication by using the conditions of equations (14) and (15). In addition, in the circuit of FIG.
It is obvious that the power-residue calculation can be similarly performed by only one selector S making C _R and M _R selectable and two selectors S making both M _R selectable.

【００５１】以上によって、べき乗剰余演算及び剰余乗
算が式（１６）を演算する演算回路のみによって実行で
きることが示された。From the above, it has been shown that the modular exponentiation operation and the modular multiplication can be executed only by the arithmetic circuit for calculating the equation (16).

【００５２】Ｚ＝Ｘ・Ｙ・Ｒ^-1 mod Ｎ（１６）また、これを式（７）に示すモンゴメリーの剰余乗算を
用いて演算する場合には、式（１４），（１５）の条件
を満足する入力値を用いることによって、モンゴメリー
の剰余乗算の単純な繰り返しによって剰余乗算及びべき
乗剰余演算が実行できることも示された。Z = X · Y · R ⁻¹ mod N (16) In addition, when this is calculated using the Montgomery's remainder multiplication shown in Expression (7), the conditions of Expressions (14) and (15) are satisfied. It was also shown that the remainder multiplication and the power-residue operation can be performed by simple iteration of the Montgomery's remainder multiplication by using input values satisfying

【００５３】式（１６）または式（７）は整数演算であ
るので、その演算回路及び方法は種々の手法によって実
現できる。例えば、ＣＰＵ等を用いれば簡単に実現でき
ることは明らかである。Since the expression (16) or the expression (7) is an integer operation, its operation circuit and method can be realized by various methods. For example, it is obvious that this can be easily realized by using a CPU or the like.

【００５４】従って、式（１６）または式（７）を実行
する共通の演算回路及び方法によって、剰余乗算及びべ
き乗剰余演算を用いた種々の暗号システムが効率的に構
成できる。Therefore, various cryptographic systems using the modular multiplication and the modular exponentiation can be efficiently constructed by the common arithmetic circuit and method for executing the equation (16) or the equation (7).

【００５５】〔モンゴメリーの剰余乗算及びべき乗剰余
回路の実施例１〕Ｔ_R ＝Ａ_R・Ｂ_R・Ｒ^-1 mod Ｎ（Ａ_R ，
Ｂ_R ＜２^n+u ，Ｒ＝２^n+r ，Ｎ＜２ⁿ 整数，ｕ，ｒは式
（１４），（１５）の条件を満たす）の剰余乗算を考え
る。Ａ_Rをｖビット毎、Ｂ_R ，Ｎ，Ｔ_R をｄビット毎に
分割すると、次のように表せる。ただし、ｎ＋ｒ≦ｍ・
ｄ，ｎ＋ｒ≦ｋ・ｖ，Ｘ＝２^d ，Ｙ＝２^v （ｖ≦ｄ）。[Example 1 of Montgomery's modular multiplication and modular exponentiation circuit] T _R = A _R · B _R · R ⁻¹ mod N (A _R ,
Consider the modular multiplication of B _R <2 ^{n + u} , R = 2 ^{n + r} , N <2 ⁿ integer, u and r satisfy the conditions of the expressions (14) and (15). When A _R is divided by v bits and B _R , N, T _R are divided by d bits, they can be expressed as follows. However, n + r ≦ m
d, n + r ≦ k · v, X = 2 ^d , Y = 2 ^v (v ≦ d).

【００５６】Ａ_R ＝Ａ_k-1・Ｙ^k-1+Ａ_k-2・Ｙ^k-2+・・・+Ａ₁・Ｙ+ Ａ₀ Ｂ_R ＝Ｂ_m-1・Ｘ^m-1+Ｂ_m-2・Ｘ^m-2+・・・+Ｂ₁・Ｘ+ Ｂ₀ Ｎ＝Ｎ_m-1・Ｘ^m-1+Ｎ_m-2・Ｘ^m-2+・・・+Ｎ₁・Ｘ+ Ｎ₀ Ｔ_R ＝Ｔ_m-1・Ｘ^m-1+Ｔ_m-2・Ｘ^m-2+・・・+Ｔ₁・Ｘ+ Ｔ₀ （１７）ここで、Ａ_i(i=0,…,k-1，n+u ＜i でＡ_i ＝０) はＡ_R
を下位桁からｖビット毎に分割したビット系列を表し、
Ｂ_j ，Ｎ_j ，Ｔ_j(j=0,…,m-1) は各々Ｂ_R ，Ｎ，Ｔ_R に
ついて下位桁からｄビット毎に分割したビット系列を表
す。この場合、モンゴメリーの剰余乗算は次の演算をi=
0 からk まで繰り返すことよって求められる。ただし、
Ｔ__iはｉ回目の演算におけるＴ_R の値を意味し、式（１
６）におけるＴ_i とは異なる。A _R = A _k-1 · Y ^k-1 + A _k-2 · Y ^k-2 + ... + A ₁ · Y + A ₀ B _R = B _m-1 · X ^m-1 + B _m-2・ X ^m-2 + ・・・ + B ₁・ X + B ₀ N = N _m-1・ X ^m-1 + N _m-2・ X ^m-2 + ・・・ + N ₁・X + N ₀ T _R = T _m-1 · X ^m-1 + T _m-2 · X ^m-2 + ... + T ₁ · X + T ₀ (17) where A _i (i = 0 , ..., k-1, n + u <i and A _i = 0) is A _R
Represents a bit sequence obtained by dividing each of the lower digits by v bits,
B _j , N _j , and T _j (j = 0, ..., m-1) represent bit sequences obtained by dividing B _R , N, and T _{R by} d bits from the lower digit. In this case, Montgomery's modular multiplication does the following operation i =
It is calculated by repeating from 0 to k. However,
T_ _i means the value of T _R in the i-th calculation, and is expressed by the formula (1
Different from T _i in 6).

【００５７】Ｔ__i＝( Ｔ__i-1＋Ａ_i・Ｂ_R・Ｙ＋Ｍ_i-1・Ｎ)/Ｙ（１８）ただし、Ｍ_i-1 ＝( Ｔ__i-1 mod Ｙ)・Ｎ₀' mod Ｙ，Ｔ
__-1 ＝０，Ｎ₀'＝Ｎ'mod Ｙこの演算を並列処理で実現するために、Ｂ_R ，ＮをＢj
，Ｎj を用いて表すと次のようになる。[0057] _{_{T_ i = (T_ i-1}} + A i · B R · Y + M i-1 · N) / Y (18) _{However, M i-1 = (T_} i-1 mod Y) · N 0 'mod Y , T
Bj To achieve at _{_{_ -1 = 0, N 0 '}} = N'mod Y parallel processing of this operation, B _R, a N
, Nj is expressed as follows.

【００５８】アルゴリズム１：ＦＯＲｉ＝０ＴＯｋＭ_i-1 ＝dw_v （dw_v （Ｔi-1,0 ）・Ｎ₀'）ＦＯＲｊ＝０ＴＯｍ−１Ｒ_i,j ＝Ｔ_i-1,j ＋Ｌ_i-2,j+1・Ｘ/ Ｙ² ＋Ｙ・Ａ_i・Ｂ_j
＋Ｍ_i-1・Ｎ_j Ｌ_i,j ＝dw_v （Ｒ_i,j ）Ｔ_i,j ＝（Ｒ_i,j −Ｌ_i,j ）／ＹＮＥＸＴＮＥＸＴただし、dw_d （Ｚ）＝Ｚ mod ２^d up_d （Ｚ）＝（Ｚ−dw_d （Ｚ））／２^d T_i,j ，Ｌ_i,j の初期値は全て０アルゴリズム１において、Ｙ・Ａ_i・Ｂ_j ，Ｌ_i-2,j+1・Ｘ
/ Ｙ² ，およびＴ_i,j＝（dw_d+v （Ｒ_i,j ）−Ｌ_i,j ）
／Ｙ等の定数Ｘ＝２^d ，Ｙ＝２^v による乗除算は他の値
に対してビットをずらすことによって実現される。従っ
て、Ｔ_i,j に関する演算はＲ_i,j のＬＳＢに対してｖビ
ット目からｄ＋ｖ−１ビット目までの値をＴ_i,j とする
ことを意味する。ただし、Ｌ_i,j はＲ_i,j のＬＳＢから
ｖ−１ビット目までの値である。このようにＴ_i,j を得
るための１／Ｙ演算をＲ_i,j 毎の下位へのビットシフト
によって実現しているので、Ｌ_i-2,j+1 はＲ_i,j を演算
するときに用いられ、Ｘ/ Ｙ² によって桁を合わせて演
算される。Algorithm 1: FOR i = 0 TO k M _i-1 = dw _v (dw _v (Ti-1,0) N ₀ ') FOR j = 0 TO m-1 R _{i, j} = T _{i- 1, j} + L _{i-2, j + 1}・ X / Y ² + Y ・ A _i・ B _j
+ M _i−1 · N _j L _{i, j} = dw _v (R _{i, j} ) T _{i, j} = (R _{i, j} −L _{i, j} ) / Y NEXT NEXT However, dw _d (Z) = Z mod ^{_{2 d up d (Z) =}} (Z-dw d (Z)) / 2 d T i, j, L i, the initial value of _j in all 0 algorithm _{1, Y · a i · B} j, L i- _{2, j + 1} x
/ Y ² , and T _{i, j} = (dw _{d + v} (R _{i, j} ) −L _{i, j} ).
Multiplication / division by constants X = 2 ^d and Y = 2 ^v such as / Y is realized by shifting bits with respect to other values. Thus, operations on T _{i, j} means that the value of the v-th bit with respect to R _{i, j} of the LSB to d + v-1 th bit T _i, and _j. However, L _{i, j} is a value from the LSB of R _{i, j} to the v−1th bit. Since the 1 / Y operation for obtaining T _{i, j} is realized by bit shifting to the lower order for each R _{i, j} in this way, L _{i-2, j + 1} calculates R _{i, j} . It is sometimes used, and it is calculated by adjusting the digits by X / Y ² .

【００５９】図２はアルゴリズムを行う回路である。
アルゴリズム１においてｉはクロックを意味し、ｊは図
３におけるレジスタ（Ｒ）の位置に対応し、右から左に
Ｒ_i,0 からＲ_i,m-1 のレジスタを示す。FIG. 2 shows a circuit for executing the algorithm.
In Algorithm 1, i means a clock, j corresponds to the position of the register (R) in FIG. 3, and the registers from R _{i, 0} to R _{i, m−1} are shown from right to left.

【００６０】以下、簡単のためにｖ＝１の場合について
図２の回路と動作を説明する。図２においてＢ_j ，Ｎ
_j(j=0,…,m-1) 及びＮ₀'は、各々に記された値を乗数と
して持つｄビットの乗算器を示し、ｄ個のアンドによっ
て実現できる。Ｎが奇数であれば、Ｎ₀'＝１であるの
で、Ｍ_i-1 を演算する乗算器は省略でき、Ｔ_i-1,0 のＬ
ＳＢをとして出力する。また、＋で示される加算器の入
力及び出力は次のようになる。下部の乗算器からの出力
Ｍ_i-1・Ｎ_j はｄビット、上部の乗算器からの出力Ａ_i・Ｂ
_j もｄビットであるが、その値を２倍するためにＭ_i-1・
Ｎ_j に対して１ビット上位桁にシフトして入力する。レ
ジスタからの入力Ｔ_i-1,_j は、Ｒ_i-1,_j のＬＳＢから２
ビット目からを１ビット下位にシフトさせ、Ｍ_i-1・Ｎ_j
と同位の値として入力する。Ｌ_i-2,j+1・２^d-2 は２つ前
のＰＥからの１ビット出力Ｌ_i-2,j+1をＭ_i-1・Ｎ_j のＬ
ＳＢからｄ−１ビット目に入力することを意味する。こ
の場合、Ｔ_i-1,_j ＜２^d+2 であれば、加算器からの出力
はｄ＋３ビットとなる。従って、加算器からの出力を受
けるレジスタは各々ｄ＋３ビットレジスタとなる。For simplicity, the circuit and operation of FIG. 2 will be described below for the case of v = 1. In FIG. 2, B _j , N
_j (j = 0, ..., M-1) and N ₀ 'represent a d-bit multiplier having the value written in each as a multiplier, and can be realized by d ANDs. If N is an odd number, N ₀ ′ = 1, and therefore the multiplier for calculating M _i−1 can be omitted, and L of T _i−1,0 can be omitted.
SB is output as. The input and output of the adder indicated by + are as follows. The output M _i−1 · N _j from the lower multiplier is d bits, and the output A _i · B from the upper multiplier.
_{j is} also d bits, but in order to double that value, M _i-1.
Shift 1 bit to N _j and input it. The input T _i−1 , _j from the register is 2 from the LSB of R _i−1 , _j.
Shifts from the 1st bit to the lower bit by ₁ and outputs M _i-1 · N _j
And enter as the value of the peer. L _{i-2, j + 1} · 2 ^d-2 is the 1-bit output L _{i-2, j + 1} from the immediately preceding PE, which is L of M _i−1 · N _j .
This means inputting from the SB to the d-1th bit. In this case, if T _i−1 , _j <2 ^{d + 2} , the output from the adder is d + 3 bits. Therefore, the registers receiving the output from the adder are each a d + 3 bit register.

【００６１】以上のようにして、図４の回路で式（１
８）の演算が実行でき、Ａ₀ からＡ_Ｋまで入力すること
によってモンゴメリーの剰余乗算が実行できる。As described above, in the circuit of FIG.
The operation of 8) can be executed, and Montgomery's remainder multiplication can be executed by inputting A ₀ to A _K.

【００６２】また、図２はｖ＝１として説明したが、ｖ
≦ｄであるｖに対しても同様の手法によってモンゴメリ
ーの剰余乗算を実行できることは明らかである。Although FIG. 2 has been described with v = 1,
It is obvious that Montgomery's modular multiplication can be executed for v with ≤d by the same method.

【００６３】本実施例によるモンゴメリーの剰余乗算回
路は非常に小さな回路規模で、高速処理を実現する。The Montgomery remainder multiplication circuit according to this embodiment realizes high-speed processing with a very small circuit scale.

【００６４】〔モンゴメリーの剰余乗算及びべき乗剰余
回路の実施例２〕この演算をシストリックアレイで実現
するために、Ｂ_Ｒ，ＮをＢ_j ，Ｎ_j を用いて表すと次
のようになる。[0064] The operation EXAMPLE 2 modulo multiplication and modulo exponentiation circuit Montgomery] To achieve systolic array, B _{R, N} and B _j, expressed with N _j as follows.

【００６５】アルゴリズム２：ＦＯＲｉ＝０ＴＯｋＭ_i-1 ＝dw_v (dw_v（Ｔ_i-1,0 ）・Ｎ₀'）ＦＯＲｊ＝０ＴＯｍ−１Ｒ_i,j ＝Ｔ_i-1,_j ＋Ｃ_i,_j-1+Ｌ_i-2,j+1・Ｘ/ Ｙ²+Ｙ・Ａ
_i・Ｂ_j ＋Ｍ_i-1・Ｎ_j Ｌ_i,j ＝dw_v （Ｒ_i,j ）Ｔ_i,j ＝（dw_d+v （Ｒ_i,j ）−Ｌ_i,j ）／ＹＣ_i,j ＝up_d+v （Ｒ_i,j ）ＮＥＸＴＮＥＸＴただし、dw_d （Ｚ）＝Ｚ mod ２^d up_d （Ｚ）＝（Ｚ−dw_d （Ｚ））／２^d Ｔ_i,j ，Ｃ_i,j ，Ｌ_i,j の初期値は全て０アルゴリズム２において、Ｃ_i,j-1 は桁上がりとしてＲ
_i,j を演算する時に用いられる。また、Ｙ・Ａ_i・Ｂ_j ，
Ｌ_i-2,j+1・Ｘ/ Ｙ² ，およびＴ_i,j ＝（dw_d+v（Ｒ
_i,j ）−Ｌ_i,j ）／Ｙ等のＸ，Ｙを定数としてもつ演算
は他の値に対してビットをずらすことによって実現され
る。従って、Ｔ_i,j に関する演算はＲ_i,j ＬＳＢに対し
てｖビット目からｄ＋ｖ−１ビット目までの値をＴ_i,j
とすることを意味する。Algorithm 2: FOR i = 0 TO k M _i−1 = dw _v (dw _v (T _i-1,0 ) · N ₀ ′) FOR j = 0 TO m−1 R _{i, j} = T _{i -1} , _j + C _i , _j-1 + L _{i-2, j + 1}・ X / Y ² + Y ・ A
_{_{_{i · B j + M i-}}} 1 · N j L i, j = dw v (R i, j) T i, j = (dw d + v (R i, j) -L i, j) / Y C i _{_{, j = up d + v (}} R i, j) NEXT NEXT _{however, dw d (Z) = Z} mod 2 d up d (Z) = (Z-dw d (Z)) / 2 d T i, j, Initial values of C _{i, j} and L _{i, j} are all 0. In Algorithm 2, C _{i, j-1} is a carry and R
Used when computing _{i, j} . In addition, Y · A _i · B _j ,
L _{i-2, j + 1} · X / Y ² , and T _{i, j} = (dw _{d + v} (R
_An operation having X and Y as constants such as _{i, j} ) -L _{i, j} ) / Y is realized by shifting bits with respect to other values. Thus, T _i, operations on _j is R _i, the value of T _i from v-th bit relative _j LSB to d + v-1 th _{bit, j}
Means to

【００６６】ただし、Ｌ_i,j はＲ_i,j のＬＳＢからｖ−
１ビット目までの値である。このようにＴ_i,j を得るた
めの１／Ｙ演算をＲ_i,j 毎の下位へのビットシフトによ
って実現しているので、Ｌ_i-2,j+1 はＲ_i,j を演算する
ときに用いられ、Ｘ/ Ｙ² によって桁を合わせて演算さ
れる。However, L _{i, j} is v− from the LSB of R _{i, j.}
It is a value up to the first bit. Since the 1 / Y operation for obtaining T _{i, j} is realized by bit shifting to the lower order for each R _{i, j} in this way, L _{i-2, j + 1} calculates R _{i, j} . It is sometimes used, and it is calculated by adjusting the digits by X / Y ² .

【００６７】図５はアルゴリズム２においてＲ_i,j ，Ｌ
_i,j ，Ｔ_i,j ，Ｃ_i,j を演算する回路である。図６は図
５の回路を１つのＰＥ（プロセッシング・エレメント）
として、それを縦列に接続したシストリックアレイであ
る。アルゴリズム２において、ｊはクロックを意味し、
ｉは図６におけるＰＥの位置に対応し、左から右にｉ＝
０（＃１）からｉ＝ｋ（＃ｋ＋１）のＰＥを示す。FIG. 5 shows that R _{i, j} , L in Algorithm 2
_This is a circuit for calculating _{i, j} , T _{i, j} , and C _{i, j} . FIG. 6 shows the circuit of FIG. 5 as one PE (processing element).
As a, it is a systolic array in which it is connected in cascade. In Algorithm 2, j means a clock,
i corresponds to the position of PE in FIG. 6, and i = from left to right
PEs from 0 (# 1) to i = k (# k + 1) are shown.

【００６８】図６において＃ｉ＋１番目のＰＥはＡ_i(i=
0,…,k) の値が内部レジスタに設定されており、ＰＥ間
はＢ_inとＢ_out ，Ｄ_inとＤ_out ，Ｔ_inとＴ_out ，Ｌ_inと
Ｌ_out ，Ｍ_inとＭ_out ，Ｎ_inとＮ_out が各々接続されて
いる。また、＃１のＰＥのＢ_in，Ｎ_inには各々Ｂ_j ，Ｎ
_j(j=0,…,m-1) が下位桁から順に入力され、Ｄ_in，
Ｔ_in，Ｌ_in，Ｍ_inの入力には各々０が設定されている。In FIG. 6, the # i + 1th PE is A _i (i =
0, ..., k) is set in the internal register, and between PEs, B _in and B _out , D _in and D _out , T _in and T _out , L _in and L _out , M _in and M _out , N _in and N _out are connected to each other. Also, each B _j is the B _in, N _in the # 1 of PE, N
_j (j = 0, ..., m-1) are input in order from the lower digit, and D _in ,
T _{_in,} L _in, each 0 is set to the input of the M _in.

【００６９】以下、簡単のために、ｖ＝１の場合につい
て図５の回路と動作を説明する。図５において×は乗算
器を示し、ｄ個のアンドによって実現できる。Ｒ１〜Ｒ
３は各々Ａ_i ，Ｍ_i-1 ，Ｎ₀'を保持する１ビットレジス
タである。Ｎが奇数であれば、Ｎ₀'＝１であるのでを演
算する乗算器とＮ₀'を保持するＲ３は省略され、Ｒ２は
Ｔ_i-1,₀ のＬＳＢをとして保持する。また、Ｒ４，Ｒ５
はＢ_in，Ｎ_inからの入力を１クロック遅らせて次のＰＥ
に送るためのｄビットレジスタである。＋で示される加
算器の入力及び出力は次のようになる。下部の乗算器か
らの出力Ｍ_i-1・Ｎ_j はｄビット、上部の乗算器からの出
力Ａ_i・Ｂ_j もｄビットであるが、その値を２倍するため
にＭ_i-1・Ｎ_j に対して１ビット上位桁にシフトして入力
する。前ＰＥからの入力Ｔ_i-1,_j は、Ｒ_i-1,_j のＬＳＢ
から２ビット目からｄ＋１ビット目までのｄビットを１
ビット下位にシフトさせ、Ｍ_i-1・Ｎ_j と同位の値として
入力する。Ｌ_i-2,j+1・２^d-2 は２つ前のＰＥからの１ビ
ット出力Ｌ_i-2,j+1 をＭ_i-1・Ｎ_j のＬＳＢからｄ−１ビ
ット目に入力することを意味する。この場合、桁上がり
ビットであるＣ_i,j-1 はＣ_i,j-1 ＜２^d+2 であれば、加
算器からの出力はｄ＋３ビットとなるので、Ｃ_i,j-1 は
２ビットの値となる。従って、加算器からの出力を受け
るＲ６はｄ＋３ビットレジスタとなる。For simplification, the circuit and operation of FIG. 5 will be described below for the case of v = 1. In FIG. 5, x indicates a multiplier, which can be realized by d AND gates. R1-R
Reference numeral 3 is a 1-bit register that holds A _i , M _i-1 , and N ₀ ′, respectively. If N is odd, R3 for holding the 'multipliers and N ₀ for computing the so is = 1' N ₀ is omitted, R2 is held as the LSB of the T _i-1, _0. In addition, R4, R5
Is the next PE after delaying the input from B _in and N _in by 1 clock.
Is a d-bit register for sending to. The input and output of the adder indicated by + are as follows. Although the output M _i-1 · N _j from the bottom of the multiplier is the output A _i · B _j be d bits from d bits, the upper part of the multiplier, M _i-1 · to doubling its value Shift 1 bit to N _j and input it. The input T _i−1 , _j from the previous PE is the LSB of R _i−1 , _j .
From the 2nd bit to the d + 1th bit is 1
The value is shifted to the lower bit and input as a value equal to M _i-1 · N _j . L _{i-2, j + 1} · 2 ^d-2 is the 1-bit output L _{i-2, j + 1} from the previous PE, and is input to the d−1 bit from the LSB of M _i−1 · N _j. Means to do. In this case, if C _{i, j-1} which is a carry bit is C _{i, j-1} <2 ^{d + 2} , the output from the adder is d + 3 bits, so C _{i, j-1} is 2 It is a bit value. Therefore, R6 receiving the output from the adder becomes a d + 3 bit register.

【００７０】以上により、図５のＰＥ１つで式（１８）
の演算を実行できることがわかる。このＰＥを図６のよ
うにｋ＋１個パイプライン状に接続し、クロックに同期
させて動作させることによってモンゴメリーの剰余乗算
が高速に実行できる。From the above, one PE in FIG.
It can be seen that the calculation of can be performed. By connecting this PE in a pipeline form of k + 1 pieces as shown in FIG. 6 and operating in synchronization with the clock, the Montgomery remainder multiplication can be executed at high speed.

【００７１】ただし、図６のアレイからの最終出力はｋ
＋１番目のＰＥからの出力Ｔ_k,j Ｌ_k,j と、ｋ番目のＰ
Ｅからの出力Ｌ_k-1,j+1 に分離されているので、アルゴ
リズム２の処理の後、次のような演算を行う。However, the final output from the array of FIG. 6 is k
The output T _{k, j} L _{k, j} from the + 1st PE and the kth P
Since it is separated into the output L _{k-1, j + 1} from E, the following calculation is performed after the processing of Algorithm 2.

【００７２】アルゴリズム３：ＦＯＲｊ＝０ＴＯｍ−１Ｒ_k+1,j ＝Ｔ_k,j+Ｃ_k+1,j-1+Ｌ_k-1,_j+1・Ｘ/ Ｙ² Ｔ_k+1,j ＝dw_d （Ｒ_k+1,j ）Ｃ_k+1,j ＝up_d （Ｒ_k+1,j ）Ｔ_k+2,j ＝Ｔ_k+1,j+Ｃ_k+2,j-1+Ｌ_k,j+1・Ｘ/ ＹＣ_k+2,j ＝up_d （Ｔ_k+2,j ）ＮＥＸＴここで、Ｔ_k+2,j がＴ_R を分割したビット系列Ｔ_j(j=0,
…,m-1) となる。Ｒ_k+1,j の演算はアルゴリズム２にお
いてＡ_i ＝Ｍ_i-1 ＝０とした場合の演算と同様であるの
で、図５のＰＥによって実現される。Ｔ_k+2,j の演算
は、ほぼＲ_k+1,jの演算と同様であるが、Ｔ_k+1,j ，Ｃ
_k+1,j は１／２演算が行われずＬ_k,j+1 もＬ_k-1,_j+1 に
対して１ビット上位桁で加算される。従って、図７に示
すように図５のＰＥの加算器とレジスタＲ６のＬＳＢの
下に１ビット分のハーフアダー（ＨＡ）とレジスタ（Ｒ
７）を用意し、ＨＡには前ＰＥの出力Ｒ_i-1,_j のＬＳＢ
と、Ｃ_k+2,j-1 を入力し（この時、Ｃ_k+2,j-1 は高々１
ビットの値である）、その加算結果を新たに用意したレ
ジスタへ、キャリービットを加算器へのキャリーとして
入力する。このようにすればＬ_k,j+1 は自動的に１ビッ
ト上位の桁として加算される。従って、Ｔ_k+2,j を演算
するＰＥは他のＰＥと異なるが、ｉ＝ｋ＋２の時のみ加
算器へのキャリーとしてハーフアダーからのキャリーを
選択する１ビットのセレクタを加えれば全てのＰＥを１
つのＰＥで実現できる。Algorithm 3: FOR j = 0 TO m-1 R _{k + 1, j} = T _{k, j +} C _{k + 1, j-1} + L _k-1 , _{j +} 1.X / Y ² T _{k + _{1, j = dw d (R}} k + 1, j) C k + 1, j = up d (R k + 1, j) T k + 2, j = T k + 1, j + C k + 2, j ₋₁ + L _{k, j + 1} · X / Y C _{k + 2, j} = up _d (T _{k + 2, j} ) NEXT Here, T _{k + 2, j} is a bit sequence T _j obtained by dividing T _R. (j = 0,
…, M-1). The calculation of R _{k + 1, j is the same as} the calculation when A _i = M _i-1 = 0 in Algorithm 2, and thus is realized by the PE in FIG. The calculation of T _{k + 2, j} is almost the same as the calculation of R _{k + 1, j} , but T _{k + 1, j} , C
_{For k + 1, j} , 1/2 operation is not performed, and L _{k, j + 1} is also added to L _k−1 , _{j + 1} in 1-bit upper digits. Therefore, as shown in FIG. 7, a half adder (HA) for one bit and a register (R) are added under the LSB of the PE adder and register R6 of FIG.
7) is prepared and HA outputs LSB of output R _i−1 , _j of the previous PE.
And input C _{k + 2, j-1} (at this time, C _{k + 2, j-1} is at most 1
(The value of the bit), and the carry bit is input as a carry to the adder to the newly prepared register. In this way, L _{k, j + 1} is automatically added as a 1-bit upper digit. Therefore, the PE that calculates T _{k + 2, j} is different from the other PEs, but only when i = k + 2, add a 1-bit selector that selects the carry from the half adder as the carry to the adder. 1
It can be realized with two PEs.

【００７３】従って、Ｔ_R を得るには図６のＰＥとして
図７に示したＰＥを用い、さらに図６のアレイの後にＰ
Ｅを２個追加した計ｋ＋３個のＰＥを用いたアレイによ
ってモンゴメリーの剰余乗算が高速演算される。Therefore, in order to obtain T _R , the PE shown in FIG. 7 is used as the PE shown in FIG. 6, and P is added after the array shown in FIG.
Montgomery's modular multiplication is performed at high speed by an array using a total of k + 3 PEs with two Es added.

【００７４】また、図５及び図７はｖ＝１として説明し
たが、ｖ≦ｄであるｖに対しても同様の手法によってモ
ンゴメリーの剰余乗算を実行できることは明かである。Although FIG. 5 and FIG. 7 have been described with v = 1, it is clear that Montgomery's modular multiplication can be executed for v with v ≦ d by a similar method.

【００７５】〔モンゴメリーの剰余乗算及びべき乗剰余
回路の実施例３〕また、次のようなシストリックアレイ
を構成することができる。式（１７）の演算を次のよう
なアルゴリズムによって実行する。[Third Embodiment of Montgomery's Residual Multiplication and Power Residue Circuit] Further, the following systolic array can be constructed. The calculation of Expression (17) is executed by the following algorithm.

【００７６】アルゴリズム４：ＦＯＲｉ＝０ＴＯｋＭ_i-1 ＝dw_v （dw_v （Ｔ_i-1,₁ ）・Ｎ₀'）ＦＯＲｊ＝０ＴＯｍＲ_i,j ＝Ｔ_i-1,_j+1 ＋Ｃ_i,j-1 ＋Ａ_i・Ｂ_j-1 ＋Ｍ_i-1・Ｎ
_j Ｔ_i,j ＝dw_v （Ｒ_i,j ）Ｃ_i,j ＝up_v （Ｒ_i,j ）ＮＥＸＴＮＥＸＴアルゴリズム４におけるＲ_i,j 演算のアルゴリズム２と
の違いは、Ｍ_i-1・Ｎ_j対するＡ_i・Ｂ_j 及びＴ_i-1,_j の桁
の違いをビットずれではなく、用いるＢ_j 及びＴ_i,j の
ｊに関する係数をずらすことによって実現している点で
ある。従って、アルゴリズム２でビットずれのために生
じる値Ｌ_i,j はアルゴリズム４では生じない。Algorithm 4: FOR i = 0 TO k M _i−1 = dw _v (dw _v (T _i−1 , ₁ ) · N ₀ ′) FOR j = 0 TO m R _{i, j} = T _i-1 , _{j + 1} + C _{i, j-1} + A _i · B _j-1 + M _i-1 · N
_j T _{i, j} = dw _v (R _{i, j} ) C _{i, j} = up _v (R _{i, j} ) NEXT NEXT Algorithm 4 differs from R _{i, j} operation algorithm 2 in that M _i−1. The difference between the digits of A _i · B _j and T _i−1 , _j with respect to N _j is realized not by bit shift but by shifting the coefficient of j of B _j and T _{i, j to be} used. Therefore, the value L _{i, j} generated in Algorithm 2 due to the bit shift does not occur in Algorithm 4.

【００７７】アルゴリズム４のＲ_i,j ，Ｔ_i,j ，Ｃ_i,j
の演算を実行するＰＥ及びシストリックアレイを図８，
９に示す。アルゴリズム４のｊ，ｉもそれぞれ、クロッ
ク及びＰＥの位置に対応する。また、図９においても＃
ｉ＋１のＰＥにはＡ_i(i=0,…,k) の値が内部レジスタに
設定されており、ＰＥ間はＢ_inとＢ_out ，Ｔ_inとＴ
_out ，Ｍ_inとＭ_out ，Ｎ_inとＮ_out が各々接続されてい
る。また、＃１のＰＥのＴ_in，Ｍ_inの入力には各々０が
設定されているが、Ｂ_in，Ｎ_inには各々Ｂ_j ，Ｎ_j(j=0,
…,m-1) が下位桁から順に入力される。ただし、アルゴ
リズムＩの場合と異なりＢ_j はＮ_j に対して１クロック
遅れで入力される。R _{i, j} , T _{i, j} , C _{i, j of} Algorithm 4
The PE and the systolic array for executing the calculation of FIG.
9 shows. The j and i of the algorithm 4 also correspond to the positions of the clock and PE, respectively. Also in FIG.
The value of A _i (i = 0, ..., k) is set in the internal register in the PE of i + 1, and B _in and B _out , and T _in and T are between PEs.
_out , M _in and M _out , and N _in and N _out are connected respectively. Further, although 0 is set to the input of T _in and M _in of PE of # 1, B _j and N _j (j = 0, j = 0, respectively) are set to B _in and N _in .
, M-1) are input in order from the lower digit. However, unlike the case of the algorithm I, B _j is input with a delay of 1 clock with respect to N _j .

【００７８】以降、簡単のためにｖ＝ｄの場合について
説明する。図８において、×はｄビット・ｄビットの乗
算器を示し、＋は加算器を示す。加算器の入力及び出力
は次のようになる。上部の乗算器からの出力Ａ_i・Ｂ_j-1
と下部の乗算器からの出力Ｍ_i-1・Ｎ_j は各々２・ｄビッ
トであり、前ＰＥからの出力Ｔ_i-1,_j+1 はｄビットの値
である。従って、Ｃ_i,j-1 がＣ_i,j-1 ＜２^2・d+1 であれ
ば加算器からの出力は２・ｄ＋２ビットの値である。ま
た、Ｒ１〜Ｒ７はｄビットのレジスタであり、加算器か
らの出力を受けるＲ８は２・ｄ＋２ビットのレジスタで
ある。レジスタＲ８は、ＬＳＢからｄビット目までをＴ
_i-1,_j+1 として次のＰＥへ出力し、上位のｄ＋２ビット
をＣ_i,j-1 として加算器へフィードバックする。For simplicity, the case of v = d will be described below. In FIG. 8, x indicates a d-bit / d-bit multiplier, and + indicates an adder. The input and output of the adder are as follows. Output from upper multiplier A _i · B _j-1
The outputs M _i−1 · N _j from the lower multiplier and the lower multiplier are 2 · d bits respectively, and the outputs T _i−1 , _{j + 1} from the previous PE are d-bit values. Therefore, if C _{i, j-1} is C _{i, j-1} <2 ^{2 · d + 1} , the output from the adder is a value of 2 · d + 2 bits. Further, R1 to R7 are d-bit registers, and R8 receiving the output from the adder is a 2 · d + 2 bit register. The register R8 sets T from the LSB to the d-th bit.
_It is output to the next PE as _i−1 , _{j + 1} , and the upper d + 2 bits are fed back to the adder as C _{i, j−1} .

【００７９】図８ではＮ_j はＢ_j に比べて１クロック前
に入力されるのでＡ_i・Ｂ_j-1 とＭ_i-1・Ｎ_j が同時に演算
される。また、Ｔ_i-1,_j+1 をＡ_i・Ｂ_j-1 及びＭ_i-1・Ｎ_j
と同時に演算するために、Ｂ_in，Ｎ_inから入力されるＢ
_j ，Ｎ_j は２クロック遅らされて次のＰＥに出力され
る。In FIG. 8, N _j is input one clock before B _j , so that A _i · B _j-1 and M _i-1 · N _j are calculated at the same time. In addition, T _i−1 , _{j + 1} is A _i · B _j−1 and M _i−1 · N _j
In order to calculate at the same time, B input from B _in and N _in
_j and N _j are delayed by 2 clocks and output to the next PE.

【００８０】従って、図８のＰＥによっても式（１７）
の演算を行うことができ、図９のシストリックアレイに
よってモンゴメリーの剰余乗算が高速に実現できること
がわかる。この場合、アルゴリズムに相当する処理は
必要なく、図９に示すようにｍ＋１個のＰＥによってア
レイが構成できる。Therefore, according to the PE of FIG.
It can be seen that Montgomery's modular multiplication can be realized at high speed by the systolic array of FIG. In this case, the process corresponding to the algorithm is not necessary, and the array can be configured by m + 1 PEs as shown in FIG.

【００８１】また、図８はｖ＝ｄとして説明したが、ｖ
＜ｄであるｖに対しても同様の手法によってモンゴメリ
ーの剰余乗算を実行できることは明らかである。Although FIG. 8 has been described with v = d,
It is obvious that Montgomery's modular multiplication can be executed for v with d that is <d.

【００８２】〔モンゴメリーの剰余乗算及びべき乗剰余
回路の実施例４〕また、実施例３に示したアルゴリズム
４を実行する場合、Ａ_i は予めＰＥに設定している必要
はなく、図１０のようにＡ_inからＡ_i(i=0,…,k-1) を下
位桁からＮ_j に同期させて入力させ、図１１のようにＡ
_inとＡ_out を接続して次のＰＥに送る構成にしてもよ
い。この場合、Ａ_i(i=0,…,k-1) はＢ_j(j=0,…,m-1) よ
りも１クロック先に入力されるので、＃１のＰＥにおい
てＡ₀ が入力されると同時にＲ１のレジスタにＡ₀ を保
持すると、Ａ₀・Ｂ_j(j=0,…,m-1) を演算を全てのｊに対
して実行することができる。また、Ｂ_j は２クロック遅
れて次のＰＥに入力されるが、Ａ_i は１クロックしか遅
れないので、＃ｉ−１のＰＥにおいてＡ_i がＢ_j(j=0,
…,m-1) の前に入力され保持できたとすると、＃ｉのＰ
ＥではＡ_i+1 がＢ_j(j=0,…,m-1) の前に入力され保持で
きる。従って、＃ｉのＰＥにおいて無理なくＡ_i+1・Ｂ
_j(j=0,…,m-1) の演算が実行できる。従って、回路規模
及び処理速度を変えることなくアルゴリズム４を図１
０，１１のＰＥ及びシストリックアレイによっても実現
できる。[Fourth Embodiment of Montgomery's Residual Multiplication and Power Residue Circuit] Further, when executing the algorithm 4 shown in the third embodiment, it is not necessary to set A _i in the PE in advance, and as shown in FIG. , A _i (i = 0, ..., k-1) from A _in is input in synchronization with N _j from the lower digit, and as shown in FIG.
_The configuration may be such that _in and A _out are connected and sent to the next PE. In this case, A _i (i = 0, ..., k-1) is input one clock ahead of B _j (j = 0, ..., m-1), so A ₀ is input in the # 1 PE. When A ₀ is held in the register of R1 at the same time, the operation of A ₀ · B _j (j = 0, ..., m-1) can be executed for all j. Also, B _j is input to the next PE after being delayed by 2 clocks, but A _i is delayed by only 1 clock, so that A _i is B _j (j = 0,
..., m-1), and if it could be held before
In E, A _{i + 1} can be input and held before B _j (j = 0, ..., m-1). Therefore, in PE of #i, A _{i + 1} · B can be easily
_{The operation of j} (j = 0, ..., m-1) can be executed. Therefore, the algorithm 4 shown in FIG. 1 is used without changing the circuit scale and the processing speed.
It can be realized also by PE of 0 and 11 and a systolic array.

【００８３】〔モンゴメリーの剰余乗算及びべき乗剰余
回路の実施例５〕モンゴメリーのべき乗剰余演算は式
（１７）の演算の繰り返しによって実行できる。図１０
及び図８，１０に示したＰＥは式（１７）の演算を行う
ことができるので、図１２のようにメモリと組み合わせ
れば、１つのＰＥを( ３・ｔ／２＋２)・ｑ回（ｑはモン
ゴメリーの剰余乗算アレイを構成するために必要なＰＥ
の数で、図７のＰＥではｋ＋３個，図８，１０のではｋ
−２個）用いてモンゴメリーのべき乗剰余演算を処理で
きる。もし、ｐ個のＰＥを用いるならば、( ３・ｔ／２
＋２)・ｑ／ｐ回の繰り返しでモンゴメリーのべき乗剰余
演算を処理できる。処理速度は繰り返し回数に反比例す
るので、この方式は処理速度をＰＥの数に比例させるこ
とができ、また、ＰＥの数による処理速度の高速化また
は回路規模の小型化に対して同じ効率のべき乗剰余演算
回路を構成することができる。[Embodiment 5 of Montgomery's modular multiplication and modular exponentiation circuit] Montgomery's modular exponentiation can be executed by repeating the calculation of equation (17). Figure 10
Since the PEs shown in FIGS. 8 and 10 can perform the operation of the equation (17), if one PE is combined with a memory as shown in FIG. 12, one PE is (3.t / 2 + 2) .q times (q Is the PE required to construct the Montgomery modular multiplication array
, The PE in FIG. 7 is k + 3, and in FIGS.
-2) can be used to process the Montgomery modular exponentiation operation. If p PEs are used, (3 · t / 2
It is possible to process Montgomery's modular exponentiation operation by repeating +2) · q / p times. Since the processing speed is inversely proportional to the number of repetitions, this method can make the processing speed proportional to the number of PEs, and the power of the same efficiency can be used for increasing the processing speed by the number of PEs or reducing the circuit size. A remainder arithmetic circuit can be configured.

【００８４】従って、図１３に示すような装置化が可能
になる。図１３において、ＳＹＭＣ（Systolic Modular
Exponentiation Chip）と表されているのは、ｐ個のＰ
Ｅを縦列接続したものをチップ化したものである。ｐは
１≦ｐ≦( ３・ｔ／２＋２)・ｑであれば任意であるの
で、任意の回路規模のチップを構成することができる。
また、ＳＹＭＣは回路構成に規則性をもつので装置化及
びチップ化が非常に行いやすい。また、処理速度はＳＹ
ＭＣの数に比例して高速化できるので、図１３に示すよ
うにＳＹＭＣを従属に接続するだけでよい。この場合、
ＳＹＭＣの処理回数を変える必要があるが、これは制御
回路を外部からプログラミング可能なＲＯＭ等によって
構成することによって容易に実現できる。Therefore, the device as shown in FIG. 13 can be realized. In FIG. 13, SYMC (Systolic Modular
Exponentiation Chip) is represented by p P
This is a chip formed by connecting Es in cascade. Since p is arbitrary as long as 1 ≦ p ≦ (3 · t / 2 + 2) · q, a chip having an arbitrary circuit scale can be configured.
In addition, SYMC has a regular circuit structure, so that it is very easy to make a device and a chip. Also, the processing speed is SY
Since the speed can be increased in proportion to the number of MCs, it is only necessary to connect SYMCs to the subordinates as shown in FIG. in this case,
Although it is necessary to change the number of times of SYMC processing, this can be easily realized by configuring the control circuit with an externally programmable ROM or the like.

【００８５】〔モンゴメリーの剰余乗算及びべき乗剰余
演算回路のその他の実施例〕上記実施例によるアルゴリ
ズムでは１つのＰＥで行う処理は簡単な整数演算である
ので、ＰＥを別にチップ化しなくても通常のＤＳＰやＣ
ＰＵ等によってもモンゴメリーのべき乗剰余演算を簡単
に実現することができる。[Other Embodiments of Montgomery's Residual Multiplication and Exponentiation Residue Arithmetic Circuit] In the algorithm according to the above embodiment, the processing performed by one PE is a simple integer arithmetic operation, so that it is possible to perform a normal PE operation even if the PE is not separately chipped. DSP or C
It is possible to easily implement Montgomery's modular exponentiation by using PU or the like.

【００８６】また、上記実施例はシストリックアレイを
基本としているので、回路構成が規則的であり、制御や
遅延も局所的であるのでＶＬＳＩによる実用化にも最適
である。Since the above-described embodiment is based on the systolic array, the circuit configuration is regular and the control and delay are local, so that it is suitable for practical use by VLSI.

【００８７】また、上記実施例のＰＥを従属に接続せず
独立した演算素子として用い、よく知られたマイクロプ
ログラミング的な手法によって制御して剰余乗算及びべ
き乗剰余演算を実現することも容易である。It is also easy to realize the modular multiplication and the modular exponentiation operation by using the PE of the above-mentioned embodiment as an independent arithmetic element without being connected to the subordinates and controlling it by a well-known microprogramming method. ..

【００８８】また、図５，７及び図８，１０のＰＥは式
（１８）の演算を一括して実行しているが、式（１８）
を種々に分解した演算素子によって最終的に式（１８）
の演算を実行する場合も本発明は含んでいる。The PEs of FIGS. 5, 7 and 8 and 10 collectively execute the operation of the equation (18), but the equation (18)
The equation (18)
The present invention also includes the case of executing the calculation of.

【００８９】また、本発明をシストリックアレイによっ
て実行する場合、制御信号もデータと同時に入力させる
ことができ、制御信号の伝送レジスタまで含んだものを
ＰＥとすることもできる。When the present invention is implemented by a systolic array, a control signal can be input at the same time as data, and a PE including a control signal transmission register can also be used.

【００９０】以上によってシストリックアレイを用いた
べき乗剰余演算及び剰余乗算回路及び方法の構成法が示
された。この方式はイブンによる手法の欠点をすべて解
決した回路及び方法を提供する。それによって、次のよ
うな効果を持つ効率的な暗号システムを構成することが
できる。As described above, the method of constructing the modular exponentiation and modular multiplication circuit and method using the systolic array has been shown. This scheme provides a circuit and method that overcomes all the drawbacks of the Iven approach. As a result, an efficient cryptographic system having the following effects can be constructed.

【００９１】高速な暗号システムが必要な場合、本発明
によるべき乗剰余演算及び剰余乗算回路をＶＬＳＩ等に
よって構成すればよい。この場合、本発明によるべき乗
剰余演算及び剰余乗算回路は簡単なＰＥによる規則構造
を持ち、かつ、ＰＥの制御とＰＥ内の遅延時間は局所的
であるので、ＶＬＳＩに最適である。これによって、高
速な暗号システムが構成される。When a high-speed cryptosystem is required, the exponentiation / residue calculation and the modular multiplication circuit according to the present invention may be configured by VLSI or the like. In this case, the modular exponentiation operation and modular multiplication according to the present invention have a simple PE rule structure, and the PE control and the delay time within the PE are local, which is optimal for VLSI. This constitutes a high speed encryption system.

【００９２】また、高速性よりも小型回路による暗号シ
ステムが要求される場合は、本発明によるべき乗剰余演
算及び剰余乗算回路をＰＥ数個で構成すればよい。この
場合も、ＰＥによる規則構造と制御と遅延時間の局所性
といった特徴は失われず回路化しやすい。また、ＰＥ内
で行われる演算は簡単な整数演算であるので、本発明に
よる演算手順はＣＰＵやＤＳＰ等のソフト的な手法によ
っても簡単な暗号システムを実現することができる。When a cryptographic system using a small circuit is required rather than high speed, the power-residue calculation and the residue-multiplication circuit according to the present invention may be composed of several PEs. In this case as well, the features such as the regular structure by PE, the control, and the locality of the delay time are not lost, and the circuit can be easily formed. Further, since the operation performed in the PE is a simple integer operation, the operation procedure according to the present invention can realize a simple cryptographic system even by a software method such as a CPU or a DSP.

【００９３】また、いくつかのＰＥからなる小型回路
（ＳＹＭＣ等）による暗号装置を構成した後で、高速性
が必要となっても図１０のように、その小型回路を縦続
に接続して行けば、回路規模に比例した高速化が実現で
きる。従って、暗号装置を作り直すことなく、継ぎ足し
て行くだけで必要に応じた高速化が簡単に行える暗号シ
ステムを実現できる。Further, even if high speed is required after the encryption device is composed of small circuits (SYMC etc.) consisting of several PEs, the small circuits should be connected in cascade as shown in FIG. If so, speeding up proportional to the circuit scale can be realized. Therefore, it is possible to realize an encryption system in which the speed can be easily increased as necessary simply by adding the encryption device without recreating the encryption device.

【００９４】また、１度暗号システムを構成した後で、
暗号システムの強度を増すために演算する整数のビット
数を増す場合も、同一の回路または、ＰＥの数を増した
同様の回路によって対応することができる。これは、本
発明のべき乗剰余演算及び剰余乗算回路が回路規模と処
理回数を簡単にトレードオフできるので、演算すべき整
数のビット数の違いを処理回数の違いに帰着できるため
である。従って、システムの暗号的な強度を増す場合に
も、暗号装置をつくり直す必要がない。また、演算する
整数のビット数を減少させる場合にも、暗号装置をつく
り直すことのない暗号システムを実現することができ
る。After the encryption system is constructed once,
Even when the number of integer bits to be calculated for increasing the strength of the cryptosystem is increased, the same circuit or a similar circuit with an increased number of PEs can be used. This is because the modular exponentiation and modular multiplication circuit of the present invention can easily trade off the circuit scale and the number of processings, and thus the difference in the number of bits of the integers to be calculated can result in the difference in the number of processings. Therefore, even if the cryptographic strength of the system is increased, it is not necessary to remake the cryptographic device. Further, even when the number of bits of the integer to be calculated is reduced, it is possible to realize a cryptographic system without recreating the cryptographic device.

【００９５】以上のような効果は、特願平3-225986号で
も述べているように従来のべき乗剰余演算及び剰余乗算
による暗号装置では実現できないものである。従って、
本発明によるべき乗剰余演算及び剰余乗算回路及び方法
を用いることによって柔軟で拡張性のある暗号システム
を構成することができる。As described in Japanese Patent Application No. 3-225986, the above-mentioned effects cannot be realized by the conventional cryptographic device based on modular exponentiation and modular multiplication. Therefore,
By using the modular exponentiation operation and the modular multiplication circuit and method according to the present invention, it is possible to construct a flexible and expandable cryptographic system.

【００９６】次に、特願平3-225986号のシストリックア
レイ（以後、アレイ０と呼ぶ）と本発明の実施例１に示
したシストリックアレイ（アレイ１），実施例２，３に
示したシストリックアレイ（アレイ２）の比較を行う。Next, the systolic array of Japanese Patent Application No. 3-225986 (hereinafter referred to as array 0) and the systolic array (array 1) shown in the first embodiment of the present invention and the second and third embodiments are shown. Compare the systolic array (array 2).

【００９７】アレイ０は除算を伴うために剰余テーブル
（ＲＯＭ等）によって剰余演算を行うが、アレイ１，２
はすべて乗算によって剰余乗算が演算できるため１クロ
ックに必要な処理時間がアレイ０に比べて短いために高
速処理できる。また、アレイ１，２はＲＯＭ等の剰余テ
ーブルを用いず、ＡＮＤゲートのような回路規模のくす
ることができる。また、アレイ０はキャリービットの処
理のためのＰＥを必要とするが、アレイ１，２は必要と
しないので必要なＰＥの数がアレイ０に比べて少ない。
従って、同じ程度の回路規模を用いた場合、アレイ１，
２はアレイ０に対して数倍の高速処理が行える。Since the array 0 involves division, the remainder operation is performed by the remainder table (ROM or the like).
Since all can perform modular multiplication by multiplication, the processing time required for one clock is shorter than that of array 0, and therefore high-speed processing is possible. Further, the arrays 1 and 2 can be made to have a circuit scale like an AND gate without using a remainder table such as a ROM. Further, array 0 requires PEs for carrying bit processing, but arrays 1 and 2 do not require PEs, so the number of required PEs is smaller than that of array 0.
Therefore, using the same circuit scale, array 1,
2 can perform high speed processing several times as fast as array 0.

【００９８】また、アレイ１はアレイ０に比べてＰＥの
種類が少なく図７に示したようにＰＥを共通化しやす
い。また、アレイ２は１種類のＰＥのみで構成できる。
従って、アレイ１，２はアレイ０よりも容易に回路化す
ることができ、アレイ０より無駄のない回路を構成する
ことができる。The array 1 has a smaller number of PEs than the array 0, and the PEs can be commonly used as shown in FIG. Further, the array 2 can be composed of only one type of PE.
Therefore, the arrays 1 and 2 can be formed into a circuit more easily than the array 0, and a circuit with less waste than the array 0 can be configured.

【００９９】また、アレイ０が剰余テーブルを用いてい
る場合、演算すべき各整数のビット数の増加に対してア
レイ１，２はアレイ０よりも柔軟性が高い。これは、ア
レイ０がＲＯＭ等の剰余テーブルを用いている場合、テ
ーブルの最大容量によって各整数のビット数が制限され
るためである。これに対し、アレイ１，２は乗算によっ
て剰余が演算されるので剰余テーブルが必要なく演算す
る整数のビット数に制限がない。ただし、演算する整数
のビット数が減少する場合（中国人の剰余定理を用いる
場合等）には、アレイ０〜アレイ２の柔軟性は同じであ
り、制限がない。よって、演算する整数のビット数の変
化に対して、アレイ１，２はアレイ０と異なり制限がな
い。従って、アレイ１，２は演算する整数のビット数の
異なる演算に対してもＳＹＭＣ等の回路を全く作り直す
必要がない。When the array 0 uses the remainder table, the arrays 1 and 2 are more flexible than the array 0 with respect to the increase in the number of bits of each integer to be calculated. This is because, when the array 0 uses a remainder table such as a ROM, the maximum capacity of the table limits the number of bits of each integer. On the other hand, since the modulos are calculated by multiplication in the arrays 1 and 2, there is no need for the modulo table and there is no limit to the number of integer bits to be calculated. However, when the number of bits of the integer to be calculated decreases (such as when using the Chinese Remainder Theorem), the flexibility of Array 0 to Array 2 is the same and there is no limitation. Therefore, unlike the array 0, the arrays 1 and 2 are not limited to changes in the number of bits of the integer to be calculated. Therefore, in the arrays 1 and 2, it is not necessary to remake a circuit such as SYMC even for an operation in which the number of integer bits to be operated is different.

【０１００】以上のことから、上記実施例による剰余乗
算及びべき乗剰余方式を用いた暗号装置は、上述の効果
を最も小さな回路で高速に、かつ柔軟に実現できる暗号
システムを提供することができる。As described above, the cryptographic apparatus using the modular multiplication and the modular exponentiation method according to the above-described embodiment can provide a cryptographic system that can achieve the above-mentioned effects with high speed and flexibility with the smallest circuit.

【０１０１】〔モンゴメリーの剰余乗算及びべき乗剰余
回路の実施例６〕ｉ回目の演算におけるＴ_R の値Ｔ_
_iを、式（１８）におけるＴ__iとは異なり、Ｔ__i＝( Ｔ__i-1／Ｙ＋Ａ_i・Ｂ_R)＋Ｍ_i・Ｎ（１９）ただし、Ｍ_i ＝((Ｔ__i-1／Ｙ＋Ａ_i・Ｂ_R) mod Ｙ)・Ｎ₀'
mod Ｙ，Ｔ__-1 ＝０，Ｎ₀'＝Ｎ' mod Ｙこの演算を複数のＰＥによる並列処理で実現するため
に、Ｂ_R ，ＮをＢ_j ，Ｎ_j に分解して次のように表す。アルゴリズム５：ＦＯＲｉ＝０ＴＯｋ−１ＦＯＲｊ＝０ＴＯｍ−１Ｓ_i,j ＝Ｔ_i-1,_j ／Ｙ＋Ａ_i・Ｂ_j ＋Ｃ_i,j-1 Ｍ_i ＝dw_v （dw_v （Ｓ_i,0 ）・Ｎ₀'）Ｒ_i,j ＝Ｓ_i,j ＋Ｍ_i・Ｎ_j ＋Ｌ_i-1,_j+1・ＸＬ_i,j ＝dw_v （Ｒ_i,j ）Ｔ_i,j ＝dw_d+v （Ｒ_i,j ）−Ｌ_i,j Ｃ_i,j ＝up_d+v （Ｒ_i,j ）ＮＥＸＴＮＥＸＴただし、dw_d （Ｚ）＝Ｚ mod ２^d up_d （Ｚ）＝（Ｚ−dw_d （Ｚ））／２^d Ｔ_i,j ，Ｃ_i,j ，Ｌ_i,j の初期値は全て０アルゴリズム５において、Ｃ_i,j-1 は桁上がりとしてＳ
_i,j を演算する時に用いられる。また、Ｌ_i-1,_j+1・Ｘ，
およびＴ_i-1,_j ／Ｙ等のＸ，Ｙを定数としてもつ演算は
他の値に対してビットをずらすことによって実現でき
る。従って、Ｔ_i,j に関する演算はＲ_i,j のＬＳＢに対
してｖビット目からｄ＋ｖ−１ビット目までの値をＴ
_i,j とすることを意味する。ただし、Ｌ_i,j はＲ_i,j の
ＬＳＢからｖ−１ビット目までの値である。このように
Ｔ_i,j を得るための１／Ｙ演算をＲ_i,j 毎の下位へのビ
ットシフトによって実現しているので、Ｌ_i-1,_j+1 はＲ
_i,j 演算するときに用いられ、Ｘによって桁を合わせて
演算される。[Embodiment 6 of Montgomery's modular multiplication and modular exponentiation circuit] The value T_ of T _R in the i-th calculation
The _i, unlike T_ _i in equation _{(18), T_ i = (} T_ i-1 / Y + A i · B R) + M i · N (19) _{_{However, M i = ((T_ i}} -1 / Y + A i _{· B R) mod Y) ·} N 0 '
mod Y, in order to achieve parallel processing by _{_{T_ -1 = 0, N 0 '}} = N' mod Y plurality this operation PE, B _R, to decompose the N B _j, the N _j as follows Represent Algorithm 5: FOR i = 0 TO k-1 FOR j = 0 TO m-1 S _{i, j} = T _i-1 , _j / Y + A _i · B _j + C _{i, j-1} M _i = dw _v (dw _v (S _{i, 0} ) · N ₀ ′) R _{i, j} = S _{i, j} + M _i · N _j + L _i-1 , _{j + 1} · X L _{i, j} = dw _v (R _{i, j} ) T _{i , j} = dw _{d + v} (R _{i, j} ) −L _{i, j} C _{i, j} = up _{d + v} (R _{i, j} ) NEXT NEXT However, dw _d (Z) = Z mod 2 ^d up _d ( Z) = (Z−dw _d (Z)) / 2 ^d Initial values of T _{i, j} , C _{i, j} , L _{i, j} are all 0. In Algorithm 5, C _{i, j-1} is S as a carry.
Used when computing _{i, j} . Also, L _i−1 , _{j + 1} · X,
And operations having X, Y as constants such as T _i−1 , _j / Y can be realized by shifting bits with respect to other values. Therefore, in the operation regarding T _{i, j} _, the value from the v-th bit to the d + v−1-th bit with respect to the LSB of R _{i, j} is T
It means _{i, j} . However, L _{i, j} is a value from the LSB of R _{i, j} to the v−1th bit. In this way, the 1 / Y operation for obtaining T _{i, j} is realized by the bit shift to the lower order for each R _{i, j,} so that L _i−1 , _{j + 1} is R
_It is used when performing _{i, j} calculations, and is calculated by matching the digits by X.

【０１０２】アルゴリズム５においてｉを処理回数，ｊ
をクロックと考えると各ｊに対して演算しなければなら
ないのはＳ_i,j 及びＲ_i,j のみであり（Ｌ_i,j ，Ｔ
_i,j ，Ｃ_i,j はビットシフトのみで実現される）、Ｓ
_i,j 及びＲ_i,j は次の同型の演算によって実現される。In Algorithm 5, i is the number of processing times, j
Is a clock, it is only S _{i, j} and R _{i, j} that must be calculated for each j (L _{i, j} , T
_{i, j} and C _{i, j} are realized only by bit shift), S
_{i, j} and R _{i, j} are realized by the following operations of the same type.

【０１０３】ｆ＝ｄ／ｙ＋ａ・ｂ＋ｃ・ｘ（２０）ただし、ｙは２^v または１，ｘは２^d または１ｘ，ｙはビットシフトの有無であるので、式（２０）は
図１４に示すＰＥによって演算できる。F = d / y + ab + cx (20) However, y is 2 ^v or 1, x is 2 ^d or 1 x, y is the presence or absence of bit shift, and therefore the equation (20) is shown in FIG. It can be calculated by the PE shown.

【０１０４】以下、簡単のためにｖ＝１の場合について
図１４の回路と動作を説明する。図１４において×は乗
算器を示し、ｄ個のアンドによって実現できる。Ｒ１は
ａを保持する１ビットレジスタである。Ｓ１，Ｓ２は入
力ｄ，ｃをｙまたはｘによってビットシフトさせるかど
うかを選択するセレクタである。＋で示される加算器は
乗算器からの出力ａ・ｂとセレクタからの出力ｄ／ｙ及
びｃ・ｘの加算を行いｆを演算する。Ｒ２は加算器から
の出力を保持するレジスタである。以上によって、図１
４のＰＥで式（２０）が演算できることが分かる。For simplicity, the circuit and operation of FIG. 14 will be described below for the case of v = 1. In FIG. 14, x indicates a multiplier, which can be realized by d AND gates. R1 is a 1-bit register that holds a. S1 and S2 are selectors for selecting whether to bit shift the inputs d and c by y or x. The adder indicated by + adds the outputs a and b from the multiplier and the outputs d / y and c and x from the selector to calculate f. R2 is a register that holds the output from the adder. By the above, FIG.
It can be seen that the equation (20) can be calculated with the PE of 4.

【０１０５】従って、アルゴリズム５の各演算は図１５
のように図１４のＰＥを２つ組み合わせれば求めること
ができる。ただし、図１５のＢ_in，Ｎ_inには各々Ｂ_j ，
Ｎ_j(j=0,…,m-1) が下位桁から順に入力されるとする。
この場合、左のＰＥでＳ_i,jが演算され、右のＰＥでＲ
_i,j が演算される。また、ｖ＝１であるのでＮが奇数で
あれば、Ｎ₀'＝１となり、Ｓ_i,0 のＬＳＢがＭ_i とな
り、右のＰＥのレジスタＲ１に保持される。また、Ｒ
_i,j はＣ_i,j ，Ｔ_i,j ，Ｌ_i,j に分割されて表示されて
いる。よって、図１５をｋ個または図１４のＰＥを２・
ｋ個用いればアルゴリズム５が実行できることは明らか
である。以上から、図１４のＰＥによって式（１９）を
効率的に並列処理できることがわかる。Therefore, each operation of Algorithm 5 is shown in FIG.
It can be obtained by combining the two PEs shown in FIG. However, B _j and N _in _in FIG.
It is assumed that N _j (j = 0, ..., M-1) are input in order from the lower digit.
In this case, the left PE calculates S _{i, j} and the right PE calculates R _i
_{i, j} are calculated. Further, since v = 1, if N is an odd number, N ₀ ′ = 1, and the LSB of S _{i, 0} becomes M _i , which is held in the register R1 of the right PE. Also, R
_{i, j} is divided into C _{i, j} , T _{i, j} and L _{i, j} for display. Therefore, k in FIG. 15 or 2 PE in FIG.
It is clear that the algorithm 5 can be executed by using k. From the above, it can be seen that the PE of FIG. 14 can efficiently perform the parallel processing of Expression (19).

【０１０６】また、図１４，１５はｖ＝１として説明し
たが、ｖ＜ｄであるｖに対しても同様の手法によってモ
ンゴメリーの剰余乗算を実行できることは明らかであ
る。Although FIGS. 14 and 15 have been described with v = 1, it is clear that Montgomery's modular multiplication can be executed for v with v <d by the same method.

【０１０７】〔モンゴメリーの剰余乗算及びべき乗剰余
回路の実施例７〕また、次のようなシストリックアレイ
を構成することができる。式（１４）の演算を次のよう
なアルゴリズムによって実行する。アルゴリズム６：ＦＯＲｉ＝０ＴＯｋＦＯＲｊ＝０ＴＯｍＳ_i,j ＝Ｔ_i-1,_j+1 ＋up_v （Ｓ_i,j-1 ）＋Ａ_i・Ｂ_j Ｍi ＝dw_v （dw_v （Ｓ_i,0 ）・Ｎ₀'）Ｒ_i,j ＝dw_v （Ｓ_i,j ）＋up_v （Ｒ_i,j-1 ）＋Ｍi・Ｎ_j Ｔ_i,j ＝dw_v （Ｒ_i,j ）ＮＥＸＴＮＥＸＴアルゴリズム６のアルゴリズム５に対する違いは、式
（１９）のＴ_i-1 ／Ｙをビットずれではなく、クロック
のずれによって実現している点である。従って、アルゴ
リズム５でビットずれのために生じる値Ｌ_i,j は、アル
ゴリズム６では生じない。[Embodiment 7 of Montgomery's modular multiplication and modular exponentiation circuit] Further, the following systolic array can be constructed. The calculation of Expression (14) is executed by the following algorithm. Algorithm 6: FOR i = 0 TO k k FOR j = 0 TO m S _{i, j} = T _i-1 , _{j + 1} + up _v (S _{i, j-1} ) + A _i · B _j Mi = dw _v (dw _v (S _{i, 0} ) · N ₀ ′) R _{i, j} = dw _v (S _{i, j} ) + up _v (R _{i, j-1} ) + Mi · N _j T _{i, j} = dw _v (R _{i, j} ) The difference between the NEXT NEXT algorithm 6 and the algorithm 5 is that T _i-1 / Y in the equation (19) is realized by a clock shift, not a bit shift. Therefore, the value L _{i, j} generated in Algorithm 5 due to the bit shift does not occur in Algorithm 6.

【０１０８】アルゴリズム６を実行するＰＥ及びシスト
リックアレイを図１６，１７に示す。アルゴリズム６の
ｊ，ｉもまた、それぞれがクロック及び処理回数に対応
する。また、図１７のＢ_in，Ｎ_inには各々Ｂ_j ，Ｎ_j(j=
0,…,m-1) が下位桁から順に入力される。A PE and systolic array implementing Algorithm 6 are shown in FIGS. Each of j and i in Algorithm 6 also corresponds to the clock and the number of times of processing. Further, each of the B _in, N _in the Figure _{_{17 B j, N j (j}} =
0, ..., m-1) are input in order from the lower digit.

【０１０９】以降、簡単のためにｖ＝ｄの場合について
説明をする。まず、図１６において、×はｄビット・ｄ
ビットの乗算器を示し、＋は加算器を示す。Ｒ１はＡ_i
またはＭ_i を保持するレジスタであり、Ｒ２は加算器か
らの出力を保持するレジスタであり、そのｖビット目以
上は桁上がりとして１クロック遅れで加算器にフィード
バック入力されている。これによって、図１７では左の
ＰＥにおいてＳ_i,j が演算され、右のＰＥにおいてＲ
_i,j が演算されることがわかる。このとき、Ｍ_iは外部
にある乗算器によってＮ₀'と乗算され右のＰＥへ出力さ
れる。従って、図７のＰＥはアルゴリズム６を並列処理
によって実現するための効率的な基本演算子になってい
ることがわかる。Hereinafter, for simplicity, the case of v = d will be described. First, in FIG. 16, x is d bits · d
A bit multiplier is shown, and + is an adder. R1 is A _i
Alternatively, R2 is a register that holds M _i , and R2 is a register that holds the output from the adder. The v-th bit and above are carried as a carry and fed back to the adder with a delay of one clock. As a result, in FIG. 17, S _{i, j} is calculated in the left PE and R _i is calculated in the right PE.
It can be seen that _{i, j} is calculated. At this time, M _i is multiplied by N ₀ 'by an external multiplier and output to the right PE. Therefore, it is understood that the PE in FIG. 7 is an efficient basic operator for realizing the algorithm 6 by parallel processing.

【０１１０】また、図１６，１７はｖ＝ｄとして説明し
たが、ｖ＜ｄであるｖに対しても同様の手法によってモ
ンゴメリーの剰余乗算を実行できることは明らかであ
る。Although FIGS. 16 and 17 have been described with v = d, it is clear that Montgomery's modular multiplication can be executed for v with v <d by the same method.

【０１１１】〔モンゴメリーの剰余乗算及びべき乗剰余
回路の実施例８〕モンゴメリーのべき乗剰余演算は式
（１９）の演算の繰り返しによって実行できる。図１４
及び図１６に示したＰＥは式（１９）の演算を行うこと
ができるので、図１８のようにメモリと組み合わせれ
ば、１つのＰＥを( ３・ｔ／２＋２)・ｑ回（ｑはモンゴ
メリーの剰余乗算アレイを構成するために必要なＰＥの
数で、図１４のＰＥでは２・ｋ個，図１６のＰＥでは２
・(ｋ＋１) 個）用いてモンゴメリーのべき乗剰余演算を
処理できる。もし、ｐ個のＰＥを用いるならば、( ３・
ｔ／２＋２)・ｑ／ｐ回の繰り返しでモンゴメリーのべき
乗剰余演算を処理できる。処理速度は繰り返し回数に反
比例するので、この方式は処理速度をＰＥの数に比例さ
せることができ、また、ＰＥの数による処理速度の高速
化または回路規模の小型化に対して同じ効率のべき乗剰
余演算回路を構成することができる。[Embodiment 8 of Montgomery's modular multiplication and modular exponentiation circuit] Montgomery's modular exponentiation can be executed by repeating the calculation of equation (19). 14
And since the PE shown in FIG. 16 can perform the operation of the equation (19), if it is combined with a memory as shown in FIG. 18, one PE will be (3 · t / 2 + 2) · q times (q is Montgomery). The number of PEs required to form the remainder multiplication array of 2k in the PE of FIG. 14 and 2 in the PE of FIG.
-(K + 1) pieces can be used to process the Montgomery modular exponentiation operation. If p PEs are used, (3.
It is possible to process Montgomery's modular exponentiation operation by repeating t / 2 + 2) · q / p times. Since the processing speed is inversely proportional to the number of repetitions, this method can make the processing speed proportional to the number of PEs, and the power of the same efficiency can be used for increasing the processing speed by the number of PEs or reducing the circuit size. A remainder arithmetic circuit can be configured.

【０１１２】従って、図１９に示すような装置化が可能
になる。図１９においてＭＥＣ（Modular Exponentiati
on Chip ）と表されているのは、ｐ個のＰＥを組み合わ
せてチップ化したものである。ｐは１≦ｐ≦( ３・ｔ／
２＋２)・ｑであれば任意であるので、任意の回路規模の
チップを構成することができる。また、ＭＥＣは回路構
成に規則性をもつので装置化及びチップ化が非常に行い
やすい。また、処理速度はＭＥＣの数に比例して高速化
できるので、図７に示すようにＭＥＣを複数用いればよ
い。この場合、各ＭＥＣの制御をチップ数に応じて変え
る必要があるが、これは制御回路を外部からプログラミ
ング可能なＲＯＭ等によって構成することによって容易
に実現できる。Therefore, a device as shown in FIG. 19 can be realized. In FIG. 19, MEC (Modular Exponentiati
On chip) is a chip formed by combining p PEs. p is 1 ≤ p ≤ (3 · t /
Since 2 + 2) · q is arbitrary, a chip having an arbitrary circuit scale can be configured. Further, since the MEC has regularity in the circuit configuration, it can be very easily made into a device and a chip. Further, since the processing speed can be increased in proportion to the number of MECs, a plurality of MECs may be used as shown in FIG. In this case, it is necessary to change the control of each MEC according to the number of chips, but this can be easily realized by configuring the control circuit with an externally programmable ROM or the like.

【０１１３】〔モンゴメリーの剰余乗算及びべき乗剰余
演算回路のその他の実施例〕本発明によるアルゴリズム
では１つのＰＥで行う処理は簡単な整数演算であるの
で、ＰＥを別にチップ化しなくても通常のＤＳＰやＣＰ
Ｕ等によってもモンゴメリーのべき乗剰余演算を簡単に
実現することができる。[Other Embodiments of Montgomery's Residual Multiplication and Exponentiation Residue Arithmetic Circuit] In the algorithm according to the present invention, the processing performed by one PE is a simple integer arithmetic operation. And CP
Even with U or the like, Montgomery's modular exponentiation operation can be easily realized.

【０１１４】また、本発明は回路構成が規則的であり、
制御や遅延も局所的であるのでＶＬＳＩによる実用化に
も最適である。The present invention has a regular circuit configuration,
Since the control and delay are local, it is suitable for practical use by VLSI.

【０１１５】また、図１４，１６のＰＥを組み合わせた
ものを１つのＰＥとして構成することも可能である。It is also possible to combine the PEs shown in FIGS. 14 and 16 into one PE.

【０１１６】以上によってＰＥを用いたべき乗剰余演算
及び剰余乗算回路及び方法の構成法が示された。それに
よって、次のような効果を持つ効率的な暗号システムを
構成することができる。As described above, the method of constructing a modular exponentiation and modular multiplication circuit and method using PE has been shown. As a result, an efficient cryptographic system having the following effects can be constructed.

【０１１７】高速な暗号システムが必要な場合、本発明
によるべき乗剰余演算及び剰余乗算回路をＶＬＳＩ等に
よって構成すればよい。この場合、本発明によるべき乗
剰余演算及び剰余乗算回路は簡単なＰＥによる規則構造
を持つので、ＶＬＳＩに最適である。これによって、高
速な暗号システムが構成される。When a high-speed cryptosystem is required, the exponentiation and modular multiplication and modular multiplication circuit according to the present invention may be configured by VLSI or the like. In this case, the modular exponentiation operation and the modular multiplication according to the present invention have a simple PE rule structure, and are therefore suitable for VLSI. This constitutes a high speed encryption system.

【０１１８】また、高速性よりも小型回路による暗号シ
ステムが要求される場合は、本発明によるべき乗剰余演
算及び剰余乗算回路をＰＥ数個で構成すればよい。この
場合も、ＰＥによる規則構造といった特徴は失われず回
路化しやすい。また、ＰＥ内で行われる演算は簡単な整
数演算であるので、本発明による演算手順はＣＰＵやＤ
ＳＰ等のソフト的な手法によっても簡単な暗号システム
を実現することができる。If a cryptographic system with a small circuit is required rather than high speed, the power-residue calculation and the residue-multiplication circuit according to the present invention may be configured by several PEs. Also in this case, the feature such as the regular structure of PE is not lost and it is easy to form a circuit. Further, since the operation performed in PE is a simple integer operation, the operation procedure according to the present invention is performed by the CPU or D.
A simple encryption system can be realized by a software method such as SP.

【０１１９】また、いくつかのＰＥからなる小型回路
（ＭＥＣ等）による暗号装置を構成した後で、高速性が
必要となっても図１９のように、その小型回路を複数用
いれば回路規模に比例した高速化が実現できる。従っ
て、暗号装置を作り直すことなく、回路を増して行くだ
けで必要に応じた高速化が簡単に行える暗号システムを
実現できる。Further, even if high speed is required after the encryption device is constructed by a small circuit (MEC or the like) consisting of several PEs, the circuit scale can be increased by using a plurality of such small circuits as shown in FIG. A proportional increase in speed can be realized. Therefore, it is possible to realize an encryption system in which the required speed can be easily increased by simply increasing the number of circuits without recreating the encryption device.

【０１２０】また、１度暗号システムを構成した後で、
暗号システムの強度を増すために演算する整数のビット
数を増す場合も、同一の回路または、ＰＥの数を増した
同様の回路によって対応することができる。これは、本
発明のべき乗剰余演算及び剰余乗算回路が回路規模と処
理回数を簡単にトレードオフできるので、演算すべき整
数のビット数の違いを処理回数の違いに帰着できるため
である。従って、システムの暗号的な強度を増す場合に
も、暗号装置をつくり直す必要がない。また、演算する
整数のビット数を減少させる場合にも、暗号装置をつく
り直すことのない暗号システムを実現することができ
る。After the encryption system is constructed once,
Even when the number of integer bits to be calculated for increasing the strength of the cryptosystem is increased, the same circuit or a similar circuit with an increased number of PEs can be used. This is because the modular exponentiation and modular multiplication circuit of the present invention can easily trade off the circuit scale and the number of times of processing, so that the difference in the number of bits of the integers to be calculated can result in the difference in the number of times of processing. Therefore, even if the cryptographic strength of the system is increased, it is not necessary to remake the cryptographic device. Further, even when the number of bits of the integer to be calculated is reduced, it is possible to realize a cryptographic system without recreating the cryptographic device.

【０１２１】以上のような効果は、効率的に並列処理を
用いない従来のべき乗剰余演算及び剰余乗算による暗号
装置では実現できないものである。従って、本発明によ
るべき乗剰余演算及び剰余乗算回路及び方法を用いるこ
とによって柔軟で拡張性のある暗号システムを構成する
ことができる。The above-described effects cannot be realized by the conventional cryptographic device based on modular exponentiation and modular multiplication that does not efficiently use parallel processing. Therefore, by using the modular exponentiation operation and the modular multiplication circuit and method according to the present invention, a flexible and expandable cryptosystem can be constructed.

【０１２２】[0122]

【発明の効果】以上説明したように、本発明によれば、
べき乗剰余演算及び剰余乗算が、Ｚ＝Ｘ・Ｙ・Ｒ^-1 mod Ｎ（１６）の繰り返し演算によって実行できる。従って、必要な演
算は、同一または同型の演算回路によって実行できる。As described above, according to the present invention,
The modular exponentiation and the modular multiplication can be performed by the iterative calculation of Z = X · Y · R ⁻¹ mod N (16). Therefore, required arithmetic operations can be performed by the same or the same type of arithmetic circuit.

【０１２３】また、これをモンゴメリーの剰余乗算Ｚ＝Ｘ・Ｙ・Ｒ^-1 mod Ｎ＝（Ｘ・Ｙ＋Ｓ・Ｎ）／Ｒただし、Ｓ＝Ｘ・Ｙ・Ｎ’ mod Ｎを用いて演算する場合には、条件を満足する入力値を用
いることによって、モンゴメリーの剰余乗算の単純な繰
り返しによって剰余乗算及びべき乗剰余演算が実行でき
る。Further, when this is calculated using Montgomery's remainder multiplication Z = X · Y · R ⁻¹ mod N = (X · Y + S · N) / R, where S = X · Y · N ′ mod N For, by using an input value that satisfies the condition, the modular multiplication and the modular exponentiation operation can be executed by a simple iteration of Montgomery's modular multiplication.

【０１２４】また、必要な演算は整数演算であるので、
その演算回路が簡単に実現できるようになった。Since the required operation is an integer operation,
The arithmetic circuit can now be easily realized.

【０１２５】従って、共通の演算回路及び方法によっ
て、剰余乗算及びべき乗剰余演算を用いた種々の暗号シ
ステムが効率的に構成できるようになった。Therefore, various cryptographic systems using the modular multiplication and the modular exponentiation can be efficiently constructed by the common arithmetic circuit and method.

【０１２６】本発明によるモンゴメリーの剰余乗算回路
は非常に小さな回路規模で、高速処理を実現する。The Montgomery remainder multiplication circuit according to the present invention realizes high-speed processing with a very small circuit scale.

【０１２７】本発明によるシストリックアレイを用いた
べき乗剰余演算及び剰余乗算回路によって、次のような
効果を持つ効率的な暗号システムを構成することができ
る。By the modular exponentiation operation and the modular multiplication circuit using the systolic array according to the present invention, an efficient cryptosystem having the following effects can be constructed.

【０１２８】高速な暗号システムが必要な場合、本発明
によるべき乗剰余演算及び剰余乗算回路をＶＬＳＩ等に
よって構成すればよい。この場合、本発明によるべき乗
剰余演算及び剰余乗算回路は簡単なＰＥによる規則構造
を持ち、かつ、ＰＥの制御とＰＥ内の遅延時間は局所的
であるので、ＶＬＳＩに最適である。これによって、高
速な暗号システムが構成される。When a high-speed cryptosystem is required, the exponentiation and modular multiplication and modular multiplication circuit according to the present invention may be constructed by VLSI or the like. In this case, the modular exponentiation operation and modular multiplication according to the present invention have a simple PE rule structure, and the PE control and the delay time within the PE are local, which is optimal for VLSI. This constitutes a high speed encryption system.

【０１２９】また、高速性よりも回路規模の小型化が暗
号システムに要求される場合は、本発明によるべき乗剰
余演算及び剰余乗算回路をＰＥ数個で構成すればよい。
この場合も、ＰＥによる規則構造と制御と遅延時間の局
所性といった特徴は失われず回路化しやすい。また、Ｐ
Ｅ内で行われる演算は簡単な整数演算であるので、本発
明による演算手順はＣＰＵやＤＳＰ等のソフト的な手法
によっても簡単な暗号システムを実現することができ
る。Further, when the cryptographic system is required to have a smaller circuit scale rather than high speed, the power-residue arithmetic operation and the modular multiplication circuit according to the present invention may be constituted by several PEs.
In this case as well, the features such as the regular structure by PE, the control, and the locality of the delay time are not lost, and the circuit can be easily formed. Also, P
Since the operation performed in E is a simple integer operation, the operation procedure according to the present invention can realize a simple cryptographic system even by a software method such as a CPU or a DSP.

【０１３０】また、いくつかのＰＥからなる小型回路
（ＳＹＭＣ等）による暗号装置を構成した後で、高速性
が必要となっても、その小型回路を縦続に接続して行く
ことで、回路規模に比例した高速化が実現できる。従っ
て、暗号装置をまったく新たに作り直すことなく、継ぎ
足して行くだけで必要に応じた高速化が簡単に行える暗
号システムを実現できる。Further, even if high speed is required after the encryption device is constructed by a small circuit (SYMC or the like) made up of several PEs, the small circuits can be connected in cascade to achieve circuit scale. It is possible to realize speedup proportional to. Therefore, it is possible to realize an encryption system in which the speed can be easily increased as needed by simply adding the encryption devices without newly recreating them.

【０１３１】また、回路規模と処理回数を簡単にトレー
ドオフできるので、１度暗号システムを構成した後で、
暗号システムの強度（解読に対する安全性）を高めるた
めに演算する整数のビット数を増加させる場合も、演算
すべき整数のビット数の違いを処理回数の違いに帰着で
きるため、同一の回路または、ＰＥの数を増した同様の
回路によって対応することができる。従って、この場
合、暗号装置を新たにつくり直す必要がない。また、演
算する整数のビット数を減少させる場合にも、暗号装置
をあらためてつくり直さずに対処することができる。Further, since the circuit scale and the number of processings can be easily traded off, once the encryption system is constructed,
Even when increasing the number of bits of the integer to be calculated in order to increase the strength (security against decryption) of the cryptosystem, the difference in the number of bits of the integer to be calculated can be attributed to the difference in the number of processing times. A similar circuit with an increased number of PEs can be accommodated. Therefore, in this case, it is not necessary to recreate the encryption device. Further, even when the number of bits of the integer to be calculated is reduced, it is possible to deal with the cryptographic device without remaking it.

【０１３２】従って、本発明によるべき乗剰余演算及び
剰余乗算回路及び方法を用いることによって柔軟で拡張
性のある暗号システムを構成することができる。Therefore, by using the modular exponentiation operation and the modular multiplication circuit and method according to the present invention, it is possible to construct a flexible and expandable cryptographic system.

[Brief description of drawings]

【図１】暗号システムを用いる通信系の構成例を示す図
である。FIG. 1 is a diagram showing a configuration example of a communication system using a cryptographic system.

【図２】本発明による剰余乗算回路の例を示す図であ
る。FIG. 2 is a diagram showing an example of a modular multiplication circuit according to the present invention.

【図３】本発明によるべき乗剰余演算回路の例を示す図
である。FIG. 3 is a diagram showing an example of a modular exponentiation arithmetic circuit according to the present invention.

【図４】実施例の剰余乗算回路のブロック構成図であ
る。FIG. 4 is a block configuration diagram of a modular multiplication circuit according to the embodiment.

【図５】実施例のＰＥ（プロセッシング・エレメント）
を示す図である。FIG. 5 PE of the embodiment (processing element)
FIG.

【図６】図５のＰＥを用いたシストリックアレイを示す
図である。6 is a diagram showing a systolic array using the PE of FIG.

【図７】共通化ＰＥを示す図である。FIG. 7 is a diagram showing a common PE.

【図８】他の実施例のＰＥを示す図である。FIG. 8 is a diagram showing a PE of another embodiment.

【図９】図８のＰＥを用いたシストリックアレイを示す
図である。9 is a diagram showing a systolic array using the PE of FIG.

【図１０】他の実施例のＰＥを示す図である。FIG. 10 is a diagram showing a PE of another embodiment.

【図１１】図１０のＰＥを用いたシストリックアレイを
示す図である。11 is a diagram showing a systolic array using the PE of FIG. 10. FIG.

【図１２】ＰＥとメモリを組み合わせた回路FIG. 12: Circuit combining PE and memory

【図１３】ＳＹＭＣを用いたべき乗剰余演算及び剰余乗
算回路を示す図である。FIG. 13 is a diagram showing a modular exponentiation operation and a modular multiplication circuit using SYMC.

【図１４】実施例２のＰＥを示す図である。14 is a diagram showing PE of Example 2. FIG.

【図１５】図１４のＰＥを用いた回路を示す図である。15 is a diagram showing a circuit using the PE of FIG.

【図１６】実施例２のＰＥを示す図である。16 is a diagram showing PE of Example 2. FIG.

【図１７】図１６のＰＥを用いた回路を示す図である。17 is a diagram showing a circuit using the PE shown in FIG.

【図１８】図１７の回路とメモリを組み合わせた回路を
示す図である。18 is a diagram showing a circuit in which the circuit of FIG. 17 and a memory are combined.

【図１９】ＭＥＣを用いたべき乗剰余演算及び剰余乗算
回路を示す図である。FIG. 19 is a diagram showing a modular exponentiation operation and a modular multiplication circuit using MEC.

[Explanation of symbols]

Ｔ通信端末Ｓ，ＳＬセレクタＲ，Ｒi レジスタＨハ−フアダ− ＋加算器Ｂ_i 乗算器Ｎ_i 乗算器ＰＥプロセッシング・エレメントＭＥＣべき乗剰余演算チップＳＹＭＣシストリックべき乗剰余演算チップT communication terminal S, SL selector R, Ri register H half adder + adder B _i multiplier N _i multiplier PE processing element MEC exponentiation residue calculation chip SYMC systolic exponentiation residue calculation chip

Claims

[Claims]

1. Remainder multiplication of integers A and B modulo N = Q
In an encrypted communication method for encrypting or decrypting communication contents by using A · B mod N, an integer R which is a prime to N is used for input data U and V, and Z = U · V · One or more calculation units for calculating and outputting R ⁻¹ mod N are provided, and A and a constant R _R such that R _R = R ² mod N are input to the calculation unit, and A
_R = A · R _R · R ⁻¹ mod N is output, B and the constant R _R are input, and B _R = B · R _R · R ⁻¹
mod N is output, the output A _R and B _R are input, and T _R = A
_R · B _R · R ⁻¹ mod N is output, and the output T _R and constant 1 are input, and T _R · 1 · R
^-1 mod N is output as Q, whereby the modular multiplication Q = A · B mod N is executed.

2. A cryptographic communication method for encrypting or decrypting communication contents by using a modular exponentiation operation on integers M, e modulo N: C = M ^e mod N, wherein input data U, V On the other hand, by using an integer R that is a prime to N, one or more arithmetic units for calculating and outputting Z = U · V · R ⁻¹ mod N are provided, and M and R are provided for the arithmetic units. _R = R ² mod N, a constant R _R , is input, and M _R = M · R _R · R ⁻¹ mod N is output, and the binary representation of e is e = [e ^t , ^et-1 , ... , e ¹ ], and C _R
The initial value of C _R = R _R · R ⁻¹ mod N is set, and C _R and M _R are input to the arithmetic unit when e ⁱ = 1 according to the value of e ⁱ from the higher order bits. and, C _{_R} · M _R · _R
^-1 modN to output as a new C _R, further, when the i in the e ⁱ is greater than 1, type both C _R as two input data to the arithmetic unit, C _R · C _R R ⁻¹ mod N is output as a new C _R , and after completion of the processing for all the e ⁱ , C _R and a constant 1 are input to the arithmetic unit, and C = C _R. R ^-1 m
By outputting od N, the modular exponentiation operation C
= M ^e mod N is executed.

3. A cryptographic communication method for encrypting or decrypting communication contents by using a modular exponentiation operation on integers M, e modulo N: C = M ^e mod N, wherein input data U, V On the other hand, by using an integer R that is a prime to N, one or more arithmetic units for calculating and outputting Z = U · V · R ⁻¹ mod N are provided, and M and R are provided for the arithmetic units. _R = R ² mod N, a constant R _R , is input, and M _R = M · R _R · R ⁻¹ mod N is output, and the binary representation of e is e = [e ^t , ^et-1 , ... , e ¹ ], and C _R
When the initial value of C _R = R _R · R ⁻¹ mod N is set and e ⁱ = 1 according to the value of e ⁱ from the lower order bits, C _R and M _R are input to the arithmetic unit. and, C _{_R} · M _R · _R
⁻¹ mod _N is output as a new C _R , and when ⁱ in the e ⁱ is smaller than t, both M _R are input to the arithmetic unit as two input data, and M _R · M _R R ⁻¹ mod N is output as a new M _R , and after the processing for all the e ⁱ is completed, C _R and the constant 1 are input to the arithmetic unit, and C = C _R. R ^-1 m
By outputting od N, the modular exponentiation operation C
= M ^e mod N is executed.

4. The constant R _R and the constant 1 are input to the arithmetic unit, and the output R _R · 1 · R ⁻¹ mod _{N is used as} an initial value of C _R. The encrypted communication system described in.

5. When ^{n is} a value N <2 ⁿ , in the arithmetic unit, a constant R and input data U and V are u = 1 and r> 1, or u> 1 and r = u + 1. 5. The cryptographic communication system according to claim 1, wherein R = 2 ^{n + r} , U <2 ^{n + u} , and V <2 ^{n + u} are satisfied for u and r.

6. A cryptographic communication method for encrypting or decrypting communication contents by using a modular multiplication Q = A · B mod N modulo N for input integers A and B, wherein N
Using the integer R which is a prime number, A · R mod N is calculated from the input A and _R , the result is taken as A _R, and B · R mod N is calculated from the input B and R. And the result is B _R, and A _R · B _{R is} calculated based on the calculation results A _R , B _R and _R.
· R ^-1 seeking mod N to the result with T _R, calculates the T _R and the R and the T _R · R ^-1 mod N,
As a result, Q is obtained, and the operation for obtaining T _R is performed by dividing A _i by v bits of A _R by an arbitrary integer v, Y = 2 ^v , and T _i = (T _i-1 + A _i · _BR · Y + M _i−1 · N) / Y M _i−1 = (T _i−1 mod Y) · (−N ⁻¹ mod Y) mod Y Communication method.

7. Each one operation in the sequential operation,
7. The cryptographic communication method according to claim 6, wherein the cryptographic communication is performed by one arithmetic element, and the sequential arithmetic operation is entirely executed by pipeline processing.

8. The cryptographic communication method according to claim 6, wherein in the sequential operation, multiplication or division by Y is executed by shifting the bit position and performing addition in the addition.

9. In the sequential calculation, B _j and N _j are respectively divided into d B bits of B _R and N by an arbitrary integer d, and A _i · B _j-1 and M _i-1 · N _j. 7. The cryptographic communication method according to claim 6, wherein the calculation result and the previous sequential calculation result T _i-1 are added.

10. A cryptographic communication method for encrypting or decrypting communication content by using a modular multiplication Q = A · B mod N modulo N for input integers A and B,
Using an integer R that is a prime to N, calculate A · R mod N from the input A and R, take the result as A _R, and calculate B · R mod N from the input B and R Then, the result is set to B _R, and based on the calculation results A _R , B _R and R, A _R · B _R
· R ^-1 seeking mod N to the result with T _R, calculates the T _R and the R and the T _R · R ^-1 mod N,
As a result, Q is obtained, and the operation for obtaining T _R is performed by dividing A _i by v bits of A _R by an arbitrary integer v, and Y = 2 ^v . T _i = (T _i-1 / Y + A _i · B _R ) + M _i · N M _i-1 = ((T _i-1 / Y + A _i · B _R ) mod Y) · (−N ⁻¹ mod Y) mod Y Characterized cryptographic communication method.