DE3631992C2

DE3631992C2 -

Info

Publication number: DE3631992C2
Application number: DE19863631992
Authority: DE
Inventors: Holger 3300 Braunschweig De Sedlak
Original assignee: Individual
Current assignee: Individual
Priority date: 1986-03-05
Filing date: 1986-09-20
Publication date: 1988-12-08
Also published as: DE3631992A1

Description

Mit der ständig wachsenden Verbreitung elektronischer Verfahren der Kommunikation und der Informations- Speicherung ist die Forderung nach Geheimhaltung, insbesondere der Geheimhaltung wichtiger Dokumente, wie z. B. Banküberweisungen, Verträge und dergleichen, unverzichtbar geworden.With the ever increasing spread of electronic Procedures of communication and information Storage is a requirement of confidentiality, especially the confidentiality of important documents, such as B. bank transfers, contracts and the like, become indispensable.

Während das Problem des Datenschutzes in der Gesetzgebung bereits eine gewisse Berücksichtigung gefunden hat, sind die technischen Probleme zur Durchführung des Datenschutzes mittels der Geheimhaltung von zu übermittelnden Daten bisher noch höchst unbefriedigend gelöst. Die Übertragung von Daten über Funk oder Breitbandkabel erfolgt mehr oder weniger öffentlich. Jedenfalls wird für die Vertraulichkeit der Übertragung keine Garantie übernommen. Die Gefahr des Mißbrauches ist dabei keinesfalls auszuschließen.While the problem of data protection in legislation some consideration has already been found has the technical problems to carry out of data protection through the confidentiality of transmitting data so far is highly unsatisfactory solved. The transmission of data via radio or broadband cable is more or less public. In any case, the confidentiality of the No guarantee assumed. The danger misuse cannot be excluded.

Hinsichtlich der Übertragung von Daten über Funk oder Kabel ist diese Gefahr mit technischen Mitteln nicht zu beseitigen. Der Benutzer selbst muß für die erforderliche Sicherheit sorgen. Hierzu gehört auch die Sicherung der Authentizität des Absenders sowie der Manipulationsschutz der Nachricht.With regard to the transmission of data via radio or This danger is not a cable with technical means to eliminate. The user himself must be responsible for the required Provide security. This also includes the Ensuring the authenticity of the sender and the Protection against manipulation of the message.

Aus Sicherheitsgründen gilt es deshalb, die zu übertragenden Informationen, Daten, Texte usw. zu verschlüsseln, d. h., derartig umzuwandeln, daß ein Unbefugter sie nicht verstehen kann. Dabei kann allgemein gesagt werden, daß eine Verschlüsselung um so sicherer ist, je komplizierter die der Verschlüsselung zugrunde liegenden Operationen sind.For security reasons, it is therefore important to transfer the data to be transmitted Encrypt information, data, texts, etc. d. i.e. convert such that an unauthorized person she can't understand. It can be general be said that encryption is all the more the more complicated encryption is, the more secure underlying operations are.

Bei den als klassisch zu bezeichnenden Verschlüsselungs- Verfahren handelt es sich um symmetrische Methoden, bei denen der Chiffrier- und Dechiffrier- Schlüssel gleichartig, d. h. identisch oder invers sind. Solange der diesbezügliche Schlüssel geheim ist, kann die entsprechend chiffrierte Nachricht öffentlich übertragen werden. Damit aber der Empfänger diese Nachricht verstehen kann, ist es erforderlich, daß dem Empfänger der geheime Kodier-Schlüssel durch einen vertrauenswürdigen Boten zugestellt wird. Diese Art der Zustellung des geheimen Schlüssels ist umständlich und zeitraubend, und zwar besonders dann, wenn mehrere Empfänger mit einer vertraulichen Nachricht versorgt werden sollen. Im übrigen mutet es im elektronischen Zeitalter anachronistisch an, Kuriere für die Übermittlung von geheimen Chiffrier-Schlüsseln einzusetzen.In the case of encryption, which can be described as classic Procedures are symmetrical methods, where the encryption and decryption Key of the same type, d. H. are identical or inverse. As long as the relevant key is secret, transmit the correspondingly encrypted message publicly will. So that the recipient of this message can understand, it is necessary that the The recipient of the secret coding key by one is delivered to trustworthy messengers. This kind the delivery of the secret key is cumbersome and time consuming, especially when there are several Recipient provided with a confidential message should be. Otherwise, it appears to be electronic Anachronistic age, couriers for the Use transmission of secret encryption keys.

Demgegenüber stellen die Chiffrier-Methoden nach dem sogenannten Public-Key-Code-Verfahren gedanklich einen großen Fortschritt dar. Diese Public-Key-Code- Verfahren sind durch eine asymmetrische Verschlüsselung gekennzeichnet. Das bedeutet, daß zum Ver- und Entschlüsseln zwei verschiedene Schlüssel benutzt werden. Bei den asymmetrischen Verfahren ist sichergestellt, daß der eine Schlüssel sich nicht ohne Zusatzinformation aus dem anderen berechnen läßt. Einer der beiden Schlüssel kann daher veröffentlicht werden. Aus diesem Grunde haben diese Verfahren die Bezeichnung "Public-Key-Code-Verfahren" erhalten.In contrast, the encryption methods according to a so-called public key code process great progress. This public key code Procedures are through asymmetric encryption featured. That means that for shipping and Decrypt two different keys used will. The asymmetrical process ensures that one key is not without additional information can be calculated from the other. One the two keys can therefore be published. For this reason, these procedures are called "Public key code procedure" received.

Will ein Benutzer der öffentlichen Netze mit anderen Teilnehmern mittels eines Public-Key-Code-Verfahrens Nachrichten austauschen, so muß er ein einziges Mal zwei Schlüssel E und D erzeugen. Den Schlüssel E zum Verschlüsseln macht er über ein öffentliches Register allen anderen Benutzern zugänglich, den Schlüssel D zum Entschlüsseln hält er geheim. Darüber hinaus werden bei manchen Verfahren auch die allgemeinen Rechenvorschriften der Verschlüsselung bekanntgegeben, ohne dadurch die Sicherheit der Geheimhaltung des Inhalts der verschlüsselten Nachrichten zu gefährden. Auch ist die Authentizität der Nachricht kein Problem. Die Sicherheit der asymmetrischen Methode beruht darauf, daß es praktisch unmöglich ist, D aus E zu berechnen.If a user of the public networks wants to exchange messages with other subscribers using a public key code method, he has to generate two keys E and D once. He makes the key E for encryption accessible to all other users via a public register, he keeps the key D for decryption secret. In addition, the general computing rules for encryption are also disclosed in some methods, without thereby endangering the security of the confidentiality of the content of the encrypted messages. The authenticity of the message is also not a problem. The security of the asymmetric method is based on the fact that it is practically impossible to calculate D from E.

Jeder, der einem anderen Benutzer eine Nachricht zusenden möchte, besorgt sich den Schlüssel E aus dem veröffentlichten Register, verschlüsselt damit die Nachricht und überträgt den so erhaltenen Code im unsicheren (gegebenenfalls digitalen) Netz, beispielsweise dem öffentlichen Telefonnetz. Der adressierte Benutzer (Empfänger) entschlüsselt den empfangenen Code mit seinem geheimen Schlüssel D und erzeugt so die ursprüngliche Nachricht. Sowohl zur Übermittlung eines Schlüssels als auch zur Übermittlung der Nachricht selbst erübrigt sich somit ein sicherer Übertragungskanal. Der adressierte Benutzer erhält ausschließlich Nachrichten, die mit seinem eigenen Schlüssel verschlüsselt worden sind. Daher braucht er nur auf den eigenen Schlüssel D zuzugreifen.Anyone who wants to send a message to another user gets the key E from the published register, encrypts the message with it and transmits the code thus obtained in the insecure (possibly digital) network, for example the public telephone network. The addressed user (recipient) decrypts the received code with his secret key D and thus generates the original message. A secure transmission channel is therefore unnecessary both for the transmission of a key and for the transmission of the message itself. The addressed user only receives messages that have been encrypted with their own key. Therefore, he only needs to access his own key D.

Bei diesen Verfahren wird somit eine leichte Verfügbarkeit der Schlüssel erreicht. Ebenso wird der Benutzer der Verwaltung eines umfangreichen persönlichen Schlüsselregisters enthoben. Die Schlüsselverwaltung erfolgt nur einmal, und zwar zentral im jedermann zugänglichen Register, z. B. nach Art elektronischer Telefonbücher. Mit dieser Übertragungsprozedur können alle Arten von Übertragungsnetzen (z. B. "ISDN") sicher gemacht werden.With these methods there is therefore an easy availability the key reached. Likewise, the user managing an extensive personal Key register removed. The key management takes place only once, and centrally in everyone accessible Register, e.g. B. in the manner of electronic Phonebooks. With this transfer procedure you can all types of transmission networks (e.g. "ISDN") are secure be made.

Bei der beschriebenen Ausführungsform des Public-Key- Code-Verfahrens ist noch nicht die Sicherung der Absenderauthentizität sowie der Manipulationsschutz der Nachricht gewährleistet. Prinzipiell ist es aber möglich, fälschungssichere "Unterschriften" in digitaler Form zu übermitteln, und zwar dann, wenn die Reihenfolge der Anwendung der Schlüssel E und D vertauschbar ist. Der Absender kann dann eine Signatur erzeugen, die zusammen mit der verschlüsselten Nachricht übertragen wird. Die Signatur ist ein mit dem geheimen Absenderschlüssel D verschlüsselter "Extrakt" der Nachricht. Zur Überprüfung der Absenderauthentizität erzeugt der Empfänger aus der rekonstruierten Nachricht ebenfalls den Extrakt, entschlüsselt die Signatur mit dem öffentlichen Absenderschlüssel E und vergleicht beide. Sind sie identisch, so muß die Nachricht vom angegebenen Absender stammen, da nur der Absender den zum Absenderschlüssel E passenden Schlüssel D kennt, mit dem die Signatur verschlüsselt wurde.In the embodiment of the public key code method described, the securing of the sender authenticity and the protection against manipulation of the message are not yet guaranteed. In principle, however, it is possible to transmit counterfeit-proof "signatures" in digital form, specifically when the order in which keys E and D are used is interchangeable. The sender can then generate a signature that is transmitted together with the encrypted message. The signature is an "extract" of the message encrypted with the secret sender key D. To check the authenticity of the sender, the recipient also generates the extract from the reconstructed message, decrypts the signature with the public sender key E and compares the two. If they are identical, the message must originate from the specified sender, since only the sender knows the key D that matches the sender key E and with which the signature was encrypted.

Mit der Signatur ist die Nachricht auch vor Manipulationen geschützt. Der Absender kann die übertragene Nachricht nicht abstreiten, da der Empfänger im Besitz einer Signatur dieser Nachricht ist. Andererseits kann der Empfänger die Nachricht nicht verändern, da er für die verfälschte Nachricht keine Signatur erzeugen kann. Diese hängt wegen des Extraktes nicht nur vom Absender, sondern auch von der Nachricht ab. Hierdurch wird ein höherer Schutz als durch die Unterschrift unter einem Dokument gewährleistet.With the signature, the message is also free from manipulation protected. The sender can transfer the Do not deny message as the recipient is in the possession of a signature of this message. On the other hand the recipient cannot receive the message change, since there is none for the falsified message Can generate signature. This depends on the extract not only from the sender, but also from the message from. This provides greater protection than through guarantees the signature under a document.

Das wohl bekannteste Public-Key-Code-Verfahren ist das nach den Anfangsbuchstaben seiner Erfinder Rivest, Shamir und Adleman bekannte RSA-Verfahren. Die Sicherheit dieses RSA-Verfahrens beruht darauf, daß es praktisch unmöglich ist, große Zahlen (z. B. 200 Dezimalstellen) zu faktorisieren, d. h. alle Primzahlen zu finden, durch die diese große Zahl ohne Rest geteilt werden kann.The probably best known public key code procedure is that after the initial letters of its inventor Rivest, Shamir and Adleman known RSA procedures. The security This RSA method is based on the fact that it is practical is impossible to use large numbers (e.g. 200 decimal places) to factorize, d. H. all prime numbers too find by dividing this large number with no remainder can be.

Das RSA-Verfahren funktioniert folgendermaßen: Zunächst wählt jeder Benutzer des RSA-Systems zwei große Primzahlen p und q und eine weitere große Zahl E. Die Zahlen können z. B. mit dem Zufallszahlen-Generator eines Rechners erzeugt werden. Zur Beantwortung der Frage, ob es sich bei der jeweils vorliegenden Zahl um eine Primzahl handelt oder nicht, stehen Algorithmen zur Verfügung (siehe z. B.: Pomerance, C. "Recent Developments in Primality Testing", Department of Mathematics, University of Georgia, in "The Mathematical Intelligencer" S. 97-104, Vol 3, Nr. 3, 1981).The RSA procedure works as follows: First, each user of the RSA system chooses two large prime numbers p and q and another large number E. The numbers can e.g. B. with the random number generator of a computer. Algorithms are available to answer the question of whether or not the number in question is a prime number (see, for example: Pomerance, C. "Recent Developments in Primality Testing", Department of Mathematics, University of Georgia , in "The Mathematical Intelligencer" pp. 97-104, Vol 3, No. 3, 1981).

Das RSA-Verfahren schreibt eine bestimmte Mindestlänge für die Primzahlen nicht vor. Kurze Zahlen machen den Algorithmus schneller, vergrößern aber die Gefahr, daß das Produkt der Primzahlen faktorisiert werden kann. Bei langen Zahlen ist es umgekehrt. 100 Dezimalstellen werden allgemein als ein guter Kompromiß angesehen. N ist das Produkt aus den Primzahlen p und q. Das Zahlenpaar (E, N) ist der öffentliche Schlüssel, während die Primzahlen p und q ausschließlich dem Empfänger einer Nachricht bekannt sind.The RSA procedure does not prescribe a certain minimum length for the prime numbers. Short numbers make the algorithm faster, but increase the risk that the product of the prime numbers can be factored. The reverse is true for long numbers. 100 decimal places are generally considered a good compromise. N is the product of the prime numbers p and q . The pair of numbers ( E, N ) is the public key, while the prime numbers p and q are only known to the recipient of a message.

Zur Chiffrierung der Nachricht verwandelt der Absender zunächst seinen Text in eine Kette von Dezimalzahlen. Diese Kette wird dann in gleichlange Glieder P+ _i ≦ωτ N zerlegt. Diese Glieder werden sodann einzeln chiffriert, indem man sie jeweils in die E-te Potenz erhebt und dann modulo N bildet, d. h., es entstehen die Zahlen C _i = P modulo N. Diese Zahlen C _i werden dann über einen unsicheren Kanal geschickt. Zur Auswertung der Zahlen ist es erforderlich, ihren Exponenten modulo Φ (N) zu berechnen, wobei Φ (N) = (p - 1) · (q - 1) ist. Da ausschließlich der Empfänger die Primzahlen p und q kennt, kann nur er den Dechiffrier- Schlüssel D = E ^-1 modulo Φ (N) berechnen. Zu diesem Zweck erhebt der Empfänger jede empfangene Zahl C _i in die D-te Potenz und reduziert modulo N. Da C _i modulo N = P modulo N und ED modulo Φ (N) = 1, ergibt die Operation P modulo N wieder die Zahlenblöcke des Klartextes.To encrypt the message, the sender first converts his text into a chain of decimal numbers. This chain is then broken down into links P + _i ≦ ωτ N of equal length. These members are then individually ciphered by it rises respectively in the e -th power modulo N, and then forms, that is, there arise the numbers C _i = P modulo N. These numbers C _i are then sent over an insecure channel. To evaluate the numbers it is necessary to calculate their exponent modulo Φ ( N ), where Φ ( N ) = ( p - 1) · ( q - 1). Since only the recipient knows the prime numbers p and q , only he can calculate the decryption key D = E ^-1 modulo Φ ( N ). For this purpose, the receiver raises every received number C _i to the D- th power and reduces modulo N. Since C _i modulo N = P modulo N and ED modulo Φ ( N ) = 1, the operation P modulo N again results in the number blocks of the plain text.

Außer den als "klassisch" bezeichneten Chiffrier-Verfahren zeigen auch die bisher bekanntgewordenen Public- Key-Code-Verfahren gravierende Mängel. Sowohl software- als auch hardwaremäßige Realisierungen scheiterten bisher stets an dem immensen Aufwand und den damit verbundenen hohen Kosten.Except for the "classic" encryption method also show the public Key code procedure serious shortcomings. Both software and hardware implementations have so far failed always in the immense effort and the associated high costs.

Nach dem bisherigen Stand der Chip-Entwicklung ist es nicht möglich, einen Universal-Rechner softwaremäßig mit dem RSA-Algorithmus (bei Zugrundelegung von 200 Dezimalstellen) so zu programmieren, daß akzeptable Verschlüsselungsgeschwindigkeiten bzw. Verschlüsselungsraten erreicht werden.According to the current state of chip development, it is not possible, a universal computer in software with the RSA algorithm (based on 200 Decimal places) so that they are acceptable Encryption speeds or encryption rates can be achieved.

Auch kann die RSA-Funktion (Potenzierung mit anschließender Modulo-Operation) nicht direkt in ein VLSI-Layout (VLSI = Very Large Scale Integration) umgesetzt werden, weil es keine direkten Potenzierschaltungen gibt. Die geschilderte Erkenntnis hat bereits seit mehreren Jahren zu dem Wunsch nach speziellen Hardware- Lösungen geführt, um die Potenzierung so in Einzelschritte zu zerlegen, daß eine hinreichende Verschlüsselungsgeschwindigkeit bzw.-Rate möglich ist.The RSA function (exponentiation with subsequent Modulo operation) not directly in a VLSI layout (VLSI = Very Large Scale Integration) implemented be because there are no direct exponentiation circuits gives. The knowledge described has been around since several years on the desire for special hardware Solutions led to the exponentiation in individual steps to disassemble that sufficient encryption speed or rate is possible.

Die heute bekannten Realisierungen der Public-Key- Code-Verfahren benötigen aber sehr viel Rechenzeit. Software-Lösungen haben Ver- bzw. Entschlüsselungsraten von 10 bis 20 Bit/sec. Auch erste bekannte Hardware- Lösungen erreichen nicht mehr als 1200 Bit/sec. Die einzige bisher realisierte Ein-Chip-Lösung stammt von Rivest (Rivest, R. L., "A Description of a Single-Chip Implementation of the RSA Cipher", Laboratory for Computer Sience, MIT, Cambridge, Massachusetts, in "LAMBDA Magazine 1", S. 14-18, Nr. 3, 1980). Bei diesem Vorschlag wurde eine relativ einfache Entwurfsmethode gewählt, die insbesondere darin besteht, daß mit einer herkömmlichen arithmetisch-logischen Elementarzelle eine 512 Bit breite arithmetisch-logische Einheit (ALU) konstruiert wurde. Diese arithmetisch-logische Einheit ist so aufgebaut, daß mit ihr sehr verschiedene Operationen ausgeführt werden können. Die in Kauf genommene Redundanz schlägt sich in einer Verschlüsselungsrate von 1200 Bit/sec. nieder. Dabei ist eine 4 µm-NMOS- Technologie benutzt worden (Extrapoliert auf eine 2 µm- CMOS-Technologie würde sich bei einer Schlüssel-Länge von 660 Bits eine Verschlüsselungsrate im Bereich von 2500 Bit/sec. ergeben).The realizations of the public key known today However, code processes require a lot of computing time. Software solutions have encryption and decryption rates from 10 to 20 bit / sec. Also the first known hardware Solutions do not reach more than 1200 bits / sec. The the only one-chip solution implemented so far comes from Rivest (Rivest, R.L., "A Description of a Single Chip Implementation of the RSA Cipher ", Laboratory for Computer Sience, MIT, Cambridge, Massachusetts, in "LAMBDA Magazine 1 ", pp. 14-18, No. 3, 1980). In this The proposal became a relatively simple design method chosen, which is in particular that with a conventional arithmetic-logic unit cell a 512-bit arithmetic-logic unit (ALU) was constructed. This arithmetic-logical unit is constructed so that with it very different operations can be executed. The accepted one Redundancy translates into an encryption rate from 1200 bit / sec. low. Here is a 4 µm NMOS Technology has been used (extrapolated to a 2 µm CMOS technology would look at a key length of 660 bits an encryption rate in the range of 2500 bit / sec. result).

Diese Lösung ist in der Praxis allerdings nicht akzeptabel, da die Schnittstellen der digitalen Netze mit sehr viel höheren Datenraten arbeiten, z. B. arbeiten die ISDN-Schnittstellen mit 64 Bit/sec.In practice, this solution is not acceptable, because the interfaces of the digital networks with work much higher data rates, e.g. B. work the ISDN interfaces with 64 bit / sec.

Es ist auch schon ein Kryptographie-Prozessor vorgeschlagen worden, der aus zwei Chips besteht, und mit deren Hilfe 336 Bit lange Zahlen des RSA-Algorithmus aufgearbeitet werden können (Rieden, R. F., J. B. Snyder, R. J. Widman and W. J. Barnard, "A Two-Chip Implementation of the RSA Public-Key Encryption Algorithm", Digest of Papers for the 1982 Government Microcircuit Applications Conference (November 1982), 24-27).A cryptography processor has also already been proposed which consists of two chips, and with the help of 336 bit long numbers of the RSA algorithm can be worked up (Rieden, R. F., J. B. Snyder, R.J. Widman and W.J. Barnard, "A Two-Chip Implementation of the RSA Public-Key Encryption Algorithm ", Digest of Papers for the 1982 Government Microcircuit Applications Conference (November 1982), 24-27).

Der schnellste bekannte RSA-Prozessor ist mit dem Vorschlag von NEC/Miyaguchi gegeben (Miyaguchi, S., "Fast Encryption Algorithm for the RSA Cryptographie System", Proceedings COMPCON 82). Er arbeitet pro Zyklus 8 Bits des Multiplikators ab und erreicht dabei eine Geschwindigkeit von 29 000 Bits/sec. Da er aber für die praktische Ausführung die hohe Anzahl von 333 Chips benötigt, ist er in wirtschaftlicher Hinsicht natürlich völlig indiskutabel.The fastest known RSA processor is with the proposal given by NEC / Miyaguchi (Miyaguchi, S., "Fast Encryption Algorithm for the RSA Cryptography System ", Proceedings COMPCON 82). It works 8 bits per cycle of the multiplier and reaches a speed of 29,000 bits / sec. But since it is for the practical Execution that requires a high number of 333 chips it is natural in economic terms completely out of the question.

Ein genereller Nachteil der Mehr-Chip-Implementierungen besteht nicht nur in den proportional mit der Anzahl der erforderlichen Chips steigenden Hardware-Kosten, sondern vor allem in der fehlenden Gewähr für Sicherheit. Wenn die Signale, die von einem zum anderen Chip übertragen werden, zugänglich sind, so kann anhand der übertragenen Signale der Geheimcode gebrochen werden. Dehalb ist es aus Gründen der Kryptographie-Sicherheit wesentlich, daß alle Kryptographie-Algorithmen möglichst von einem einzigen Chip durch einen Tresor geschützt werden könnten. A general disadvantage of multi-chip implementations does not consist only in the proportional with the number of required chips rising hardware costs, but especially in the lack of guarantee of security. If the signals transmitted from one chip to another become accessible, it can be based on the transferred Secret code signals are broken. That's why it is essential for reasons of cryptography security, that all cryptography algorithms if possible from one single chip could be protected by a safe.

In der DE-PS 32 28 018 ist ein Schlüsselsystem für RSA- Kryptographie beschrieben, welches ebenfalls einen extrem hohen Hardware-Bedarf erfordert. Im Vergleich mit dem ursprünglichen RSA-Algorithmus soll bei dem bekannten Schlüsselsystem die Verschlüsselungsrate um den Faktor 4 erhöht werden, was aber nur eine bescheidene Verbesserung darstellt. Zu diesem Zweck wird eine feste Anzahl von vier Bits gleichzeitig verarbeitet, wofür mehrere Multiplikatoren erforderlich sind, und wofür insgesamt 14 Addierer angegeben werden.DE-PS 32 28 018 describes a key system for RSA Cryptography described, which is also an extreme requires high hardware requirements. In comparison with the original RSA algorithm is said to be used in the known Key system the encryption rate by the factor 4 increased, but only a modest improvement represents. For this purpose, a fixed Number of four bits processed simultaneously for what multiple multipliers are required, and for what a total of 14 adders can be specified.

Theoretisch ist es zwar denkbar, daß der Prozessor für das bekannte Schlüsselsystem gemäß der DE-PS 32 28 018 viermal so schnell wie die direkte Verwendung des ursprünglichen RSA-Algorithmus ist, da aber die Signalwege sehr viel größer sind als bei einer Abtastung von jeweils einem einzigen Bit, ist in der Praxis kaum ein effektiver Zeitgewinn zu erwarten.Theoretically, it is conceivable that the processor for the known key system according to DE-PS 32 28 018 four times faster than direct use of the original RSA algorithm is there because of the signal paths are much larger than with a scan of a single bit each, is hardly in practice to expect an effective time saving.

Auch ein gedanklich angenommener Universal-Chip mit einer 100fachen Rechnerdichte wäre übrigens nicht in der Lage, den bekannten RSA-Algorithmus abzuarbeiten. Aus diesem Grunde käme, wenn überhaupt, nur ein Spezial- Kryptographie-Chip in Frage, allerdings würden sich dabei erhebliche Kühlprobleme einstellen, denn bei derartig hochspezialisierten Chips wären - im Unterschied zu Universalchips - sämtliche Transistorfunktionen fast ständig im Einsatz. Dies ist mit beträchtlichen Verlustleistungen verbunden, die wegen der angenommenen 100fachen Rechnerdichte aus eine 100fache (Verlust-) Leistungsdichte zur Folge haben würde.Also a universal chip with an idea a 100-fold computer density would not be in the way able to process the well-known RSA algorithm. For this reason, only a special Cryptography chip in question, however, would do so set significant cooling problems, because with such would be highly specialized chips - in difference to universal chips - almost all transistor functions constantly in use. This is with considerable power losses connected, because of the adopted 100 times the computing density from a 100 times (loss) Power density would result.

Daß die damit verbundenen Kühlprobleme nicht unbeträchtlich sind, zeigt der aufwendige Vorschlag, die Kühlung mittels verflüssigter Edelgase durchzuführen, die durch im Silizium-Chip führende Bohrungen hindurchfließen. Andererseits ist zu berücksichtigen, daß sich als Folge einer ungenügenden Kühlung eine erheblich verkürzte Lebensdauer der Chips und eine erhöhte Fehlerquote bei den Verschlüsselungsoperationen ergibt. Außerdem würde der angenommene Universal-Chip mit der hohen Rechnerdichte von den räumlichen Abmessungen her so groß ausfallen, daß eine praktikable Anwendung außer Betracht bleiben müßte.That the associated cooling problems are not insignificant are the elaborate proposal shows the cooling by means of liquefied noble gases which are carried out by holes in the silicon chip flow through. On the other hand, it should be borne in mind that as a result insufficient cooling significantly reduced Lifetime of the chips and an increased error rate results in the encryption operations. Furthermore would the adopted universal chip with the high Computer density from the spatial dimensions turn out to be large that a workable application besides Should be kept in mind.

Hier greift nun die Erfindung ein, der die Aufgabe zugrunde liegt, ein Kryptographie-Verfahren anzugeben, welches bei hinreichender Sicherheit eine so schnelle Rechengeschwindigkeit ermöglicht, daß eine kommerzielle praktikable Anwendung des an sich bekannten RSA-Verfahrens möglich ist. Außerdem soll durch die Erfindung ein Kryptographie-Prozessor zur Durchführung des Verfahrens geschaffen werden, der die gestellten Anforderungen bei kleiner handlicher Bauweise bzw. Chip- Abmessungen ermöglicht.This is where the invention intervenes, which is based on the task lies in specifying a cryptography method, which is so fast with sufficient security Computing speed enables a commercial practical application of the known RSA process is possible. In addition, by the invention a cryptography processor for performing the method created to meet the requirements with small handy design or chip Allows dimensions.

Dieses Ziel erreicht die Erfindung verfahrensmäßig bei dem im Oberbegriff des Anspruchs 1 genannten Kryptographie- Verfahren durch die im kennzeichnenden Teil des Anspruchs 1 genannten Merkmale, wobei den Anforderungen einer digitalen Schnittstelle eines ISDN- Netzes genügt wird.The invention achieves this goal in terms of method the cryptography mentioned in the preamble of claim 1 Procedure by the in the characterizing part of claim 1 mentioned features, the requirements a digital interface of an ISDN Network is sufficient.

Ein wesentlicher Gesichtspunkt der Erfindung ist die neuartige Anwendung eines Look-Ahead-Algorithmus für die Division. Durch diesen Schritt wird es nämlich möglich, ein Look-Ahead-Verfahren auch bei der Multiplikation anzuwenden. Somit sind beim Abarbeiten mehrerer Bits nur einfache Additionen und Subtraktionen erforderlich, d. h. ein zusätzliches Multiplizieren kann entfallen.An essential aspect of the invention is that novel application of a look-ahead algorithm for the division. This step makes it possible a look-ahead procedure also for multiplication to apply. Thus, when processing several Bits only require simple additions and subtractions, d. H. an additional multiplication can be omitted.

Look-Ahead-Algorithmen sind allgemein bekanntgeworden durch die Literaturstelle "Automatic Digital Calculators" von Booth, A.D.; Booth, K.H.; Academic Press, Inc.; New York, 1956, sowie durch das Buch "Logischer Entwurf digitaler Systeme" von W. Giloi, H. Liebig; Springer Verlag 1980, insbesondere Seiten 177-178.Look-ahead algorithms have become well known by the reference "Automatic Digital Calculators" by Booth, A.D .; Booth, K.H .; Academic Press, Inc .; New York, 1956, as well as through the book "Logischer Design of digital systems "by W. Giloi, H. Liebig; Springer Verlag 1980, especially pages 177-178.

Die Vorteile des neuen Verfahrens basieren im einzelnen darauf, daß der gesamte Kryptographie-Algorithmus suksessive so weit in kleinere Schritte zerlegt wird, bis jeder Rechenschritt in einfacher Weise direkt mit einer Hardware-Auslegung korrespondiert.The advantages of the new process are based in detail insist that the entire cryptography algorithm suksessive so far broken down into smaller steps will until every arithmetic step in a simple way corresponds directly to a hardware design.

Vorteilhafte Weiterbildungen und zweckmäßige Ausgestaltungen der Erfindung sind in den Unteransprüchen angegeben.Advantageous further developments and practical refinements the invention are specified in the subclaims.

Jeweils bei der Gegenüberstellung der einzelnen Operationen zeigen sich bereits die Vorteile der erfindungsgemäßen Anordnung gegenüber dem Stand der Technik. Durch die Umwandlung des Exponenten-Algorithmus in eine Folge von Multiplikationen, wobei nach jeder einzelnen Multiplikation eine Modulo-Operation ausgeführt wird, wird verhindert, daß die Zwischenergebnisse nicht, wie bei der Potenzierung üblich, ins Astronomische anwachsen (D und E haben je 200 Dezimalstellen).The comparison of the individual operations already shows the advantages of the arrangement according to the invention over the prior art. By converting the exponent algorithm into a sequence of multiplications, whereby a modulo operation is carried out after each individual multiplication, it is prevented that the intermediate results do not, as is usual with exponentiation, grow to astronomical values ( D and E each have 200 decimal places ).

Dadurch, daß darüber hinaus die Multiplikation in Einzelschritte zerlegt wird, bei der die Multiplikation in eine Folge von Additionen umgewandelt wird, kann die Berechnung schneller erfolgen. So wird zur Realisierung außerdem weniger Fläche auf dem Chip benötigt.In addition, multiplication in single steps is broken down, in which the multiplication in a sequence of additions is converted, the calculation done faster. This is how it is realized also requires less space on the chip.

Die weiter unten erläuterte Reduktion der Modulo-Operation in eine Folge von Subtraktionen wird mit derselben Additionslogik berechnet, denn eine Subtraktion kann als Addition mit umgekehrten Vorzeichen behandelt werden.The reduction in modulo operation explained below in a sequence of subtractions is with the same Addition logic calculated because a subtraction can be treated as an addition with the opposite sign will.

Da die Multiplikation im Ring über N ausgeführt wird, kann bereits nach jeder Addition eine Modulo-Operation ausgeführt werden. Auch hierdurch werden große Zahlen vermieden und beträchtliche Rechenzeit eingespart. Sämtliche Zahlen jedes Schrittes sind nun kleiner als N. Auf diese Weise wird die maximal erforderliche Größe des Speichers auf die Länge von N reduziert, was mit einer Halbierung der erforderlichen Chipfläche verbunden ist.Since the multiplication in the ring is carried out over N , a modulo operation can be carried out after each addition. This also avoids large numbers and saves considerable computing time. All numbers in each step are now less than N. In this way, the maximum required size of the memory is reduced to the length of N , which is associated with a halving of the required chip area.

Beträchtliche Bedeutung für das Kryptographie-Verfahren liegt in der vorteilhaften Anwendung von Look-Ahead- Algorithmen, denn hierdurch wird die maximale Anzahl erforderlich werdender Additionen für die Multiplikationen und Subtraktionen für die Modulo-Operationen weiter beträchtlich reduziert. Das sich die Einführung der Look-Ahead-Algorithmen in einem Gewinn an Rechengeschwindigkeit auszahlt, ergibt sich aber erst durch die Einführung der erfindungsgemäßen neuen Look-Ahead- Algorithmus für die Modulo-Operation, denn die bloße Anwendung bekannter Look-Ahead-Algorithmen auf die Multiplikationen würde keinesfalls einen Zeitgewinn erbringen. Erst wenn die mittlere Reduktion des mit der Modulo-Operation verbundenen Rechenaufwands der mit der Multiplikation möglichen Rechenreduktion entspricht, ergibt sich ein optimaler Algorithmus, der die Rechenzeit auf etwa ¹/₃ reduziert. Diese vorteilhafte Einsparung an Rechenzeit hängt mit dem weiter unten noch erläuterten "Schwimmen" zusammen; während Z absolut verschoben wird, erfolgt die Verschiebung von N relativ zu Z, so daß die beiden Verschiebe-Raten voneinander entkoppelt sind.Considerable importance for the cryptography method lies in the advantageous use of look-ahead algorithms, since this further considerably reduces the maximum number of additions required for the multiplications and subtractions for the modulo operations. However, the fact that the introduction of the look-ahead algorithms pays off in terms of computing speed only results from the introduction of the new look-ahead algorithm according to the invention for the modulo operation, because the mere application of known look-ahead algorithms to the Multiplication would in no way save time. Only when the mean reduction in the computing effort associated with the modulo operation corresponds to the computing reduction possible with multiplication does an optimal algorithm result which reduces the computing time to about ½. This advantageous saving in computing time is related to the "swimming" explained below; while Z is shifted absolutely, N is shifted relative to Z , so that the two shift rates are decoupled from one another.

Der letzte Vorteil in der Kette von Verfahrensschritten besteht in der Verknüpfung der sich aus der Multiplikation ergebenden Addition und der sich aus der Modulo-Operation ergebenden Subtraktion zu einer einzigen Operation, der 3-Operanden-Addition. Mit ihrer Hilfe braucht die Zykluszeit nicht erweitert zu werden, denn für die 3-Operanden-Addition wird die gleiche Zeit benötigt wie für eine einfache Addition. Hierdurch wird eine Verdoppelung der Rechengeschwindigkeit bewirkt.The final advantage in the chain of procedural steps consists in the combination of the multiplication resulting addition and that resulting from the modulo operation resulting subtraction to a single operation, the 3-operand addition. With their help, the cycle time takes not to be expanded because for the 3-operand addition takes the same time as for one simple addition. This will double the computing speed.

Einerseits resultiert ein Zeitgewinn aus den vorteilhaften Anwendungen von mathematischen Transformationen auf die einzelnen Verfahrensschritte, andererseits ergeben sich aus der erfindungsgemäßen Architektur des Kryptographie- Prozessors weitere Verbesserungen. Durch die Organisation zu baumartigen Strukturen können die einzelnen Elemente gleichzeitig eine größere Anzahl von Informationen abarbeiten. Auch hierdurch wird die Rechenzeit weiter beträchtlich verkürzt.On the one hand, time savings result from the advantageous ones Applications of mathematical transformations the individual process steps, on the other hand, result derive from the inventive architecture of cryptography Processor further improvements. Through the The individual can organize tree-like structures Elements at the same time a larger number of Process information. This also increases the computing time further shortened considerably.

Mit der erfindungsgemäßen Blockstruktur reduziert sich die Rechenzeit für eine 660-Bit-Addition auf die Länge einer 20-Bit-Addition.The block structure according to the invention reduces the computing time for a 660-bit addition to the length a 20-bit addition.

Zusammenfassend ist festzustellen, daß der erfindungsgemäße Kryptographie-Prozessor die Ver- und Entschlüsselung mit einer Rate von 64 000 Bits pro Sekunde durchführt. Dies gilt auch für den ungünstigsten Fall, wo der Schlüssel die maximale Länge von 660 Bits aufweisen sollte.In summary, it should be noted that the invention Cryptography processor encryption and decryption at a rate of 64,000 bits per second. This also applies to the worst case, where the Key should have a maximum length of 660 bits.

Die mit dem erfindungsgemäßen Kryptographie-Verfahren bzw. mit dem entsprechenden Kryptographie-Prozessor erzielbaren Vorteile werden augenscheinlich, wenn man seine "Über-alles-Effizienz" mit der einer Software-Implementierung, z. B. auf einem 16 Bit breiten Bit-Slice-Prozessor (BSP) vergleicht, der speziell auf die Verschlüsselungsaufgabe zugeschnitten worden ist: Heute erhältliche BSP's haben eine Taktfrequenz von ca. 10 MHz. Sie können in einem Zyklus zwei 16-Bit-Worte addieren bzw. subtrahieren und gleichzeitig das Ergebnis um 1 Bit verschieben. Einen Barrel-Shifter haben sie nicht, so daß eine Shift-Operation seriell ausgeführt werden muß. Unter diesen Bedingungen hat ein Look-Ahead negative zeitliche Auswirkungen. Für die Zahl A der Zyklen zur Ausführung einer Operation wird angenommen, daß der BSP für jeden Schritt seiner Hauptschleife nur einen Zyklus benötigt. Sie besteht aus (vgl. Fig. 3):The advantages that can be achieved with the cryptography method according to the invention or with the corresponding cryptography processor become apparent when one compares its "overall efficiency" with that of a software implementation, e.g. B. compares on a 16 bit wide bit slice processor (BSP), which has been specially tailored to the encryption task: BSPs available today have a clock frequency of approx. 10 MHz. You can add or subtract two 16-bit words in one cycle and simultaneously shift the result by 1 bit. They do not have a barrel shifter, so a shift operation must be carried out serially. Under these conditions, a look-ahead has negative temporal effects. For the number A of cycles to perform an operation, it is assumed that the BSP needs only one cycle for each step of its main loop. It consists of (see Fig. 3):

1. Z [ i ]: = Z [ i ] + P [ i ] and
2. Z [ i ]: = Z [ i ] + - N [ i ], shift Z [ i ] by 1 bit.

Das Mikroprogramm der BSP's kann so ausgelegt werden, daß sich die Schleife auf den zweiten Schritt reduziert, wenn der erste aufgrund des Tests des entsprechenden Bits im Multiplikator entfallen kann. Da der erste Schritt mit einer Wahrscheinlichkeit von 1/2 ausgeführt werden muß, hat A einen Wert von 1,5. Für den Vergleich folgt:The microprogram of the BSPs can be designed so that the loop is reduced to the second step if the first step can be omitted due to the test of the corresponding bit in the multiplier. Since the first step must be performed with a probability of 1/2, A has a value of 1.5. For the comparison follows:

Der Vergleich zeigt, daß der erfindungsgemäße Kryptographie- Prozessor selbst spezialisierten Abstimmungen aus Hard- und Software um mehr als zwei Größenordnungen überlegen ist. Ist der RSA-Algorythmus auf normalen Rechenanlagen nur softwaremäßig implementiert, kann von Unterschieden im Bereich von 10 000 ausgegangen werden, da die Hauptschleife bei weitem nicht so effizient ausgeführt werden kann.The comparison shows that the cryptographic Processor even specialized voting from hardware and Software is superior to more than two orders of magnitude. Is the RSA algorithm on normal computer systems only software implemented, may differ in area of 10,000 can be assumed since the main loop at far less efficient.

Verglichen mit dem eingangs erwähnten Rivest-Prozessor ergibt sich eine 50fache Einsparung an Verschlüsselungszeit trotz einer um ca. 30% größeren Schlüssellänge. Diese gewaltige Steigerung der Über-Alles-Effizienz wurde sowohl mit der erfindungsgemäßen Anwendung neuer Verfahrensschritte als auch durch eine spezielle Chip-Architektur ermöglicht.Compared to the Rivest processor mentioned at the beginning this results in a 50-fold saving in encryption time despite a key length that is approx. 30% longer. These Huge increase in overall efficiency has been achieved with the application of new process steps according to the invention as well as a special chip architecture enables.

Der erfindungsgemäße Kryptographie-Prozessor arbeitet als sogenannter Coprozessor. Er besitzt zwei DMA-Kanäle für das Daten I/0 und einen 8 Bit breiten Datenbus. Die Kodier- und die I/0-Einheit arbeiten parallel. Er stellt eine "Kryptographie-Box" dar, bei der die Ver- und Entschlüsselungen von außen nicht beeinflußt werden können, bei der eine Signatur erzeugt wird, und zu der die Schlüssel nur im verschlüsselten Zustand übertragen zu werden brauchen.The cryptography processor according to the invention works as so-called coprocessor. It has two DMA channels for the data I / 0 and an 8 bit wide data bus. The Coding and I / 0 units work in parallel. Created a "cryptography box", in which the encryption and decryption cannot be influenced from outside, for which a signature is generated and for which the Only transfer keys in encrypted state need to be.

Im folgenden wird das erfindungsgemäße Kryptographie- Verfahren und der darauf aufbauende Kryptographie- Prozessor, mit deren Hilfe Daten nach der Public-Key- Code-Methode von Rivest, Shamir und Adleman (RSA) ver- und entschlüsselt werden, anhand eines Ausführungsbeispiels näher beschrieben.In the following cryptographic process of the invention and based thereon cryptographic processor code method Ivest with the aid of data after the public key of R, S and A Hamir dleman (RSA) encryption and are decrypted on the basis of an exemplary embodiment described in more detail.

Im ersten Teil der Beschreibung, der sich mit dem erfindungsgemäßen Kryptographie-Verfahren befaßt, werden die gegenüber dem ursprünglichen RSA-Verfahren gegebenen Modifizierungen erläutert. Deshalb wird dabei häufig auf den ursprünglichen RSA-Algorithmus Bezug genommen. Da das weitere Ziel der Erfindung in der Realisierung des Verfahrens in einem effizienten VLSI- Layout besteht, bleibt es nicht aus, bereits bei der Beschreibung des Verfahrens auf die entsprechenden Hardware-Möglichkeiten hinzuweisen. Insbesondere werden komplexe Operationen so weit in Grundoperationen (Addieren, Subtrahieren, Verschieben usw.) zerlegt, bis sich jeder Schritt unmittelbar in den später folgenden VLSI-Entwurf umsetzen läßt. Deshalb spielt die Betrachtung des Algorithmus aus der Sicht der Hardware eine wichtige Rolle. Viele Schrittfolgen des RSA-Algorithmus werden durch vorteilhaftere ersetzt.In the first part of the description, which deals with the invention Cryptography procedures will be dealt with compared to the original RSA procedure Modifications explained. That is why often referring to the original RSA algorithm taken. Since the further object of the invention in the Realization of the procedure in an efficient VLSI Layout, it does not fail to appear already at the Description of the procedure on the corresponding Point out hardware possibilities. In particular complex operations so far in basic operations (Adding, subtracting, moving, etc.) disassembled until each step immediately in the later ones VLSI draft can be implemented. That is why the consideration plays of the algorithm from a hardware perspective important role. Many step sequences of the RSA algorithm are replaced by more advantageous ones.

Da die Entschlüsselung mathematisch identisch mit der Verschlüsselung ist, wird deshalb im folgenden auf die beiden Vorgänge nicht gesondert eingegangen.Since the decryption is mathematically identical to that Encryption is, therefore, in the following on the the two processes were not dealt with separately.

In der Zeichnung zeigen:The drawing shows:

Fig. 1a ein Flußdiagramm eines Algorithmus für die Potenzierung für die Ent- oder Verschlüsselung eines Datums nach dem ursprünglichen RSA-Verfahren, FIG. 1a is a flowchart of an algorithm for the potentiation for the unlocking or encryption of a date after the original RSA method,

Fig. 1b ein Flußdiagramm eines Algorithmus für die Potenzierung für die Ent- und Verschlüsselung des gleichen Datums gemäß Fig. 1a nach dem erfindungsgemäßen Verfahren, FIG. 1b is a flowchart of an algorithm for the potentiation of the decryption and encryption of the same date as shown in FIG. 1a, by the inventive process

Fig. 2 ein Flußdiagramm des seriellen Algorithmus für die in Fig. 1 erforderliche Multiplikation, wobei die Multiplikanten Elemente der natürlichen Zahlen sind, FIG. 2 is a flow diagram of the serial algorithm for the multiplication required in FIG. 1, the multiplicants being elements of the natural numbers.

Fig. 3a ein Flußdiagramm des Multiplikations- Algorithmus mit einem zusätzlichen Modulschritt, wodurch die Multiplikanten hier Elemente eines Restklassenringes über N sind,3a is a flow chart of the multiplication algorithm with an additional module step, whereby the multiplicand elements Fig. Here a residue class ring over N,

Fig. 3b ein Flußdiagramm gemäß Fig. 3a, wobei die Modul-Rechnung auf eine Subtraktion reduziert ist, FIG. 3b is a flow chart according to Fig. 3a, with the module calculation is reduced to a subtraction,

Fig. 4 ein Flußdiagramm eines Look- Ahead-Algorithmus für die Multiplikation, der die Look- Ahead-Parameter seriell berechnet, Fig. 4 is a flowchart of a look-ahead algorithm for the multiplication of the look-ahead parameters calculated serially,

Fig. 5 ein Flußdiagramm eines Look- Ahead-Algorithmus für die Modulo-Operation, der die Look-Ahead-Parameter seriell berechnet, Fig. 5 is a flowchart of a look-ahead algorithm for the modulo operation, the look-ahead parameters calculated serially,

Fig. 6a ein Flußdiagramm gemäß Fig. 1b, wobei die Multiplikation und anschließende Modulo-Operation zu einem MultMod-Schritt zusammengefaßt sind, FIG. 6a is a flow chart of FIG. 1b, the multiplication and subsequent modulo operation are combined to form a MultMod step,

Fig. 6b ein Flußdiagramm des in Fig. 6a verwendeten erfindungsgemäßen MultMod-Verfahrens, ausgeführt mit Look-Ahead, Fig. 6b is a flow chart of the inventive MultMod method used in Fig. 6a, carried out with look-ahead,

Fig. 7 eine Elementarzelle zur Realisierung einer MultMod-Schleife in einem Schritt, Fig. 7 is a unit cell for implementing a mult-mod loop in one step,

Fig. 8 die Zusammenfassung von vier Elementarzellen gemäß Fig. 7 zu einem 4-Zellen-Block mit einem hierarchischen Carry- Look-Ahead (CLA)-Element, Fig. 8 shows the summary of four unit cells of Fig. 7 to a 4-cell block with a hierarchical carry-look-ahead (CLA) element,

Fig. 9 die hierarchische Zusammenfassung von jeweils fünf 4- Zellen-Blöcken gemäß Fig. 8 zu einem 20-Zellen-Block, Fig. 9 shows the hierarchical summary of five cells 4- blocks shown in FIG. 8 to a 20-cell block,

Fig. 10 eine vollständige Verschlüsselungseinheit mit mehreren 20-Zellen-Blöcken gemäß Fig. 9, sowie mit einer Steuereinheit, Fig. 10 is a complete encryption unit 20 with a plurality of cell blocks in FIG. 9, as well as with a control unit,

Fig. 11 das Blockschaltbild eines Kryptographie-Prozessors, Fig. 11 is a block diagram of a cryptographic processor,

Fig. 12 ein Blockschaltbild einer Steuereinheit gemäß Fig. 10 nach Vorgabe der Look-Ahead- Algorithmen gemäß Fig. 4 und 5, FIG. 12 shows a block diagram of a control unit according to FIG. 10 after the look-ahead algorithms according to FIGS . 4 and 5 have been specified, FIG.

Fig. 13 ein hierarchisches Carry- Look-Ahead-Element, wie es bei den 4-Zellen-Blöcken gemäß Fig. 8 zur Anwendung gelangt, Fig. 13 is a hierarchical carry-look-ahead element as it passes at the 4-cell blocks in FIG. 8 are used,

Fig. 14 die Verschaltung der Carry-Look-Ahead-Elemente gemäß Fig. 13 innerhalb der Hierarchie der 20-Zellen-Blöcke, Fig. 14, the interconnection of the carry-look-ahead elements of FIG. 13 in the hierarchy of the 20 cell blocks,

Fig. 15 ein Zustandsdiagramm bezüglich einiger Schrittfolgen des in Fig. 10 als Puffer ausgebildeten obersten 20-Zellen-Blockes, FIG. 15 shows a state diagram with regard to some sequence of steps of the uppermost 20-cell block designed as a buffer in FIG. 10, FIG.

Fig. 16 eine schematische Blockstruktur zur Verdeutlichung des Informationsflusses, und Fig. 16 is a schematic block structure for explaining the flow of information, and

Fig. 17 einen Floorplan der Anordnung von Elementarzellen auf einem Chip. Fig. 17 shows a floorplan of the array of unit cells on a chip.

Unter Bezugnahme auf Fig. 1 wird nachfolgend zunächst der Verfahrensschritt für die Potenzierung erläutert. Bei der Zerlegung in einfache Grundoperationen wird die Potenzierung in durchschnittlich 1,5 · L (E) Multiplikationen zerlegt. E ist der Exponent und L (x) ist definiert alsWith reference to FIG. 1, the method step for exponentiation is first explained below. When breaking down into simple basic operations, the exponentiation is broken down into an average of 1.5 · L ( E ) multiplications. E is the exponent and L ( x ) is defined as

L (x) : = Anzahl der binären Ziffern von x. L ( x ): = number of binary digits of x .

Die Zeitkomplexität des Algorithmus ist 0 (L (E)). Seine Grundidee ist, den Exponenten binär darzustellen, also in eine Summe von Zweierpotenzen zu verwandeln. Mit den Potenzgesetzen wird die Summe im Exponenten in ein Produkt von Potenzen der zu potenzierenden Zahl P umgeformt. Die e-te Potenz hat als Exponent die e-te Zweierpotenz oder die Null, je nachdem, ob an der e-ten Stelle im ursprünglichen Exponenten eine 1 oder eine 0 steht. Die Faktoren sind also Quadrate bzw. die Zahl 1.The time complexity of the algorithm is 0 ( L ( E )). His basic idea is to represent the exponent in binary form, i.e. to convert it into a sum of powers of two. With the power laws, the sum in the exponent is converted into a product of powers of the number P to be raised . The exponent of the e- th power has the e- th power of two or zero, depending on whether there is a 1 or a 0 at the e- th position in the original exponent. The factors are squares or the number 1.

P ^2c+1=P ^2*2e=(P ^2e)² P ^{2c + 1} = P ^{2 * 2e} = ( P ^2e ) ²

Die Stelle des niederwertigsten Bits wird definitionsgemäß mit Null bezeichnet. Deshalb steht das höchstwertigste Bit an der Stelle L (E) - 1.The position of the least significant bit is defined by zero. Therefore, the most significant bit is at position L ( E ) - 1.

Das (e + 1)-te Quadrat läßt sich leicht durch Quadratur des e-ten berechnen. Es ist deshalb vorteilhaft, für das Produkt ein eigenes Register C zu reservieren. Der Inhalt von Register P wird dann in jedem Schritt quadriert und wieder darin gespeichert. Nach der Quadratur enthält P das e-te Quadrat, da nach dem (e - 1)- ten Schritt in P das (e - 1)-te Quadrat eingetragen worden war.The ( e + 1) th square can easily be calculated by squaring the e th. It is therefore advantageous to reserve a separate register C for the product. The contents of register P are then squared in each step and saved again. After the quadrature, P contains the e th square, since after the ( e - 1) th step in P the ( e - 1) th square was entered.

Im Zwischenregister C steht zu Beginn die 1. Steht im Exponenten E an e-ter Stelle eine 1, so wird im e-ten Schritt C mit P multipliziert und wieder darin gespeichert, andernfalls wird C nicht verändert. Da das Register P zu diesem Zeitpunkt das e-te Quadrat enthält, wird das obige Produkt, wegen der Gleichheit, P hoch E berechnet. Nach dem letzten Schritt steht das Ergebnis im Register C.At the beginning of the intermediate register C there is 1. If there is a 1 in the exponent E at the e- th position, C is multiplied by P in the e- th step and stored again, otherwise C is not changed. Since the register P contains the e- th square at this time, the above product is computed P to E because of the equality. After the last step, the result is in register C.

Das im Flußdiagramm der Fig. 1a dargestellte RSA-Verfahren enthält im oberen Teil den Potenzieralgorithmus. Im unteren Teil der Fig. 1a wird im letzten Schritt C mod N berechnet, d. h. der letzte Schritt des RAS-Algorithmus. Weil auf die Restklassenarithmetik während der Potenzierung verzichtet wurde, hat C bei großen Zahlen eine astronomische Stellenzahl angenommen.The RSA method shown in the flow diagram of FIG. 1a contains the potentiating algorithm in the upper part. In the lower part of FIG. 1a, C mod N is calculated in the last step, ie the last step of the RAS algorithm. Because the remaining class arithmetic was omitted during exponentiation, C assumed an astronomical number of digits for large numbers.

Dies verhindert der in Fig. 1b dargestellte Algorithmus des erfindungsgemäßen Verfahrens. Er nutzt das KongruenzgesetzThis is prevented by the algorithm of the method according to the invention shown in FIG. 1b. He uses the congruence law

(a mod c) (b mod c) ≡ (a b) mod c.( a mod c ) ( b mod c ) ≡ ( a b ) mod c .

Das jeweils entstehende Produkt wird durch die Modulorechnung auf den Repräsentanten der Restklasse abgebildet. Der Repräsentant ist das Element der Restklasse, das auch Element des Ringes ist. Dieses Element ist eindeutig, d. h. es gibt nur ein Element in jeder Restklasse, das die Bedingung erfüllt. The resulting product is determined by the modulo calculation depicted on the representatives of the remaining class. The Representative is the element of the rest of the class, which is also an element of the ring is. This element is unique, i. H. it there is only one element in each residual class that meets the condition Fulfills.

Bezogen auf den Algorithmus bedeutenRelate to the algorithm

a, b Produkte aus dem Schritt e - 1, und
c der Modul. a, b products from step e - 1, and
c the module.

Die Aussage der Kongruenz ist: das Ergebnis der Rechnung im Schritt e fällt in dieselbe Restklasse,The statement of congruence is: the result of the calculation in step e falls in the same residual class,

a) if the resulting products from step e - 1 are mapped onto their representatives and then in step e the representatives are multiplied together or
b) if in step e the products from step e - 1 are multiplied together and then this product is represented on its representatives.

Der Fall a wird im rechten Algorithmus (Fig. 1b) bei jedem Schleifendurchlauf auf die Produkte angewandt. Der Fall b ist im linken Algorithmus (Fig. 1a) realisiert, allerdings nur ein einziges Mal als letzter Schritt des Algorithmus. Durch die ständige Abbildung haben die benutzten Register im rechten Algorithmus eine planbare Größe bekommen. Die Zahlen, die sie speichern müssen, sind maximal doppelt so lang wie die Länge des Modulus. Dies ist der Fall in dem Zeitraum zwischen Multiplikation und der Modulorechnung.In the right algorithm ( Fig. 1b), case a is applied to the products with each loop pass. Case b is implemented in the left algorithm ( FIG. 1a), but only once as the last step of the algorithm. Due to the constant mapping, the registers used in the right algorithm have been given a predictable size. The numbers you have to save are a maximum of twice the length of the modulus. This is the case in the period between multiplication and the modulo calculation.

Die einzelnen Schritte des umgeformten Algorithmus lassen sich auf der Ebene der Potenzierung nicht weiter zerlegen. Es besteht auch nicht mehr die Notwendigkeit dafür. Die Chip-Fläche, die die Potenzierung benötigt, hat durch die ständige Modulorechnung eine obere Grenze erhalten.Leave the individual steps of the reshaped algorithm do not further decompose at the level of potentiation. There is no longer a need for it. The Chip area, which needs the exponentiation, has by the constant modulus calculation get an upper limit.

Nachfolgend wird nun unter Bezugnahme auf Fig. 2 der Verfahrensschritt für die Multiplikation erläutert. Bei dem erfindungsgemäßen Verfahren wird die Multiplikation durch einen seriellen Algorithmus gelöst. Er zerlegt die Multiplikation in L (M) Shift-Operationen und durchschnittlich 0,5 · L (M) Additionen. Mit "M" ist im folgenden der Multiplikator bezeichnet.The method step for the multiplication is now explained below with reference to FIG. 2. In the method according to the invention, the multiplication is solved by a serial algorithm. It breaks down the multiplication into L ( M ) shift operations and an average of 0.5 · L ( M ) additions. In the following, the multiplier is designated with "M" .

Der Platzbedarf hängt linear von L (M) ab, denn für diesen Algorithmus ist eine Arithmetik-Lodic-Unit (ALU) der Größe L (M) vorgesehen, so daß die Addition in einem Schritt geschieht. Das gleiche gilt dann auch für die Shift-Operation. Beide benötigen daher eine konstante Zeit für ihre Operation. Zudem kann die Addition parallel zur Shift-Operation ausgeführt werden. Daraus folgt für die ZeitkomplexitätThe space requirement depends linearly on L ( M ), because an arithmetic lodic unit (ALU) of size L ( M ) is provided for this algorithm, so that the addition takes place in one step. The same applies to the shift operation. Both therefore need a constant time for their surgery. In addition, the addition can be carried out in parallel with the shift operation. It follows for the time complexity

T _Mul = L (M) max(T _Shift, T _Add) = c L (M). T _Mul = L ( M ) max ( T _Shift , T _Add ) = c L ( M ).

Sie hängt, wie der Platzbedarf, linear von der Länge von M ab. Wenn der benötigte Platz allerdings die Möglichkeiten der Integration überschreitet, d. h. M und damit L (M) einen bestimmten Wert übersteigt, ist dieser Algorithmus nur in modifizierter Form verwendbar. Da dies bei der ins Auge gefaßten Größe von 660 Binärstellen (200 Dezimalstellen) nicht der Fall ist, wird dieses Problem in diesem Ausführungsbeispiel nicht diskutiert.Like the space requirement, it depends linearly on the length of M. However, if the space required exceeds the integration options, ie M and thus L ( M ) exceeds a certain value, this algorithm can only be used in a modified form. Since this is not the case with the envisaged size of 660 binary digits (200 decimal places), this problem is not discussed in this exemplary embodiment.

Der in Fig. 2 als Flußdiagramm dargestellte serielle Algorithmus für die Multiplikation baut, ähnlich wie der vorher anhand der Fig. 1 beschriebene Algorithmus für die Potenzierung, auf der Binärdarstellung eines Eingabeparameters auf. Hier ist es der Multiplikator M:The serial algorithm for the multiplication shown as a flow chart in FIG. 2 is based, similarly to the algorithm for the exponentiation described previously with reference to FIG. 1, on the binary representation of an input parameter. Here it is the multiplier M :

Die Multiplikation wird in Additionen zerlegt. P wird im Schritt m zum Zwischenergebnis Z addiert, wenn an der (L (M) - m)-ten Stelle im Multiplikator eine 1 steht, andernfalls bleibt Z unverändert. Danach wird die Schleife noch (L (M) - m)-mal ausgeführt. Wegen der Verdoppelung von Z zu Beginn jedes Schleifendurchlaufs wird die Summe Z + P des m-ten Schrittes (L (M) - m)-mal verdoppelt. Das entspricht der Multiplikation mit der Zweierpotenz.The multiplication is broken down into additions. P is added to the intermediate result Z in step m if there is a 1 at the ( L ( M ) - m ) th position in the multiplier, otherwise Z remains unchanged. Then the loop is executed ( L ( M ) - m ) times. Because of the duplication of Z to the beginning of each loop iteration, the sum Z + P of the m-th step is (L (M) - m) times doubled. This corresponds to the multiplication by the power of two.

Zusammenfassend nutzt der Algorithmus aus, daß eine Multiplikation mit einer Binärziffer entweder den Multiplikanten selbst oder Null ergibt. Weiterhin führt er die in jedem Schritt erforderliche Multiplikation mit einer Zweierpotenz auf eine Verdoppelung von Z zurück. In der Binärdarstellung ist die Verdoppelung eine einfache Shift-Operation um ein Bit nach links (definitionsgemäß steht das niederwertigste Bit rechts).In summary, the algorithm takes advantage of the fact that multiplication by a binary digit results in either the multiplicant itself or zero. Furthermore, he attributes the multiplication with a power of two required in each step to a doubling of Z. In the binary representation, the doubling is a simple shift operation by one bit to the left (by definition, the least significant bit is on the right).

Der Modulo-Verfahrensschritt ist schematisch in Fig. 3 dargestellt. Während der Potenzierung ist nach jeder Multiplikation eine Modulo-Operation auszuführen, um eine zum Produkt kongruente Zahl aus dem Restklassenring zu erhalten. Der in Fig. 2 beschriebene Algorithmus betrachtet die beiden Multiplikanten als Elemente der natürlichen Zahlen, nicht des Restklassenringes über N. Deshalb wird im Potenzieralgorithmus nach jeder Multiplikation ein Moduloschritt ausgeführt.The modulo process step is shown schematically in FIG. 3. During the exponentiation, a modulo operation must be carried out after each multiplication in order to obtain a number that is congruent to the product from the residual class ring. The algorithm described in FIG. 2 considers the two multiplicants as elements of the natural numbers, not of the residual class ring over N. Therefore, a modulo step is carried out in the exponentiation algorithm after each multiplication.

Nach dem erfindungsgemäßen Verfahren wird auch die Multiplikation in diesem Restklassenring ausgeführt. Dafür wurde der herkömmliche Algorithmus an einer Stelle verändert: Am Ende der Schleife wird das Zwischenergebnis Z auf seinen Repräsentanten abgebildet.According to the method according to the invention, the multiplication is also carried out in this residual class ring. The conventional algorithm was changed at one point: at the end of the loop, the intermediate result Z is mapped to its representatives.

Das ist notwendig, weil Z erstens verdoppelt und zweitens P (im ungünstigsten Fall) zu ihm addiert wurde. Deshalb kann Z am Ende der Schleife Werte haben, die größer oder gleich des Modulus N sind.This is necessary because, firstly, Z was doubled and secondly, P (in the worst case) was added to it. Therefore, Z can have values at the end of the loop that are greater than or equal to the modulus N.

Wird dagegen zum Schluß noch ein Moduloschritt hinzugefügt, hat Z nach dem Verlassen der Schleife immer Werte, die im erlaubten Zahlenbereich des Ringes liegen. Das Kongruenzgesetz, das es erlaubt, den Moduloschritt vom Potenzieralgorithmus in den Mutliplikationsalgorithmus zu verlegen, lautetIf, on the other hand, a modulo step is added at the end, after leaving the loop, Z always has values that are within the permitted number range of the ring. The congruence law that allows the modulo step to be shifted from the exponentiation algorithm to the multiplication algorithm is

(a mod c) + (b mod c) ≡ (a + b) mod c.( a mod c ) + ( b mod c ) ≡ ( a + b ) mod c .

Wie bei der Potenzierung ist auch hier die Aussage der Kongruenz: das Ergebnis der Rechnung im Schritt m fällt in dieselbe Restklasse,As with the exponentiation, the statement of congruence is also here: the result of the calculation in step m falls in the same residual class,

a) if the resulting sum is mapped to its representative in step m -1 and then further calculated with this in step m or
b) if in step m one adds something to the sum from step m - 1 and then maps this sum to its representative.

Da die Multiplikation in eine Summenfolge umgewandelt wurde, lautet der Schluß: Es ergibt das gleiche Ergebnis, zwei Zahlen zu multiplizieren und dann die Modulorechnung auszuführen, oder nach jeder Addition in der zerlegten Mutliplikation sofort Modulo zu rechnen.Since the multiplication has been converted into a sum sequence, the conclusion is: it gives the same result, two Multiply numbers and then do the modulus calculation, or after each addition in the disassembled multiplication to calculate modulo immediately.

Das Zwischenergebnis Z des in Fig. 3a dargestellten Flußdiagramms kann in der Schleife nicht beliebig große Werte annehmen, wenn es beim Schleifeneintritt einen kleineren Wert als N hatteThe intermediate result Z of the flow chart shown in FIG. 3a cannot assume arbitrarily large values in the loop if it had a value smaller than N when the loop entered

N ≦λτ Z, P =≦λτ · N ≦λτ Z : = 2 Z + P. N ≦ λτ Z, P = ≦ λτ · N ≦ λτ Z : = 2 Z + P.

In dem erfindungsgemäßen Verfahren ist die herkömmliche Modulorechnung durch eine einzige Subtraktion ersetzt.In the method according to the invention is the conventional one Modular calculation replaced by a single subtraction.

Wenn Z am Ende der Schleife größer oder gleich N ist, wird lediglich N bzw. 2 N von Z subtrahiert, und der Wert von Z ist wieder kleiner als N. Diese Schritte sind im Flußdiagramm der Fig. 3b enthalten.If Z is greater than or equal to N at the end of the loop, only N or 2 N is subtracted from Z , and the value of Z is again less than N. These steps are included in the flow chart of Figure 3b.

Für die Subtraktion wird keine zusätzlich Logik benötigt, denn nach der Negation des Subtrahenten wird sie zu einer Addition und ist mit der Additionslogik berechenbar:No additional logic is required for subtraction, because after the negation of the subtrahent, it becomes one Addition and can be calculated with the addition logic:

a + b = a + (-b) a + b = a + (- b )

Eine Zahl wird negiert, indem jedes einzelne Bit negiert wird. Dazu muß abschließend noch die Zahl 1 addiert werden. Das ist im VLSI-Entwurf mit einem Inverter pro Bit realisierbar. Da jedoch in der Speicherzelle beide Informationen vorliegen, das Bit und das invertierte Bit, wird auf einen zusätzlichen Inverter verzichtet.A number is negated by negating every single bit becomes. Finally, the number 1 must be added. This can be done in the VLSI design with one inverter per bit. However, since both information are in the memory cell present, the bit and the inverted bit, is on dispenses with an additional inverter.

Bei der Addition zweier Zahlen wird an die niederwertigsten Bits kein Übertragsbit (Carrybit) übergeben. Soll nun subtrahiert werden, werden die negierten Speicherbits an die Additionslogik angelegt und gleichzeitig wird den niederwertigsten Bits ein Carrybit signalisiert.When two numbers are added, the least significant are added Bits no carry bit transferred. Should now be subtracted the negated memory bits are sent to the Addition logic created and at the same time becomes the least significant Bits signals a carry bit.

Dieser Verfahrensschritt, der die Multiplikation mit der Modulo-Operation in erfindungsgemäßer Weise miteinander verbindet, wird im folgenden MultMod genannt.This process step, which is the multiplication with the Modulo operation with each other in the manner according to the invention connects, is called MultMod in the following.

Die Erhöhung der Rechengeschwindigkeit durch Look-Ahead- Verfahren läßt sich anhand von Fig. 4 erläutern. Analysiert man den MultMod-Algorithmus und bedenkt die Möglichkeiten der Parallelisierung, so ergibt sich, daß viele Schritte umsonst ausgeführt werden. Genauer gesagt, ganze Schleifendurchläufe (Zyklen) können entfallen, wenn außer den beiden ersten, unabwendbaren Schritten keine der bedingten Schritte auszuführen sind.The increase in the computing speed by the look-ahead method can be explained with reference to FIG. 4. If you analyze the MultMod algorithm and consider the possibilities of parallelization, you can see that many steps are carried out for free. More precisely, entire loop runs (cycles) can be omitted if, apart from the first two inevitable steps, none of the conditional steps have to be carried out.

Entfällt ein Zyklus, dann wird kein Schritt der Schleife ausgeführt, auch die unbedingten nicht. Dies muß bei dem nächsten, nicht ausgefallenen Zyklus bedacht werden. Zuerst muß jedoch berechnet werden, wieviel Zyklen übersprungen werden können. Sei nun sz - 1 die Anzahl der übersprungenen Zyklen ("sz" ist der Schiebebetrag (Shift-Betrag) der Multiplikation und behält im folgenden diese Bedeutung). Mit dieser Information können die ersten beiden Schritte der übersprungenen Zyklen im jetzigen Zyklus mitausgeführt werden (sz - 1 übersprungene Zyklen plus dem aktuellen Zyklus ergibt sz Zyklen!):If a cycle is omitted, no step of the loop is carried out, not even the unconditional ones. This has to be considered in the next cycle that has not failed. But first you have to calculate how many cycles can be skipped. Let sz - 1 be the number of skipped cycles ( "sz" is the shift amount (shift amount) of the multiplication and retains this meaning in the following). With this information the first two steps of the skipped cycles can be carried out in the current cycle ( sz - 1 skipped cycles plus the current cycle results in sz cycles!):

1. Z is not shifted to the left by 1 bit (doubling), but by sz bits
2. m is not increased by 1, but by sz .

Das Verschieben um sz Bits ist mit einem Barrel-Shifter in einem Schritt machbar (Conway; L., Mead, C., Introduction to VLSI Systems, Adison-Wesley Publishing Company, Inc., 1980).Shifting by sz bits can be done in one step with a barrel shifter (Conway; L., Mead, C., Introduction to VLSI Systems, Adison-Wesley Publishing Company, Inc., 1980).

Methoden, die es ermöglichen, überflüssige Schritte zu überspringen, werden Look-Ahead (vorausschauende) Verfahren genannt. Solche Verfahren müssen nach sorgfältiger Analyse für jeden Algorithmus getrennt entworfen werden. Es muß vor allem geprüft werden, ob der zu erwartende Zeitgewinn größer ist als die Zeit zur Berechnung der überspringbaren Zyklen. Bei der angestrebten Hardware-Implementierung des gesamten Algorithmus wird der Zeitgewinn durch nichts geschmälert, da die Berechnung der Look-Ahead-Parameter parallel zum längsten Schritt, der Addition, geschieht.Methods that allow unnecessary steps to be taken skip, look-ahead (predictive) procedures called. Such procedures need careful analysis be designed separately for each algorithm. It has to be done everything is checked whether the expected time saving is greater is the time to calculate skippable cycles. In the targeted hardware implementation of the whole Algorithm the time saving is not diminished by anything since the calculation of the look-ahead parameters parallel to longest step, the addition, happens.

Für die Multiplikation ist seit langem ein Look-Ahead-Algorithmus bekannt. Er hat zwei Zustände:A look-ahead algorithm has long been used for multiplication known. It has two states:

1. LA = 0, zeros in the multiplier are ignored and
2. LA = 1, ones in the multiplier are ignored.

Die Schrittfolge des Algorithmus lautet:The step sequence of the algorithm is:

1. Set sz : = 1.
2. Set m : = m + 1.
3. Set a : = 1 - 2 LA
4. The 3-bit string M [ L ( M ) - m . . L ( M ) - m - 2] considered. As long as not finished, depending on the 3-bit string and the LA value, execute the rule in the line.
5. The following must be carried out in the MultMod algorithm:

a. Shift Z left by sz bits.

b. Set Z : = Z + a P.

Die römischen Zahlen in den Klammern hinter der "Fertig"- Anweisung benennen die Regeln dieser Zeile, die erste Zahl steht für LA = 0 und die zweite für LA = 1. Die 3-Bit- Strings, hinter denen "Unmöglich" steht, können nicht auftauchen, da bei ihnen bereits im Schritt davor die Regel II bzw. III zur Anwendung gekommen wäre. Die Variable "a" dient nur als Zwischenspeicher für die Information, ob P beim Additionsschritt negiert wird oder nicht. In diesem Schritt findet in der Implementierung keine Multiplikation statt, da a nur die Werte +1 und -1 annehmen kann. Das Verschieben von Z und die Erhöhung von m ist vorher schon erklärt worden.The Roman numerals in the brackets after the "Done" statement indicate the rules of this line, the first number stands for LA = 0 and the second for LA = 1. The 3-bit strings behind "Impossible" can do not appear, since rule II or III would have been applied to them in the previous step. The variable "a" only serves as a buffer for the information as to whether P is negated in the addition step or not. In this step, there is no multiplication in the implementation, since a can only take the values +1 and -1. Moving Z and increasing m has already been explained.

Die Look-Ahead-Regeln lassen sich leicht mit Hilfe der Summenzerlegung der Multiplikation verstehen. Sie lautetThe look-ahead rules can be easily set using the Understand sum multiplication. it is

"s" soll in den folgenden Rechnungen die Stelle relativ zu L (M) - m bezeichnen, an der im Multiplikator, von der Stelle L (M) - m - 1 an gerechnet, das erste Bit ungleich LA steht. In the following calculations, "s" is to denote the position relative to L ( M ) - m at which, in the multiplier, calculated from the position L ( M ) - m - 1, the first bit is not equal to LA .

Regel I besagt, wenn in einem 0-String eine isolierte 1 steht, dann addiere an dieser Stelle P zu Z. Mathematisch ausgedrückt:Rule I states that if there is an isolated 1 in a 0 string, add P to Z at this point. Expressed mathematically:

Die Summanden der zweiten Summe sind Null außer dem Wert Lambda = m + s, da an den Binärstellen L (M) - m - 1 . . L (M) - m - s des Multiplikators 00 . . . 01 steht.The summands of the second sum are zero except for the value lambda = m + s , since at the binary digits L ( M ) - m - 1. . L ( M ) - m - s of the multiplier 00. . . 01 stands.

Regel II ist am einfachsten im Zusammenwirken mit Regel III zu verstehen. Regel II schaltet von einem 0-String auf einen 1-String um, wenn auf die erste 1 mindestens noch eine zweite folgt. An der Stelle der letzten 0 wird P zu Z addiert. Regel III ist zu II dual. Sie schaltet von einem 1-String zu einem 0-String um, wenn auf die erste 0 mindestens noch eine zweite folgt. An der Stelle der letzten 1 wird P von Z subtrahiert. Sei s 1 die Stelle der ersten 1 und s 0 die Stelle der ersten darauf folgenden 0, beide relativ von m gerechnet, so ergibt sich:Rule II is most easily understood in conjunction with Rule III. Rule II switches from a 0 string to a 1 string if the first 1 is followed by at least one second. P is added to Z at the position of the last 0. Rule III is dual to II. It switches from a 1 string to a 0 string if at least one second follows the first 0. In place of the last 1, P is subtracted from Z. If s 1 is the position of the first 1 and s 0 is the position of the first 0 that follows, both calculated relative to m , the following results:

Aus der letzten Zeile folgt direkt Regel II und III. Die Stellen zwischen den beiden Umschaltpunkten brauchen nicht beachtet zu werden, vorausgesetzt alle sind 1. Ein Beispiel soll dies anschaulicher machen. Rules II and III follow directly from the last line. The Positions between the two switch points do not need to be considered, provided everyone is 1. An example should make this clearer.

Man spart zwei Additionen (somit zwei Zyklen), wenn man nicht ein Bit des Multiplikators nach dem anderen abarbeitet, sondern 1-String als geometrische Summe betrachtet, die Summe berechnet und die unwesentlichen Bits überspringt.You save two additions (thus two cycles) if you not processing one bit of the multiplier after another, but considered 1-string as a geometric sum, calculates the sum and skips the insignificant bits.

Bleibt noch Regel IV. Sie besagt, steht in einem 1-String eine isolierte 0, dann subtrahiere an dieser Stelle P von Z. Es ergibt sich:If rule IV still remains, it says that there is an isolated 0 in a 1 string, then subtract P from Z at this point. The result is:

Aus der Sicht des Look-Ahead-Verfahrens sieht es nach der Subtraktion von P an der Stelle L (M) - (m + s) so aus, als sei der 1-String nicht unterbrochen. Der Look-Ahead kann fortgesetzt werden.From the point of view of the look-ahead method, after subtracting P at position L ( M ) - ( m + s ), it looks as if the 1 string is not interrupted. The look-ahead can continue.

Mit dem im Flußdiagramm der Fig. 4 dargestellten Look- Ahead-Algorithmus für die Multiplikation, der die Parameter seriell berechnet, ist gegenüber der Version ohne Look-Ahead kein Zeitvorteil zu erreichen, denn pro Zyklus kann nur ein Bit des Multiplikators getestet werden. Dieses Flußdiagramm dient daher auch nur der Umsetzung der Regeln in einen funktionierenden Algorithmus. Andererseits ist dieser Algorithmus für die Hardware-Implementierung vorgesehen, jedoch geschieht dann die Berechnung der Look-Ahead-Parameter in einem Schritt parallel zu den Operationen anderer Rechenwerke, so daß am Ende eines Zyklus sofort die Parameter für den nächsten bereitstehen.With the look-ahead algorithm for multiplication shown in the flow diagram of FIG. 4, which calculates the parameters serially, no time advantage can be achieved compared to the version without look-ahead, because only one bit of the multiplier can be tested per cycle. This flowchart therefore only serves to convert the rules into a functioning algorithm. On the other hand, this algorithm is intended for hardware implementation, but the look-ahead parameters are then calculated in one step parallel to the operations of other arithmetic units, so that at the end of one cycle the parameters are immediately available for the next one.

In diesem Algorithmus kann der Shift-Betrag sz maximal den Wert cur k annehmen (cur k ist der "Name" einer Variablen). Der Shift-Betrag gibt die Anzahl der Stellen an, um die ein Register verschoben wird. Ein Maximum des Shift-Betrages wird von der Theorie nicht gefordert, wohl aber von der Praxis. Der Barrel Shifter, der Z in einem Schritt in die berechnete Position schiebt, kann dies nur bis zu einem maximalen Betrag "k", der im Entwurf festgelegt werden muß. k ist der maximale Wert, den cur k annehmen kann. Der Wert von cur k wird vom noch zu entwerfenden Modulo-Look-Ahead-Algorithmus festgesetzt.In this algorithm, the shift amount sz can have a maximum value of cur accept k ( cur k is the "name" of a variable). The shift amount indicates the number of places by which a register is shifted. A maximum of the shift amount is not required by theory, but by practice. The barrel shifter that pushes Z into the calculated position in one step can only do this up to a maximum amount "k" , which must be specified in the design. k is the maximum value that cur k can assume. The value of cur k is determined by the modulo look-ahead algorithm yet to be designed.

Ist bis sz = cur k keine Regel zur Anwendung gekommen, dann wird Z um k Bits nach links verschoben und a erhält den Wert 0, d. h. P wird weder addiert noch subtrahiert. Die in einem Schritt machbare Arbeit muß aus Kosten- und Platzgründen in mehrere aufgeteilt werden. Welcher Wert für k ein günstiger Kompromiß zwischen den Forderungen der Geschwindigkeitssteigerung und der Reduzierung des benötigten Platzes ist, wird in einem der folgenden Abschnitte geklärt.Is up to sz = cur k no rule is applied, then Z is shifted to the left by k bits and a is given the value 0, ie P is neither added nor subtracted. The work that can be done in one step must be divided into several for cost and space reasons. Which value for k is a favorable compromise between the demands of speed increase and the reduction of the space required is clarified in one of the following sections.

Es folgt jetzt die Look-Ahead-Erweiterung auf den Moduloschritt (Fig. 5). Da aus Gründen der Effizienz die Multiplikation in den Restklassenring verlegt wurde, nutzt die im vorigen Abschnitt gefundene Verbesserung der Multiplikation durch Look-Ahead wenig, wenn nicht auch für den Moduloschritt eine Methode zur Erzeugung von Look-Ahead-Parametern eingeführt werden kann. Sonst bremst der Moduloschritt den gesamten Ablauf, da er nach wie vor nur ein Bit pro Zyklus abarbeiten kann. Das Verfahren ist außerdem einer Einschränkung unterworfen. Der Erwartungswert des Shift-Betrages pro Zyklus sollte annähernd übereinstimmen mit dem des Multiplikations-Algorithmus. Ist dies nicht der Fall, bremst ein Look-Ahead-Algorithmus den anderen.The look-ahead extension to the modulo step now follows ( FIG. 5). Since multiplication was moved to the remainder of the class for reasons of efficiency, the improvement in multiplication by look-ahead found in the previous section is of little use, if not a method for generating look-ahead parameters can also be introduced for the modulo step. Otherwise, the modulo step slows down the entire process, since it can still only process one bit per cycle. The process is also restricted. The expected value of the shift amount per cycle should be approximately the same as that of the multiplication algorithm. If this is not the case, one look-ahead algorithm slows down the other.

Der erfindungsgemäße Algorithmus, der die geforderten Bedingungen erfüllt, und die genannten Regeln zu einem Algorithmus zusammenfaßt, ist im Flußdiagramm der Fig. 5 dargestellt. Wie zuvor, ist auch dieser Algorithmus aus denselben Gründen seriell beschrieben. In der Hardware- Implementierung hat er dieselben Eigenschaften wie der Look-Ahead-Algorithmus der Multiplikation.The algorithm according to the invention, which fulfills the required conditions and combines the rules mentioned into one algorithm, is shown in the flow chart of FIG. 5. As before, this algorithm is serially described for the same reasons. In the hardware implementation, it has the same properties as the look-ahead algorithm of multiplication.

Die Rahmenbedingungen für den Modulo-Look-Ahead-Algorithmus verlangen, daß die Erwartungswerte beider Beträge, um die pro Zyklus verschoben wird, übereinstimmen. Bisher wurde N nicht verschoben: Z ist in jedem Zyklus um ein Bit nach links verschoben worden und N relativ dazu um ein Bit nach rechts, d. h. N behielt seine Position bei. Dagegen wird das Look-Ahead-Verfahren einen Parameter sn generieren, der angibt, um wieviel Bits N relativ zu Z nach rechts zu verschieben ist. "sn" ist im folgenden der aktuelle Shift- Betrag des Modulo-Look-Ahead-Algorithmus, und "n" gibt an, wieviel Binärstellen N zum jeweiligen Zeitpunkt absolut nach links verschoben ist. Im Register N stehen demnach Vielfache des Moduls N. Deshalb wird Z am Ende der MultMod- Schleife meistens kein Restklassen-Repräsentant sein. Es können drei Fälle eintreten: The general conditions for the modulo look-ahead algorithm require that the expected values of the two amounts, which are shifted per cycle, match. So far, N has not been shifted: Z has been shifted one bit to the left in each cycle and N relative to it to the right by one bit, ie N has maintained its position. In contrast, the look-ahead method will generate a parameter sn which specifies how many bits N are to be shifted to the right relative to Z. In the following, "sn" is the current shift amount of the modulo look-ahead algorithm, and "n" indicates how many binary positions N have been shifted absolutely to the left at the respective time. The register N therefore contains multiples of the module N. For this reason, Z will usually not be a residue class representative at the end of the MultMod loop. There are three cases:

1. sn ≦ λτ sz : N is shifted absolutely to the right by sn - sz digits. Precautions must be taken to ensure that n does not become less than 0, otherwise fractions of N and not multiples of N are expected. This destroys the congruence. sn is therefore calculated so that the value of n becomes a minimum of 0.
2. sn = sz there is nothing to consider.
3. sn ≦ ωτ sz : N is shifted absolutely by sz - sn places to the left. If there is a possibility that n will exceed a certain set value MAX in the next step, the multiplication look-ahead algorithm must be braked in such a way that n in any case assumes a value less than or equal to MAX . This prevents n from becoming arbitrarily large.

Im Algorithmus von Fig. 5 ist am Schluß die Begrenzung des maximalen Shift-Betrages von Z zu sehen. cur k wird so gesetzt, daß im ungünstigsten Fall (sz = cur k und sn = 1) des nächsten Schrittes n gerade den Wert MAX annimmt.In the algorithm of FIG. 5, the limitation of the maximum shift amount of Z can be seen at the end. cur k is set so that in the worst case ( sz = cur k and sn = 1) of the next step n just assumes the value MAX .

Ein Look-Ahead für die Modulorechnung erfordert, unabhängig von den eingesetzten Regeln, daß mit Vielfachen von N gerechnet werden kann. Der Vorteil ist, daß sich die Multiplikation und die Modulorechnung im allgemeinen nicht gegenseitig behindern. Eine Behinderung tritt nur dann ein, wenn im oben aufgeführten Fall 1 oder Fall 3 der Wert von sn bzw. sz für einen Schritt nach oben begrenzt wird. Ein Nachteil ist, daß die Register größere Zahlen als die des Moduls speichern müssen. Für diesen Überlauf muß im VLSI-Entwurf ein Puffer vorgesehen werden.Regardless of the rules used, a look-ahead for the modulo calculation requires that multiples of N can be expected. The advantage is that the multiplication and the modulus calculation generally do not interfere with each other. A disability only occurs if, in case 1 or case 3 above, the value of sn or sz is limited for one step. A disadvantage is that the registers have to store larger numbers than that of the module. A buffer must be provided for this overflow in the VLSI draft.

Dieser Nachteil gilt aber nur für zwei Register: Z und N. Außerdem ist die Puffergröße durch MAX nach oben begrenzt. Der VLSI-Entwurf kann also weiterhin von konstanten Größen aller Register und nun auch des Puffers ausgehen. Welche Größe der Puffer haben sollte, damit die Erwartungswerte nicht zu sehr sinken, wird in einem folgenden Abschnitt besprochen.This disadvantage only applies to two registers: Z and N. In addition, the buffer size is limited by MAX . The VLSI design can therefore continue to assume constant sizes of all registers and now also of the buffer. The following section discusses the size of the buffers so that the expected values do not decrease too much.

Es müssen noch die Berechnungsregeln der Look-Ahead-Parameter beschrieben werden. Dazu wird die Zahl ZDN benötigt. Sie ist definiert als zwei Drittel des Wertes vom Register N:The calculation rules for the look-ahead parameters still have to be described. The number ZDN is required for this. It is defined as two thirds of the value from register N :

Der Algorithmus lautet:The algorithm is:

1. Set sn : = 0.
2. Set b : = 0.
3. As long as the absolute amount of Z is less than or equal to ZDN , do:

a. Set sn : = sn + 1,

b. set n : = n - 1 and

c. shift ZDN to the right by 1 bit, ie divide ZDN by 2.
4. Set b: = 2 Z [sign] - 1. If the sign bit has the value 0, then Z is positive, otherwise Z is negative.
5. The following must be carried out in the MultMod algorithm:

a. Shift N to the right and sn by sn bits relative to Z

b. set Z : = Z + b N , ie if Z is positive, then N is subtracted from Z , otherwise N is added to Z.

Im letzten Schritt wird in der Hardware-Implementierung nicht multipliziert, da b nur die Werte -1, 0 und +1 annehmen kann: An die Additionslogik wird der Wert -N, 0 oder +N angelegt. Die Berechnung von ZDN bereitet auch keine Schwierigkeiten, denn ZDN wird nicht jedesmal neu berechnet, sondern nur ein einziges Mal bei der Schlüsselübergabe und wird dann derselben Shift-Operation unterworfen wie N. So bleibt die Relation N zu ZDN erhalten.In the last step, the hardware implementation does not multiply, since b can only take the values -1, 0 and +1: The value - N , 0 or + N is applied to the addition logic. The calculation of ZDN does not pose any difficulties either, because ZDN is not recalculated every time, but only once when the key is handed over and is then subjected to the same shift operation as N. So the relation N to ZDN is preserved.

Nach der Schlüsselübergabe muß ZDN jedoch berechnet werden. Zwei Drittel binär dargestellt sind 0.101010101. . Die Berechnung von ZDN geschieht demnach so:After handing over the keys, however, ZDN must be calculated. Two thirds are shown in binary 0.101010101. . ZDN is therefore calculated as follows:

1. Set Z : = 0.
2. Set Z : = Z + N.
3. Shift Z 2 bits to the left.
4. Skip back to step 2 if ZDN is not yet calculated precisely enough.

Der letzte Schritt enthält eine unscharfe Abbruchbedingung. ZDN ist genau bestimmt, wenn jedes Bit des Multiplikators "Zwei Drittel" abgearbeitet ist. Die Anzahl der Bits von "Zwei Drittel", die noch einen Einfluß auf den Vergleich von Z mit ZDN haben, ist dieselbe, wie die Anzahl der Bits, die der Komparator hat, der den Vergleich durchführt. Die Breite des Komparators wird wiederum durch die verlangte Genauigkeit des Vergleiches bestimmt. Wie im nächsten Abschnitt gezeigt wird, sind 10 Bits mehr als ausreichend. Daraus folgt, daß ZDN in wenigen Schritten berechnet werden kann.The last step contains an unsharp termination condition. ZDN is determined exactly when each bit of the multiplier "two thirds" has been processed. The number of bits of "two thirds" that still affect the comparison of Z with ZDN is the same as the number of bits that the comparator that performs the comparison has. The width of the comparator is in turn determined by the required accuracy of the comparison. As shown in the next section, 10 bits are more than sufficient. It follows that ZDN can be calculated in a few steps.

Wie gerade erwähnt, werden von ZDN nur ein paar der höchstwertigsten Bits zum Vergleich mit Z benutzt. Dies bewirkt natürlich, daß der Komparator ab und zu ein falsches Ergebnis liefert, denn ein hundertprozentig sicherer Vergleich müßte alle L (N) Bits berücksichtigen. Dies ist aus Platzgründen nur schwer zu realisieren. Was aber viel schwerer wiegt, ist die Tatsache, daß die Vergleichszeit dann ähnlich groß wird wie die normale Additionszeit. Der korrekte Vergleich wäre also ein Phyrrussieg.As just mentioned, ZDN uses only a few of the most significant bits to compare with Z. Of course, this causes the comparator to give an incorrect result every now and then, because a 100% reliable comparison would have to take into account all L ( N ) bits. This is difficult to achieve due to space constraints. What is much more serious, however, is the fact that the comparison time then becomes similar to the normal addition time. The correct comparison would be a Phyrus victory.

Welche Auswirkungen hat es aber, wenn der Komparator eine falsche Entscheidung getroffen hat? Dann hat sn im nächsten Zyklus den Wert 1, andernfalls hätte sn einen Wert größer als 1 gehabt. Es verschlechert sich demnach der aktuelle Shift-Betrag des nächsten Zyklus auf den Wert 1. Beweis:But what effects does it have if the comparator has made the wrong decision? Then sn has the value 1 in the next cycle, otherwise sn would have had a value greater than 1. Accordingly, the current shift amount of the next cycle deteriorates to the value 1. Proof:

Wenn der Komparator richtige Ergebnisse geliefert hat, dann ist sn so bestimmt worden, daßIf the comparator has given correct results, then sn has been determined so that

ist. N wird nun um sn Bits nach rechts verschoben, d. h. N wird durch 2 hoch sn dividiert. Dann wird N, wenn Z negativ ist, zu Z addiert, andernfalls wird N von Z subtrahiert. Daraus folgt, daß N vom Absolutwert von Z subtrahiert wird. Das Ergebnis wird wieder in Z abgespeichert.is. N is now shifted to the right by sn bits, ie N is divided by 2 to the sn . Then, if Z is negative, N is added to Z , otherwise N is subtracted from Z. It follows that N is subtracted from the absolute value of Z. The result is saved again in Z.

Da der Absolutwert von Z jetzt kleiner gleich einem Drittel von N ist, muß im nächsten Zyklus sn ≦λτ 1 sein:Since the absolute value of Z is now less than or equal to one third of N , sn ≦ λτ 1 must be in the next cycle:

Teil 2 ist damit bewiesen.Part 2 is proven.

Fällt dagegen der Komparator eine Fehlentscheidung, ist die Ungleichung I nicht erfüllt. Eine Fehlentscheidung wird z. B. dann getroffen, wenn ZDN durch Rundungsfehler bei der Berechnung etwas kleiner geworden ist als zwei Drittel von N. Liegt Z dann in der Nähe, aber noch unter von zwei Drittel von N, ist die Vergleichsaussage: ZDN ist kleiner als Z. Tatsächlich jedoch hätte diese Aussage erst eine Bitstelle später erfolgen müssen. Daraus folgt:If, on the other hand, the comparator makes a wrong decision, inequality I is not fulfilled. A wrong decision will e.g. B. hit when ZDN has become somewhat smaller than two thirds of N due to rounding errors in the calculation. If Z is close, but still less than two thirds of N , the comparison statement is: ZDN is smaller than Z. In fact, however, this statement should only have been made one bit position later. It follows:

Die Voraussetzung (1) ist nicht mehr erfüllt. Es folgtThe requirement (1) is no longer met. It follows

Die Auswirkungen eines Fehlers des Komparators sind also relativ harmlos. Das gibt dem Designer einen weiten Spielraum in der Wahl der Komparatorweite, denn eine geringe Breite erhöht durch fehlerhafte Vergleiche nur die Zyklenzahl, verfälscht aber nicht die Rechnung. So kann ein Komparator geringer Breite entworfen werden, der zwar häufiger irrt, aber aufgrund des sehr schnell vorliegenden Ergebnisses trotzdem noch einen Geschwindigkeitsgewinn bringt.So the effects of a comparator error are relatively harmless. This gives the designer a wide scope in the choice of the comparator width, because a small one Width only increases the number of cycles due to incorrect comparisons, but does not falsify the bill. So can a comparator small width, which is more common wrong, but because of the very quickly available result still brings a speed gain.

Für diese Abwägung muß die Fehlerwahrscheinlichkeit bekannt sein. Ein Fehler ereignet sich entweder durch Rundung im letzten Bit von ZDN oder in Bits, die niederwertiger sind als das niederwertigste Komparatorbit. Sei "d" die Komparatorbreite, dann ist die Fehlerhäufigkeit Epsilon The probability of error must be known for this consideration. An error occurs either by rounding in the last bit of ZDN or in bits that are less significant than the least significant comparator bit. Let "d" be the comparator width, then the error rate is epsilon

Anschaulich formuliert besagt der Ausdruck, daß nur dann ein Fehler eintreten kann, wenn alle höherwertigen Bits den Vergleich nicht entscheiden konnten. Das ist bei einer von 2^d Zahlen der Fall.To put it clearly, the expression says that an error can only occur if all the higher bits could not decide the comparison. This is the case with one of 2 ^d numbers.

Da ausschließlich die wahrscheinlichkeitstheoretischen Erwartungswerte der Look-Ahead-Verfahren übereinstimmen, wurden sie notwendigerweise durch den MultMod-Algorithmus voneinander entkoppelt. Dabei wird in jedem Zyklus Z absolut und N relativ zu Z verschoben. Diese Entkopplung wird hier "Schwimmen" genannt, und nachfolgend im Zusammenhang mit Fig. 15 näher erläutert.Since only the probability-theoretical expected values of the look-ahead method match, they were necessarily decoupled from one another by the MultMod algorithm. Here, in each cycle Z is absolute and N shifted relative to Z. This decoupling is called "swimming" here and is explained in more detail below in connection with FIG. 15.

Der Zustand, in dem sich der Kryptographie-Prozessor jeweils befindet, ist anhand der Schritte a bis e in Fig. 15 dargestellt. Die Übergänge a, c und e verdeutlichen den Verschiebevorgang, während b und d nicht näher erläuterte Additionen oder Subtraktionen bedeuten. Die gezeigten Rechtecke (Türme) repräsentieren die Register C (14), N (18) und Z (20, 22, 24; Register des Kryptographie-Prozessors). Die Höhe der Rechtecke beträgt 660 Bits + 20 Bits. 660 Bits ist die maximale Wortlänge, und 20 Bits ist die Größe eines Puffers, der die Entkopplung ermöglicht.The state in which the cryptography processor is in each case is illustrated by means of steps a to e in FIG. 15. The transitions a, c and e illustrate the shifting process, while b and d mean additions or subtractions which are not explained in more detail. The rectangles (towers) shown represent registers C ( 14 ), N ( 18 ) and Z ( 20, 22, 24 ; registers of the cryptography processor). The height of the rectangles is 660 bits + 20 bits. 660 bits is the maximum word length and 20 bits is the size of a buffer that enables decoupling.

Wenn beispielsweise der Schiebebetrag (Shift-Betrag; Größe der Verschiebung) der Multiplikation größer als der Schiebebetrag der Modulo-Operation ist, wird das Register M um die Differenz der Schiebebeträge zum oberen Ende geschoben (vgl. Schritte a und e); die obersten Bits des Registers N werden somit teilweise in den Puffer hineingeschoben.If, for example, the shift amount (shift amount; size of the shift) of the multiplication is greater than the shift amount of the modulo operation, the register M is shifted to the upper end by the difference of the shift amounts (cf. steps a and e ); the uppermost bits of register N are thus partially pushed into the buffer.

Im umgekehrten Fall wird N zum unteren Ende geschoben (Schritt c). Die Überwachung, daß N nicht aus den Puffergrenzen hinausläuft, wird weiter unten noch näher erläutert.In the opposite case, N is pushed to the lower end (step c ). The monitoring that N does not exceed the buffer limits is explained in more detail below.

Vor dem Schritt a sei N bereits um 10 Bits in den Puffer geschoben worden. Der Schiebebetrag sz nimmt dann die Werte "3", "1" und "2" an, wie unten in Fig. 15 zu erkennen ist. Dabei werden die genannten Werte in diesem Beispiel nacheinander angenommen. Der Schiebebetrag si, welcher die Verschiebung von N relativ zu Z festlegt, nimmt hier nacheinander die Werte "2", "3" und "1" an, das bedeutet, daß N absolut gesehen um sn = sz - si verschoben worden ist, also ist sn nacheinander "1", "-2" und "1". Nach den Schritten a, c und e ist N jeweils um 11, 9 bzw. 10 Bits in den Puffer verschoben worden. Dieser Vorgang stellt das dar, was weiter oben mit dem Begriff "Schwimmen" bezeichnet wurde.Before step a , N had already been shifted into the buffer by 10 bits. The shift amount sz then assumes the values "3", "1" and "2", as can be seen in FIG. 15 below. The values mentioned in this example are assumed one after the other. The shift amount si , which defines the shift from N relative to Z , takes on the values "2", "3" and "1" one after the other, which means that N has been shifted absolutely by sn = sz - si , ie sn is successively "1", "-2" and "1". After steps a, c and e , N has been shifted into the buffer by 11, 9 and 10 bits, respectively. This process represents what was referred to above with the term "swimming".

Im Schritt c ist der Einfluß der Look-Ahead-Grenze k für den Fall von k = 3 verdeutlicht. Obwohl der Algorithmus für si eine Verschiebung um 4 hätte vornehmen können, ist N relativ zu Z nur um 3 Bits verschoben worden, und im Schritt e dann um 1 Bit. Daher ist N im Schritt d von Z weder addiert noch subtrahiert. Das bedeutet also, daß das Vorzeichen b in Fig. 5 den Wert "0" angenommen hat.The influence of the look-ahead limit k for the case of k = 3 is illustrated in step c . Although the algorithm for si could have shifted by 4, N has only been shifted by 3 bits relative to Z , and then by 1 bit in step e . Therefore, N is neither added nor subtracted from Z in step d . This means that the sign b in FIG. 5 has the value "0".

Nachfolgend wird der Verfahrensschritt der 3-Operanden- Addition erläutert. Die Fig. 6 stellt das komplette RSA- Verfahren (Fig. 6a) dem nunmehr vollständigen Ausführungsbeispiel des erfindungsgemäßen Kryptographie-Verfahrens gegenüber (Fig. 6b).The method step of 3-operand addition is explained below. FIG. 6 contrasts the complete RSA method ( FIG. 6a) with the now complete exemplary embodiment of the cryptography method according to the invention ( FIG. 6b).

Hier sind die Abfragen, die in der letzten Fassung des MultMod-Algorithmus (Fig. 3) noch enthalten waren, durch die Aufrufe der beiden Look-Ahead-Algorithmen ersetzt. Die Berechnung der Look-Ahead-Parameter geschieht parallel. Das soll durch die parallelen Zweige, in denen die Aufrufe stattfinden, ausgedrückt werden.Here, the queries that were still contained in the last version of the MultMod algorithm ( FIG. 3) are replaced by the calls of the two look-ahead algorithms. The look-ahead parameters are calculated in parallel. This is to be expressed by the parallel branches in which the calls take place.

In dieser Version des Algorithmus kann in Z ein negativer Wert stehen, nachdem die Schleife abgearbeitet worden ist. Deshalb muß zum Schluß im MultMod-Algorithmus mit Look- Ahead eine Ergebniskorrektur vorgenommen werden. Sollte Z negativ sein, dann ist Z + N positiv. Dieser Zusatzschritt ist im Flußdiagramm von Fig. 6b enthalten.In this version of the algorithm, a negative value can be in Z after the loop has been processed. For this reason, the result must be corrected in the end in the MultMod algorithm with Look-Ahead. If Z is negative, then Z + N is positive. This additional step is included in the flow chart of Fig. 6b.

Der Multiplikations- und der Moduloschritt sind außerdem zu einem einzigen Schritt zusammengefaßt worden, der 3- Operanden-Addition. Es werden von der Logik pro Schritt nicht zwei, sondern drei Operanden gleichzeitig addiert, wie dies nachfolgend zu erkennen ist.The multiplication and modulo steps are also combined into a single step, the 3- Operand addition. It is from logic per step not two but three operands added at the same time, as can be seen below.

Die 3-Operanden-Addition wird in zwei Abschnitte unterteilt. Im ersten Abschnitt wird an jeder binären Stelle eine Summe der drei Bits der Operanden A, B und C gebildet. Die Summe von A [i], B [i] und C [i] kann die Werte 0. .3 annehmen, sie ist also binär mit den zwei (1) Bits S [1] und S [0] darstellbar. Da die Summe an jeder Stelle gebildet wird, können aus den zwei Summenbits zwei neue Zahlen X und Y zusammengestellt werden (i = 0 bis max):The 3-operand addition is divided into two sections. In the first section, a sum of the three bits of operands A, B and C is formed at each binary position. The sum of A [ i ], B [ i ] and C [ i ] can take the values 0 ... 3, so it can be represented in binary form with the two (1) bits S [1] and S [0]. Since the sum is formed at every point, two new numbers X and Y can be put together from the two sum bits ( i = 0 to max):

Y [i] : = S [0],Y [max+1] : = 0 und X [i + 1] : = S [1],X [0] : = 0. Y [ i ]: = S [0], Y [max + 1]: = 0 and X [ i + 1]: = S [1], X [0]: = 0.

Im zweiten Abschnitt werden die beiden Zahlen auf die übliche Art und Weise addiert. Die Verlängerung um ein Bit bereitet keine Probleme, da das Ergebnis um mindestens ein Bit kürzer ist als der längste Operand.In the second section, the two numbers are on the usual Way added. The extension by one bit poses no problems since the result is at least one bit shorter than the longest operand.

Damit die Additionslogik nicht einen zu hohen Energieverbrauch hat, sind bei ihr an mehreren Stellen die Pullup-Transistoren weggelassen worden. Sie ist also in einem metastabilen Zustand. Kippt sie dann bei der Addierung in einen stabilen Zustand, so kann sie diesen nicht mehr selbständig verlassen. Deshalb muß die Logik am Ende eines Zyklus mit einem externen Precharge-Signal wieder in den metastabilen Anfangszustand gebracht werden. Während dieses Zeitraums wird die Bitaddition eingeschoben.So that the addition logic does not consume too much energy has in several places Pullup transistors have been omitted. So it's in a metastable state. Then flip it over when adding in a stable state, so it cannot leave more independently. Therefore the logic must end a cycle with an external precharge signal again be brought into the metastable initial state. While the bit addition is inserted during this period.

The cryptography processor

Dieser zweite Teil der Erfindungsbeschreibung befaßt sich mit dem Blockschaltbild und dem daraus resultierenden Floorplan des Prozessors.This second part of the description of the invention deals with with the block diagram and the resulting floor plan of the processor.

Eine Teilaufgabe der Erfindung besteht darin, die Struktur einer spezialisierten Elementarzelle (10), die den RSA-Algorithmus optimal unterstützt, darzustellen. Mit dieser Struktur wird das Blockschaltbild des Prozessors festgelegt. Dieses enthält genügend Informationen, um einen darauf abgestimmten Floorplan des Prozessors entwerfen zu können.A partial task of the invention is to represent the structure of a specialized unit cell ( 10 ) which optimally supports the RSA algorithm. With this structure, the block diagram of the processor is determined. This contains enough information to be able to design a coordinated floor plan of the processor.

Wie wird der RSA-Algorithmus effizient unterstützt? Um diese Frage zu beantworten, müssen die einzelnen Schritte des Algorithmus auf die Eigenschaft überprüft werden, ob sie selten ausgeführt werden und/oder wenig Zeit benötigen oder ob das Gegenteil zutrifft. Im ersten Fall ist es sinnvoller, die Schritte durch Mikroprogramm zu realisieren. Bei den zeitkritischen Schritten dagegen muß die Hardware-Implementierung in die Elementarzelle verlegt werden. Es werden die folgenden Schritte auf der Ebene des RSA-Algorithmus ausgeführt (siehe Flußdiagramm in Fig. 6):How is the RSA algorithm supported efficiently? In order to answer this question, the individual steps of the algorithm must be checked for the property, whether they are rarely executed and / or require little time or whether the opposite is true. In the first case, it makes more sense to implement the steps using a micro program. In the case of the time-critical steps, however, the hardware implementation must be moved to the unit cell. The following steps are performed at the level of the RSA algorithm (see flow chart in Fig. 6):

1. The initialization; the start values are assigned to the two variables C and e .
2. The loop; there are two queries and the jumps induced thereby, two calls to the MultMod algorithm and the incrementation of the variable e .

Jeder der aufgelisteten Schritte wird entweder während der Berechnung nur einmal ausgeführt, oder die Operation beschränkt sich auf wenige Bits, z. B. die binäre Abfrage eines Exponentenbits. Außerdem sind es einfache Operationen und benötigen daher kaum Rechenzeit. Die Schritte werden also durch ein Mikroprogramm realisiert. Eine einfache Steuereinheit 42 wird dieses Problem ausführen (Fig. 11).Each of the listed steps is either performed only once during the calculation, or the operation is limited to a few bits, e.g. B. the binary query of an exponent bit. In addition, they are simple operations and therefore require little computing time. The steps are therefore implemented by a micro program. A simple control unit 42 will accomplish this problem ( Fig. 11).

Die Variablen, die mit einem kleinen Buchstaben bezeichnet sind, haben alle eine Zählerfunktion. Sie arbeiten eng mit der Steuereinheit 42 zusammen, da dessen Entscheidungen von den Zählern abhängen und andererseits die Zähler in Abhängigkeit der Entscheidungen der Steuereinheit de- bzw. inkrementiert werden. Sie müssen deshalb in enger Nachbarschaft zur Steuereinheit plaziert werden. Dies ist möglich, da deren Länge einen Wert von ld (L (M)) Bits hat (ld = Logarithmus Dualis). Bei einer Schlüssellänge von 660 Bits sind sie 10 Bits lang.The variables marked with a small letter all have a counter function. They work closely with the control unit 42 since its decisions depend on the counters and, on the other hand, the counters are decremented or incremented depending on the decisions of the control unit. They must therefore be placed in close proximity to the control unit. This is possible because their length has a value of ld ( L ( M )) bits (ld = logarithm dualis). With a key length of 660 bits, they are 10 bits long.

Dieselben Argumente sind für die Abfragen und die Zähler des MultMod-Algorithmus gültig. Allerdings werden diese Aufgaben von einer separaten anderen Steuereinheit 36 (Fig. 10) übernommen, da die Berechnung der Look-Ahead- Parameter sehr zeitkritisch im Hinblick auf den Ablauf der MultMod-Berechnung ist. Die rechtzeitige Generierung der Shift-Beträge sz und sn sowie der Information, ob und mit welchem Vorzeichen P und N in die 3-Operanden- Addition eingeht, beeinflußt die Zykluszeit wesentlich.The same arguments apply to the queries and counters of the MultMod algorithm. However, these tasks are carried out by a separate other control unit 36 ( FIG. 10), since the calculation of the look-ahead parameters is very time-critical with regard to the sequence of the mult-mod calculation. The timely generation of the shift amounts sz and sn and the information as to whether and with which sign P and N are included in the 3-operand addition significantly influences the cycle time.

Nicht in diese Steuereinheit gehören die Operationen, die mit den L (N) Bit langen Zahlen durchgeführt werden. Für ihre Speicherung in nächster Nähe der ausführenden Logik sowie für die Logik selbst wurde eine Elementarzelle (Fig. 7) entworfen.The operations which are carried out with the L ( N ) bit long numbers do not belong in this control unit. An elementary cell ( FIG. 7) was designed for its storage in the immediate vicinity of the executing logic and for the logic itself.

Die benötigten Register und die Logik der Elementarzelle 10 ergeben sich aus Fig. 7. Danach umfaßt die Elementarzelle 10 ein Register 12, welches einen Multiplikator M enthält, sowie ein Code-Register 14 und ein Datum-Register 16. Es folgt ein UD-Shift-Register 18, welches N enthält und -2 . . . +2 Bit schiebt.The required registers and the logic of the elementary cell 10 result from FIG. 7. The elementary cell 10 then comprises a register 12 , which contains a multiplier M , as well as a code register 14 and a date register 16 . A UD shift register 18 follows, which contains N and -2. . . +2 bit shifts.

Weitere Bestandteile der Elementarzelle 10 sind ein Barrel- Shifter 20, ein Bitaddierer 22, sowie ein Volladdierer 24, an den sich ein Carry-Look-Ahead-Element 26 anschließt.Further components of the elementary cell 10 are a barrel shifter 20 , a bit adder 22 , and a full adder 24 , which is followed by a carry look ahead element 26 .

Im RSA- und MultMod-Algorithmus werden zusammen fünf Register benötigt:In the RSA and MultMod algorithm, there are five registers needs:

1. The register 12 for the respective multiplier M ; the length of the register is L ( N ).
2. The encrypted date register 14 ; it contains the variable C during the calculation, the value of which is the result of the encryption after the RSA algorithm has been processed. The length of the register is L ( N ).
3. The register 16 for the date to be encrypted; it contains the variable P during the calculation, which is assigned the value of the date to be encrypted at the start of the RSA algorithm. The length of the register is L ( N ).
4. Register 18 for module N ; it contains a multiple of the module during the calculation. Therefore the register has the length L ( N ) + MAX . In addition to the memory function, this register has the ability to shift the variable N by several places in one step. N is shifted by sn positions relative to Z to the right in each cycle. At the same time Z is shifted to the left by sz places, ie N is shifted absolutely by sn - sz places to the right. sz and sn can have the values 1. . . Assume 3 (the maximum look-ahead k has been set to 3). The absolute shift amount by which N is shifted to the right thus takes the values -2. . . 2 on. A negative shift amount to the right means that N is shifted to the left. Based on an LR shift register, the required function is implemented with the UD shift register ( U p D own) 18 , which can shift by 1 bit in each half cycle in every direction, i.e. by 2 bits in a full cycle.
5. The register Z (comprising 20, 22, 24 ) for the intermediate result Z of the MultMod algorithm; this register is read out at the beginning of each cycle and is written with the new intermediate result at the end. The register must therefore only store the variable Z for a short time. The simplest way to do this is to dynamically store each bit of Z on the input gate of an inverter. The length of the register is L ( N ) + MAX because the register 18 has this length and Z can take its value. The register Z is to be understood as part of the full adder 24 .

Die drei ersten Register 12, 14 und 16 werden als statische Speicher entworfen, da sie über längere Zeiträume Informationen speichern müssen.The first three registers 12, 14 and 16 are designed as static memories since they have to store information over longer periods of time.

Neben den besprochenen Registern gehört in den Elementarzellenentwurf:In addition to the registers discussed, the elementary cell design includes:

1. A barrel shifter 20 , the result of the addition, the intermediate result Z , by 0. . Can shift 3 bits. The inclusion of the 0-bit shift in the capabilities of the barrel shifter is necessary because the multiplication is completed before the modulo calculation and then Z may no longer (!) Be shifted.
2. A bit adder 22 without a carry bit, which converts the 3 operand bits into a sum that can be represented with two bits.
3. A full adder 24 ; this is the name of a 2-bit adder that processes the carry bit of the next lower position and itself generates a carry bit for the next higher position.

Nachfolgend wird das Multiplizieren in dem Ausführungsbeispiel erläutert. Die Additionslogik hat einen lesenden Zugriff auf die Register 16, 18 und 20, 22, 24, und sie kann in jedes Register eine Zahl abspeichern. P (Register 16) ist in jedem MultMod-Aufruf des RSA-Algorithmus einer der beiden Faktoren. Wird P immer als Multiplikant gewählt und der andere Faktor als Multiplikator, dann genügt es, wenn die Elementarzellenlogik nur die Register 16, 18 und Z (20, 22 und 24) auslesen kann, um die erforderlichen Shift-Operationen und die 3-Operanden-Addition ausführen zu können.Multiplication in the embodiment is explained below. The addition logic has read access to registers 16, 18 and 20, 22, 24 and can store a number in each register. P (register 16 ) is one of the two factors in every MultMod call of the RSA algorithm. If P is always chosen as a multiplicant and the other factor as a multiplier, then it is sufficient if the unit cell logic can only read registers 16, 18 and Z ( 20, 22 and 24 ) in order to perform the required shift operations and the 3-operand To be able to carry out addition.

Die MultMod-Steuereinheit 36 (Fig. 10) löst die Detailaufgabe, die Look-Ahead-Parameter zu generieren. Aus diesem Grund wird der Multiplikator M im Register 12 parallel mit dem Register Z verschoben. Die MultMod-Steuereinheit 36 hat Zugriff auf die obersten Bits des Registers 12 (vgl. Fig. 12). In Abhängigkeit dieser Bits generiert die MultMod-Steuereinheit 36 den Schiebeparameter sz, der aber nicht direkt ausgegeben wird, sondern in entsprechende Steuersignale umgesetzt wird. Diese Vorgänge laufen in Fig. 12 in der Multiplikations-Schiebelogik 50 ab.The MultMod control unit 36 ( FIG. 10) solves the detailed task of generating the look-ahead parameters. For this reason, the multiplier M in the register 12 is shifted in parallel with the register Z. The MultMod control unit 36 has access to the uppermost bits of the register 12 (see FIG. 12). Depending on these bits, the MultMod control unit 36 generates the shift parameter sz , which, however, is not output directly, but is instead converted into corresponding control signals. These processes take place in the multiplication shift logic 50 in FIG .

Entsprechend wird der Verschiebeparameter sn in der Modulo- Schiebelogik 52 generiert. Ein Vergleicher 38 vergleicht die oberen Bits von sz mit 1/3, 1/6, 1/12 . . . von N. Entsprechend dem Algorithmus aus Fig. 5 wird das Vergleichsergebnis an die Modulo-Schiebelogik 52 geliefert. Dieser Wert gibt den relativen Betrag an, um den das Register 18 relativ zu Z nach rechts verschoben wird. Die Modulo- Schiebelogik 52 generiert aus diesem relativen Wert und dem Schiebebetrag der Multiplikations-Schiebelogik 50 den absoluten Schiebeparameter sn. Wiederum wird sn nicht ausgegeben, sondern gleich in entsprechende Steuersignale umgesetzt.Accordingly, the shift parameter sn is generated in the modulo shift logic 52 . A comparator 38 compares the upper bits of sz with 1/3, 1/6, 1/12. . . from N. According to the algorithm from FIG. 5, the comparison result is delivered to the modulo shift logic 52 . This value indicates the relative amount by which register 18 is shifted to the right relative to Z. The modulo shift logic 52 generated from this relative value, and the shift amount of the multiplication shift logic 50 sn the absolute shift parameters. Again, sn is not output, but immediately converted into corresponding control signals.

Die in Fig. 12 dargestellten Begrenzer 54 und 56 haben die Aufgabe, den Schiebebetrag sz oder sn zu begrenzen, falls das Register 18 die Puffergrenzen überschreiten sollte, was weiter oben schon erwähnt wurde. Die Signale des ersten Begrenzers 54 und des zweiten Begrenzers 56 werden von der Multiplikations-Schiebelogik 50 und der Modulo-Schiebelogik 52 mit verarbeitet.The limiters 54 and 56 shown in FIG. 12 have the task of limiting the shift amount sz or sn if the register 18 should exceed the buffer limits, which has already been mentioned above. The signals of the first limiter 54 and the second limiter 56 are also processed by the multiplication shift logic 50 and the modulo shift logic 52 .

Der erste Zähler 58 in Fig. 12 enthält die Variable m (vgl. Fig. 6b), die angibt, wieviel Bits des Multiplikators im Register 12 noch zu bearbeiten sind. Der zweite Zähler 60 enthält die Variable n (vgl. Fig. 6b bzw. Fig. 5), die angibt, um wie viele Bits das Register 18 (N) in den Puffer hineingeschoben worden ist.The first counter 58 in FIG. 12 contains the variable m (cf. FIG. 6b), which indicates how many bits of the multiplier are still to be processed in the register 12 . The second counter 60 contains the variable n (see FIG. 6b and FIG. 5), which indicates how many bits the register 18 ( N ) has been pushed into the buffer.

Nachdem in Fig. 7 eine Elementarzelle 10 dargestellt ist, zeigt Fig. 8, wie in einem hierarchischen Aufbau aus mehreren Elementarzellen 10 ein 4-Zellen-Block 28 mit einem hierarchischen Carry-Look-Ahead-Element 30 aufgebaut wird. Gemäß Fig. 9 sind fünf 4-Zellen-Blöcke 28 in einer weiteren Stufe zu einem 20-Zellen-Block 32 aufgebaut, und wie Fig. 10 zeigt, sind in einem weiteren hierarchischen Aufbau mehrere 20-Zellen-Blöcke 32, von denen der oberste als Puffer 34 ausgebildet ist, zu einer Verschlüsselungseinheit 40 mit einer MultMod-Steuereinheit 36 zusammengefaßt.After an elementary cell 10 is shown in FIG. 7, FIG. 8 shows how a 4-cell block 28 with a hierarchical carry-look-ahead element 30 is constructed from a plurality of elementary cells 10 in a hierarchical structure. According to FIG. 9, five 4-cell blocks 28 are built up in a further stage to form a 20-cell block 32 , and as FIG. 10 shows, in a further hierarchical structure there are several 20-cell blocks 32 , of which the The uppermost one is designed as a buffer 34 , combined to form an encryption unit 40 with a MultMod control unit 36 .

Eine gemäß Fig. 7 aufgebaute Elementarzelle 10 ist in der Zusammenarbeit mit der MultMod-Steuereinheit 36 in der Lage, alle Schritte der MultMod-Schleife gemäß Fig. 6b in einem Zyklus zu erledigen, da sie für jeden Schritt der Schleife (Verschieben des Modulus N um mehrere Bits im Register 18 (N); Verschieben des Registers Z (20, 22, 24) um mehrere Bits im Barrel-Shifter 20; Ausführung der 3-Operanden-Operation mittels des Bitaddierers 22 und des Volladdierers 24 und der Carry-Look-Ahead-Einheit 26) eine spezielle Logik enthält. Die MultMod-Steuereinheit 36 berechnet (Fig. 12) parallel zur Arbeit der Elementarzelle 10 die Parameter des nächsten Zyklus. Damit ist die direkte Umsetzung des erfindungsgemäßen Verfahrens in dem VLSI-Entwurf des Kryptographie-Prozessors gegeben.A unit cell 10 constructed according to FIG. 7, in cooperation with the MultMod control unit 36 , is able to carry out all steps of the MultMod loop according to FIG. 6b in one cycle, since for each step of the loop (shifting the modulus N by several bits in register 18 ( N ); shifting register Z ( 20, 22, 24 ) by several bits in barrel shifter 20 ; execution of the 3-operand operation by means of bit adder 22 and full adder 24 and the carry look -Ahead unit 26 ) contains special logic. The MultMod control unit 36 calculates ( FIG. 12) the parameters of the next cycle in parallel with the work of the unit cell 10 . This provides the direct implementation of the method according to the invention in the VLSI draft of the cryptography processor.

Gemäß Fig. 13 und 14 wurde ein Carry-Look-Ahead-Element 30 entwickelt, welches erkennt, ob sich Carry-Bits über größere Distanzen beeinflussen. Da dies im Kryptographie-Prozessor bei der gewählten Lösung nur in einem von 30 000 Fällen so ist, bestimmt nicht mehr die Dauer der längsten (niederwertigste Carry-Bit beeinflußt das höchstwertigste), sondern der durchschnittlichen Additionszeit die Zykluszeit der Additionslogik.According to FIGS. 13 and 14, a carry look ahead element 30 has been developed which recognizes whether carry bits influence one another over larger distances. Since this is only so in one of 30,000 cases in the cryptography processor in the chosen solution, the cycle time of the addition logic is no longer determined by the duration of the longest (least significant carry bit influences the most significant), but by the average addition time.

Das Carry-Look-Ahead-Element (CLA) 30 ist hierarchisch aufgebaut. Es verarbeitet die CLA-Signale der untergeordneten Stufe (linke Seite) und generiert ein CLA-Signal für die übergeordnete Stufe (rechte Seite).The carry look ahead element (CLA) 30 is structured hierarchically. It processes the CLA signals of the lower level (left side) and generates a CLA signal for the higher level (right side).

Ein Propagate-Signal einer Position bedeutet, daß der Übertrag dieser Position vom Übertrag der nächstniedrigeren Position bestimmt wird. Wenn alle Propagate-Signale aktiviert sind, generiert das hierarchische Carry-Look- Ahead-Element 30 das Propagate-Ausgangssignal dieses Elements. Ein Kill-Signal besagt, daß diese Position keinen Übertrag hat. Das Kill-Ausgangssignal wird aktiviert, wenn in den untergeordneten Elementen entscheidbar ist, daß die höchstwertige Position dieses Elements keinen Übertrag hat.A propagate signal of a position means that the transfer of this position is determined by the transfer of the next lower position. When all propagate signals are activated, the hierarchical carry look ahead element 30 generates the propagate output signal of this element. A kill signal indicates that this position has no carry. The kill output signal is activated when it can be decided in the subordinate elements that the most significant position of this element has no carry.

Die Carry-Look-Ahead-Elemente 30 können nach einem Baukastenprinzip zu einer Baumstruktur zusammengesteckt werden. Sie repräsentieren dann jeweils eine größere Anzahl von Carry-Bits. Der Vorteil der CLA ist, die serielle Abarbeitung der Carry-Bits durch eine parallel-serielle (baumartige) Abarbeitung zu ersetzen. Dadurch wird die Additionszeit erheblich reduziert.The carry-look-ahead elements 30 can be put together into a tree structure according to a modular principle. They each represent a larger number of carry bits. The advantage of the CLA is to replace the serial processing of the carry bits with a parallel serial (tree-like) processing. This considerably reduces the addition time.

In der Realität vervielfacht sich mit jeder zusätzlich benötigten Stufe allerdings die Länge der Signalpfade, so daß ab einer bestimmten Baumtiefe die Zusammenfassung benachbarter Bäume keinen Gewinn mehr erbringt. Bei den Wurzeln dieser Bäume wird der Übertrag dann wieder seriell verarbeitet.In reality, multiplies with each additional one needed Level, however, the length of the signal paths, see above that from a certain tree depth the combination of neighboring Trees no longer profit. With the roots the transfer of these trees then becomes serial again processed.

Letzteres ist in Fig. 14 zu erkennen, welche die Verschaltung der Carry-Look-Ahead-Elemente 30 in Verbindung mit einem Unterbrecher 62 zeigt. Innerhalb eines Blockes von 20 Bits werden die Überträge durch den CLA-Baum verarbeitet. Von Block zu Block wird der Übertrag seriell weitergereicht.The latter can be seen in FIG. 14, which shows the connection of the carry look-ahead elements 30 in connection with an interrupter 62 . The transfers are processed by the CLA tree within a block of 20 bits. The transfer is passed on in series from block to block.

Dieses Konzept ist im Kryptographie-Prozessor 48 noch geringfügig erweitert. Das nutzlos gewordene Block-Propagate- Signal wird dazu verwendet, den Unterbrecher 62 zu aktivieren, der die Taktsignale für die Dauer von 8 Takten unterdrückt. Dies ist die Zeit, die ein Übertrag durch die gesamte serielle Blockkette benötigt.This concept is slightly expanded in the cryptography processor 48 . The block propagate signal which has become useless is used to activate the interrupter 62 , which suppresses the clock signals for a period of 8 clocks. This is the time it takes to carry through the entire serial block chain.

Wird also der Übertrag eines Blockes vom nächstniederwertigen Block bestimmt, ist das Block-Propagate-Signal durch die Block-CLA automatisch aktiviert. Der Unterbrecher 62 wird eingeschaltet, und die Blockkette erhält genügend Zeit, sich korrekt einzupegeln.If the carry of a block is determined by the next least significant block, the block propagate signal is automatically activated by the block CLA. The interrupter 62 is switched on and the block chain is given enough time to adjust itself correctly.

Die Zykluszeit kann daher so eingestellt werden, daß sie gerade ausreicht, den Übertrag direkt benachbarter Blöcke zu verarbeiten. Der Vorteil hierbei ist sehr groß, da unabhängig von der Anzahl der Stellen nur die Zeit eines Block- Übertrages berücksichtigt zu werden braucht. Die Dauer der Berechnung einer 660-Bit-Addition ist daher nicht länger als die einer 20-Bit-Addition. Lediglich in einem von ungefähr 30 000 Fällen beeinflussen sich die Carry-Bits über mehr als zwei Blöcke hinweg. Dies erkennt die CLA. Die Addition benötigt dann nicht einen, sondern acht Zyklen.The cycle time can therefore be set so that it just enough to carry directly adjacent blocks to process. The advantage here is very large because it is independent from the number of digits only the time of a block Transfer needs to be taken into account. The duration the calculation of a 660-bit addition is therefore no longer than that of a 20-bit addition. Only in one of approximately 30,000 cases affect the carry bits over more than two blocks away. The CLA recognizes this. The Addition then requires not one but eight cycles.

Aus der Zerlegung von RSA-Algorithmus in Grundoperationen und ihre Umsetzung in eine erfindungsgemäße Schaltung kann die absolute Schrittzahl berechnet werden, die zur Verschlüsselung eines Datums (Nachricht) notwendig ist. Daraus direkt ableitbar ist die Verschlüsselungsrate in allgemeiner Form V _{RSA, allg.}, ausgedrückt in codierten Bits pro Sekunde:The absolute number of steps required to encrypt a date (message) can be calculated from the decomposition of the RSA algorithm into basic operations and its implementation in a circuit according to the invention. The encryption rate can be derived directly from this in general form V _{RSA, generally} , expressed in coded bits per second:

Sie istshe is

- proportional to the frequency f of the processor,
- proportional to the number of bits L ( N ) that are encrypted together,
- inversely proportional to the number 3/2 · L ( N ) of MultMod calls
- inversely proportional to the number L ( N ) / min ( Erw ( sz ), Ers ( sn )) of the additions or subtractions per MultMod call and
- inversely proportional to the number L ( N ) · A / B of the individual steps into which the addition or subtraction of a large number is broken down. B is the width of the ALU (the wider the ALU is, the fewer operations are necessary to add two long word numbers) and A is the addition of the cycles to perform an operation.

Für den neuen Kryptographie-Prozessor ist Erw (sz) = Ers (sn) und A/B = 1/L (N), weil die Datenbreite der ALU gleich der Länge der zu verschlüsselnden Daten ist und die ALU-Operation nur einen Zyklus benötigt. Daraus folgt in diesem Fall für die Verschlüsselungsrate des Kryptographie-Prozessors V _{RSA, KP}:For the new cryptography processor, Erw ( sz ) = Ers ( sn ) and A / B = 1 / L ( N ) because the data width of the ALU is equal to the length of the data to be encrypted and the ALU operation only requires one cycle . In this case, the encryption rate of the cryptography processor V _{RSA, KP} follows from this:

Die Frequenz von 30 MHz ergibt sich aus einer Extrapolation einer in 5 µm-NMOS-Technologie erreichten Zykluszeit von 100 ns (ein als Labormuster hergestellter Prototyp besteht aus ca. 5000 Transistoren in 5 µm-NMOS-Technologie) auf heute gängige 2 µm-CMOS-Technologien.The frequency of 30 MHz results from an extrapolation a cycle time achieved in 5 µm NMOS technology of 100 ns (a prototype made as a laboratory sample consists of approx. 5000 transistors in 5 µm NMOS technology) to common 2 µm CMOS technologies today.

Aufgebaut ist das beschriebene Ausführungsbeispiel des erfindungsgemäßen Kryptographie-Prozessors aus ca. 80 000 Transistoren in 2 µm-CMOS-Technologie. Die Chipfläche beträgt dann 5,2 mm × 5,2 mm. Bei der maximalen Schlüssel- Länge von 660 Bits ver- und entschlüsselt er im ungünstigsten Fall Daten immer noch mit einer hohen Geschwindigkeit von 64 000 Bit/sec.The described embodiment of the Cryptography processor according to the invention from approximately 80,000 Transistors in 2 µm CMOS technology. The chip area is then 5.2 mm x 5.2 mm. At the maximum key It encodes and decrypts a length of 660 bits in the worst case Fall data still at high speed of 64,000 bits / sec.

Die logische Blockstruktur und der Floorplan ergeben sich aus Fig. 16 und Fig. 17. Fig. 16 zeigt das Blockschaltbild des Kryptographie-Prozessors 48, das sich unmittelbar aus den vorangegangenen Erläuterungen der Elementarzelle 10 ableiten läßt.The logic block structure and the floor plan will be apparent from Fig. 16 and Fig. 17. Fig. 16 shows the block diagram of the cryptographic processor 48, 10 can be derived directly from the preceding explanations of the unit cell.

Bei der Konzeption des Floorplans gemäß Fig. 17 sind verschiedene Rahmenbedingungen zu beachten: Zum einen die Struktur der Elementarzelle 10, weiterhin die Kommunikation der Elementarzellen untereinander und schließlich die Gewährleistung des Anschlusses aller erforderlichen Steuersignale an die Elementarzellen. . When designing the floor plan provided for 17 different conditions must be noted: First, the structure of the unit cell 10 further communication between the unit cells with each other and finally ensuring the connection of all required control signals to the unit cells.

Vier Komponenten der Elementarzelle 10 tauschen mit den entsprechenden Komponenten benachbarter Elementarzellen Information aus. Das ist erstens das UD-Shift-Register 18 (vgl. Fig. 7), zweitens der Bitaddierer 22, drittens der Barrel-Shifter 20 und viertens der Volladdierer 24. Das impliziert, daß die einzelnen Komponenten der Elementarzelle übereinander plaziert werden, denn dann entstehen keine zusätzlichen Kommunikationswege zu benachbarten Elementarzellen. Aus der relativ hohen Anzahl der Komponenten folgt, daß die Elementarzelle der gewählten Ausführungsform des Prozessors eine geringe Bauhöhe und eine große Breite hat.Four components of the unit cell 10 exchange information with the corresponding components of adjacent unit cells. This is firstly the UD shift register 18 (cf. FIG. 7), secondly the bit adder 22 , thirdly the barrel shifter 20 and fourthly the full adder 24 . This implies that the individual components of the unit cell are placed one above the other, because then there are no additional communication paths to neighboring unit cells. It follows from the relatively high number of components that the unit cell of the chosen embodiment of the processor has a low overall height and a large width.

Obwohl die Elementarzelle flach ist, ergibt ihre notwendige Stapelung einen schmalen hohen Turm, wie links in Fig. 17 zu sehen ist. Aus fertigungstechnischen Gründen werden möglichst quadratische Chips angestrebt. Deshalb wird der Turm in einzelne Stapel aufgeteilt, die dann nebeneinander plaziert werden (Fig. 17 rechts). Jeder zweite Stapel steht auf dem Kopf, weil dann die Elementarzellen, die vorher an der Trennlinie übereinander benachbart waren, zu seitlichen Nachbarn werden. Die benötigt 00515 00070 552 001000280000000200012000285910040400040 0002003631992 00004 00396en Informationen werden an der Ober- und Unterseite der Stapel zu den Nachbarzellen übertragen. Although the unit cell is flat, its necessary stacking results in a narrow, high tower, as can be seen on the left in FIG. 17. For manufacturing reasons, chips that are as square as possible are aimed for. Therefore, the tower is divided into individual stacks, which are then placed next to each other ( Fig. 17, right). Every second stack is upside down because then the unit cells, which were previously adjacent to each other at the dividing line, become lateral neighbors. The required 00515 00070 552 001000280000000200012000285910040400040 0002003631992 00004 00396en information is transmitted to the neighboring cells on the top and bottom of the stack.

Die noch nicht plazierten Einheiten des Prozessors, Hauptsteuereinheit 42 und I/0-Einheit 44, benötigen im Verhältnis zur Verschlüsselungseinheit 40 so wenig Platz, daß sie an beliebiger Stelle an ihrem Rand plaziert werden können.The not yet placed units of the processor, main control unit 42 and I / O unit 44 , require so little space in relation to the encryption unit 40 that they can be placed anywhere on their edge.

Claims

1. Cryptography method according to the public key code method by Rivest, Shamir and Adleman (RSA method), comprising the use of the following operations for encrypting or encrypting messages:

Selection of two large prime numbers p and q and a further large number E ,
- formation of the product N = p * q,
Conversion of the messages to be encrypted into a chain of links P _i of preferably the same length, the value of which as a number is smaller than the value of the number N ,
- Encryption of these terms by the respective elevation into the E- th power with subsequent formation of modulo N (ie, the numbers C _i = p modulo N arise),
- wherein the exponentiation is replaced by a sequence of multiplications and after each multiplication is performed immediately a modulo operation (that is, it is multiplied in the remainder class ring over N),
- The multiplication is broken down into individual steps, so that a sequence of additions arises from the multiplication, and
the modulo operation is replaced by a sequence of subtractions according to the classic division algorithm,

characterized by the use of a first look-ahead method (pre-calculation method) for the division ( FIG. 5), so that the multiplication with a second look-ahead method ( FIG. 4) can also be carried out.

2. Cryptography method according to claim 1, characterized through look-ahead algorithms with which the maximum necessary number of additions or subtractions is reduced.

3. Cryptography method according to claim 2, characterized characterized that the first look-ahead method is chosen for the modulo operation so that the probability theory Expected value of the number of the operations skipped in the first look-ahead process is the same size as the probability theory Expected value, which in the second look-ahead Procedure for multiplying skipped operations.

4. cryptography method according to claim 3, characterized by a decoupling of the two look-ahead methods, wherein each of the two look-ahead methods generates a shift amount ( sz or sn ), which indicates how many bits the intermediate result ( Z ) the multiplication or the modulus ( N ) is shifted per cycle, the intermediate result ( Z ) being shifted absolutely by sz bits and the modulus ( N ) relative to the intermediate result ( Z ) by sn bits.

5. Cryptography method according to one of the preceding claims 1-4, characterized by the combination of the addition or subtraction of the multiplication and the modulo step into a single operation (3-operand addition), with not two but per step three operands are added as follows: and wherein this 3-operand addition is divided into two sections.

6. Cryptography method according to claim 5, characterized in that the first of the two sections is selected such that a sum of the three bits of the operands A, B and C is formed at each binary position, the sum of A [ i ] , B [ i ], and C [ i ] is between 0 and 3, so it can be represented in binary form with the two bits S 1 and S 0 , and two new numbers X and Y are compiled from the two sum bits in the following way: Y [ i ]: = least significant bit of A [ i ] + B [ i ] + C [ i ], Y [max + 1]: = 0, X [ i + 1]: = more significant bit of A [ i ] + B [ i ] + C [ i ] and X [0]: = 0. ( i = 0,..., Max).

7. Cryptography method according to claim 5 and 6, characterized in that the second section is selected so that the numbers X and Y are added in a manner known per se with carry (carry), and in contrast the bit addition without carry of the first Section is executed at a time when the normal addition logic is prepared for the next cycle by a precharge signal (preparation signal).

8. Cryptography method according to claim 7, characterized in that the addition comprises the following steps:

a) dividing the long and large numbers X and Y into small blocks ( 32 ),
b) simultaneous calculation of the carry bits within the blocks ( 32 ) according to a known carry-look-ahead method, and
c) Forwarding the carry bit of one block to the subsequent block, in the event that the carry bits of adjacent blocks do not influence each other.

9. Cryptography method according to claim 8, characterized in that a by the blocks ( 32 ) activatable interrupter ( 62 ) is provided which, in the event that carry bits influence adjacent blocks, the time required for their calculation and provides for consideration ( Fig. 14).

10. cryptography processor for carrying out the cryptography method, characterized by a series of elementary cells ( 10 ) specialized for the calculation of the individual operations, wherein several elementary cells ( 10 ) step by step ( FIG. 8) to form larger blocks ( 32 ; FIG. 9) are combined, and each block ( 28; 32 ) is assigned a tree-like hierarchical carry-look-ahead element ( 30 ), which normally ensures that the time for adding two numbers is independent of the length of these numbers , and wherein in each block ( 32 ) the carry is performed in parallel.

11. Cryptography processor according to claim 10, characterized in that the elementary cells ( 10 ) contain the following registers and logic components:

- a register ( 12 ) for the multiplier ( M ),
- a code register ( 14 ),
- a date register ( 16 ),
- A UD shift register ( 18 ) in which there is a multiple of the modulus ( N ) during the calculation, and which in addition to the memory function has the ability to move the modulus ( N ) in one of the two directions by several places to postpone
a barrel shifter ( 20 ) which can shift the result of the addition, an intermediate result ( Z ) by several bits,
a bit adder ( 22 ) without a carry bit, which carries out the first step of the 3-operand operation,
- a full adder ( 24 ) which adds the two numbers obtained in the bit adder ( 22 ) and stores them as an intermediate result ( Z ), and
- A carry look ahead element ( 26 ) which calculates a carry bit.

12. Cryptography processor according to claim 11, characterized in that all components ( 12 - 26 ) work in parallel.

13. Cryptography processor according to one of the preceding claims 10-12, characterized in that a plurality of blocks ( 28 ) of unit cells ( 10 ) are combined to form larger blocks ( 32 ), the transfer being passed on serially from one block to the next, and wherein the carry-look-ahead elements ( 30 ) of the blocks ( 32 ) are in turn arranged in a tree-like manner, and wherein the carry is calculated simultaneously on each superordinate carry-look-ahead element ( 30 ) of a block ( 32 ), and that a resulting signal optionally drives an interrupter ( 62 ), the interrupter ( 62 ) processing the signals coming from the carry look-ahead elements ( 30 ) and interrupting the clocks for about eight cycles if a carry look Ahead element of a block ( 32 ) gives a signal.

14. Cryptography processor according to one of the preceding claims 10-13, characterized by a MultMod control unit ( 36 ; Fig. 12) for controlling the functions of the elementary cells ( 10 ), the MultMod control unit ( 36 ) comprising the following components:

a sliding logic ( 50 ) for the multiplication,
a sliding logic ( 52 ) for the modulo operation,
a comparator ( 38 ) which compares the uppermost bits of the intermediate result ( Z ) of the full adder ( 24 ) with the uppermost bits of 1/3, 1/6, 1/12 etc. of modulus N in parallel,
- a first limiter ( 54 ) for the multiplication, which limits the maximum shift amount of the intermediate result ( Z ) if necessary, and a second limiter ( 56 ) for the modulo operation, which limits the maximum shift amount of the UD shift register ( 18 ) limited if necessary, and
- Two counters ( 58, 60 ), of which the first counter ( 58 ) indicates the bits of the register ( 12 ) still to be processed and of which the second counter ( 60 ) indicates the position of modulus N in a buffer ( 34 ).

15. A cryptography processor according to any one of the preceding claims 10-14, characterized by an elementary block designed as a buffer ( 34 ) with a length of approximately 20 bits, which the look-ahead algorithms for the multiplication and for the modulo operations from each other decoupled by N running into the buffer ( 34 ) and the MultMod control unit ( 36 ) ensuring through the delimiters ( 58, 60 ) that the modulus N does not run up or down over the buffer limit.

16. Cryptography processor according to claim 10, characterized in that twenty elementary cells ( 10 ) are combined to form a 20-cell block ( 32 ).

17. Cryptography processor according to claim 10, characterized by an encryption unit ( 40 ), a variably configurable register block ( 46 ), an input / output unit ( 44 ) and a main control unit ( 42 ) which are operatively connected to one another via data collection lines ( FIG. 11 ).