FR2842051A1

FR2842051A1 - CRYPTOGRAPHY METHOD INCLUDING THE CALCULATION OF A MODULAR MULTIPLICATION WITHIN THE MEANING OF MONTGOMERY AND CORRESPONDING ELECTRONIC ENTITY

Info

Publication number: FR2842051A1
Application number: FR0205442A
Authority: FR
Inventors: Robert Naciri
Original assignee: Oberthur Card Systems SA France
Current assignee: Idemia France SAS
Priority date: 2002-04-30
Filing date: 2002-04-30
Publication date: 2004-01-09
Anticipated expiration: 2022-04-30
Also published as: WO2003093973A1; FR2842051B1; AU2003265508A1

Abstract

Calcul d'une multiplication modulaire au sens de Montgomery.Les deux nombres x, a à multiplier modulo m sont exprimés en mots et le calcul s'effectue par cycle, chaque cycle comprenant :- le calcul (102) d'un mot dit coefficient de réduction ,- la multiplication (100) d'un mot (Xi) de x par un mot (Aj) de a,- la multiplication (101) du coefficient de réduction (Qi) par un mot (Mj) du module m,- l'addition (103) des résultats à un mot d'une mémoire à décalage (104), et- le décalage d'un mot dans cette mémoire, à la fin d'un cycle.Calculation of a modular multiplication in the sense of Montgomery The two numbers x, a to be multiplied modulo m are expressed in words and the calculation is carried out by cycle, each cycle comprising: - the calculation (102) of a word called coefficient reduction, - the multiplication (100) of a word (Xi) of x by a word (Aj) of a, - the multiplication (101) of the reduction coefficient (Qi) by a word (Mj) of the module m, the addition (103) of the results to a word of a shift memory (104), and the shift of a word in this memory, at the end of a cycle.

Description

i L'invention se rapporte à un procédé de cryptographie incluant le calculThe invention relates to a cryptographic method including calculation

d'une multiplication modulaire au sens de Montgomery entre deux nombres binaires de grandes tailles. Plus particulièrement, l'invention vise un mode de calcul d'une telle multiplication, dans lequel la représentation binaire des opérandes soit facilement configurable à la demande, pour pouvoir changer la taille des opérandes. L'invention concerne aussi toute entité électronique mettant a Montgomery modular multiplication between two large binary numbers. More particularly, the invention aims at a method of calculating such a multiplication, in which the binary representation of the operands is easily configurable on demand, in order to be able to change the size of the operands. The invention also relates to any electronic entity

en oeuvre le procédé.the process.

Depuis l'introduction du système RSA, de nombreuses recherches ont été effectuées sur les techniques de calcul modulaire, tant au niveau logiciel que Since the introduction of the RSA system, a lot of research has been done on modular computing techniques, both software and

matériel. Parmi toutes ces techniques, celle qui s'appuie sur l'algorithme de P.L. equipment. Among all these techniques, the one based on the algorithm of P.L.

Montgomery est la plus intéressante et offre encore de nombreuses perspectives. Cette théorie a été exposée dans un article de Montgomery intitulé "Modular Multiplication Without Trial Division" dans la revue "Mathematics & Computations" d'avril 1985. Les réalisations matérielles connues de la technique de Montgomery sont faites uniquement en représentation binaire dont le principal inconvénient est d'imposer des registres de travail de taille importante Montgomery is the most interesting and still offers many perspectives. This theory was exposed in a Montgomery article entitled "Modular Multiplication Without Trial Division" in the journal "Mathematics & Computations" of April 1985. The known material achievements of the Montgomery technique are made only in binary representation whose main disadvantage is to impose large work records

et non configurables.and not configurable.

Les systèmes cryptographiques à clé publique les plus utilisés (RSA, DSA, ECC...), sont basés sur l'arithmétique modulaire des grands nombres. Il faut en effet savoir calculer efficacement l'exponentielle modulaire de nombres dont la taille nécessite au moins 1024 bits pour la représentation des opérandes, ce qui représente plus de 310 chiffres décimaux. Les systèmes de représentation des opérandes dans les ordinateurs sont basés sur le mot binaire regroupant plusieurs bits. Les structures de mots actuellement employées sont donc: - l'octet représenté par 8 bits, The most used public key cryptographic systems (RSA, DSA, ECC ...) are based on the modular arithmetic of large numbers. It is indeed necessary to know how to efficiently calculate the modular exponential of numbers whose size requires at least 1024 bits for the representation of the operands, which represents more than 310 decimal digits. The systems for representing operands in computers are based on the binary word grouping several bits. The word structures currently used are therefore: the byte represented by 8 bits,

- les mots de 16 ou 32 bits.- the words of 16 or 32 bits.

Ainsi donc un opérande de 1024 bits nécessite 128 octets (ou 32 mots de So a 1024-bit operand requires 128 bytes (or 32 words of

32 bits), pour sa représentation.32 bits), for its representation.

La réalisation d'une exponentiation modulaire peut être réalisée en The realization of a modular exponentiation can be realized in

effectuant une série de multiplications modulaires. performing a series of modular multiplications.

Etant donné a, e, n trois nombres, on cherche à calculer a' mod n, appelée exponentiation modulaire. L'opérande n, s'appelle le module. Pour éviter le dépassement de capacité des ordinateurs, l'opération ci-dessus peut se Given a, e, n three numbers, we try to compute a 'mod n, called modular exponentiation. Operand n is called the module. To prevent the overflow of computers, the above operation may occur

décomposer en une suite de multiplications modulaires de la forme a. x mod n. break down into a series of modular multiplications of the form a. x mod n.

C'est pour cette raison que l'on s'intéresse aux techniques de multiplication modulaire. Cette dernière relation indique qu'il faut effectuer le produit de a par x et ne conserver, dans le résultat, que le reste de la division par n. Rappelons que les opérandes en question sont de grandes tailles et nécessitent plusieurs mots de représentation. Si la plupart des opérations arithmétiques ne posent pas de problème particulier, l'opération de division sur des grands nombres sans technique particulière est très coteuse. Au prix d'un changement de This is why we are interested in modular multiplication techniques. This last relation indicates that we must carry out the product of a by x and keep in the result only the remainder of the division by n. Remember that the operands in question are large and require several words of representation. If most arithmetic operations do not pose any particular problem, the operation of division on large numbers without particular technique is very expensive. At the cost of a change of

représentation, l'algorithme de Montgomery permet de contourner cette difficulté. representation, Montgomery's algorithm allows to circumvent this difficulty.

Lorsqu'on implante un algorithme dans un circuit électronique, on dispose généralement d'opérateurs physiques, typiquement des additionneurs et des multiplieurs, d'une taille fixe qu'on notera w, par exemple de 8, 16 ou 32 bits. On When an algorithm is implanted in an electronic circuit, there are generally physical operators, typically adders and multipliers, of a fixed size that will be noted w, for example 8, 16 or 32 bits. We

dit que w est la taille du "mot machine". says that w is the size of the "word machine".

Ainsi, exécuter une opération arithmétique addition, soustraction ou multiplication sur des grands nombres, se décompose en opérations élémentaires impliquant les opérateurs physiques de base sur des mots machines de taille w. Pour une multiplication, cette décomposition est en général réalisée de façon logicielle, comme décrit ci-dessous: Soit (Au-1, Au-2,... A1, Ao), (Xu-1, Xut2,... X1, X0), représentant des entiers a et x, les Ai et Xi étant des mots machines de taille w. Si h est la taille des opérandes, on a u=h/w. Typiquement h=1024, w=16 bits ou 32 bits et u= 64 ou 32. On voit alors que: a= Ai.2 X= L Xi *2 Ona: a.x= Z Ai.Xj.2 i, j= o On va maintenant rappeler brièvement en quoi consiste une multiplication Thus, performing an arithmetic operation addition, subtraction or multiplication on large numbers, breaks down into elementary operations involving the basic physical operators on machine words of size w. For a multiplication, this decomposition is generally carried out in a software way, as described below: Let (Au-1, Au-2, ... A1, Ao), (Xu-1, Xut2, ... X1, X0), representing integers a and x, the Ai and Xi being machine words of size w. If h is the size of the operands, we have u = h / w. Typically h = 1024, w = 16 bits or 32 bits and u = 64 or 32. We then see that: a = Ai.2 X = L Xi * 2 Ona: ax = Z Ai.Xj.2 i, j = o We will now briefly recall what a multiplication

modulaire au sens de Montgomery.modular in the sense of Montgomery.

On veut calculer a. x mod n, n étant impair. We want to calculate a. x mod n, n being odd.

On choisit un radical noté r<n, r étant le mot machine, c'est à dire r = (2W) (remarquons que r est alors relativement premier avec n), et on calcule les We choose a radical denoted r <n, where r is the word machine, that is to say r = (2W) (note that r is then relatively prime with n), and we calculate

coefficients r1 et n' tels que r. rl1- n.n' = 1. coefficients r1 and n 'such that r. rl1- n.n '= 1.

Grâce à l'identité de Bezout on sait que: - de tels coefficients existent, - n' ne nécessite qu'un seul mot pour sa représentation, Thanks to the identity of Bezout we know that: - such coefficients exist, - does not require a single word for its representation,

- on peut utiliser l'algorithme d'Euclide généralisé pour calculer n' = 1/n mod r. the generalized Euclid algorithm can be used to compute n '= 1 / n mod r.

Dans la suite, on notera "*", la multiplication au sens de Montgomery. In the following, we note "*", the multiplication in the sense of Montgomery.

Notons: - a le multiplicande,Note: - has the multiplicand,

- x, le multiplieur exprimé sous forme de mots, Xi, le mot courant à l'étape i. - x, the multiplier expressed as words, Xi, the current word in step i.

La multiplication au sens de Montgomery se traduit par un algorithme de multiplication-accumulation modulaire. Cet algorithme peut être résumé ainsi c=O Pour i=0 à u-1 c:=c+a.Xi (1) Qi:= Low(c.n') (2) c:= (c + n.Qi) /r (3) Montgomery's multiplication translates into a modular multiplication-accumulation algorithm. This algorithm can be summarized thus c = O For i = 0 to u-1 c: = c + a.Xi (1) Qi: = Low (c.n ') (2) c: = (c + n.Qi ) / r (3)

La fonction "Low" extrait le mot de poids faible du produit c. n'. The "Low" function extracts the low-order word from the product c. not'.

Comme le montre Montgomery dans son article de référence rappelé plus haut, pour le calcul d'une exponentiation modulaire, la multiplication modulaire peut être remplacée par la multiplication de Montgomery. Or, la multiplication de As Montgomery shows in his reference article above, for calculating a modular exponentiation, modular multiplication can be replaced by Montgomery multiplication. However, the multiplication of

Montgomery est plus rapide que la multiplication modulaire classique. Montgomery is faster than conventional modular multiplication.

En effet, en choisissant r le mot machine, c'est-à-dire r = (2W), il apparaît que la division par r revient à un simple décalage de un mot dans le sens des mots de poids faibles. L'opération de division de la ligne (3) est donc très rapide Indeed, by choosing r the word machine, that is to say r = (2W), it appears that the division by r returns to a simple shift of a word in the direction of the words of low weight. The division operation of the line (3) is therefore very fast

comparée à une division euclidienne classique. compared to a classical Euclidean division.

Dans l'expression ci-dessus, Qi est désigné par l'expression "coefficient In the expression above, Qi is referred to as "coefficient

de réduction" qui ressort de la théorie de Montgomery. of reduction "which emerges from Montgomery's theory.

Montgomery a démontré dans son article que la multiplication modulaire définie par lui permet de calculer par accumulation dans c l'expression Montgomery has demonstrated in his article that the modular multiplication defined by him makes it possible to calculate by accumulation in c the expression

a.b.ru modulo n.a.b.ru modulo n.

De plus, une exponentiation modulaire au sens de Montgomery peut s'obtenir facilement en utilisant uniquement des multiplications au sens de Montgomery, à savoir: a*e = aa* a...* a (e.fois) et le calcul d'une exponentiation modulaire traditionnelle s'obtient aisément à l'aide d'une conversion utilisant une exponentielle au sens de Montgomery, c'est10 àdire: ae mod n = 1*[(a*R2)e] (1) avec R = ru L'algorithme résumé dans le tableau ci-dessus nécessite cependant de réaliser des multiplications portant sur des grands nombres, à savoir: a.Xi en(1) In addition, Montgomery's modular exponentiation can be easily obtained using only Montgomery multiplications, namely: a * e = aa * a ... * a (e.times) and the calculation of a traditional modular exponentiation is easily obtained by means of a conversion using an exponential in the Montgomery sense, ie: ae mod n = 1 * [(a * R2) e] (1) with R = ru The algorithm summarized in the table above, however, requires multiplications involving large numbers, namely: a.Xi in (1)

et n. Qi en (3).and N. Qi in (3).

En effet, a et n sont des grands nombres exprimés sur 1024 bits, Xi et Qi étant Indeed, a and n are large numbers expressed on 1024 bits, Xi and Qi being

par ailleurs de la "taille machine", c'est-à-dire de 8, 16 ou 32 bits. moreover the "machine size", that is to say 8, 16 or 32 bits.

Ces multiplications peuvent impliquer, pour leur "implémentation" concrète dans les systèmes informatiques, des opérateurs de grandes tailles (capables de "gérer" des grands nombres tels que a et n) ou tout au moins des registres de grandes tailles qui sont nécessairement de tailles prédéterminées. Il en résulte que les systèmes connus ne sont pas configurables, c'est-à-dire qu'on ne peut These multiplications can involve, for their concrete "implementation" in computer systems, operators of large sizes (able to "manage" large numbers such as a and n) or at least large registers that are necessarily sizes predetermined. As a result, known systems are not configurable, i.e.

changer la taille des opérandes en fonction des besoins. change the size of the operands as needed.

Pour résoudre ce problème, l'invention propose de parcelliser les grands nombres en mots de taille réduite, par exemple de la taille du mot machine, en partant de la constatation qu'une somme de produits partiels du genre Aj Xi + Nj Xi ne va affecter que la valeur de Cj (le mot de rang correspondant dans l'écriture par mots du résultat) ou éventuellement aussi le mot voisin (Cj+,) de To solve this problem, the invention proposes to subdivide the large numbers into words of reduced size, for example the size of the word machine, starting from the observation that a sum of partial products of the kind Aj Xi + Nj Xi does not assign that the value of Cj (the word of corresponding rank in the word writing of the result) or possibly also the neighboring word (Cj +,) of

rang immédiatement supérieur.rank immediately higher.

Ainsi l'invention peut se définir comme un procédé de cryptographie incluant le calcul d'une multiplication modulaire au sens de Montgomery (*) entre deux nombres (x, a), caractérisé en ce qu'on exprime lesdits nombres et un module (m) en mots, on stocke lesdits mots dans des mémoires correspondantes et on définit une mémoire à décalage o on accumule par mots des résultats de cycles de multiplications et additions entre mots, jusqu'à obtenir le résultat de ladite multiplication modulaire inscrit dans ladite mémoire à décalage, en ce que, pour chaque mot (Xi)de la succession de mots d'un premier nombre précité, on accomplit le cycle d'opérations consistant: - à calculer pour tout le cycle (i) considéré un mot dit coefficient de réduction (Qi) - à multiplier ledit mot (Xi) dudit premier nombre correspondant audit cycle (i) chaque fois par l'un des mots (Aj) constituant ledit second nombre pour obtenir une succession de premiers résultats intermédiaires, - à multiplier ledit coefficient de réduction (Qi) correspondant audit cycle (i) chaque fois par l'un des mots Mj constituant ledit module pour obtenir une succession de seconds résultats intermédiaires, et - à additionner lesdits premiers et seconds résultats intermédiaires, aux mots de mêmes rangs déjà contenus dans ladite mémoire à décalage et, à la fin de chaque cycle (i), à décaler d'un mot vers les bits de poids faibles le contenu Thus, the invention can be defined as a cryptographic method including the calculation of a modular multiplication in the Montgomery (*) sense between two numbers (x, a), characterized in that said numbers and a module (m) are expressed. ) in words, said words are stored in corresponding memories and a shift memory is defined where word results are accumulated from cycles of multiplications and additions between words, until the result of said modular multiplication recorded in said memory is obtained. offset, in that, for each word (Xi) of the succession of words of a first aforementioned number, the cycle of operations consisting in: - calculating for the entire cycle (i) considered a word said coefficient of reduction (Qi) - multiplying said word (Xi) of said first number corresponding to said cycle (i) each time by one of the words (Aj) constituting said second number to obtain a succession of first intermediate results, - to mul tiplier said reduction coefficient (Qi) corresponding to said cycle (i) each time by one of the words Mj constituting said module to obtain a succession of second intermediate results, and - to add said first and second intermediate results to the words of the same rows already contained in said shift memory and, at the end of each cycle (i), shifting one word to the least significant bits the content

de ladite mémoire à décalage.of said shift memory.

Ledit "coefficient de réduction" est issu de la théorie de Montgomery. Said "reduction coefficient" is derived from Montgomery's theory.

Il est clair, pour l'homme du métier, que certaines opérations indiquées cidessus peuvent être accomplies indépendamment les unes des autres et que l'ordre dans lequel elles sont énoncées n'est pas une caractéristique essentielle It is clear to those skilled in the art that certain operations mentioned above may be performed independently of one another and that the order in which they are set forth is not an essential characteristic

pour la mise en èuvre de l'invention. for the implementation of the invention.

Selon un aspect de l'invention, on peut aussi envisager une simplification de l'algorithme de Montgomery indiqué ci-dessus: On reprend l'expression r. rl1 - n.n' = 1 avec les mêmes hypothèses sur r. Cl', n et n'. En particulier, rl1 est l'inverse de r modulo n. On peut se restreindre According to one aspect of the invention, it is also possible to envisage a simplification of the Montgomery algorithm indicated above: The expression r is repeated. rl1 - n.n '= 1 with the same assumptions on r. Cl ', n and n'. In particular, rl1 is the inverse of r modulo n. We can restrict ourselves

aux modules n tel que le coefficient n' correspondant soit égal à 1. to the modules n such that the corresponding coefficient n 'is equal to 1.

L'algorithme se simplifie ainsi: c=O Pour i=0 à u-1 c:= c + aXi Qj:= Low(c) c:= (c + n.Qi)/r Lorsque n'=1 selon l'invention, on dira que le module est "normalisé" au sens de Montgomery. Par un changement de variable, on peut toujours se ramener au The algorithm is simplified as follows: c = O For i = 0 to u-1 c: = c + aXi Qj: = Low (c) c: = (c + n.Qi) / r When n '= 1 according to l Invention, it will be said that the module is "normalized" in the sense of Montgomery. By changing the variable, we can always reduce ourselves to

cas des modules normalisés au sens de Montgomery. case of standardized modules in the sense of Montgomery.

En effet, posons m=n.n' et m'=1, m est impair et donc relativement premier avec r. On peut donc toujours considérer l'identité de Bezout identiquement vérifiée par le nouveau jeu de coefficients r.r-1- m.m'=1 Donc l'arithmétique de Montgomery s'applique au module m normalisé au sens Indeed, let m = n.n 'and m' = 1, m is odd and therefore relatively prime with r. We can therefore always consider the identity of Bezout identically verified by the new set of coefficients r.r-1- m.m '= 1 So Montgomery's arithmetic applies to the standard m module in the sense

de Montgomery. Dans la suite de la description, le module m sera exprimé par from Montgomery. In the remainder of the description, the module m will be expressed by

une série de mots Mj. L'invention peut cependant être mise en oeuvre sans la a series of words Mj. The invention can, however, be implemented without the

simplification n'=1, c'est-à-dire sans faire appel au module normalisé. simplification n '= 1, that is to say without using the standard module.

On voit que dans le cas n'=1, le calcul du coefficient de réduction Qi se trouve simplifié et est obtenu en calculant "Low c". Il n'est plus nécessaire de calculer c.n'. Ceci simplifie l'algorithme décrit plus haut car Qi, à chaque cycle, peut être constitué de la somme dudit premier mot contenu dans ladite mémoire à décalage avec les bits de poids faibles du produit du premier mot Ao d'un second nombre précité, par le mot Xi dudit premier nombre considéré dans ledit cycle. A la fin du calcul d'une exponentiation modulaire utilisant des multiplications au sens de Montgomery, avec module normalisé, on obtient un résultat T = ae mod m en appliquant la conversion (1) ci-dessus. On peut ensuite calculer U, l'exponentiation modulaire, modulo n, par une seule transformation U = T mod n Selon un mode d'exécution possible, le procédé consiste, pour chaque cycle et au fur et à mesure, à additionner un résultat intermédiaire précité au mot de même rang et au mot de rang immédiatement supérieur contenus dans ladite mémoire à décalage, à réinscrire les résultats de ces additions à ces mêmes rangs de ladite mémoire, puis à recommencer ces opérations avec l'autre résultat intermédiaire, jusqu'à la fin dudit cycle comprenant le décalage d'un mot précité. Selon un autre mode d'exécution possible, le procédé consiste, pour chaque cycle, à partager chaque résultat intermédiaire en deux groupes rassemblant respectivement les bits de poids faibles et les bits de poids forts, à mémoriser temporairement ces groupes, à ajouter, à la valeur de chaque mot de ladite mémoire à décalage les groupes rassemblant les bits de poids faibles de même rang (J) et les groupes rassemblant les bits de poids forts du rang immédiatement inférieur (j-1), et à reporter le résultat de ces additions, chaque fois, à l'emplacement du mot j considéré de ladite mémoire à décalage, jusqu'à We see that in the case n '= 1, the calculation of the reduction coefficient Qi is simplified and is obtained by calculating "Low c". It is no longer necessary to calculate c.n '. This simplifies the algorithm described above since Qi, at each cycle, can consist of the sum of said first word contained in said shift memory with the low-order bits of the product of the first word Ao of a second number cited above, by the word Xi of said first number considered in said cycle. At the end of the calculation of a modular exponentiation using Montgomery multiplications, with a standardized module, a result T = ae mod m is obtained by applying the conversion (1) above. Modular exponentiation, modulo n, can then be calculated by a single transformation U = T mod n According to one possible embodiment, the process consists, for each cycle and as it is, of adding an intermediate result. supra, to the word of the same rank and to the next higher rank word contained in said shift memory, to re-register the results of these additions to these same ranks of said memory, then to repeat these operations with the other intermediate result, until the end of said cycle comprising the shift of a aforementioned word. According to another possible embodiment, the method consists, for each cycle, of dividing each intermediate result into two groups respectively gathering the least significant bits and the most significant bits, to temporarily memorize these groups, to be added, to the value of each word of said shift memory the groups gathering the least significant bits of the same rank (J) and the groups gathering the most significant bits of the immediately lower rank (j-1), and to postpone the result of these additions , each time, at the location of the word j considered of said shift memory, up to

la fin dudit cycle comprenant le décalage d'un mot précité. the end of said cycle comprising the shift of a aforementioned word.

L'invention concerne aussi une entité électronique comprenant des moyens de cryptographie incluant une unité de calcul d'une multiplication modulaire, au sens de Montgomery, entre deux nombres, caractérisée en ce que ladite unité de calcul comprend: - trois mémoires dimensionnées pour recevoir respectivement lesdits nombres a, b et un module m, chacun exprimé en mots de taille prédéterminée dans la mémoire correspondante, une mémoire à décalage dimensionnée pour recevoir le résultat de ladite multiplication modulaire et agencée pour décaler son propre contenu, chaque fois par mot de taille prédéterminée dans la direction des bits de poids faibles, - un premier multiplieur pour faire des produits successifs de deux mots respectivement lus dans les mémoires des deux nombres, - un calculateur d'un coefficient de réduction déterminé au début d'un cycle de multiplications des mots d'un nombre par un mot de l'autre, - un second multiplieur pour faire des produits successifs d'un mot dudit coefficient de réduction par un mot lu dans la mémoire dudit module, et - un additionneur connecté, en entrée, entre les sorties desdits premier et second multiplieurs et ladite mémoire à décalage et connecté, en sortie, à ladite The invention also relates to an electronic entity comprising cryptographic means including a unit for calculating a modular multiplication, in the Montgomery sense, between two numbers, characterized in that said calculation unit comprises: three memories sized to receive respectively said numbers a, b and a module m, each expressed in words of predetermined size in the corresponding memory, a shift memory sized to receive the result of said modular multiplication and arranged to shift its own content, each time by predetermined size word in the direction of the low-order bits, - a first multiplier for making successive products of two words respectively read in the memories of the two numbers, - a calculator of a reduction coefficient determined at the beginning of a cycle of multiplications of the words. a number by a word of the other, - a second multiplier to make products successive ones of a word of said reduction coefficient by a word read in the memory of said module, and - an adder connected, as input, between the outputs of said first and second multipliers and said shift memory and connected, at the output, to said

mémoire à décalage en adressant directement des mots sélectionnés de celle-ci. shift memory by directly addressing selected words therefrom.

L'invention sera mieux comprise à la lumière de la description qui va The invention will be better understood in the light of the description which will

suivre, donnée à titre d'exemple, et faite en référence aux dessins annexés dans lesquels: - la figure 1 est un schéma bloc illustrant une partie d'une entité électronique constituant plus particulièrement une unité de calcul d'une multiplication modulaire au sens de Montgomery, conforme à l'invention; - la figure 2 est un organigramme explicitant le fonctionnement de l'unité de calcul de la figure 1; - la figure 3 est un schéma bloc d'une variante de l'unité de calcul de la figure 1; et - la figure 4 est un organigramme explicitant le fonctionnement de cette variante. L'unité de calcul 11 d'une multiplication modulaire, au sens de Montgomery, représentée sur la figure 1 peut être intégrée à toute entité électronique comprenant des moyens de cryptographie, pour effectuer de telles follow, given by way of example, and with reference to the accompanying drawings in which: - Figure 1 is a block diagram illustrating a part of an electronic entity more particularly constituting a unit for calculating a modular multiplication in the sense of Montgomery, according to the invention; FIG. 2 is a flowchart explaining the operation of the calculation unit of FIG. 1; FIG. 3 is a block diagram of a variant of the calculation unit of FIG. 1; and FIG. 4 is a flowchart explaining the operation of this variant. The unit of calculation 11 of a modular multiplication, in the Montgomery sense, represented in FIG. 1 can be integrated into any electronic entity comprising cryptographic means, to perform such

multiplications modulaires et, par conséquent, des exponentiations modulaires. modular multiplications and, consequently, modular exponentiation.

Cette unité de calcul peut par exemple être intégrée à une carte à microcircuit comprenant des moyens de cryptographie, une telle unité de calcul 11 selon l'invention étant avantageuse pour ce type d'entité électronique en raison de la rapidité de calcul et de la facilité avec laquelle on peut configurer la taille des opérandes, c'est-à-dire la capacité occupée par les mémoires renfermant ces opérandes. L'unité de calcul comprend trois mémoires Ma, M,, Mm dimensionnées pour recevoir respectivement deux nombres a, b et un module normalisé m. Les nombres x, a et le module m sont, chacun, exprimés en mots (Xi, Aj, Mj...) de taille prédéterminée dans la mémoire correspondante, ce qui signifie que chacune de ces mémoires comporte u registres, chacun dimensionné pour recevoir un mot du nombre a ou b ou du module m. La taille d'un mot d'un nombre ou du module est la même. Selon les cas, on peut choisir des mots de 8 bits, 16 bits ou 32 bits suivant la capacité des multiplieurs et des additionneurs qui seront utilisés dans l'unité de calcul. Cette dernière comporte en outre une mémoire à décalage 104 dimensionnée pour recevoir le résultat de la multiplication modulaire, par accumulation de calculs successifs. Cette mémoire est donc constituée d'une série de registres, chacun de la taille d'un mot Co, Cu... et elle est agencée pour décaler son propre contenu, à chaque itération comme décrit plus loin, par mot de ladite taille prédéterminée, dans la direction des mots de bits de poids faibles, c'est-à-dire vers Co. Il est à noter que les mémoires Mx, Ma, Mn et la "mémoire à décalage" 104 peuvent être des parties allouées (en fonction de la taille des opérandes) d'une même mémoire dite RAM (mémoire à accès direct) configurée pour la circonstance. De préférence, on pourra même avoir recours à des pointeurs, c'est-à-dire à des registres contenant simplement les adresses des mots inscrits dans la mémoire This calculation unit can for example be integrated in a microcircuit card comprising cryptographic means, such a calculation unit 11 according to the invention being advantageous for this type of electronic entity because of the speed of calculation and the ease with which one can configure the size of the operands, that is to say the capacity occupied by the memories containing these operands. The calculation unit comprises three memories Ma, M ,, Mm sized to receive respectively two numbers a, b and a standardized module m. The numbers x, a and the module m are each expressed in words (Xi, Aj, Mj ...) of predetermined size in the corresponding memory, which means that each of these memories comprises u registers, each dimensioned to receive a word of the number a or b or the module m. The size of a word in a number or module is the same. Depending on the case, it is possible to choose 8-bit, 16-bit or 32-bit words depending on the capacity of the multipliers and adders that will be used in the computing unit. The latter further comprises a shift memory 104 sized to receive the result of the modular multiplication, by accumulating successive calculations. This memory consists of a series of registers, each of the size of a word Co, Cu ... and it is arranged to shift its own content, at each iteration as described below, by word of said predetermined size. in the direction of the words of least significant bits, ie to Co. It should be noted that the memories Mx, Ma, Mn and the "offset memory" 104 may be allocated portions (depending the size of the operands) of the same memory called RAM (random access memory) configured for the circumstance. Preferably, it will even be possible to use pointers, that is to say, registers simply containing the addresses of the words in the memory

RAM. A chaque itération en i ou en j, le contenu du pointeur change. RAM. At each iteration in i or j, the pointer contents change.

L'unité de calcul 11 comporte en outre un premier multiplieurl00 pour faire des produits successifs de deux mots respectivement lus dans les mémoires Ma, Mx des deux nombres, un calculateur 102 d'un coefficient de réduction Qi, ce coefficient étant déterminé au début d'un cycle (i) de multiplication des mots Aj d'un nombre (a) par un mot Xi de l'autre et un second multiplieur101 pour faire des produits successifs dudit coefficient de réduction Qi par un mot Mj lu dans la mémoire Mm dudit module. L'unité de calcul comporte aussi un additionneur 103, ici un additionneur double, connecté, en entrée, entre les sorties desdits premier et second multiplieurslOO, 101 et ladite mémoire à décalage 104 et connecté, en sortie, à cette même mémoire à décalage 104 avec possibilité d'adresser directement des mots sélectionnés de celle-ci, c'està-dire que la sortie de l'additionneur 103 peut "pointer" directement un ou plusieurs emplacements de mot CJ, Cj+1 de ladite mémoire à décalage et y The calculation unit 11 further comprises a first multiplier 100 to produce successive products of two words respectively read in the memories Ma, Mx of the two numbers, a calculator 102 of a reduction coefficient Qi, this coefficient being determined at the beginning of a cycle (i) of multiplying the words Aj of a number (a) by a word Xi of the other and a second multiplier 101 to produce successive products of said reduction coefficient Qi by a word Mj read in the memory Mm of said module. The calculation unit also comprises an adder 103, here a double adder, connected, at the input, between the outputs of said first and second multipliers 100, 101 and said offset memory 104 and connected, at the output, to this same offset memory 104. with possibility of directly addressing selected words thereof, that is, the output of the adder 103 can "point" directly one or more word locations CJ, Cj + 1 of said shift memory and

inscrire le résultat d'une ou plusieurs additions correspondantes. enter the result of one or more corresponding additions.

Sur la figure 1, on a représenté l'emplacement de mot Co (le registre) dans lequel s'inscrit le mot des poids faibles et, à l'opposé, l'emplacement de mot Cu dans lequel s'inscrit le mot des poids forts. On a en outre représenté deux emplacements de mot voisins de rang j et j+1. Ces deux emplacements FIG. 1 shows the word location Co (the register) in which the word of the least significant weight is written and, conversely, the word location Cu in which the word of the weights is inscribed. strong. In addition, two neighboring word locations of rank j and j + 1 have been represented. These two locations

sont ceux qui sont lus puis modifiés au cours d'un cycle j. are those that are read and modified in a cycle j.

Les mémoires Ma, Mx des deux nombres sont reliées aux entrées du premier multiplieur100 par l'intermédiaire de registres tampons 106, 107 de la taille d'un mot. Chaque registre peut donc accueillir le mot du nombre qui doit être multiplié par celui de l'autre nombre, à un moment donné. La sortie de ce même multiplieur est connectée à l'entrée d'un registre tampon 108 ayant la taille de deux mots. La sortie de ce registre tampon est reliée à l'une des entrées de The memories Ma, Mx of the two numbers are connected to the inputs of the first multiplier 100 via buffer registers 106, 107 of the size of a word. Each register can thus accommodate the word of the number which must be multiplied by that of the other number, at a given moment. The output of this same multiplier is connected to the input of a buffer register 108 having the size of two words. The output of this buffer register is connected to one of the inputs of

l'additionneur 103.the adder 103.

Le calculateur du coefficient Qi comprend un additionneur 102 qui reçoit à l'une de ses entrées le mot de poids faibles Co contenu dans le registre correspondant de la mémoire à décalage et, à son autre entrée, le contenu d'un autre registre tampon 109 qui contient un mot appelé Low P1 calculé au début d'un cycle considéré (décrit plus loin) et constitué des bits de poids faibles du produit d'un mot Xi d'un premier nombre x, invariable dans un cycle, par le premier mot A0 d'un second nombre a dont tous les mots sont utilisés au cours de ce même cycle. Les bits de poids faibles retenus dans Low P1 forment un mot de ladite taille prédéterminée de x ou a. La sortie de l'additionneur 102 est reliée à l'entrée d'un registre tampon 110 dont la sortie est reliée à une entrée du second multiplieur101. Un autre registre tampon 111 est relié entre la mémoire Mm et l'autre entrée du second multiplieurl01. Cette mémoire tampon 111 peut The calculator of the coefficient Qi comprises an adder 102 which receives at one of its inputs the low-weight word Co contained in the corresponding register of the shift memory and, at its other input, the contents of another buffer register 109 which contains a word called Low P1 calculated at the beginning of a cycle considered (described below) and consists of the least significant bits of the product of a word Xi of a first number x, invariable in a cycle, by the first word A0 of a second number a of which all the words are used during this same cycle. The low-order bits retained in Low P1 form a word of said predetermined size of x or a. The output of the adder 102 is connected to the input of a buffer register 110 whose output is connected to an input of the second multiplier 101. Another buffer register 111 is connected between the memory Mm and the other input of the second multiplier01. This buffer 111 can

donc recevoir successivement chaque mot Mj de la mémoire Mm. therefore successively receive each word Mj of the memory Mm.

La sortie du second multiplieur101 est reliée à l'entrée d'un registre tampon 113 de la taille de deux mots et la sortie de ce registre est reliée à l'une des entrées de l'additionneur. Enfin, l'additionneur comporte une troisième entrée reliée à la mémoire à décalage 104. Deux mots consécutifs de cette mémoire peuvent être appliqués à cette troisième entrée. Les mots qui sont susceptibles d'être adressés à cette entrée de l'additionneur 103 sont les mêmes The output of the second multiplier 101 is connected to the input of a buffer register 113 the size of two words and the output of this register is connected to one of the inputs of the adder. Finally, the adder comprises a third input connected to the shift memory 104. Two consecutive words of this memory can be applied to this third input. The words that are likely to be addressed to this entry of the adder 103 are the same

qui sont "pointés" par la sortie 105 de ce même additionneur. which are "pointed" by the output 105 of this same adder.

En fonctionnement, l'additionneur 103 est programmé pour effectuer successivement l'addition du contenu du registre tampon 108 avec les deux mots Cj, Cj+1 considérés de la mémoire à décalage, le transfert du résultat aux mêmes emplacements de ladite mémoire, puis l'addition du contenu du registre In operation, the adder 103 is programmed to successively perform the addition of the contents of the buffer register 108 with the two words Cj, Cj + 1 considered of the shift memory, the transfer of the result to the same locations of said memory, then the addition of the contents of the register

tampon 113 avec les deux mots Cj, Cj+1 modifiés de la mémoire à décalage. buffer 113 with the two words Cj, Cj + 1 modified from the shift memory.

L'additionneur est classiquement agencé avec une mémoire de retenue Rt dans laquelle s'inscrit la retenue d'une addition donnée. Comme on le verra plus loin, selon les cas, cette retenue peut s'ajouter à l'addition en cours des deux mots. A la fin d'un cycle sur j, la retenue est inscrite dans le registre de poids forts Cu de The adder is conventionally arranged with a retaining memory Rt in which the retention of a given addition is inscribed. As will be seen below, depending on the case, this deduction may be added to the current addition of the two words. At the end of a cycle on j, the reservoir is written in the Cu

la mémoire à décalage 104.the offset memory 104.

On va maintenant décrire le fonctionnement de l'unité de calcul de la figure 1 pour réaliser une multiplication modulaire au sens de Montgomery entre les deux nombres x et a, le module normalisé étant exprimé par m. On rappelle que les mots du nombre x sont notés Xi, les mots du nombre a sont notés Aj et les mots du normalisé module m sont notés Mj. Les mots inscrits dans la mémoire à décalage sont notés Cj. L'organigramme de la figure 2 décrit un cycle complet sur j dans lequel on utilise tous les mots Aj et tous les mots Mj pour modifier tous les mots de la mémoire à décalage 104, Xi étant constant. La multiplication modulaire au sens de Montgomery n'est réalisée que lorsqu'on a The operation of the computation unit of FIG. 1 will now be described to realize a Montgomery modular multiplication between the two numbers x and a, the normalized module being expressed by m. Remember that the words of the number x are denoted Xi, the words of the number a are denoted Aj and the words of the normalized module m are denoted Mj. The words inscribed in the shift memory are denoted Cj. The flowchart of FIG. 2 describes a complete cycle on j in which all the words Aj and all the words Mj are used to modify all the words of the shift memory 104, Xi being constant. Modular multiplication in the Montgomery sense is realized only when

effectué tous les cycles sur i exploitant tous les mots Xi. performed all the cycles on i exploiting all the words Xi.

Au début d'un tel cycle sur i, on remet à zéro la mémoire de retenue Rt de l'additionneur 103 et on initialise un compteur j = 0 (étapes 115 et 116). On rappelle que les nombres x et a et le module m comportent u mots. A chaque At the beginning of such a cycle on i, the retaining memory Rt of the adder 103 is reset and a counter j = 0 is initialized (steps 115 and 116). Remember that the numbers x and a and the module m comprise u words. Every

début de cycle sur j, on effectue le test pour vérifier si j = u (étape 117). beginning of cycle on j, the test is carried out to check if j = u (step 117).

Dans la négative, s'agissant d'un cycle sur Xi, le multiplieurl00 effectue le If not, for a cycle on Xi, the multiplierl00 performs the

produit P1 = AjXi (étape 118).product P1 = AjXi (step 118).

Parallèlement, on effectue le test 119 pour vérifier si j = 0 (début de cycle). In parallel, the test 119 is carried out to check if j = 0 (beginning of cycle).

Si ce test est positif, on calcule (étape 120) un coefficient Qi, valable pour tout le cycle i, à l'aide du calculateur de coefficient de réduction principalement constitué de l'additionneur 102. Comme mentionné précédemment, ce coefficient Qi est constitué de la somme du premier mot Co (poids faibles) de la mémoire à décalage 104 au début du cycle et de la partie des bits de poids faibles du premier produit P1 qui vient d'être calculé, c'est-à-dire la moitié des bits du If this test is positive, a coefficient Qi, valid for the entire cycle i, is calculated (step 120) using the reduction coefficient calculator mainly consisting of the adder 102. As previously mentioned, this coefficient Qi is constituted of the sum of the first word Co (low weight) of the shift memory 104 at the beginning of the cycle and the part of the least significant bits of the first product P1 which has just been calculated, that is to say half bits of

produit AoXi.AoXi product.

Dès que la valeur de Qi pour le cycle est calculée, on peut aussi calculer As soon as the value of Qi for the cycle is calculated, we can also calculate

le produit P2 = M1Qi (étape 121).the product P2 = M1Qi (step 121).

Il est important de noter à ce stade que les calculs successifs des produits P1 et P2 par les multiplieurs100 et 101 respectifs, qui prennent plus de temps que les additions qui doivent être effectuées par l'additionneur 103, peuvent être It is important to note at this point that the successive computations of the products P1 and P2 by the respective multipliers 100 and 101, which take longer than the additions to be made by the adder 103, can be

effectués "en parallèle", c'est-à-dire sensiblement en même temps. performed "in parallel", that is to say substantially at the same time.

Dès qu'un produit P1 est connu, on passe à l'étape 123 o l'additionneur 103 réalise la somme (add) entre la sortie du multiplieur100 inscrite dans le registre 108 et les deux mots Cj+1, Cj pointés à ce stade par la sortie 105 de l'additionneur 103. Le contenu de la mémoire Rt (retenue de l'opération précédente) est également ajouté. Dès que la somme de cette triple addition est effectuée, elle est inscrite aux emplacements Cj+î, Cj de la mémoire à décalage 104. La retenue de cette addition Radd, qui ne peut être inscrite dans Cj+1 est inscrite dans la mémoire R. On passe ensuite à l'étape 124 o on réalise à nouveau une addition, dès que le produit P2 correspondant est connu, entre la valeur de ce produit et le double mot Cj+1, Cj inscrit dans la mémoire à décalage 104 à l'étape précédente. En revanche, on n'additionne pas la retenue. La nouvelle retenue Radd est ajoutée à la précédente et réinscrite dans la mémoire Rt. Les additions effectuées par l'additionneur 103 peuvent être effectuées en As soon as a product P1 is known, step 123 is carried out where the adder 103 realizes the sum (add) between the output of the multiplier100 entered in the register 108 and the two words Cj + 1, Cj pointed at this stage by the output 105 of the adder 103. The content of the memory Rt (retained from the previous operation) is also added. As soon as the sum of this triple addition is made, it is written in the locations Cj + 1, Cj of the shift memory 104. The retention of this addition Radd, which can not be written in Cj + 1, is written in the memory R Then, step 124 is again carried out, as soon as the corresponding product P2 is known, between the value of this product and the double word Cj + 1, Cj entered in the shift memory 104. 'previous step. On the other hand, we do not add restraint. The new retainer Radd is added to the previous one and rewritten in the memory Rt. The additions made by the adder 103 can be made in

temps masqué pendant l'exécution des multiplications suivantes. time hidden during the execution of the following multiplications.

Par ailleurs, dès que les valeurs de P1 et P2 sont connues, on incrémente le compteur j d'une unité (étape 122) et on revient au test 117. Tant que la réponse à ce test j = u est négative, on effectue de nouveaux calculs de P1 et P2, ce qui entraîne l'inscription de nouvelles valeurs aux emplacements suivants de Moreover, as soon as the values of P1 and P2 are known, the counter j is incremented by one unit (step 122) and it returns to the test 117. As long as the response to this test j = u is negative, we carry out recalculations of P1 and P2, resulting in the inclusion of new values in the following locations of

la mémoire à décalage 104, pointés par la sortie de l'additionneur 103. the shift memory 104, pointed by the output of the adder 103.

Lorsque le test 117 devient positif, c'est-à-dire lorsqu'on arrive à la fin d'un cycle sur Xi, le contenu de la mémoire à décalage 104 est décalé d'un mot vers les bits de poids faibles, à l'étape 125 et la dernière valeur inscrite dans la mémoire Rt est reportée dans les bits de poids faibles de l'emplacement Cu (mot de poids fort de la mémoire à décalage 104) à l'étape 126. Le cycle i est terminé When the test 117 becomes positive, that is to say when reaching the end of a cycle on Xi, the content of the shift memory 104 is shifted by one word to the least significant bits, to step 125 and the last value entered in the memory Rt is reported in the least significant bits of the slot Cu (high-order word of the offset memory 104) in step 126. The cycle i is finished.

et on recommence un nouveau cycle avec le mot Xi suivant. and we start a new cycle with the next word Xi.

Lorsque tous les cycles ont été effectués, la mémoire à décalage contient When all cycles have been completed, the shift memory contains

le produit modulaire au sens de Montgomery des nombres x et a. Montgomery's modular product of numbers x and a.

La variante de la figure 3 se distingue de l'unité de calcul décrite en référence à la figure 1 par la structure de l'additionneur et la façon dont celui-ci The variant of FIG. 3 differs from the calculation unit described with reference to FIG. 1 by the structure of the adder and the manner in which it

pointe la mémoire à décalage, mot par mot et non plus par groupe de deux mots. point the shift memory, word by word and no longer by group of two words.

Tous les moyens de calcul des produits P1 et P2 qui s'inscrivent successivement dans les registres 108 et 113 sont identiques à ceux de la figure 1 et ne seront donc pas décrits à nouveau. La mémoire 104 est également identique à la différence près qu'elle est "pointée" mot par mot, c'est-à-dire que l'additionneur ne lit chaque fois qu'un mot de la mémoire à décalage et ne modifie que ce All the means for calculating the products P1 and P2 which register successively in the registers 108 and 113 are identical to those of FIG. 1 and will therefore not be described again. The memory 104 is also identical with the difference that it is "pointed" word by word, ie the adder reads only one word of the shift memory each time and modifies only that

même mot à l'issue d'une phase de calcul. same word at the end of a calculation phase.

Cependant, selon la variante, les contenus des registres tampons 108 et 113 sont traités différemment en ce sens que le produit P1 est partagé en deux mots, un mot L1 contenant les bits de poids faibles et un mot H1 contenant les bits de poids forts. De même, le produit P2 est partagé en deux mots, un mot L2 However, according to the variant, the contents of the buffer registers 108 and 113 are treated differently in that the product P1 is divided into two words, a word L1 containing the least significant bits and a word H1 containing the most significant bits. Similarly, the product P2 is divided into two words, a word L2

contenant les bits de poids faibles et un mot H2 contenant les bits de poids forts. containing the least significant bits and a word H2 containing the most significant bits.

Le mot de la mémoire 104 pointé à un moment donné par l'additionneur 103 est noté Cj. Ces différents mots sont adressés à un ensemble 112 de cinq registres d'entrée de l'additionneur formant "file d'attente". Cependant, comme on le verra plus loin en référence à la figure 4, les mots L1, L2 (poids faibles) d'un cycle j sont additionnés avec le mot Cj de la mémoire registre mais avec les mots H1 et H2 The word of the memory 104 pointed at a given moment by the adder 103 is noted Cj. These different words are addressed to a set 112 of five input registers of the adder forming "queue". However, as will be seen below with reference to FIG. 4, the words L1, L2 (least significant) of a cycle j are added together with the word Cj of the register memory but with the words H1 and H2.

du cycle j-1. Ceci est explicité par l'organigramme de la figure 4. of the cycle j-1. This is explained by the flowchart in Figure 4.

Au début d'un cycle de traitement à Xi constant, on initialise (étape 130) un compteur j = 0, la retenue Rt est mise à zéro ainsi que les valeurs H1 et H2 At the beginning of a constant Xi processing cycle, a counter j = 0 is initialized (step 130), the holding Rt is set to zero and the values H1 and H2 are set to zero.

contenues dans les registres 108 et 113. contained in registers 108 and 113.

On passe alors à un test 131 pour vérifier si j=u. Dans la négative, on commence à calculer la valeur du produit Pl (étape 132) constituant le produit de Aj par Xi, ce qui permet d'inscrire dans le registre 108 les valeurs correspondantes de L1 et H1. Dans le même temps, on effectue le test 133, sur j, pour vérifier si j = 0. Si c'est le cas, on passe à l'étape 134 o on calcule la valeur de Qi pour le cycle i comme dans l'exemple précédent. Ce calcul peut être effectué dès que la valeur de L1 du produit AoXi est connue à l'étape 132. We then go to a test 131 to check if j = u. If not, the value of the product P1 (step 132) constituting the product of Aj by Xi is started, which makes it possible to write in the register 108 the corresponding values of L1 and H1. At the same time, the test 133 is performed on j to check if j = 0. If it is the case, proceed to step 134 where the value of Qi for cycle i is calculated as in FIG. previous example. This calculation can be performed as soon as the value of L1 of the product AoXi is known in step 132.

Dès lors et pour toutes les valeurs de j suivantes, il est possible de calculer (étape 135) la succession des produits P2, c'est-à-dire d'obtenir les valeurs L2 et H2 qui s'inscrivent dans le registre 113. Comme précédemment, les produits L1, H1 et L2, H2 peuvent être calculés "en parallèle" indépendamment du Therefore and for all the values of j following, it is possible to calculate (step 135) the sequence of products P2, that is to say, to obtain the values L2 and H2 which are registered in the register 113. As before, the products L1, H1 and L2, H2 can be calculated "in parallel" independently of

fonctionnement de l'additionneur 103. operation of the adder 103.

Concernant ce dernier, à chaque test 133 négatif, les valeurs actuelles de L1 et L2 sont mises à jour (étape 136) pour l'additionneur 103, c'està-dire que les mots L1 et L2 contenus dans les registres 108 et 113 sont transférés dans les For the latter, at each negative test 133, the current values of L1 and L2 are updated (step 136) for the adder 103, that is to say that the words L1 and L2 contained in the registers 108 and 113 are transferred to

registres correspondants de l'ensemble 112, les mots H1 et H2 étant conservés. corresponding registers of the set 112, the words H1 and H2 being retained.

Ceux-ci sont donc mis à jour mais pas les registres de l'ensemble 112 contenant These are updated but not the registers of the set 112 containing

H1 et H2 calculés au cycle précédent. H1 and H2 calculated in the previous cycle.

On passe alors à une addition à l'étape 137 qui consiste à faire la somme des valeurs inscrites dans l'ensemble 112 alimentant l'additionneur 103, c'est-àdire Cj, les valeurs L1 et L2 du cycle j, les valeurs H1 et H2 du cycle précédent et la valeur inscrite dans la mémoire Rt. Le résultat est réinscrit à l'emplacement Cj de la mémoire à décalage et la retenue Radd de cette addition est inscrite dans la An addition is then made to step 137 which consists in adding the values entered in the set 112 supplying the adder 103, ie Cj, the values L1 and L2 of the cycle j, the values H1 and H2 of the preceding cycle and the value entered in the memory Rt. The result is re-entered at the location Cj of the shift memory and the retention Radd of this addition is written in the

mémoire Rt.Rt memory.

On passe ensuite à l'étape 138 o les valeurs actuelles de H1 et H2 sont mises à jour, c'est-à-dire que les mots H1 et H2 contenus dans les registres 108 et 1 13 sont transférés dans les registres correspondants de l'ensemble 112. On incrémente le compteur j d'une unité à l'étape 139 et on retourne au Then, step 138 o the current values of H1 and H2 are updated, that is to say that the words H1 and H2 contained in the registers 108 and 1 13 are transferred to the corresponding registers of the 112. We increment the counter j by one in step 139 and return to

test 131 o les mêmes opérations sont renouvelées pour le cycle j + 1. test 131 o the same operations are renewed for the cycle j + 1.

Lorsque le test 131 devient positif, on passe à l'étape 140 o on calcule la valeur de Cu de façon analogue à un calcul de Cj (étape 137) mais sans tenir compte de L1 et L2, c'est-à-dire en additionnant Cu, H1, H2 et Rt, la retenue de cette addition Radd est inscrite dans Rt. A l'étape 141, le contenu de la mémoire à décalage 104 est décalé d'un mot vers les bits de poids faibles. A l'étape 142, la dernière valeur inscrite dans la mémoire Rt est reportée à l'emplacement CQ (bits de poids forts) de la mémoire à décalage 104. Le cycle i est terminé et on When the test 131 becomes positive, proceed to step 140 where the value of Cu is calculated analogously to a calculation of Cj (step 137) but without taking into account L1 and L2, that is to say in adding Cu, H1, H2 and Rt, the retention of this addition Radd is entered in Rt. In step 141, the contents of the shift memory 104 are shifted by one word to the least significant bits. In step 142, the last value entered in the memory Rt is transferred to the location CQ (high-order bits) of the shift memory 104. The cycle i is completed and

recommence un nouveau cycle avec le mot Xi suivant. start a new cycle with the next word Xi.

On peut envisager encore d'autres variantes. Other variants can be envisaged.

Notamment, on pourrait n'utiliser qu'un seul multiplieur qui serait programmé pour réaliser une première série de calculs dans la boucle j pour calculer par exemple tous les Aj Xi puis une seconde série de calculs dans une In particular, we could use only one multiplier that would be programmed to perform a first series of computations in the loop j to compute for example all the Aj Xi then a second series of computations in a

autre boucle j, cette fois pour calculer tous les Q1.mj. other loop j, this time to calculate all Q1.mj.

De plus il est à noter que les mots des nombres a, x, m et c peuvent être Moreover, it should be noted that the words of the numbers a, x, m and c can be

de tailles différentes; la mise en oeuvre de l'invention reste possible. of different sizes; the implementation of the invention remains possible.

Claims

A method of cryptography including the calculation of a Montgomery modular multiplication (*) between two numbers (x, a), characterized in that said numbers and a module (m) are expressed in words, said data being stored words in corresponding memories and defining a shift memory where words of the multiplication and addition cycles between words are accumulated in words until the result of said modular multiplication is recorded in said shift memory, in that for each word (Xi) of the succession of words of a aforementioned first number, the cycle of operations consisting in: - calculating for the whole cycle (i) considered a word said coefficient of reduction (Qi) - to be multiplied said word (Xi) of said first number corresponding to said cycle (i) each time by one of the words (Aj) constituting said second number to obtain a succession of first intermediate results, - to multiply said coefficient of reduction (Q i) corresponding to said cycle (i) each time by one of the words (Mj) constituting said module to obtain a succession of second intermediate results, and - to add said first and second intermediate results, to the words of the same ranks already contained in said shift memory and, at the end of each cycle (i), to shift from one word to the least significant bits the content

of said shift memory.

2. Method according to claim 1, characterized in that such a reduction coefficient (Q>) consists of the sum of said first word contained in said shift memory with the least significant bits of the product of the first word (Ao). of a second number mentioned above by the word (Xi) of said first

number considered in said cycle.

3. Method according to claim 1 or 2, characterized in that the two multiplications of words (Xi Aj, QiMj) substantially at the same time.

4. Method according to one of the preceding claims, characterized in

what it consists, for each cycle, to add an intermediate result above to the word of the same rank and to the next higher rank word contained in said shift memory, to re-register the results of these additions to these same ranks of said memory, then to repeat these operations with the other intermediate result, until the end of said cycle comprising the shift of a aforementioned word.

5. Method according to claim 4, characterized in that it consists, during a same cycle, to carry word by word the retention of the aforementioned additions and, at the end of a cycle, to postpone the last deduction in the bits of

least significant of the most significant word (CJ) of said shift memory.

6. Method according to one of claims 1 to 3, characterized in that

consists for each cycle to share each intermediate result in two groups respectively gathering the least significant bits and the most significant bits, to temporarily store these groups, to add, as and when, to the value of each word of said memory shifting the groups collecting the least significant bits of the same rank () and the groups gathering the most significant bits of the immediately lower rank (j-1), and reporting the result of these additions, each time, to the location of the word () considered from that memory to

offset, until the end of said cycle comprising the shift of a aforementioned word.

An electronic entity comprising cryptographic means including a unit for calculating a modular multiplication, in the Montgomery sense, between two numbers, characterized in that said computing unit comprises: three memories sized to receive said numbers respectively (a , b) and a module (m), each expressed in words of predetermined size in the corresponding memory, - a shift memory (104) sized to receive the result of said modular multiplication and arranged to shift its own content, each time by predetermined size word in the direction of the least significant bits, - a first multiplier (100) for making successive products of two words respectively read in the memories of the two numbers, - a calculator (102) with a determined reduction coefficient. at the beginning of a cycle of multiplications of the words of a number by a word of the other, - a second multiplier (101) to make products successive s of a word of said reduction coefficient by a word read in the memory of said module, and an adder (103) connected, as input, between the outputs of said first and second multipliers and said offset memory and connected, at the output, to said shift memory by directly addressing words

selected from it.

8. Electronic entity according to claim 7, characterized in that said adder (103) is a double adder arranged to perform a first addition between an intermediate result produced by said first multiplier and two adjacent words (Cj, Cj + 1) of ranks. corresponding ones of said shift memory and to report the result to the two corresponding word locations thereof, then to make a second similar addition from the intermediate result produced by said second multiplier and to report the result to the same two word locations correspondents of

said shift memory.

9. Electronic entity according to claim 7, characterized in that it comprises buffer registers (108, 113) interposed between said multipliers

and said adder.

10. Electronic entity according to claim 9, characterized in that said buffer registers are arranged to separately receive the low weight portions and the high weight portions of the two multipliers, the adder being arranged to record the result at a word location. corresponding

of said shift memory.