CA2446291A1

CA2446291A1 - Device and method for signing, marking and authenticating computer programs

Info

Publication number: CA2446291A1
Application number: CA002446291A
Authority: CA
Inventors: Michel Riguidel; Patrick Cousot; Arnaud Venet
Original assignee: Individual
Current assignee: Thales SA
Priority date: 2001-05-04
Filing date: 2002-04-23
Publication date: 2002-11-14
Also published as: WO2002091141A1; FR2824402A1; EP1390831A1

Abstract

The invention relates to a product/program and method that can be used to insert into a software program watermarks in source code, particularly Java, which respect the semantics of the program and which are very difficult to detect. Said invention can be used to: calculate a secret semantic signature of a computer software or hardware program from among an infinite number of possible secret semantic signatures; mark a computer software or hardware program by inserting a visible or invisible mark by means of watermarking that can be used to find an authenticator of the original program; find the mark and extract said authenticator using the secret semantic signature of the watermarked computer software or hardware. The secret semantic signature of the computer software or hardware program to be protected is characteristic of the semantics of said program. The visible or invisible mark, which is inserted by watermarking from an original software or hardware computer program and which can be used to find an authenticator, can only be identified by finding the secret semantic signature of the watermarked program, which requires the secret to be known (or a computing power that goes beyond the possibilities of computer hardware). The mark can withstand tracking and scrubbing methods without affecting the performance of the program to be protected.

Description

DISPOSITIF ET PROCEDE POUR LA SIGNATURE, LE MARQUAGE ET
L'AUTHENTIFICATION DE PROGRAMMES D'ORDINATEUR.
La présente invention appartient au domaine des dispositifs et procédés pour prévenir etlou auditer une utilisation d'un programme d'ordinateur non autorisée par son auteur, son éditeur ou son distributeur.
Relèvent de la prévention les dispositifs et méthodes qui vérifient que la personne ou l'automate qui cherche à utiliser le programme d'une certaine manière dispose des droits nécessaires. De nombreux dispositifs et méthodes de cette catégorie ont été prévus à cet effet. En particulier le brevet US 6,108,420 divulgue un procédé et un dispositif pour produire une empreinte cryptographique comportant les données de la licence attribuée à
un utilisateur ou une classe d'utilisateurs puis pour chiffrer cette empreinte attachée au programme à protéger.
L'inconvénient de ces dispositifs et méthodes est qu'ils supposent la coopération de l'utilisateur qui ne doit pas communiquer les données de sa licence à d'autres utilisateurs ni supprimer la partie du programme, aisément ~5 repérable dans un code source car non fonctionnelle, qui comporte ces données d'identification.
Relèvent de l'audit les dispositifs et méthodes qui modifient le programme de manière caractéristique de l'exemplaire du programme, de manière non aisément décelable par l'utilisateur, sans en altérer les 2o fonctionnalités. En particulier, le brevet US 5,559,884 divulgue une méthode et un dispositif pour modifier de manière caractéristique de l'exemplaire du programme l'ordre d'exécution de blocs dudit programme.
L'inconvénient des dispositifs et méthodes de ce type est qu'ils supposent l'insertion dans les dits blocs d'instructioris d'appel et de retour qui 25 sont aisément repérables de manière automatique et affectent les performances du programme à protéger.
La présente invention a pour but de remédier aux inconvénients de ce deuxième type en divulguant un dispositif et une méthode pour calculer une signature du logiciel à protéger, caractéristique dudit programme, 3o résistant aux méthodes de repérage sans affecter les performances du programme à protéger et permettant de marquer le logiciél de manière secrète, les détenteurs du secret pouvant identifier la marque et la signature. DEVICE AND METHOD FOR SIGNATURE, MARKING AND
AUTHENTICATION OF COMPUTER PROGRAMS.
The present invention belongs to the field of devices and methods for preventing and / or auditing the use of a program computer not authorized by its author, publisher or distributor.
Prevention and prevention devices and methods that the person or machine seeking to use the program of a certain way has the necessary rights. Many devices and methods of this category have been provided for this purpose. In particular the US Patent 6,108,420 discloses a method and a device for producing a cryptographic fingerprint containing the license data assigned to a user or a class of users then to encrypt this fingerprint attached to the program to be protected.
The downside of these devices and methods is that they assume the cooperation of the user who must not communicate the data of his license to other users or delete part of the program, easily ~ 5 locatable in a source code because not functional, which includes these identification data.
The systems and methods that modify the program typically from the copy of the program, from not easily detectable by the user, without altering the 2o functionalities. In particular, US Patent 5,559,884 discloses a method and a device for characteristically modifying the copy of the program the order of execution of blocks of said program.
The disadvantage of such devices and methods is that they suppose the insertion in the so-called call and return instruction blocks who 25 are easily identifiable automatically and affect the program performance to protect.
The object of the present invention is to remedy the drawbacks of this second type by disclosing a device and a method for calculating a signature of the software to be protected, characteristic of said program, 3o resistant to tracking methods without affecting the performance of the program to protect and allowing to mark the software so secret, the holders of the secret being able to identify the brand and the signature.

2 A ces fins, l'invention divulgue un produit/programme d'ordinateur pour traiter les instructions d'un programme d'ordinateur en code source, caractérisé en ce qu'il comprend un module pour choisir en fonction de critères prédéfinis les instructions dudit logiciel auxquelles un programme de transcodage sera appliqué et un module pour choisir parmi plusieurs la méthode secrète de transcodage à appliquer aux dites instructions.
Parmi les modes de réalisation préférés, l'invention divulgue également un produit/programme d'ordinateur du type ci-dessus dont la méthode secrète de transcodage produit une signature sëmantique du logiciel.
Selon une variante de l'invention, le produit/programme d'ordinateur du type ci-dessus comprend un module pour insérer les instructions transcodées dans le logiciel.
L'invention divulgue également un procédé pour traiter les ~5 instructions d'un logiciel en code source comprenant une étape pour choisir en fonction de critères prédéfinis les instructions dudit programme auxquelles un programme de transcodage est appliqué et une étape pour choisir parmi plusieurs la méthode secrète de transcodage à appliquer aux dites instructions ainsi qu'un mode de réalisation préféré du procédé où la 2o méthode secrète de transcodage produit par une signature sémantique du logiciel ainsi qu'une variante de l'invention où le procédé comprend en outre une étape pour insérer les instructions transcodëes dans le logiciel.
L'invention sera mieux comprise et ses différentes caractéristiques et avantages ressortiront de la description qui suit et d'exemples de 25 réalisation, et de ses figures annéxées dont - la figure 1 montre un schéma de principe du dispositif et du procédé pour choisir la méthode de transcodage et instructions transcodées dans le programme ;
- la figure 2 montre le schéma du dispositif et du procédé dans 3o une de ses variantes de réalisation ;
- la figure 3 montre le schéma de principe du dispositif et du procédé pour décoder un programme modifié selon le principe de la figure 1 ; 2 For these purposes, the invention discloses a computer product / program to process the instructions of a computer program in source code, characterized in that it includes a module for choosing according to predefined criteria the instructions of said software to which a program of transcoding will be applied and a module to choose among several the secret method of transcoding to be applied to said instructions.
Among the preferred embodiments, the invention discloses also a product / computer program of the above type whose secret method of transcoding produces a semantic signature of the software.
According to a variant of the invention, the product / program computer of the above type includes a module for inserting instructions transcoded in the software.
The invention also discloses a method for treating ~ 5 software source code instructions including a step to choose according to predefined criteria the instructions of said program to which a transcoding program is applied and a step to choose from many the secret method of transcoding to apply to said instructions as well as a preferred embodiment of the method where the 2o secret method of transcoding produced by a semantic signature of the software as well as a variant of the invention where the method further comprises a step to insert the transcoded instructions into the software.
The invention will be better understood and its various characteristics and advantages will emerge from the following description and examples of 25 realization, and its attached figures including - Figure 1 shows a block diagram of the device and method for choosing the transcoding method and instructions transcoded into the program;
- Figure 2 shows the diagram of the device and method in 3o one of its variant embodiments;
- Figure 3 shows the block diagram of the device and method for decoding a program modified according to the principle of Figure 1;

3 - la figure 4 montre le schéma du dispositif et du procédé pour décoder un programme modifié selon le dispositif et le procédë
de la figure 2.
L'invention divulgue enfin un produitlprogramme d'ordinateur et un procédé pour reconnaître la signature du logiciel.
Dans les revendications, la description et les dessins, les expressions ci-dessous sont utilisées avec la signification indiquée ~ Un procédé de protection/prévention de logiciel, est un ensemble de techniques qui rendent plus difficiles la copie et l'utilisation frauduleuse des logiciels.
~ La compilation d'un programme est sa traduction dans un autre langage.
~ Un programme informatique est un programme interprétable ou compilable en un programme interprétable.
~5 ~ Un programme est écrit dans un certain langage de programmation appelé langage informatique du programme.
~ L'interprétation d'un programme est la traduction de la suite de mots le composant en une suite d'actions.
~ L'étiquetage d'un document ou d'un code consiste à inclure des 2o marques visibles ou invisibles, séparëes du contenu de l'objet, qui identifient par des champs l'objet logiciel (désignation, nom de l'auteur, nom du destinataire, termes de la licence d'utilisation) et un dernier champ de signature numérique qui lie complètement l'étiquette au contenu de l'objet et du même coup garantit l'intégrité de l'objet.
25 ~ Le tatouage consiste à insérer des marques visibles ou invisibles incrustées dans le corps de l'objet. Ces marques réparties dans le corps de l'objet sont sinon indécelables du moins indélébiles pour qui ne connaît pas la clé secrète qui a engendré le motif sous-jacent. Un tatouage est de l'information cachée dans des données numériques et qui n'en modifient pas 30 le sens.
~ Tatouer un code consiste à le transformer en un code équivalent sans modifier sa sémantique, en ajoutant de l'information cachée et récupérable grâce à un secret appelé clef. . 3 - Figure 4 shows the diagram of the device and method for decode a modified program according to the device and the procedure in Figure 2.
The invention finally discloses a computer program product and a process for recognizing the signature of the software.
In the claims, the description and the drawings, the expressions below are used with the meanings indicated ~ A software protection / prevention process is a set of techniques that make copying and fraudulent use more difficult of the software.
~ Compiling a program is translating it into another language.
~ A computer program is an interpretable program or compilable into an interpretable program.
~ 5 ~ A program is written in a certain programming language called computer program language.
~ The interpretation of a program is the translation of the series of words on component in a series of actions.
~ Labeling a document or code consists of including 2o visible or invisible marks, separated from the content of the object, which identify by fields the software object (designation, name of author, name of recipient, license terms) and a final field of digital signature that completely links the label to the content of the object and at the same time guarantees the integrity of the object.
25 ~ Tattooing involves inserting visible or invisible marks embedded in the body of the object. These marks distributed in the body of the object are otherwise undetectable at least indelible for those who do not know the secret key that generated the underlying pattern. A tattoo is from information hidden in digital data that does not change it 30 meaning.
~ Tattooing a code consists of transforming it into a code equivalent without changing its semantics, adding hidden information and recoverable thanks to a secret called key. .

4 ~ Deux codes logiciels sont sémantiquement équivalents s'ils ont le même comportement observable, c'est à dire par exemple, si pour toute entrée possible, les sorties du programme sont les mêmes.
~ Un procédé de signature consiste à joindre à un objet un segment de données caractéristique (résumé), obtenu à l'aide d'une méthode secrète.
~ Une technique de lessivage est une tentative d'effacement de la marque sans changer la sémantique.
~ Un programme, écrit dans un langage particulier (par exemple o Java), représente une suite d'instructions opérant sur l'état du système informatique.
~ L'état du système à un instant t est constitué par la valeur des variables du programme et des variables systèmes (files d'attente de flux entrée/sortie).
~ 5 ~ On appelle état du système associé, un ensemble de variables composé des variables existantes ou/et d'autres variables supplémentaires.
~ Une sémantique est un modèle mathématique définissant l'ensemble des comportements possibles d'un programme à l'exécution à un certain niveau d'observation.
20 ~ L' « obfuscation » est la transformation d'un programme en un programme sémantiquement équivalent sous une forme difficile à
comprendre par un informaticien.
~ L' Analyse Statique Sémantique . est la détermination automatique de propriétés sémantiques des programmes.
25 Pour assurer la sécurité d'un document numérique quelconque (image, son, texte, programme, etc.), on peut utiliser les techniques classiques de cryptologie (signatures électroniques, etc). On adjoint au contenu à sécuriser, les règles de sécurité (par exemple les données relatives à la licence du logiciel, mais ces règles peuvent être plus 3o personnalisées et plus complexes) relatives au document pour l'utilisateur autorisé. Ces règles sont écrites dans un autre document numérique (en général à part, comme par exemple un entête, une étiquette, etc).
Le document est protégé (ou non) par des méthodes de cryptographie (chiffrement du contenu, par exemple). L'étiquette est protégée par les techniques cryptographiques. Le tout est en général soudé par des mécanismes cryptographiques (signature électronique, par exemple).
L'inconvénient de ces dispositifs ou méthodes est qu'ils supposent, dans la chaîne de confiance de distribution, une coopération de 4 ~ Two software codes are semantically equivalent if they have the same observable behavior, i.e. for example, if for all possible entry, program outputs are the same.
~ A signature process consists in attaching to an object a characteristic data segment (summary), obtained using a secret method.
~ A leaching technique is an attempt to erase the brand without changing the semantics.
~ A program, written in a particular language (for example o Java), represents a series of instructions operating on the state of the system computer science.
~ The state of the system at an instant t is constituted by the value of program and system variables (stream queues enter exit).
~ 5 ~ We call state of the associated system, a set of variables composed of existing variables and / or other additional variables.
~ A semantics is a mathematical model defining the set of possible behaviors from a program to execution at a certain level of observation.
20 ~ "Obfuscation" is the transformation of a program into a semantically equivalent program in a difficult form to understand by a computer scientist.
~ Semantic Static Analysis. is the determination automatic semantic properties of programs.
25 To ensure the security of any digital document (image, sound, text, program, etc.), we can use the techniques classics of cryptology (electronic signatures, etc.). We add to the content to be secured, security rules (e.g. data related to the software license, but these rules may be more 3o personalized and more complex) relating to the document for the user authorized. These rules are written in another digital document (in general apart, such as a header, label, etc.).
The document is protected (or not) by methods of cryptography (encryption of content, for example). The label is protected by cryptographic techniques. The whole is generally welded by cryptographic mechanisms (electronic signature, for example).
The downside of these devices or methods is that they suppose, in the chain of trust of distribution, a cooperation of

5 tous les acteurs.
Aucun ne doit divulguer le contenu de l'étiquette ou modifier ou supprimer l'éfiiquette, si celle-ci est à l'extérieur ou à l'intérieur du programme. Personne ne doit usurper l'identité d'un utilisateur licite.
Les données d'identification d'un programme, même si celles-ci dépendent de l'heure, du üeu et du contexte peuvent être déviées de leur contexte propre.
Ces méthodes sont universelles et fonctionnent quelle que soit la nature du document (texte, signal, image ou programme, ...).
Elles ont un défaut majeur. Elles garantissent un contenu "bit à
bit"
- elles ne font pas la différence entre des modifications infimes ou profondes.
- elles ne font pas la différence entre une modification qui altère le contenu sémantique ou esthétique du document original et une modification 2o qui change le "sens" de ce document.
Les techniques cryptographiques traditionnelles se contentent de broyer tout contenu en une "farine" numérique. La méthode de broyage n'est pas dépendante de la nature, du format, de la syntaxe, ni de la sémantique de l'image ou du texte.
La sécurité d'un logiciel dépend de la politique de sécurité mise en vigueur par le propriétaire du logiciel. II s'agit en général de garantir - la disponibilité du logiciel : éviter la copie par des pirates (pour une revente ou une utilisation illicite) ; éviter l'utilisation non autorisée à partir d'un support physique original (CDROM) ; ' .
- la confidentialité du logiciel : éviter la compréhension du logiciel (confidentialité des algorithmes du logiciel source, afin de ne pas divulguer le secret du contenu du programme, la connaissance des algorithmes permettant une compréhension, une réécriture similaire, une modification, etc.) ; 5 all actors.
None should disclose the contents of the label or modify or remove the label, if it is outside or inside the program. No one should impersonate a lawful user.
The identification data of a program, even if these depend on time, üeu and context can be deviated from their proper context.
These methods are universal and work regardless of the nature of the document (text, signal, image or program, ...).
They have a major flaw. They guarantee "bit to bit"
- they do not differentiate between minute changes or deep.
- they do not make the difference between a modification which alters the semantic or aesthetic content of the original document and a modification 2o which changes the "meaning" of this document.
Traditional cryptographic techniques are content with grind everything in a digital "flour". The grinding method is not not dependent on the nature, format, syntax, or semantics image or text.
The security of software depends on the security policy implemented force by the software owner. It is generally a question of guaranteeing - software availability: avoid copying by hackers (for resale or unlawful use); avoid unauthorized use from an original physical medium (CDROM); '.
- software confidentiality: avoid understanding the software (confidentiality of source software algorithms, so as not to disclose the secret of program content, knowledge of algorithms allowing an understanding, a similar rewriting, a modification, etc.);

6 - l'intégrité du logiciel : éviter la modification du logiciel, soit en ne changeant pas le contenu sémantique, mais en changeant seulement la syntaxe (en modifiant le nom des variables, en rajoutant des instructions inutiles, etc comme dans les "obfuscateurs" pour les programmes Java par exemple), soit en changeant son contenu (ajout d'un virus, ajout ou retrait d'un "patch", modification classique, emprunt d'un morceau, ...) ;
- l'authentification pour garantir l'origine et le contenu du logiciel (sceau à une certaine date) : pour assurer l'antériorité d'un tatouage par rapport à un autre, on a recours à une infrastructure de sécurité (Tierce 1o Partie de Confiance à Valeur Ajoutée).
Les méthodes de tatouage logiciel s'inspirant des méthodes de tatouage d'autres types d'objets (images et sons) sont vouées à l'échec. En effet, un code informatique, par nature, est complètement différent des objets audio, vidéo et des images. Pour ces derniers, une légère perte ~5 d'information ne modifie pas le sens : nos capteurs sensoriels sont imparfaits. Ce n'est pas le cas du code informatique. II ne supporie que les compressions sans perte d'information. Une modification, aussi infime soit-elle, peut le rendre non fonctionnel. Seules les modifications qui prennent en compte la sémantique sont valides, par exemple les opérations 2o d'optimisation de code machine effectuées par les compilateurs ou l'obfuscation.
L'ïnconvénient des méthodes et dispositifs de tatouage de logiciel est qu'ils considèrent le code en tant qu'objet syntaxique et n'exploitent .pas sa sémantique. Pour cette raison, il est facile de retrouver la 25 marque de tatouage par analyse syntaxique du programme, et d'enlever la marque par transformation syntaxique.
De plus les méthodes générales de tatouage de logiciel développées jusqu'à présent induisent des modifications du programme facilement repérables. L'insertion des instructions de saut et de retour sont 3o par exemple repérables de manière automatique. En ce qui concerne l'ajout de structures de graphes, ces structures sont également facilement identifiables dans les codes non complètement compilés. Elles alourdissent aussi le fonctionnement du programme de façon significative.
Pour la sécurité des logiciels, relèvent de 1â prévention les 35 dispositifs et méthodes qui vérifient que la personne ou l'automate qui 6 - software integrity: avoid modifying the software, either by not changing the semantic content, but only changing the syntax (by modifying the names of the variables, by adding instructions unnecessary, etc. as in "obfuscators" for Java programs by example), either by changing its content (adding a virus, adding or removing a "patch", classic modification, borrowing a song, ...);
- authentication to guarantee the origin and content of the software (seal on a certain date): to ensure the precedence of a tattoo by compared to another, we use a security infrastructure (Third Party 1o Value Added Confidence Game).
Software tattoo methods inspired by tattooing other types of objects (images and sounds) are doomed to fail. In indeed, a computer code, by nature, is completely different from audio, video and image objects. For the latter, a slight loss ~ 5 information does not change the meaning: our sensory sensors are imperfect. This is not the case with computer code. It only assumes the compressions without loss of information. A modification, however small it can make it non-functional. Only changes that take into account account semantics are valid for example operations 2o machine code optimization performed by compilers or obfuscation.
The downside of tattoo tattooing methods and devices software is that they view code as a syntactic object and do not exploit its semantics. For this reason, it is easy to find the 25 tattoo mark by parsing the program, and removing the mark by syntactic transformation.
Plus general software tattoo methods developed so far induce program modifications easily spotted. The insertion of the jump and return instructions are 3o for example automatically identifiable. Regarding the addition of graph structures, these structures are also easily identifiable in codes not fully compiled. They weigh down also the operation of the program significantly.
For software security, prevention 35 devices and methods that verify that the person or the machine that

7 cherche à utiliser le programme d'une certaine manière dispose des droits nécessaires. De nombreux dispositifs et méthodes de cette catégorie ont êté
prévus à cet effet. Il s'agit de dispositifs de protection de logiciels (leur utilisation dépend généralement d'un dispositif matériel: carte à puce, "dongle") ou de dissuasion d'utilisation frauduleuse de logiciels (étiquette, bannière, marque à l'extérieur ou à l'intérieur d'un programme source, instructions en langage machine dans le programme exécutable pour identifier un contexte d'utilisation).
Avec l'avancëe des nouvelles technologies et l'accroissement o du nombre d'utilisateurs d'Internet, la protection de la propriétë
intellectuelle devient une priorité pour les producteurs et les vendeurs de logiciels. De nombreux dispositifs et méthodes de protectïon ont été prëvus à cet effet. On notera la famille des dispositifs qui fonctionnent avec un matériel spécialisé.
Parmi les méthodes de protection logicielles, on distingue le tatouage de logiciel et !'obfuscation. Les méthodes de tatouage qui en font partie se répartissent en deux familles ~ les méthodes dites dynamiques, qui prennenfi en compte (a dimension temporelle de l'exécution du programme. La marque insérée peut être obtenue - en lisant la sortie du programme pour une entrée donnée.
- en lisant le contenu d'une variable pendant l'exécution, pour une entrée donnée.
- en notant !'ordre d'exécutïon des blocs d'instructions. En particulier, le brevet 5,559,884 divulgue une méthode et un dispositif pour modifier de manière caractéristique l'ordre d'exécution de blocs dudit programme, par insertion d'instructions d'appel et de retour.
~ (es méthodes dites syntaxiques statïques, grâce auxquelles on insère et on lit la marque sans l'exécution du programme. La marque peut être obtenue 3o - en étudiant l'ordre des instructions - en examinant les données utilisées par le programme (texte, photos, sons) Un descriptif de l'ensemble des méthodes existantes est présenté dans l'article de Coalberg et Thomborson, Softwarè Watermarking Models and Dynamic Embeddinas (1998). 7 seeks to use the program in a certain way has rights required. Many devices and methods in this category have been provided for this purpose. These are software protection devices (their use generally depends on a hardware device: smart card, "dongle") or deterrent of fraudulent use of software (label, banner, mark outside or inside a source program, machine language instructions in the executable program for identify a context of use).
With the advancement of new technologies and the increase o the number of Internet users, the protection of property intellectual becomes a priority for software producers and sellers. Of numerous protective devices and methods have been provided for this purpose. We note the family of devices that work with hardware specialized.
Among the software protection methods, there is a tattoo of software and obfuscation. The tattoo methods that are part of it split into two families ~ so-called dynamic methods, which take into account (a dimension execution of the program. The mark inserted may be obtained - by reading the program output for a given input.
- by reading the content of a variable during execution, to a given entry.
- by noting the order of execution of the blocks of instructions. In In particular, patent 5,559,884 discloses a method and a device for characteristically change the order of execution of said blocks program, by inserting call and return instructions.
~ (es so-called static syntactic methods, thanks to which we insert and read the mark without executing the program. The brand can be obtained 3o - by studying the order of instructions - by examining the data used by the program (text, photos, sounds) A description of all the existing methods is presented in the article by Coalberg and Thomborson, Softwarè Watermarking Models and Dynamic Embeddinas (1998).

8 L'invention appartient à une troisième catégorie, celle des dispositifs et méthodes statiques sémantiques.
Dans cette catégorie, on peut trouver des signatures qui auront un sens arbitraire mais qui resteront néanmoins cachées car le logiciel comporte lui-même une part importante de texte non structuré.
Mais dans son mode de réalisation préféré, le dispositif proposé s'applique de manière spécifüque aux programmes logiciels qui ont une sémantique donnée, c'est à dire les programmes écrits dans un langage de programmation (« programming languages ») : il est particulièrement o applicable aux programmes interprétés (Java, PostScript, etc) car ces programmes, une fois écrits en langages machines, sont facilement déchiffrables par de la rétro-ingénierie dans leur forme originelle, compréhensible par un informaticien. Mais il est bien entendu que ce dispositif est aussi applicable aux programmes écrits en langages ~5 compilables (C, C++, Basic, Ada, Pascal, VHDL, Estérel, ...) pour authentifier et contrôler l'origine, le contenu ou la destination.
Le dispositif dans son mode de réalisation préféré dépend de la sémantique du langage. Ainsi, les tatoueurs Java, VHDL, etc sont différents.
Ils peuvent avoir des parties similaires (pour la sëmantique de leurs 20 opérations arithmétiques) mais parfois spécifiques (Les pointeurs en langage C). Le dispositif est applicable aux langages à balises pour les parties qui contiennent des programmes (« des scripts »). Pour XML, les autres parties relèvent des procédés de tatouages d'images (modifications imperceptibles de ta présentation sur écran ou papier). Le dispositif est également 25 applicable aux parties qui comportent des opérations (programmes « macros » pour « centrer », « justifier », etc).
Dans son mode de réalisation préféré, l'invention s'appuie sur les principes de l'analyse sémantique.
L'analyse sémantique a été appliquée avec succès à la 3o certification des programmes à exigence draconienne de fiabilité.
(référence 1 : Abstract Interpretation : a unified lattice modal for static analysis of programs by construction or approximations of fixpoints, P.Cousot & R.
Cousot January 17-19, 1977 référence 2 : P.Lacari, J.N. Montfort, Le Vinh Quy Ribal, A. Deutsch, and F. Gouthin. The software reliability verification 35 process : The Ariane 5 example. In Proceedings DA/A 98 - Data Systems in 8 The invention belongs to a third category, that of semantic static devices and methods.
In this category, we can find signatures that will have an arbitrary meaning but which will nevertheless remain hidden because the software itself contains a large part of unstructured text.
But in its preferred embodiment, the device proposed applies specifically to software programs that have a given semantics, i.e. programs written in a language programming languages: it is particularly o applicable to interpreted programs (Java, PostScript, etc.) because these programs, once written in machine languages, are easily decipherable by reverse engineering in their original form, understandable by a computer specialist. But it is understood that this device is also applicable to programs written in languages ~ 5 compilable (C, C ++, Basic, Ada, Pascal, VHDL, Estérel, ...) for authenticate and control the origin, content or destination.
The device in its preferred embodiment depends on the language semantics. So, tattoo artists Java, VHDL, etc. are different.
They can have similar parts (for the semantics of their 20 arithmetic operations) but sometimes specific (The pointers in language VS). The device is applicable to tag languages for parts which contain programs ("scripts"). For XML, the other parts come from image tattooing processes (imperceptible modifications of your presentation on screen or paper). The device is also 25 applicable to parties involving operations (programs "Macros" to "center", "justify", etc.).
In its preferred embodiment, the invention is based on the principles of semantic analysis.
Semantic analysis has been successfully applied to the 3o certification of programs with draconian reliability requirements.
(reference 1: Abstract Interpretation: a unified lattice modal for static analysis of programs by construction or approximations of fixpoints, P.Cousot & R.
Cousot January 17-19, 1977 reference 2: P.Lacari, JN Montfort, Le Vinh Quy Ribal, A. Deutsch, and F. Gouthin. The software reliability verification 35 process: The Ariane 5 example. In Proceedings DA / A 98 - Data Systems in

9 Aerospace, Athènes, Grèce. ESA Publications, SP-422,25-28 mai 1998).
Pour prouver qu'un programme s'exécute sans défaillance en contexte opérationnel, il faut en théorie étudier toutes les exécutions possibles dudit programme. Les travaux ci-dessus, ont montré que l'on pouvait élargir à
l'ensemble de toutes les exécutions possibles la preuve du bon fonctionnement. On considère qu'un programme est construit à partir d'un ensemble de naeuds liés entre eux. Les instructions relient les naeuds entre eux et provoquent ainsi des transitions d'états matérialisées par un changement d'une ou plusieurs variables ou/et un changement de noeud.
Lorsque l'on fait une analyse sémantique d'un programme par interprétation abstraite, on calcule par itération [cf. « Abstract Interpretation A unified lattice model for static analysis of programs by construction or approximation of fixpoints, P.Cousot & R.Cousot January 17-19, 1977], pour chaque variable et pour chaque noeud, un sur-ensemble de l'ensemble des ~5 valeurs que pourra prendre une variable à un noeud donné.
Trouver ces sur-ensembles revient à résoudre une équation de point-fixe, dont on obtient une approximation par itération. Les exemples montreront des méthodes d'itération dans des cas précis.
Le problème que l'inventeur se propose de résoudre relève de 20 la même approche : si les instructions ajoutées au programme à protéger appartiennent à un sous-ensemble des invariants dudit programme, alors les dites instructions s'exécuteront sans erreur et ne modifieront pas les fonctionnalités dudit programme. II existe plusieurs sous-ensembles pertinents. II est judicieux de choisir un sous-ensemble particulièrement 25 représentatif du type de programmes à protéger.
Bien que d'autres solutions puissent être envisagées, il est actuellement préféré que la sélection des instructions à transcoder soit effectuée selon des critères prédéfinis incluant la sélection de sous-ensemble d'invariants du programme tels que définis ci-dessus.
3o Le module (310) de la figure 1 permet d'effectuer cette sélection des instructions à transcoder à partir de la définition des caractéristiques des dites instructions (opérations sur des entiers, opérations sur les flottants, initialisation de variables, boucle conditionnelle, branchement...) en application de critères au moins partiellement prédéfinis. Le ~paramétrage du 35 module (310) du compilateur (30) permet alors la sélection automatique des instructions â transcoder. L'intervention de l'opérateur sera également possible pour sélectionner une partie autonome du programme à signer sémantiquement. En java par exemple, un programme est assimilé à
plusieurs classes composées de plusieurs méthodes, elles-mêmes pouvant 5 faire appel à des méthodes d'autres classes et d'autres packages. C'est pour cette raison que dans ce cas particulier nous préférerons signer la plus petite entité autonome d'un programme java : la méthode.
Les parties du programme à signer doivent être les parties innovantes et/ou sensibles du programme. En général elles concernent la partie algorithmique du programme.
II s'agit ensuite de choisir parmi plusieurs la méthode de transcodage en application de critères au moins partiellement prédéfinis. Ce choix dépendra du caractère sensible des informations à protéger. Le transcodage sera donc plus ou moins complexe, étant entendu qu'il devra ~5 respecter la sémantique des instructions à transcoder, c'est-à-dire utiliser le même vocabulaire et la même grammaire. Le module (320) contient les méthodes de transcodage possibles pour chaque langage. Une méthode de transcodage peut avoir un degré de sophistication correspondant au niveau de protection souhaité. Pour un niveau donné de protection, un nombre 2o alloué de paramètres possibles seront attribuables à des utilisateurs ou des classes d'utilisateurs.
Dans cette liste, chaque méthode de transcodage comporte un secret à préserver. En effet, sa connaissance permettra le décodage et donc l'enlèvement des instructions supplémentaires, ce qui supprimera toute 25 preuve d'une copie éventuelle ultérieure non autorisée.
Une table de transcodage est une application qui, à toute opération effectuée sur l'état du système, associe une autre opération effectuée sur l'état du système associé. Créer une sémantique secrète revient donc à créer une analyse statique correspondante, ce qui consiste à
3o définir les variables associées, et la relation entre, les actions opérant sur l'état du système et celles opérant sur les variables associées.
On peut par exemple garder les même variables que celles utilisées par Je programme. Notre table de transcodage donnera une autre interprétation des instructions du programme, et affecterâ aux variables 35 associées d'autres valeurs que celles qu'auraient prises les variables initiales. On peut également considérer une et une seule variable associée V, quelles que soient les variables initiales. La suite des instructions du programme sera alors interprétée en une suite d'actions sur la variable V.
L'invention prend en compte toutes les actions opérant sur l'état du système d'environnement du programme informatique : une telle action peut affecter la pile d'exécution, modifier le contenu d'une variable système, appeler une fonction, ordonner la lecture d'une autre variable, effectuer une opération logique ou mathématique, créer un objet, envoyer ou recevoir un flux I/O, lire un élément d'un tableau...
1 o Utiliser une table de transcodage secrète renforce les dispositifs ' de signature et de tatouage. Sans ce secret, il est très difficile de trouver la même signature (selon l'argument classique en crypto : on ne peut pas raisonnablement essayer toutes les combinaisons possibles) de même qu'il est quasiment impossible de retrouver une marque.
~ 5 La signature sémantique proposée caractérise l'algorithme développé en tant que tel, et permet de trouver les programmes qui ont une sémantique équivalente.
La marque de tatouage est obtenue par analyse statique (après transcodage) du programme, qui constitue la clef de notre secret. II y a peu 2o de bruit ajouté, et la marque est donc plus résistante.
La table de transcodage secrète rend plus difficïle la détection de la marque.
A ce stade, il est possible de déposer la signature du logiciel (10) telle que produite en sortie du module (320) chez une Tierce Partie de 25 Confiance (TPC). Elle n'a donc pas à être insérée dans le logiciel (10).
L'authentification du logiciel s'effectuera alors auprès de la TPC par comparaison de la signature produite avec le logiciel à analyser avec celle qui a été déposée. Dans ce cas, le logiciel n'est pas protégé contre ia copie mais les modifications qu'il a éventuellement subies sont détectables.
30 Dans une variante de l'invention, la signature est insérée dans le logiciel (10) sous forme de marque.
Le module (330) réalise d'une part le transcodage des instructions classées avec la méthode choisie et d'autre part l'insertion des instructions transcodées dans le programme à protéger. L'algorithme principal du module 35 (330) a pour but d'optimiser le placement des instructions transcodées.
Elle doit être adaptée à la typologie des instructions et des méthodes de transcodage. Les compilateurs contiennent normalement des outils d'optimisation de code et de vérification de la propagation des constantes.
L'insertion des instructions transcodées dans le programme à protéger devra donc respecter un certain nombre de règles minimales de robustesse : il faut s'assurer que les variables contenant les marques de tatouage aient une influence sur les sorties du programme. En d'autrés termes, pour éviter la suppression pure et simple par un optimiseur, il faut s'assurer que le contenu de la variable sera repris par une instruction du programme initial. II faut 1o également dissimuler les constantes. II faut pour cela éviter les initialisations de variable v par une valeur constante IC, suivies par une instruction du type w=f(v), au début du programme, car il est alors facile pour un optimiseur de calculer directement w. On pourrait par contre privilégier les boucles : dans une boucle, il est naturel d'initialiser une variable (souvent à 1 ),et de ~5 l'incrémenter à chaque exécution de boucle. Supposons que dans la boucle, i prenne des valeurs de a à b. On pourrait insérer l'instruction w=f(i) dans la boucle et calculer f de telle sorte que w prenne la valeur de la clef pour une valeur de i comprise entre a et b.
Une fois les propriétés sémantiques du programme conjugué
2o déterminées, on calcule par la table de transcodage inverse, les instructions à ajouter au programme initial. Les instructions ajoutées doivent s'exécuter sans erreur et ne pas modifier les fonctionnalités dudit programmé. Ces transformations seront iso-sémantiques : pour toute entrée identique, les programmes initiaux et tatoués produiront la même sortie. Mais par contre, 25 les environnements internes d'exécution ne seront pas identiques.
La marque est donc un ensemble de propriétés sémantiques du programme transcodé. Notre dispositif de tatouage consiste donc à calculer les instructions à ajouter dans le programme initial qui permettront d'obtenir ces nouvelles propriétés dans l'espace caché.
3o L'insertion des instructions transcodées dans le programme à
protéger devra respecter un certain nombre de règles minimales de robustesse : il faut s'assurer que les variables et les opérations contenant les marques de tatouage donnent l'impression qu'ils aient une influence sur les sorties du programme. Par exemple, pour éviter la suppression pure et 35 simple par un optimiseur, il faut s'assurer que le contenu de la variable sera repris par une instruction du programme initial.(allocation dynamique de valeur) II est important de noter que l'observation des valeurs prises par les variables lors de l'exécution ne permet pas de retrouver la marque. D'une part il faut connaître la table secrète pour établir notre programme conjugué.
D'autre part, le calcul de propriétés sémantiques ne pourra se faire dans la plupart des cas que par analyse statique, et non pas par une exécution pas à
pas.
La figure 2 expose un exemple de réalisation.
o La méthode (310) de la figure 2 sera donc appliquée à chaque méthode du programme (ou au moins à celles qui ont un contenu algorithmique que l'on souhaite protéger). La sélection des instructions à
transcoder tient compte de la dynamique d'exécution du code Java l'allocation des objets s'effectue "sur le tas", c'est-à-dire en fonction du contexte d'utilisation des ressources des processeurs, de la mémoire et des entrées/sorties et, le cas échéant, des autres membres du réseau. Les instructions portant sur les objets ne sont donc pas recommandées pour le tatouage puisque celui-ci doit être indépendant du contexte.
Le tatouage doit donc porter sur des valeurs scalaires ou des 2o références à des objets qui sont les seuls opérandes à allouer sur la pile du programme.
Le critère prédéterminé peut comprendre le choix comme instructions à transcoder des opérations sur les entiers, les flottants ou les valeurs booléennes. Pour les programmes standards on choisira les opérations sur les entiers. Pour les programmes de conception assistée sur ordinateur, de simulation ou de modelage en trois dimensions où les opérations sur scalaires flottants dominent, on choisira plutôt ces opérations.
Le module (310) de la figure 2 est alors uniquement utilisé pour sélectionner les instructions (110), (130), (150) des méthodes du programme 9 Aerospace, Athens, Greece. ESA Publications, SP-422.25-28 May 1998).
To prove that a program is running without failure in context operational, in theory it is necessary to study all the possible executions of said program. The above work has shown that we can expand to the set of all possible executions proof of good operation. We consider that a program is built from a set of knots linked together. The instructions link the nodes between them and thus cause state transitions materialized by a change of one or more variables and / or change of node.
When doing a semantic analysis of a program by abstract interpretation, we calculate by iteration [cf. "Abstract Interpretation A unified lattice model for static analysis of programs by construction or approximation of fixpoints, P.Cousot & R.Cousot January 17-19, 1977], for each variable and for each node, a superset of the set of ~ 5 values that a variable can take at a given node.
Finding these supersets amounts to solving an equation of fixed-point, of which we obtain an approximation by iteration. The examples will show iteration methods in specific cases.
The problem which the inventor proposes to solve concerns 20 the same approach: if the instructions added to the program to be protected belong to a subset of the invariants of said program, then the said instructions will execute without error and will not change the functionalities of said program. There are several subsets relevant. It is a good idea to choose a particular subset 25 representative of the type of programs to be protected.
Although other solutions may be considered, it is currently preferred that the selection of instructions to transcode be performed according to predefined criteria including the selection of sub-set of program invariants as defined above.
3o The module (310) of FIG. 1 allows this selection to be made instructions to transcode from the definition of the characteristics of the say instructions (operations on integers, operations on floats, initialization of variables, conditional loop, connection ...) in application of criteria at least partially predefined. The ~ configuration of the 35 module (310) of the compiler (30) then allows the automatic selection of the instructions to transcode. The operator intervention will also be possible to select an autonomous part of the program to sign semantically. In java for example, a program is assimilated to several classes composed of several methods, which can themselves 5 call methods from other classes and other packages. It's for this reason that in this particular case we prefer to sign the most small autonomous entity of a java program: the method.
The parts of the program to be signed must be the parts innovative and / or sensitive program. In general they concern the algorithmic part of the program.
It is then a question of choosing among several the method of transcoding according to at least partially predefined criteria. This choice will depend on the sensitive nature of the information to be protected. The transcoding will therefore be more or less complex, it being understood that it will have to ~ 5 respect the semantics of the instructions to transcode, that is to say use the same vocabulary and the same grammar. The module (320) contains the transcoding methods possible for each language. A method of transcoding can have a level of sophistication corresponding to the level protection. For a given level of protection, a number 2o allocated of possible parameters will be attributable to users or of the user classes.
In this list, each transcoding method has a secret to be preserved. Indeed, his knowledge will allow decoding and therefore removing additional instructions, which will remove any 25 proof of any subsequent unauthorized copy.
A transcoding table is an application which, at any operation performed on the state of the system, associates another operation performed on the state of the associated system. Create secret semantics therefore amounts to creating a corresponding static analysis, which consists of 3o define the associated variables, and the relationship between, the actions operating sure the state of the system and those operating on the associated variables.
We can for example keep the same variables as those used by the program. Our transcoding table will give another interpretation of program instructions, and assign to variables 35 associated with other values than those that the variables would have taken initials. We can also consider one and only one associated variable V, whatever the initial variables. The following instructions from program will then be interpreted as a series of actions on variable V.
The invention takes into account all the actions operating on the state of the computer program environment system: such an action can affect the execution stack, modify the content of a system variable, call a function, order the reading of another variable, perform a logical or mathematical operation, create an object, send or receive a I / O flow, read an element of an array ...
1 o Using a secret transcoding table strengthens the devices 'signature and tattoo. Without this secret, it is very difficult to find the same signature (according to the classic argument in crypto: we cannot reasonably try all possible combinations) as well as is almost impossible to find a brand.
~ 5 The proposed semantic signature characterizes the algorithm developed as such, and allows you to find programs that have a equivalent semantics.
The tattoo mark is obtained by static analysis (after transcoding) of the program, which is the key to our secret. There is little 2o of noise added, and the brand is therefore more resistant.
The secret transcoding table makes it harder to detect the brand.
At this stage, it is possible to deposit the software signature (10) as produced at the output of the module (320) at a Third Party of 25 Confidence (TPC). It therefore does not have to be inserted into the software (10).
The software will then be authenticated to the TPC by comparison of the signature produced with the software to be analyzed with that that has been filed. In this case, the software is not copy protected but any changes it has undergone are detectable.
In a variant of the invention, the signature is inserted in the software (10) in the form of a trademark.
The module (330) performs on the one hand the transcoding of the instructions classified with the chosen method and on the other hand the insertion of instructions transcoded into the program to be protected. The main module algorithm 35 (330) aims to optimize the placement of transcoded instructions.
She must be adapted to the typology of instructions and methods of transcoding. Compilers normally contain tools code optimization and constant propagation verification.
The insertion of transcoded instructions into the program to be protected must therefore comply with a certain number of minimum robustness rules:
ensure that the variables containing the tattoo marks have a influence on program outputs. In other words, to avoid the outright deletion by an optimizer, you must ensure that the content of the variable will be taken up by an instruction from the initial program. We must 1o also conceal the constants. To do this, avoid boots of variable v by a constant value IC, followed by an instruction of the type w = f (v), at the start of the program, because it is then easy for a directly calculate w. We could however favor loops: in a loop, it is natural to initialize a variable (often at 1), and to ~ 5 increment it with each loop execution. Suppose in the loop, i take values from a to b. We could insert the instruction w = f (i) in the loop and calculate f so that w takes the value of the key for a value of i between a and b.
Once the semantic properties of the program have been combined 2o determined, we calculate by the reverse transcoding table, the instructions to add to the initial program. The added instructions must execute without error and do not modify the functionality of said programmed. These transformations will be iso-semantic: for any identical entry, the initial and tattooed programs will produce the same output. But on the other hand, 25 the internal execution environments will not be identical.
The brand is therefore a set of semantic properties of the transcoded program. Our tattooing device therefore consists in calculating the instructions to add in the initial program which will allow to obtain these new properties in the hidden space.
3o The insertion of the transcoded instructions in the program to protect will have to respect a certain number of minimum rules of robustness: make sure that the variables and the operations containing the tattoo marks give the impression that they have an influence on program outputs. For example, to avoid pure deletion and 35 simple by an optimizer, we must ensure that the content of the variable will be taken up by an instruction from the initial program (dynamic allocation of value) It is important to note that the observation of the values taken by variables during execution do not allow to find the mark. On the one except you must know the secret table to establish our combined program.
On the other hand, the calculation of semantic properties cannot be done in the most cases only by static analysis, and not by runtime execution not.
Figure 2 shows an exemplary embodiment.
o The method (310) of Figure 2 will therefore be applied to each program method (or at least those that have content algorithmic that we want to protect). The selection of instructions to transcoder takes into account the execution dynamics of Java code objects are allocated "on the job", that is to say according to the context of resource usage of processors, memory and inputs / outputs and, if applicable, other members of the network. The object instructions are therefore not recommended for the tattoo since it must be independent of the context.
The tattoo must therefore relate to scalar values or 2o references to objects which are the only operands to allocate on the stack of program.
The predetermined criterion may include choice as instructions to transcode operations on integers, floats or boolean values. For standard programs we will choose the operations on integers. For assisted design programs on computer, simulation or three-dimensional modeling where the operations on floating scalars dominate, we will rather choose these operations.
The module (310) of Figure 2 is then only used for select the instructions (110), (130), (150) of the program methods

(10) qui comportent des opérations sur les entiers.
Un transcodage qui respecte la sémantique de l'instruction consiste alors à effectuer la même opération modulo à un entier quelconque Nk inférieur à 232. Une alternative consiste à effectuer une opération différente choisie dans une table de permutation, encore en algèbre modulaire. Le module (320) a, dans ce cas, pour fonction principale de conserver la table des secrets attribués à des classes d'utilisateurs donnés pour des programmes donnés.
Le module (330) pour calculer et insérer les instructions transcodées est une suite d'instructions du compilateur (30) du type décrit ci après.
On veut insérer p marques dans chaque méthode sélectionnée.
Soient p nombres n1, n2, ....,nP et p valeurs de marques ci<n1 ; c2 <n2 ; ....
Cp<np constituant l'ensemble secret.
On répétera la suite des macro-instructions de l'annexe 1 pour 1o chaque méthode à tatouer.
II est bien entendu possible de répéter plusieurs fois le processus avec des valeurs différentes de marques.
II est également possible d'améliorer la robustesse du tatouage en utilisant différentes techniques de l'algèbre modulaire.
~5 La technique la plus forte pour lier le tatouage au programme est de faire interférer les variables de tatouage avec la méthode originale. II faut pour cela pouvoir utiliser des propriétés de la variable de tatouage qui soient transposables en arithmétique signée sur 32 bits. C'est possible lorsque la clef K est une puissance de 2. En effet, supposons.que K = 2k. Alors on a 2o X = v + a. 2k dans Z
Nous savons, toujours dans Z, que, pour tout j < k X% 2' = v% 2' Où x%y désigne l'opération qui renvoie le reste de la division euclidienne de x par y. On remarque que cette propriëté reste trivialement 25 vraie dans Z/232 qui est le domaine des entiers Java. Nous pouvons donc utiliser des propriétés arithmétiques du tatouage pour modifier des calculs de la méthode. Par exemple, supposons que K = 216 et que v = 18. Alors quelle que soit la valeur de x, on a toujours x% 4 = 2.
Si on a par exemple une constante explicite 1 dans le programme 30 original, on peut la remplacer par x% 4-1. Si maintenant on a rajouté une dynamique à x, on a une variable qui prend en apparence des valeurs stochastiques mais dont on utilise un invariant caché. Le programme ainsi modifié est dégradé de manière irréversible tout en conservant sa sémantique originale. Un pirate qui chercherait .à éliminér la variable de 35 tatouage x rendrait alors le programme inutilisable. II faut une étude poussée du comportement de x pour pouvoir retrouver l'information ainsi dissimulée.
Notons que cette technique de dissimulation de constantes peut se faire simplement de manière automatique et aléatoire.
La méthode générique de la lecture du tatouage est présentée sur 5 la figure 3. Elle suppose la connaissance d'une partie au moins des paramètres de tatouage. II est ainsi possible d'affecter à plusieurs niveaux d'une chaîne de distribution de logiciel (acheteur, grossiste, distributeur, détaillant) des marques qui leur seront propres.
Dans l'exemple de réalisation de la figure 4, la connaissance du 10 ou des nombres secrets Nk permet de retrouver par une exécution pas à pas du programme dans le compilateur les variables qui comportent une congruence modulo Nk à un instant donné puis de tracer cette variable jusqu'à son initiaüsation.
La table des secrets peut être aisément contenue sur une carte à
~5 microprocesseur qui sera connectée à l'ordinateur comportant l'interpréteur.
L'utilisateur habilité n'a à connaître qu'une clé publique qui activera le programme de lecture des secrets correspondant à son identification d'utilisateur. Les clés privées qui définissent la table des secrets n'ont donc pas à être divulguées.
2o Deux exemples donnés en annexes 2 et 3 permettent d'illustrer l'application de l'analyse statique sémantique à l'authentification de la signature du logiciel avec deux cas simples de calcul des points fixes des variables cachées.
Ces dispositifs et procédés peuvent être mis en oeuvre sans difficulté sur des ordinateurs de commerce. En fonction de la complexité du logiciel et de la signature, les temps d'exécution seront plus ou moins longs, notamment pour le calcul des variables cachées par la méthode du point fixe.
Pour limiter ces temps de calcul, on appliquera les méthodes d'élargissement et de rétrécissement connues de l'homme du métier.
3o On peut tatouer autant de fois que l'on veut un logiciel (superposition de tatouages), contrairement aux tatouages de sons ou d'images qui subissent une certaine saturation du canal subliminal (selon le modèle de perception choisi). On n'altèrera pas le logiciel, on le surchargera, les ressources mémoire et temporelles seront affectées.
On peut utiliser pour chacun de ces tatouages des sémantiques secrètes différentes, si bien que la chaîne de confiance peut avoir des secrets différents.
Bien entendu, ce dispositif et ce procédé de tatouage peuvent être combinés avec des dispositifs et procédés de l'art antérieur pour rendre le programme inutilisable à un utilisateur non autorisé (prévention) pour tracer la dissémination éventuelle dudit programme par un utilisateur autorisé à des 1o utilisateurs non autorisés (audit). II suffira pour cela de ne pas communiquer aux utilisateurs autorisés les clés leur permettant de retrouver le tatouage.
II est également possible d'utiliser le dispositif et le procédé pour authentifier de manière automatique les codes que l'on va autoriser à
pénétrer sur un réseau ou sur un poste donné. Le tatouage est assimilable à
~5 un certificat d'authentification dont le poste de surveillance du réseau ou du poste disposera de la clé de lecture.
L'Annexe 4 présente un glossaire.
L'Annexe 5 présente un mode de réalisation de l'invention avec un découpage en modules plus fin.
2o L'Annexe 6 et l'Annexe 7 présentent de nouveaux exemples de réalisations (exemples 3 et 4).

ANNEXES

1.1 Décomposition de la méthode en 2 blocs A et B de taille équivalente.
1.2 Pour i variant de 1 à p faire 1.2.a trouver une position aléatoire dans le bloc A
1.2.b faire la liste des variables ayant une valeur déterminée à ce niveau d'exécution - 2 cas cas 1 : il n'y en a pas Créer une variable w initialisée à une valeur x. Insérer cette initialisation entre le début de la méthode et notre position actuelle.
cas 2 : il y en a au moins une 75 Sélectionner une de ces variables w, soit x sa valeur à
notre position actuelle d'exécution.
1.2.c créer un polynôme P quelconque de degré 2 vérifiant P(x) = c; + k*n; (k entier âléatoire petit) 1.2.d insérer l'instruction d'initialisation suivante : int 2o v;=P(w) 1.2.e trouver une position aléatoire dans le bloc B
1.2.f créer un polynôme Q quelconque de degré 2 vérifiant Q(v;)=v;+1 *n; (1 entier aléatoire petit) ' 1.2.g insérer l'instruction v;=Q(v;) Exemple 1 La méthode tatouée est la méthode principale de la classe 3o Fibonacci, qui calcule la valeur du noème terme de la suite de Fibonacci, définie par les un+2 un+1 + un relations uo = 0 u1 = 1 Programme initial public class Fibonacci public Fibonacci() public static void main(String[] args) f int n=Integer.parselnt(args[0]);
int a=0;
int b=1;
for (int i=1;i<n;i++) f 2o int c=a+b;
a=b; //a vaut u;
b=c; //b vaut u~+1 ..+b);
System.out.println("La valeur de la suite pour n="+n+" est Choix de tables de correspondances sémantiques Nous choisissons d'insérer deux valeurs de marque. Pour cela nous utilisons deux tables de transcodage.
La première permettra d'insérer la valeur cachée 2507, la seconde 3012.

Nos deux tables associent à chaque opération algébrique sur des entiers l' opération algébrique identique modulo un nombre N. A tout retour de valeur entière consécutif à l'appel d'une méthode de type int, on associe l'initialisation à une valeur V. Pour la première table, N vaut 10000 et V 17.
Pour (a seconde, N vaut 5421 et V vaut 50.
1. Table de transcodage pour 2507 Instruction initiale Instruction transcode v (entier)= a (entier) + b (entier)v = a + b modulo 10 000 v (entier)= a (entier) * b (entier)v = a * b modulo 10 000 v (entier) est le retour (entier)v = 17 de l'appel une fonction l0 2. Table de transcodage pour 3012 Instruction initiale Instruction transcode v (entier)= a (entier) + b (entier)v = a + b modulo 5421 v (entier)= a (entier) * b (entier)v = a * b modulo 5421 v (entier) est le retour (entier)v = 50 de l'appel une fonction Tatouage de la méthode Le tatouage de notre méthode consiste en l'insertion de deux variables j et k, prenant respectivement les valeurs 2507 et 3012 dans notre espace secret.
Le tatouage consiste en deux étapes - Initialisation de j et k en fonction de n en début de programme. Cette initialisation nous permet de calculer la valeur de j et k dans notre espace conjugué, par contre les valeurs de j et k au cours de l'exécution du 2o programme sont inconnues. Cela nous permet d'éviter la transformation de ces instructions par un optimiseur.
- Ajout d'instructions dans l'algorithme assurant l'invariance de j et k dans notre espace secret. .
Ces instructions sont des calculs de j et k à l'aide de polynômes de degré 2.

Ancrage de la marque La marque est ancrée en fin de méthode. En effet, les valeurs de j et k sont apparemment utilisées pour le calcul du résultat b. Ceci est un leurre, car la suite des opérations effectuées sur b laissent in fine cette 5 variable invariante.
Programme tatoué
public class Fibonacci {
publie Fibonacci() ~o {
) public static void main(String[] args) {
int n=Integer.parselnt(args[0]);
15 int a=0;
int j=-34*n*n-500*n+833; /l j vaut 2507 dans l'espace conjugué n°1 // (c'est à dire modulo 10000, n->17) int b=1;
int k=2*n-n*n-9; // k vaut 3012 dans l'espace conjugué n°2 20 (c'est // à dire modulo 5421, n->50) for (int i=i;i<n;i++) {
int c=a+b;
a=b; // a vaut u(i) j=j*5-28; // j vaut toujours 2507 dans l'espace conjugué 1 b=c; // b vaut u(i+1 ) k=k*k+201; // k vaut toujours 3012 dans l'espace conjugué 2 ) b+=k+j;
3o b=b-1-k*j+(1-k)*(1-j); // ancrage de k et j System.out.println("La valeur de la suite pour n="+n+" est "+b);

Analyse de la marque La marque se retrouve par Analyse Statique, et la méthode d'approximation du point fixe par itérations.
Le programme conjugué dans l'espace n°1 est le suivant public static void main(String[] args) , int n=17;
int a=0;
int j=(-34*n*n-500*n+833) % 10000;
1 o int b=1;
int k=(2*n-n*n-9) % 10000;
for (int i=1;i<n;i++) int c=(a+b) % 10000;
a=b;
j=(j*5-28) % 10000;
b=c;
k=(k*k+201 ) % 10000;
o b=(b+k+j) % 10000; .
b=( b-1-k*j+(1-k)*(1-j) ) % 10000;
System.out.println("La valeur de la suite pour n="+n+" est :"+b) ;
}
Transformons tout d'abord le programme en graphe de flux d'exécution Analyse Sémantique Statique : approximation du point iuce Début ~n=171 tat (I) tat (II) r:
_ i ~
=.-34 x raz -x n +
833~10000~!

._._._._._._._._._._._._.J

tat (III) ~b ~1 i =

tat (IV) ik= 2xn-nz-9~10000~i tat (~

l i ~i ~

.
.

tat (V1) !
i (i < ra) i sinon !

s _._ i._._.~

tat tat (VII) (X111) ~c =a+b~10000~i ~b=b+k+
j(10000~i tat tat (VIII) (XIV) __ _ ~~-bi ib=_b-1-_kx j+(1=.k)x(1-j~10000~i tat tat (I~ (XV) . j =-j x 28~10000~!
!.a, ffichage(b) i I L_._._ ._._._._._._._._.

tat Fin (~

~b=~i tat (X1) i k =. 01~10000~~
k z + 2 tat (X11) i = i 1~10000~ !
+

ANALYSE STATIQUE SEMANTIQUE
Pour chaque état on étudie l'ensemble des valeurs possibles prises par les variables, On ne considèrera que les variables qui ont changé de valeur par rapport à tous les états précédents possibles.
Etat Variable Valeur initiale tudie/
Intervalle I n N=Ql II a A =O

NI Qs J =

I b _ B = Q3 k K =O

I i I = QS

II i I =O

III c = f I a A = !'d J = f~

I b B =O

II k K =~

III i I = ~

I b B = QS

b B =~

Écriture de l'interdépendance des états Nr =~17~ Arr =~0~ Jrrr =1j = 34xn2 -500xn+833;nE Nr Br~, =~1~ Kv =~k=2,xn-nxn-9;nE Nr~ 1~ =~1~U~Ivrr +1~
I~r =Ivr (~ -~;supn r~Nl Cvrrr =~~=a+b;aE ~Arr UA~~~bE ~Bn, UB~
A~ = Bn, UB~ Jx = ~,j x5 -28; j E ~Jx U Jrrr ~~ Bxr = wrrr Kir =~kxk+201;kE ~Kv UKxrr~~ I~rr =Ivr nCinf n;~
rtEN ~~
Bue, _ ~b =b +k + j;b E ~Br~, U Bxr ~~ k E ~Kv U Kxtr ~~ I E ~Jr~r U Jx )~
Bue, =~b=b-1-kx j+~1-k~x~1- j~;bE B~v ~k~ ~Kv UKxrr~~.1 E ~Jrlr UJx ANALYSE
STATIQUE
SEMANTIQIE
Pour chaque tat, on tudie l'ensemble des valeurs possibles prises par les variables.
On ne considrera que les variables qui ont chang de valeur par rapport tous les tats prcdents possibles.
On approximera ces ensembles par des ntervalles Numro 0 1 2 3 4 d'itratio n N ~ 17 17 17 17 J s i~ 2507 2507 2507 K O ~ 9736 ' 9736 9736 I s?~ 1 1 i ;2 1 ;2) I O O 1 1 1 ;2) C QJ fd 1 1 ;2 (1 ;2) A S ~ 1 1 1 J Q~ ~ f 2507 2507 B ~ ~ f~ 1 1 ;2 K >~ ~D f 9897 82 ;9930 I

B ~ O gj 2244 2244 ;2405 B P~ S23 s ~ 0 ;9999 Numro 5 6 7 8 35 d'itratio POINT FIXE
n I (i ;3 (1 ;3) (1 ;4) (1 ;4) (1 ; l 7) I (1 ;2) (1 ;3) (i ;3) (1 ;4) (1 ;16) ' (i ;3) (1 ;4) (i ;5) (1 ;7) (0 ;9999) (1 ;2) (1 ;2) (1 ;3) (1 ;4) (0 ;9999) B (1 ;2) (1 ;3) (1 ;4) (1 ;5) (1 ;9999) K (2 ;9997) (2 ;9997) (2 ;9997) (2 ;9997) (2 ;9997) I Ql f QJ ~'Q i 7 B (0 ;9999) (0 ;9999) (0 ;9999) (0 ;9999) (0 ;9999) B (0 ;9999) (0 ;9999) (0 ;9999) (0 ;9999) (0 ;9999) On retrouve notre première marque : la variable J garde fioujours la valeur 2507 .

Le programme conjugué dans l'espace n°2 est le suivant publïc static void main(String[] args) int n=50;
int a=0;
10 int j=(-34*n*n-500*n+833) ~ 5421;
ïnt b=1;
int k=(2*n-n*n-9) ~ 5421;
for (int i=1;i<n;i++) int c=(a+b) ~ 5421;
a=b;
j=(7*5-28) ~ 5421;
b=c;
k=(k*k+201) ~ 5421;
é5 b=(b+k+j) ~ 5421;
b=(b-1-k*j+(1-k)*(1-j)) ~ 5421;
System.out.println("La valeur de la suite pour n="+n+" est .
"+b);
l Transformons comme précédemment lé programme en graphe de flux d'exécution Début ~n = 50i ötat (I) i0 État (II) j =~ -34 x n2-- 500 x n + 833~5421~i État (III) .
~b= ~li État (IV) ~~k -- 2x n.-n2.-9~5421~i État (~
_ ü=.1i État (V1) ~ i Si(i ~ h) i I I sinon ~
i._._.
État (VII) État (X111) ic=_a+b~5421~! ib=b+.k+ j~5421~i . _ . _ . _ . _ ._ . _. J _ État (VIII) État (XIV) ~a=bi ib=b= 1-kx j+(1= k)x(1 _j)~5421~~
État (XV]
j = j x 5 - 28 ~5421~ i ~~affichage(b) i ._._._._._._._: i._._._._._ État (X) Fin ~b-~i É at (IX) État (X1) ik =k2.+201~5421~~
État (X11) _, _= i + 1~5421~i ANALYSE STATIQUE SEMANTIQUE
Pour chaque ëtat, on étudie l'ensemble des valeurs possibles prises par les variables, On ne considèrera que les variables qui ont changé de valeur par rapport à tous les états précédents possibles, Etat Variable Valeur initiale tudie/
Intervalle I n N=~

II a A = Q~

III J = O

I b B = P3 k K =O

I i I=O

II i I = ~Zi III c = Ql IX a A = f X J =O

XI b B = Qs XII k K = 0 XIII i 1 = O

XI b B = Q3 b 8,~, = O

Écriture de l'interdépendance des états Nr =~50~ Arr =~0~ Jrrr -lJ=-34xn2-SOOxn+833;nE Nr B~, =~1~ Kv =~k=2Xn-nxn-9;nE Nr~ Ivr =~1~U~Ivrr +1~
I~r =1~ (~ -~;supn r~ NI
Cvrrr ' ~~ = a + b; a E ~Arr U Arx ~ ~ b E ~Bn, U B~ ~~
Arx = Brv U Bxr Jx = ~j x 5 - 28; j E ~Jx U .l rrr O Bxr = Cvru Kir = ~k x k + 20I; k E ~Kv U K~1 ~~ I xrrr = I vr n C inf n;-I-~~
ne N~
B~v =~b=b+k+ j;bE ~Br~, UBxr~~kE ~Kv UKxtr~~jE ~Jrrl UJx Bue, =~b=b-1-kx j+~1-k~x~l- j~;bE Bxr~, ~kE ~Kv UK~r~~ jE ~Jrrr UJx ANALYSE STATIQUE
SEMANTIQUE
Pour chaque tat, on tudie l'ensemble des valeurs possibles prises par les variables.
On ne considrera que les variables qui ont chang de valeur par rapport tous les tats prcdents possibles.
On approximera ces ensembles par des intervalles Numro 0 1 2 3 4 d'itration N ~ 50 50 50 50 B ~ 1 1 1 1 i5 K ~ O O 3012 3012 3012 I f 1 1 1 ;2 1 ;2) I O O 1 1 (1 ;2) C Q3 ~d 1 1 ;2 1 ;2) A ~ O 1 1 1 J O ~ ~ 1658 0;5420 B Qj O QS 1 1 ;2) K D O fd 3012 3012 I O ~ ~ O

B s QS Q~ 2266 0 ;4671 B O ~ ~ s~ 0 ;5420) Numro 5 6 7 8 105 d~itration POINT FIXE

B 1 1 i' i i I (1 ;3) (1 ;3) (1 ;4) (1 ;4) (1 ;50) I (i ;2) (1 ;3) (1 ;3) (1 ;4) (i ;49) C (1 ;3) (1 ;4) (1 ;5) (1 ;7) (0;5421) (1 ;2) (1 ;2) 1 ;3) (1 ;4) (0;5421) J (0 ;5420) (0 ;5420) (0 ;5420) (0 ;5420) (0 ;5420) B (1 ;2) (1 ;3) (1 ;4) (1 ;5) (0 ;5420) I pJ ~1j QS QS 50 B (0 ;5420) (0 ;5420) (0 ;5420) (0 ;5420) (0 ;5420) B (0 ;5420) (0 ;5420) (0 ;5420) (0 ;5420) (0 ;5420) Nous retrouvons notre deuxième marque : 3012.

Exemple 2 La méthode dont on extrait la signature sémantique est une méthode de tri bulles.
Programme initial public class Bulle public statïc voïd main(String[] args) int[] table=new int[args.length];
for (int i=O;ï<args.length;i++) table[i]=znteger.parselnt{ares[i]);
print{table);
print(tri(table));
public static void print(int[] table) System.out.println{"");
for (int i=O;i<table.length;i++) System.out.print{table[i]+" ");
System.out.println{"");
public static int[] tri(int[] table) int[] table2=table;
boolean flag=false;
int i=0;
int v=0;
while{!flag) flag=true;
for (i=O;i<table2.length-1;i++) if {table2[i]>table2[i+1]) v=table2[ï];
table2[i]=table2[ï+1];
table2[i+1]=v;
flag=false;
return{table2);

Choix d'une table de correspondance sémantique Nous souhaitons obtenir une propriété sémantique après transcodage. Pour cela nous utilisons une table de transcodage.
Pour les variables de notre espace conjugué, nous ajouterons aux 5 variables de notre espace initial deux nouvelles variables W et W' de type entier qui sont initialisées avec la valeur 0.
Ensuite nous tâchons de repérer les transpositions effectuées sur les tableaux du programme, et nous remplacerons ces opérations effectuées sur le tableau par des opérations sur la variable W. Une analyse statique sur 1o W montrera qu'elle garde une valeur constante égale à 7 dans toutes les situations.
Table de transcodage Instruction initiale Instruction transcode Dbut de programme W vaut 0 W' vaut 0 Squence: Squence:

<suite d'oprations quelconques <suite d'oprations (1)>
(1)>

<suite d'oprations (2)>

<une variable X prend la valeurleau<suite d'oprations (3)>
d'un Tab T

l'indice P> <suite d'oprations (4)>

<suite d'oprations (5)>

<suite d'oprations quelconques(2) n'afFectant ni X n T ni P> Si W=0 Alors W=1 <une variable Y prend la valeurun - W=W*Valeur Absolue (P -de T 6~) indice Q>

W'=min(P,Q) <suite d'oprations quelconques(3) n'affectant ni X, ni Y, ni T, ni P ,ni Q>

<on crit l'indice P de T
la valeur de Y>

<suite d'oprations quelconques(4) n'affectant ni X ni T ni Q>

<on crit l'indice Q de T
la valeur de X>

<suite d'oprations quelconques (5)>

T est un tableau, paramtre T est le tableau (5 ; 2 ;
d'entre 4 ; 1 ; 3) Signature de la méthode La signature de la méthode consiste en ('analyse statique sémantique de la méthode et la détection de propriétés Analyse du programme les propriétés se retrouvent par Analyse Statique, et la méthode d'approximation du point fixe par itérations.

Le graphe d'exécution du programme conjugué est !e suivant Dëbut stable -- [5;2;4;1;3[
État (I) !~ flag =~ false i L_._._._._.
État (II) ii=O; v..=Oi État (III) ~W --O;W~=0i i._ ._._._ État (IV) ~ si ( f lag =~ false) ! sinon !
i._._._._._._._.1 i._._.~
État (~
â~=Oi Fin i._._.
sinon !
État (V1) ~ ''-'-' ~si(i < longueur(table2)) i i._._._._._._._._._._._ État (VII) ! si(table[i] > table[i +1]) t sinon I
L_._._._._._._._._._.~ i._._.~
État (VIII) ~W --W ~eI i -.(i +1~!
._._._._._ État (IX]
iyy~-~n(i+l;i)~
État (X]
!~ flag =. false i i._._._._._.
État (X1) I I ~i=,i+lj Écriture de l'interdépendance des états ~ TABLE, _ ~5;2;4;1;3~
FLAGII = ~false~
1O IIII ~~~ WIV ~O
VIII ~~~ W'N
FLAGV = FLAGII n ~ false~
Il Ivl =~O~U~Imlr ~1J
Ivll =1,,, (~ ~-- ~;longueur(tablell TABLEVIII = table E (TABLE,I UTABLEvIIr ) tel que ~i E Ivll tel que table(i) ~-table(i +1)~
Ivlll = ~i E Ivll tel que stable E (TABLEII ~JTABLEVIII ) tel que table(i) ~-table(i +1)~
W Uf(n'~f : OH~l rx = x~OH ~xXli-(i+l~;iE IvIrrI
~(WauwN) W'X = ~min(i +l;i}i E IVIII ~ _ FLAGxI =~f'alse~ .
Itrations 0 1 2 FLAG Q~ false false I PJ 0 0;1}

I P~ 0 0;1 TABLE Qs (5 ;2 ;4 ;1 ;3) (5 ;2 ;4 ;1 ;3) I ~ 0 0 W ~ 1 1 W' P~ 0 0 FLAG Qs false false Itrations 3 4 5 - POINT FIXE

FLAG false false false I 0;1 ;2 0;1 ;2;3 0;1 ;2;3;4 I 0;1 ;2 0;1 ;2;3 0;1 ;2;3 TABLE (5;2;4;1 ;3) (5;2;4;1 ;3) (5;2;4;1 ;3) I 0 ;2 0 ;2 0 ;2 W 1 1 ~ 1 W' 0;2 0;2 0;2 FLAG false false false Signature On note la signature de notre méthode de tri bulles En considérant la table de transcodage secrète, on note que - « la variable associée W garde une valeur constante égale à
notre variable X' prend les valeurs 0 et 2 ».

Glossaire 5 - Un procédé de protection-prévention de logiciel ou matériel est un ensemble de techniques qui rendent plus difficiles la copie et l'utilisation frauduleuse des logiciels ou circuits électroniques.
- Un programme est écrit dans un certain langage de programmation appelé langage informatique du programme.
10 - L'interprétation d'un programme est la traduction de la suite de mots le composant en une suite d'actions appelée exécution du programme.
- La compilation d'un programme source, écrit dans un langage de haut niveau, est sa traduction dans un autre langage ou matérialisation en un automate, en général langage machine ou circuit électronique.
15 - Un programme informatique logiciel est un programme interprétable ou compilable en un programme interprétable.
- Un programme informatique matériel est un programme réalisable par un circuit électronique et spécifié par un langage ~de description de circuit.
20 - Un élément d'un programme informatique est une partie non nécessairement connexe du texte du programme correspondant à une ou plusieurs instructions, éventuellement composées (comme une commande de choix conditionnel ou par cas, une boucle, etc.), une déclaration ou description d'une ou plusieurs structures de données comprenant 25 éventuellement les ou des opérations agissant sur ces structures de données, une ou plusieurs procédures ou méthodes, un ou plusieurs modules, etc.
- Une sémantique d'un programme logiciel ou matériel est un modèle mathématique définissant l'ensemble des comportements possibles 3o d'un programme à l'exécution à un certain niveau d'observation.
- L'analyse statique sémantique est la détermination automatique de propriétés sémantiques des programmes.
- Deux programmes logiciels sont sémantiquement équivalents (ou fonctionnellement équivalents) s'ils ont le même ' comportement 35 observable c'est-à-dire qu'ils s'exécutent de manière fonctionnellement équivalente, (par exemple, si pour toute entrée possible, les sorties du programme sont les mêmes).
- Une sémantique abstraite d'un programme logiciel ou matériel est un modèle mathématique définissant une sur-approximation ou une sous approximation de l'ensemble des comportements possibles d'un programme à l'exécution.
- Une sémantique abstraite est secrète si sa spécification pour un programme logiciel ou matériel nécessite la connaissance d'un secret.
- Une signature est une information caractéristique (étiquette, vignette ou résumé) associée à un objet (ici, un programme logiciel ou matériel). Cette information peut dépendre d'une propriété intrinsèque ou extrinsèque de l'objet. Ces propriétés peuvent authentifier la forme et fond du contenu de l'objet (le codage, le format, la syntaxe, l'esthétique, la sémantique) ou sa traçabilité (l'histoire et/ou le devenir de cet objet).
~ 5 - Une signature secrète est une signature obtenue à l'aide d'une méthode faisant appel à un secret.
- Une signature sémantique est une . signature spécifiée en fonction de fa sémantique de l'objet (ici, du programme écrit dans un langage de programmation, avec une sémantique définie).
20 - Un authentifiant est une preuve secrète de la détention d'une information ou d'un droit (par exemple désignation de l'objet, nom de l'auteur, nom du destinataire, termes de la licence d'utilisation, etc.).
- Une marque est une composante d'un objet qui permet de l'identifier et qui, dans le contexte de l'invention, permet de retrouver une 25 signature de l'objet.
- L'étiquetage d'un objet ou d'un programme consiste à fabriquer une étiquette qui est en général séparée du contenu de l'objet ou incrustée dans l'objet à un endroit facilement repérable.
- Ä la différence, le tatouage consiste à incruster une (ou des) 3o marques) dans le corps de l'objet. Cette marque répartie dans le corps de l'objet est en général sinon indécelable du moins indélébile. L'objet greffé
avec cette marque s'appelle l'objet tatoué.
- L' obfuscation est la transformation d'un programme en un programme sémantiquement équivalent sous une forri~e difficile à
35 comprendre par un informaticien mais utilisable par un utilisateur. Ä la différence du tatouage, l'obfuscation rend un programme confidentiel (grâce à la difficulté de le comprendre) mais ne permet pas l'authentification [cf.
Obfuscation techniques for enhancing software security, New Zealand Patent Application #328057, WO 99/01815, PCT/US98/12017, 9 June 1997].
- Le lessivage de marque est une tentative d'effacement ou de modification de la marque ou bien une surcharge du programme pour noyer la marque, sans changer la sémantique du programme.
- Un tatouage de programme est robuste s'il résiste à une optimisation de compilation, une obfuscation, un étiquetage et à un autre 1o tatouage, toutes ces opérations étant appliquées ultérieurement.

Description analytigue des modules et arrière-plan théorie~ue Un procédé A de sémantique abstraite L'inventeur utilise les principes et techniques de l'analyse statique sémantique des programmes par interprétation abstraite, cf. [P. Cousot & R.
Cousot, « Abstract Interpretation : A unified lattice model for static analysis of programs by construction or approximation of fixpoints », Conférence internationale « Principles of programming languages, POPL'77 », p. 238 252, ACM Press, Janvier 1977] et [P. Cousot & R. Cousot, « Systematic design of program analysis frameworks », Conférence internationale « Principles of programming languages, POPL'77 », p. 269-282, ACM
Press, Janvier 1979].
Le procédé/module A définit une infinité de sémantiques abstraites secrètes dépendantes - d'un domaine abstrait D(K) paramétré par une clé secrète K (qui, 2o à une injection près, peut être considérée comme un nombre N = b(K), cette injection b pouvant elle-même constituer un secret );
- d'une clé secrète K particulière qui définit le domaine abstrait D(K) utilisé dans la sémantique abstraite secrète SAS.
Cette sémantique abstraite secrète SAS est constituée du domaine abstrait secret D(N) et des opérations abstraites correspondantes pour les constructions et primitives de la famille de langages de programmation considérés, obtenues en utilisant les principes de l'interprétation abstraite.
Tout domaine abstrait D(K) obtenu par une abstraction, au sens 3o de la théorie de l'interprétation abstraite, cf. [P. Cousot & R. Cousot, «
Abstract Interpretation : A unified lattice model for static analysis of programs by construction or approximation of fixpoints », Conférence internationale « Principles of programming languages, POPL'77 », p. 238-252, ACM
Press, Janvier 1977] et [P. Cousot ~ R. Cousot, « Syster~natic design of program analysis frameworks », Conférence internationale « Principles of programming languages, POPL'77 », p. 269-282, ACM Press, Janvier 1979], paramétré par une clé secrète K est utilisable comme paramètre pour le module A. Grâce à l'utilisation d'une injection N = b(K), qui peut rester secrète, tout domaine abstrait D(N), paramétré par un nombre N, est réutilisable avec des clés quelconques. 1l existe donc une infinité de domaines abstraits D(K), paramétrés par une clé K, qui sont utilisables.
Cependant, les domaines abstraits publiés dans la littérature scientifique ne conviennent pas, en ce sens qu'ils ne sont pas paramétrés par une clé K
(méme choisie de manière isomorphe aux entiers ou aux ordinaux).
Fait partie de l'invention, ia transformation d'un domaine abstrait quelconque utilisable pour l'analyse sémantique statique classique dont la concrétisation fournit une information sur des valeurs entières (à un codage injectif près) en un domaine abstrait D(N) paramétré par un entier N
constituant la clé secrète K = N (audit codage près). Le procédé consiste à
~5 abstraire (au sens de l'interprétation abstraite) des opérations concrètes en une opération abstraite considérant l'opération concrète réalisée modulo N.
Cette abstraction est ensuite étendue aux éléments de contrôle du programme selon les principes de l'analyse statique sémantique par interprétation abstraite, en utilisant éventuellement les techniques 2o d'élargissement et de rétrécissement connues de l'homme de métier.
Des exemples de transformation de tels domaines D en domaines D(N) sont décrits ci-dessous : ' Pour un programme P écrit en langage algorithmique (avec des additions, des multiplications, des boucles, des branchemenfis, ...) et dont 25 l'auteur veut sécuriser le contenu algorithmique sensible, on peut définir, pour le module A, un domaine abstrait D(K) pour les variables entières ou à
virgule fixe ou flottante, voire les réels correspondant à la propagation des constantes, aux puissances, aux intervalles, aux octogones, aux invariants d'égalité ou d'inëgalité linéaires, etc. modulo une valeur entière positive N
3o constituant la clé secrète K = N ;
Pour un programme P écrit en langage de programmation avec des données symboliques (avec des pointeurs, des listes, des tableaux, ...), et dont l'auteur veut sécuriser le déroulement sensible de manipulation de variables et de ,tableaux, on peut utiliser un domaine abstrait symbolique (par 35 exemple un automate ou des domaines d'arbres) paramétré par des entiers (par exemple compteurs de boucles dans l'automate ou une mesure de la taille des données symboliques) en réutilisant le domaine abstrait précédent D(N) pour analyser les paramètres entiers, modulo une valeur entière positive N constituant la clé secrète K, à nouveau à une injection (pouvant 5 être secrète) près.
II existe d'une part une infinité de domaines abstraits Di possibles et d'autre part une infinité d'instances Di(K) de chaque domaine particulier en fonction de la clé secrète K.
Contrairement à la cryptologie classique, si le module A est secret, o il n'est pas envisageable de le casser par la force brute. En effet il n'existe pas d'énumération possible de tous les domaines abstraits paramétrés utilisables.
Pour un domaine abstrait secret donné Di(N), des clés différentes conduisent à des sémantiques abstraites secrètes SAS différentes, comme ~5 c'est le cas pour les exemples ci-dessus. Quand le domaine abstrait secret a été divulgué ou est public, le secret de la clé doit être géré comme en cryptographie classique pour limiter les attaques à la force brute (comme dans le cas de la cryptologie classique basée sur les problèmes NP-complets ou les mécanismes de chiffrement à clé secrète).
2o La sémantique abstraite secrète SAS doit être invariante pour les transformations des primitives et constructions de la famille de langages logiciels ou matériels considérés, qui laissent leur sémantique standard invariante, comme c'est le cas pour les exemples ci-dessus. Ceci est important pour éviter les attaques d'obfuscation qui consistent à transformer 25 un programme P en un programme syntaxiquement diffërent mais sémantiquement équivalent, par exemple, l'expression mathématique (a =
2b) étant changée en (a = b + b).
(Voir Figure 5) Un Module DA de sémantique abstraite Parmi les modes préférés de réalisation du procédé ci-dessus, l'invention divulgue également un dispositif (pouvant être réalisé par un produit/programme d'ordinateur) incluant Un module DA basé sur un domaine abstrait D;(K) paramétré par une clé secrète K choisie parmi un très grand nombre possible, permettant d'engendrer un programme implantant la sémantique abstraite secrète SAS
de la famille de langages de programmation logiciel ou matériel considérés ;
(Voir Figure 5A) Un procédë S de signature Le procédé/module S utilise une sémantique abstraite secrète o SAS pour les constructions et primitives de la famille de langages de programmation considérés, pour faire l'analyse statique sémantique du programme original P logiciel ou matériel, le résultat de cette analyse fournissant la signature sémantique secrète SSS du programme P. Pour ce faire, le module S construit une représentation du système d'équations de point-fixe dont la solution approchée définit la sémantique abstraite SASP
dudit programme logiciel ou matériel P. Cette sémantique abstraite SASP du programme logiciel ou matériel P est définie selon les principes de l'analyse statique par interprétation abstraite de la sémantique concrète du programme en fonction de la sémantique abstraite secrète SAS pour les constructions et 2o primitives de la famille de langages de programmation considérés. Cette sémantique abstraite SASP est calculée par élimination, ou par itération avec accélération de la convergence. La signature sémantique secrète SSS du programme P est une fonction injective SSS = i(SASP) de la sémantique abstraite secrète SASP. Cette fonction injective i peut elle-même constituer un secret.
La signature sémantique secrète SSS d'un programme logiciel ou matériel P, calculée selon le module S ou par le dispositif DS permet de vérifier si un programme logiciel ou matériel P' a une signature sémantique équivalente.
3o Ä ce stade, if est possible de déposer la signature SSS du programme logiciel ou matériel P telle que produite selon le module S ou en sortie du dispositif DS chez une Tierce Parue de Confiance (TPC). Elle n'a donc pas à être insérée dans le programme logiciel ou matériel P.
L'authentification du programme logiciel ou mâtériel P s'éffectuera alors auprès de la TPC par comparaison de la signature produite SSS' du programme logiciel ou matériel P' à analyser avec celle (SSS) qui a été
déposée. Dans ce cas, le programme logiciel ou matériel P n'est pas protégé
contre la copie P', mais deux cas se présentent SSS' est différent de SSS : les modifications que P a subies sont détectables, puisque dans le cas de modifications sémantiques (par exemple, ajout d'un virus, modification d'un algorithme), la signature est différente;
SSS' est égal à SSS : le programme logiciel ou matériel P' est une copie de P ou si le programme logiciel ou matériel P a été obfusqué
(manuellement ou automatiquement) pour devenir P', la signature SSS' de P' sera égale à celle SSS de P, ce qui prouvera, avec une probabilité
extrémement faible d'erreur, que le programme logiciel ou matériel P a été
piraté, ce piratage étant masqué par un maquillage d'édition.
~5 (Voir Figure 6) Un module DS de signature Un module DS permettant de calculer ta signature SSS d'un programme logiciel ou matériel P selon la sémantique abstraite secrète SAS
0 obtenue en utilisant le module précédent DA.
(Voir Figure 6A) Un procédé M de marquage 25 Pour l'invention, une marque m est un élément de programme logiciel ou matériel qui peut être insérée dans un programme logiciel ou matériel par le procédé décrit dans le module B ci-dessous. Soit Pm le programme logiciel ou mafiériel ie plus simple possible dans lequel la marque m peut être inséré. Pour toute sémantique abstraite secrète SAS, ce 3o programme Pm a une signature sémantique secrète, calculable par le module S, dite signature sémantique secrète de ladite marque m et notée SSS(m).
Une « marque » m est un triplet constitué d'un « emplacement de marque », d'une « marque d'initialisation » et d'une « marqùe d'induction », s5 l'une des deux dernières pouvant éventuellement être vide.

Soit Pm un programme logiciel ou matériel de la famille de langages considérée, le plus simple possible, qui déclare si nécessaire I'« emplacement de marque » 'X', exécute la « marque d'initialisation » 'la', puis exécute une ou plusieurs voire un nombre infini de fois la « marque d'induction » 'If'.
Le procédé M de marquage est paramétré par une sémantique abstraite secrète SAS (par exemple, produite par le module A à partir d'un domaine abstrait Di(K) en fonction d'une clé secrète K) et une signature o sémantique secrète SSS ;
En utilisant une bijection f3 qui peut être secrète, le module M
calcule une valeur abstraite 's' du domaine abstrait Di(K) en fonction de la signature sémantique secrète SSS, selon le codage de la sémantique abstraite secrète SAS. Cette valeur abstraite 's' est choisie de sorte qu'il existe de très nombreuses valeurs concrètes 'x' dont l'abstraction selon les principes de l'interprétation abstraite ou celle du singleton '{x}' est la valeur abstraite 's' . Par exemple cette valeur 'x' peut être celle d'une variable dans le cas d'une analyse non-relationnelle ou d'un vecteur de variables pour une analyse relationnelle ou la valeur de tout objet analogue à un vecteur de variables ;
Le procédé M choisit, selon les principes de l'interprétation abstraite, l'une quelconque des valeurs concrètes 'a' dont l'abstraction ou celle du singleton '{a}' est la valeur abstraite s ;
Le procédé M choisit ensuite, selon les principes de l'interprétation abstraite, une opération concrète 'f' dont l'abstraction fonctionnelle laisse invariante la valeur abstraite 's' mais pas les valeurs concrètes correspondantes (c'est-à-dire que si 'x' est une valeur concrète dont l'abstraction selon les principes de l'interprétation abstraite ou celle du singleton '{x}' est la valeur abstraite 's' alors 'f(x)' est en général différent de 'x' tandis que l'abstraction selon les principes de l'interprétation abstraite de 'f(x)' ou celle du singleton '{f(x)}' est égale à 's') ;
Le procédé M choisit ensuite un « emplacement de marque » 'X' qui peut être une variable existante du programme qui est inutile ou n'est pas vivante dans les éléments du programme P à tatouer, une nouvelle variable auxiliaire, un champ inutile ou supplémentaire d'une structure de données allouée dynamiquement, etc. dont les valeurs sont prises en compte dans l'analyse statique sémantique utilisée pour déterminer la sémantique abstraite secrète du programme ;
Le procédé M détermine ensuite une « marque d'initialisation » 'la' qui est constituée d'une ou plusieurs primitives ou constructions de la famille de langages considérés s'interprétant comme une affectation de la valeur concrète 'a' définie ci-dessus à l'emplacement de marque 'X' ;
Le procédé M détermine ensuite une « marque d'induction » 'If' comprenant une ou plusieurs primitives ou constructions de la famille de langages considérés s'interprétant comme une affectation de la valeur de 'f(X)' à l'emplacement de marque 'X'. Selon la famille de langages considérés, ces primitives ou constructions seront des affectations, des passages de paramètre, des unifications, etc.
La marque m est choisie telle que, selon les principes de l'interprétation abstraite, !'analyse sémantique statique du programme Pm, selon la sémantique abstraite secrète et conformément aux directives du module S, détermine de façon reconnaissable par le détenteur du secret SAS, la valeur abstraite 's' définie ci-dessus et telle que la signature sémantique secrète du programme Pm, comme définie ci-dessus, est celle 2o SSS(m) de (a marque.
En utilisant des domaines abstraits ~ qui sont des produits cartésiens ou des produits réduits de domaines abstraits élémentaires, il est toujours possible de considérer les marques qui sont des ensembles finis de marques élémentaires.
(Voir Figure 7) Un Module DM de marquage Parmi les modes préférés de réalisation du procédé M ci-dessus, l'invention divulgue également un dispositif DM (pouvant être réalisé par un produit/programme d'ordinateur) qui, à partir du programme implantant (a sémantique abstraite secrète SAS et des données représentant la sémantique secrète SSS du programme P, calcule le texte de la marque m.
(Voir Figure 7A) Un procédé B de camouflage Le procédé/module B dé camouflage prend en paramètre le programme logiciel ou matériel original P, une sélection des éléments du 5 programme P à tatouer et une marque m. Le module B fournit le programme tatoué PT qui est une fusion du programme original P et de la marque m.
Cette fusion est faite sans altérer la sémantique abstraite secrète de Ia marque, ni la fonctionnalité. du programme original, ce qui permet de retrouver la signature sémantique secrète SSS(m) de la marque m à partir de la signature sémantique secrète SSS(PT) du programme tatoûé PT.
II est important de noter que, dans le mode de réalisation préféré
de l'invention, l'observation des valeurs prises par les variables lors de l'exécution du programme logiciel ou matériel tatoué PT ne permet pas de retrouver la marque. D'une part, il faut connaître la sémantique abstraite 15 secrète pour établir le programme tatoué PT. D'autre part, le calcul de propriétés sémantiques de ce programme logiciel ou matériel tatoué PT ne pourra se faire, dans les cas de domaines abstraits D(K) non triviaux, que par analyse sémantique statique, et non pas par une exécution pas à pas.
L'insertion des instructions dans le programme logiciel ou matériel 2o à protéger P devra respecter un certain nombre de règles minimales de robustesse. Par exemple, pour éviter la suppression pure et simple par un optimiseur utilisant les techniques d'extraction « slicing », il faut s'assurer que les variables et les opérations contenant les marques de tatouage donnent l'impression qu'ils ont une influence possible sur la sémantique 25 observable ,(par exemple les sorties) du programme. Ladite dépendance potentielle peut être une simple dépendance syntaxique choisie de sorte que la démonstration que cette dépendance syntaxique n'entraîne pas une dépendance sémantique requiert la preuve complexe d'une équivalence sémantique de programmes logiciels ou matériels. C'est le cas par exemple 3o de l'allocation dynamique de valeur (sémantiquement inutile mais ceci est indécidable) de certaines valeurs calculées par les opérations contenant les marques de tatouage.
La marque m créée par le module M, la sélection des éléments à
tatouer dans un programme logiciel ou matériel P telle qu'elle~est choisie par le module PS et utilisée par le module B doivent satisfaire aux critères énoncés ci-après.
Les « marques d'induction » 'If' doivent être incluses dans les parties du programme logiciel ou matériel contenant des primitives ou constructions de répétition (itérations, récursivités, etc.).
Les « marques d'initialisation » 'la' sont utilisées pour faire les initialisations requises par la sémantique de la famille de langages considérés. De plus, si nécessaire, pour la famille de langages considérée, d'éventuelles déclarations doivent être ajoutées au programme tatoué PT si la nature des I'« emplacement de marque » 'X' le nécessite.
Finalement, le module de camouflage B ajoute des primitives ou constructions au programme tatoué PT rendant actives les valeurs ëventuellement utilisées dans les « emplacements de marque » 'X'. Une solution possible consiste à utiliser les valeurs éventuellement utilisées dans ~5 les « emplacements de marque » dans le calcul de variables actives du programme matériel ou logiciel tatoué PT, ou encore à affecter les valeurs des « emplacements de marque » 'X' à des variables dynamiques du programme tatoué PT, plus généralement en s'assurant que la durée de vie dynamique des « emplacements de marque » 'X' est celle du programme 2o tatoué PT. Ä nouveau ces transformations du programme P en le programme tatoué PT doivent être faites sans altérer la sémantique abstraite secrète de la marque, ni la fonctionnalité du programme original, de façon que, selon les principes de l'analyse statique par interprétation abstraite, cela permette de retrouver la signature sémantique secrète SSS(m) de la marque m à partir de 25 la signature sémantique secrète SSS(PT) du programme tatoué PT.
(Voir Figure 8) Un Module DB de camouflage 3o Parmi les modes préférés de réalisation du procédé B ci-dessus, l'invention divulgue également un dispositif DB (pouvant être réalisé par un produit/programme d'ordinateur) qui, à partir du texte du programme logiciel ou matériel P, de données de sélection des éléments à tatouer dans P et du texte de la marque m, produit le texte du programme logiciel ôu matériel PT

dont la sémantique est fonctionnellement équivalente à celle de P et qui dissimule la marque m.
(Voir Figure 8A) Un procédé PS de politique de sécurité
Le procédé PS de sélection du domaine abstrait de tatouage et des éléments à tatouer dans le programme logiciel ou matériel P dépend de la politique de sécurité à appliquer à ce programme P
Ä l'image des techniques stéganographiques, la marque peut être répartie dans le programme à des endroits différents et éloignés en choisissant par exerriple des analyses sémantiques statiques par interprétation abstraite basées sur des domaines abstraits permettant d'ignorer les structures de contrôle dans le calcul de la signature sémantique ~5 secrète de sorte que la répartition aléatoire des marques est sans effet sur cette analyse ; cette stratégie est par exemple applicable lorsque la marque sert à authentifier tout un programme P, relativement important en taille.
Une autre politique de sécurité consiste ' à tatouer les parties innovantes et/ou sensibles du programme P. En général les éléments 2o sélectionnés pour le tatouage concernent la partie algorithmique du programme (opérations algébriques ou manipulations de variables en mémoire vive). Selon la famille de langages de programmation considérés et la méthodologie de programmation utilisée, cette partie algorithmique peut être 25 - Des éléments significatifs du programme qui sont sémantiquement autonomes. En JavaT"" par exemple, un programme est assimilé à plusieurs classes composées de plusieurs méthodes, elles-mêmes pouvant faire appel à des méthodes d'autres classes et d'autres packages. Une politique de sécurité pour JavaT"" peut donc consister à
3o tatouer les méthodes qui sont les plus petites entités autonomes significatives du programme. Un autre choix serait de tatouer des BeansT"".
- Des éléments répartis dans le programme constituant un ensemble sémantiquement cohérent comme par exemple un type abstrait algébrique qui serait implémenté par des structures de données et des 35 procédures et fonctions réparties dans tout le programme ;

Dans ce cas les éléments à tatouer peuvent être sélectionnés automatiquement sur des critères syntaxiques (procédures, modules, etc.) ou sémantiques (par une analyse statique de dépendance) ou manuellement par intervention d'un opérateur.
Plusieurs politiques de sécurité différentes peuvent être mise en ceuvre sur le même programme P en répétant des tatouages successifs du même programme logiciel ou matériel original pour des acteurs différents. II
est ainsi possible d'affecter à plusieurs niveaux d'une chaîne de distribution de matériel ou de logiciel (acheteur, grossiste, distributeur, détaillant) des 1o marques qui leur seront propres. On peut ainsi tatouer autant de fois que l'on veut un programme logiciel ou matériel (par superposition de tatouages), contrairement aux tatouages de sons ou d'images qui subissent une certaine saturation du canal subliminal (selon le modèle de perception choisi). On n'altérera pas ie programme logiciel ou matériel, on le surchargera, les ~5 ressources mémoire et temporelles en seront seules affectées.
On peut utiliser pour chacun de ces tatouages des sémantiques secrètes différentes, si bien que la chaîne de cbnfiance peut avoir des secrets différents.
Bien entendu, ce dispositif et ce procédé de tatouage peuvent être 2o combinés avec des dispositifs et procédés de l'art antérieur pour rendre le programme matériel ou logiciel inutilisable à un utilisateur non autorisé
(prévention) pour tracer la dissémination éventuelle dudit programme par un utilisateur autorisé à des utilisateurs non autorisés (audit). II suffira pour cela de ne pas communiquer aux utilisateurs autorisés les clés leur permettant de 25 retrouver le tatouage.
II est également possible d'utiliser le dispositif et le procédé pour authentifier de manière automatique les programmes logiciels ou matériels que l'on va autoriser à transiter sur un réseau ou à être hébergés sur un poste informatique donné. Le tatouage est assimilable à un certificat 3o d'authentificâtion dont le poste de surveillance du réseau ou du poste informatique disposera de la clé de lecture.
(Voir Figure 9) Un Module DPS de politique de sécurité
Parmi les modes préférés de réalisation du procédé PS ci-dessus, l'invention divulgue également un dispositif DPS (pouvant être réalisé par un produit/programme d'ordinateur) qui, à partir du texte du programme logiciel ou matériel P et d'une famille de domaines abstraits D1, ..., Dm, choisit un domaine abstrait D et sélectionne les éléments du programme P à tatouer.
(Voir Figure 9A) o Un procédé T de tatouage Le procédé T de tatouage utilise un module C de chiffrement, qui implante une fonction bijective pour calculer une signature sémantique secrète SSS en fonction d'un authentifiant Auth, en utilisant éventuellement la sémantique abstraite secrète SAS. Ä partir de cette signature sémantique secrète SSS et de ladite sémantique abstraite secrète SAS, le module de marquage M décrit ci-dessus produit une marque m. Le module T utilise encore le module B décrit ci-dessus qui, à partir de ladite marque m, du programme logiciel ou matériel P et de ia sélection des éléments à tatouer dans P fournit le programme logiciel ou matériel tatoué PT.
(Voir Figure 10) Un Module DT de tatouage Parmi les modes préférés de réalisation du procédé T de tatouage ci-dessus, l'invention divulgue également un dispositif DT (pouvant être réalisé par un produit/programme d'ordinateur) pour dissimuler un authentifiant Auth dans le texte d'un programme original P logiciel ou matériel grâce à un programme implantant une sémantique abstraite secrète SAS et des données de sélection des éléments de P à tatouer.
(Voir Figure 10A) Un procédé G général Le procédé général G sert à tatouer un programme P. Pour ce faire, le module PS décrit ci-dessus est utilisé pour choisir automatiquement ou interactivement le domaine abstrait paramétré qui sera ultérieurement utilisé pour réaliser l'analyse statique par interprétation abstraite par le module Au d'authentification ainsi que pour sélectionner automatiquement ou interactivement les éléments du programme logiciel ou matériel à tatouer P.
5 Ensuite, le module A décrit ci-dessus est utilisé pour calculer la sémantique abstraite secrète SAS à partir du domaine abstrait paramétré précédemment sélectionné et d'une clé secrète, en général choisie en interaction avec un opérateur mais qui peut également l'être automatiquement, voire aléatoirement. Enfin, le module T décrit ci-dessus utilise le programme P, la 1 o sémantique abstraite secrète SAS, un authentifiant Auth et la sélection des éléments à tatouer dans P pour produire le programme logiciel ou matériel tatoué PT. Des variantes consistent à fixer une fois pour tout le domaine abstrait qui est utilisé par le module G et à choisir la clé secrète K en utilisant une méthode cryptographique standard.
Application à la compilation authentïfiante La compilation conserve la sémantique concrète des programmes logiciels ou matériels à un morphisme près. Par conséquent la compilation conserve également la sémantique abstraite secrète du programme logiciel ou matériel objet Po, qui est la même que celle du programme source P, à ce même morphisme près. De ce fait, la compilation d'un programme matériel ou logiciel tatoué source PT par un compilateur correct ne lessive pas le tatouage dans le programme matériel ou logiciel objet PTo. Connaissant l'ordinateur cible, on connaît la sémantique concrète du code objet et donc celle du programme matériel ou logiciel objet PTo. En adaptant les dispositifs DA et DS, selon les principes de la théorie de l'interprétation abstraite, à
la sémantique , concrète du langage maçhine de l'ordinateur ou du système informatique objet en utilisant ledit morphisme de compilation, on obtient des dispositifs DAo et DSo conformes aux modules A et S pour la sémantique 3o concrète du code objet et tels que la composition de DA puis DS pour PT
donne la même sémantique secrète SSS (audit morphisme près) que la composition de DAo et DSo pour PTo. Par conséquent, le dispositif DAu utilisant le dispositif DSo calcule une représentation de la signature sémantique secrète SSS du programme PTo, qui est égalément celle du programme PT et peut donc être utilisée par le dispositif DAu pour retrouver l'authentifiant du programme PT.
Une des applications de ce brevet est un module Ca de compilation, authentifiante compilant un programme tout en y insérant un tatouage selon les principes du module G. Dans son mode de réalisation préféré, un compilateur authentifiant Dca intègre le.dispositif DG de la figure 7 bis dans un compilateur pour un langage informatique matériel ou logiciel qui est utilisable en option pour tatouer le code objet. Comme expliqué ci dessus, le dispositif DAu permet de retrouver l'authentifiant du programme 0 objet tatoué.
(Voir Figure 11) Le Modute DG gënérat ~5 Parmi les modes préférés de réalisation du procédé G de tatouage ci-dessus, l'invention divulgue également un dispositif DG (pouvant être réalisé par un produit/programme d'ordinateur) pour dissimuler un authentifiant Auth dans le texte d'un programme original P logiciel ou matériel (en fonction du choix un domaine abstrait paramétré par une clé
2o secrète).
(Voir Figure 11A) Un procédé Au d'authentification 25 Le procédé Au d'authentification prend en paramètres, le programme logiciel ou matériel marqué PT et la sémantique abstraite secrète SAS. II calcule l'authentification originale Auth. Ce module A est composé de deux sous-modules le module S, décrit ci-dessus, qui à partir du programme logiciel 3o tatoué PT et de la sémantique secrète extrait la signature sémantique secrète SSS(PT) et par conséquent celle SSS(m) de la marque m cachée dans PT ;
un module F de déchiffrement et d'extraction, qui est l'inverse du module C de chiffrement utilisé dans le module T de tatouage, prend en paramètre la signature sémantique secrète SSS(PT) pour extraire l'authentifiant original Auth de P.
Le module de déchiffrement F peut également constituer un secret, tout comme le module C.
(Voir Figure 12) Un Module DAu d'authentification Parmi les modes préférés de réalisation du procédé
1o d'authentification Au ci-dessus, l'invention divulgue également un dispositif d'authentification DAu (pouvant être réalisé par un produit/programme d'ordinateur utilisant le dispositif DS ci-dessus et un dispositif de déchiffrement DF) pour l'authentification d'un prôgramme tatoué PT en calculant son authentifiant Auth.
(Voir Figure 12A) Présentation des figures supplémentaires 5 La figure 5 montre un schéma de principe du procédé A
pour engendrer la sémantique abstraite secrète SAS à partir d'un domaine abstrait Di(K) dépendant d'une clé secrète K ;
6 La figure 6 montre un schéma de principe du procédé S
pour calculer la signature SSS selon la sémantique secrète SAS d'un programme logiciel ou matériel P ;
Les figures 5A et 6A montrent le schéma des dispositifs DA et DS
correspondant aux procédés A et S dans une de leurs variantes de réalisation ;
3o 7 La figure 7 (respectivement 7A) montre le schéma de principe du procédé M (respectivement du dispositif DM dans une de ses variantes de réalisation) pour produire une marque connaissant une sémantique abstraite secrète SAS (résultant par exemple de l'application du principe du procédé de la figure 1 ) et une signature sémàntique secrète SSS ;

8 La figure 8 (respectivement 8A) montre le schéma de principe du procédé B (respectivement du dispositif DB dans une de ses variantes dé réalisation) pour tatouer un programme logiciel ou matériel P en insérant une marque m dans une sélection d'éléments à tatouer dudit P ;
9 La figure 9 (respectivement 9ä) montre le schéma de principe du procédé PS (respectivement du dispositif DPS dans une de ses variantes de réalisation) qui, étant donné un programme logiciel ou matériel P, choisit un domaine abstrait D parmi une famille de domaines abstraits possibles et sélectionne des éléments de P à tatouer;
10 La figure 10 (respectivement 10A) montre le schéma de principe du procédé T (respectivement du dispositif DT dans une de ses variantes de réalisation) pour insérer une marque m caractéristique d'un authentifiant Auth dans une sélection d'éléments d'un programme logiciel ou matériel P original en utilisant une sémantique abstraite secrète SAS ;
~ 5 11 La figure 11 (respectivement 11 A) montre le schéma de principe du procédé T (respectivement du dispositif DT dans une de ses variantes de réalisation) pour choisir un domaine abstrait paramétré par une clé, une c¿é secrète particulière et un authentifiant pour tatouer un programme logiciel ou matériel P par transformation en un programme tatoué
2o fonctionnellement équivalent PT ;
12 La figure 12 (respectivement 12A) montre le schéma de principe du procédé Au (respectivement du dispositif DAu dans une de ses variantes de réalisation) pour authentifier un programme tatoué PT.

Exemple 3 Considérons un programme original P à tatouer très simple par le dispositif DG de la figure 11 A, qui est le suivant public class Fibonacci public Fibonacci() ~ o () public static void main(String[] args) int n=Integer.parselnt(args[0]);
int a=0;
int b=1;
for (int i=1;i<n;i++) int c=a+b;
2o a=b; //a vaut ui b=c; //b vaut ui+1 ) System.out.println("La valeur de la suite pour n="+n+"
est : "+b);
A titre d'exemple, le dispositif DPS de la figure 9A sélectionne des méthodes à tatouer, comme la méthode main.
Un exemple très simple de tatouage de méthodes JavaT"" consiste 3o à utiliser une sémantique collectrice qui est l'ensemble des descendants des états d'entrée de la méthode (d'autres alternatives étant la sémantique des ascendants des états de sortie ou des combinaisons comme l'intersection de ces sémantiques collectrices).
Dans cet exemple, la sémantique abstraite secrète SAS est l'abstraction de la sémantique collectrice d'une méthode qui ne retient que les variables locales entières et fait complètement abstraction des autres variables, du graphe de flot de contrôle, des éléments extérieurs et du contexte de la méthode. De ce fait l'analyse statique est insensible aux transformations du programme (par exemple à des fins d'obfuscation ou de 5 brouillage) modifiant le graphe de flot de contrôle et des transformations par équivalence des expressions arithmétiques. Par simplicité pour cet exemple de sémantique abstraite secrète SAS, seules les opérations arithmétiques de base +, - et * sont considérées (y compris toutes les autres opérations arithmétiques permettant de réécrire des expressions arithmétiques comprenant ces opérations de base sous une forme arithmétiquement équivalente, comme le moins unaire, etc.). La sémantique concrète est choisie sur l'anneau (Z,+, *) des entiers mathématiques (et non comme des entiers modulo 232 comme en JavaTM, les équivalences arithmétiques mentionnées ci-dessus devant tenir compte de ce fait).
15 Dans cet exemple, la sémantique abstraite secrète SAS utilise un domaine de hauteur finie pour les variables locales entières, ce qui la rend insensible aux stratégies d'itération chaotique utilisées et évite, toujours dans l'optique d'une plus grande simplicité de l'exemple, l'utilisation d'opérateurs d'élargissement et de rétrécissement.
2o Dans cet exemple, la clé secrète K est le produit K1 *K2*...*Kn d'entiers naturels strictement positifs et premiers entre eux. Dans l'exemple ' ~ considéré ci-dessous, n=2, K1=10000 et K2=5421. Le domaine abstrait D(K), utilisé pour calculer la sémantique abstraite secrète SAS par interprétation abstraite de la sémantique collectrice, est celui de la propagation des 25 constantes modulo K. D'après le lemme chinois, Z/~ = Z/K1*",*KnZ est isomorphe à l'anneau produit Z/K1 Z x ~ ~ ~ x ~KnZ~ Étant donné la signature sémantique secrète SSS du programme tatoué PT dans Z l~ -~K1 *... *KnZ~ on considère son image (s1, ..., sn) dans Z/K1 Z x ~ ~ ~ x ~KnZ
dont les composantes sont données par la projection canonique sur l'anneau 3o Z/KiZ, i = 1, .~..,n. Dans l'exemple considéré, on suppose que le dispositif DC
de la figure 6 bis calcule la signature sémantique secrète (2507, 3012) à
partir de l'authentifiant du programme. La sémantique statique secrète est alors obtenue par n analyses statiques, pour toutes les variables entières locales à la méthode des domaines abstraits Z/Ki Z, i = 1,...,n; correspondant 35 à la propagation des constantes modulo Ki. Le dispositif DA de la figure 5A

consiste donc en un programme d'analyses statiques successives par propagation des constantes modulo (K1, ..., Kn).
Dans cet exemple, le dispositif DM de la figure 7A utilise la sémantique abstraite secrète SAS et la signature sémantique secrète SSS
définies ci-dessus pour produire une marque m dont le texte est le suivant int <watermark:10000:2507>;
int <tmp:10000:2507>;
o <watermark:10000:2507>=1;
<tmp:10000:2507>=<watermark:l 0000:2507>+227492;
<tmp:10000:25D7>~<watermark:10000:2507>*<tmp:10004:2507>;
<watermark:l 0000:2507>=<tmp:10000:2507>+155014;
<tmp:10000:2507>=<watermark:10000:2507>*1323;
<tmp:10000:2507>=<tmp:10000:2507>+153;
<tmp:10000:2507>=<tmp:10000:2507>*<watermark:l 0000:2507>;
<watermark:l 0000:2507>=<tmp:10000:2507>+9109;
2o int <watermark:5421:3012>;
int <tmp:5421:3012>;
<watermark:5421:3012> = 1;
<tmp:5421:3012>=<watermark:5421:3012>+-35539;
<tmp:5421:3012>=<watermark:5421:3012>*<tmp:5421:3012>;
<watermark:5421:3012>=<tmp:5421:3012>+11445;
<tmp:5421:3012>=<watermark:5421:3012>*658;
<tmp:5421:3012>=<tmp:5421:3012>+971;
<tmp:5421:3012>=<tmp:5421:3012>*<watermark:5421:3012>;
<watermark:5421:3012>=<tmp:5421:30i 2>+4623;
Dans un cas simplifié où Ki n'est pas trop grand efi pour une signature sémantique secrète s donnée dans ZlKiZ, le texte~de la marque m créé par le dispositif DM de la figure 7A peut consister en une seule marque d'initialisation qui peut être une affectation <watermark:Kia>= s';
où s' = s + aKi et la valeur a dans Z n'est pas choisie trop grande de manière à éviter que s' ne déborde pas hors des 32 bits des entiers Java.
La valeur de la variable <watermark : Ki : s> est toujours constante dans une analyse statique par propagation des constantes modulo Ki et égale à s.
Cette valeur n'apparaît pas en clair dans la marque et est d'autant plus difficile à trouver que la clé secrète Ki est inconnue.
1o Une instanciation plus sophistiquée du dispositif DM de la figure 7A permet de choisir cette marque d'initialisation comme étant un polynôme Q (en la variable <watermark:Kia>) de la forme ak<watermark:Kia>k + ... + a1 <watermark:Kia> + a où ak, ..., a1 sont des valeurs aléatoires et la valeur a est donnée par a -= s - akvk - ak-1 v k-1 _ , .. - a1 v mod Ki où la valeur initiale v est choisie quelconque de manière aléatoire.
Dans ce cas, la marque d'initialisation est constituée par les affectations <watermark:Kia>=v;
<watermark:Kia>= ak<watermark:Kia>k+...+a1 <watermark:Kia>
+ a;
de sorte que ('on a toujours <watermark:Kia> = s mod Ki L'utilisation d'une seule marque d'initialisation a l'inconvénient de laisser la valeur de <watermark: Ki : s> constante et donc aisément repérable par une analyse statique propageant les constantes. Pour l'éviter, un dispositif DM plus sophistiqué ajoutera une marque d'induction permettant de donner une dynamique à la variable de marque <watermark : Ki : s> de façon qu'elle prenne des valeurs stochastiques dans Z mais reste constante 3o dans Z/KiZ. On utilise pour cela un polynôme Q' qui possède la propriété de stabilité, c'est-à-dire Q'(s) ---- s mod Ki Le polynôme Q' est engendré comme expliqué ci-dessus et la marque d'induction est constituée par l'instruction <watermark:Kia>=
a'k~<watermark:Kia>k'+...+a'1 <watermark:Kia> + a';
ou tout équivalent, par exemple en utilisant le principe de calcul de Hôrner.
Le dispositif DB de la figure 8A peut placer cette marque d'induction dans une boucle ou dans un appel récursif de la méthode à
tatouer dans P car son exécution ne modifie pas la valeur de <watermark: Ki : s> dans l'anneau Z/KiZ choisi pour calculer la sémantique statique secrète s. Par contre la valeur observée dans le domaine o d'interprétation des entiers Java sera totalement stochastique.
Dans l'exemple du tatouage du programme P ci-dessus, la marque définie ci-dessus utilise deux variables locales <watermark : Ki : s>
et <tmp : Ki : s>. La marque d'initialisation est le segment de code initial constitué par un polynôme Q calculant la valeur initiale de ~5 <watermark: Ki : s>. La marque d'induction est constituée par un polynôme Q' possédant la propriété de stabilité pour s dans l'anneau Z/KiZ considéré.
On utilise des polynômes du second degré par commodité. Les valeurs des coefficients du polynôme du second degré assurant l'initialisation sont aléatoires. Ce polynôme est donné comme suit dans ZJKiZ
2o Q(x) _ (x -1 )(x - s) = x2 + coeffl .x + coeff2.
Un nombre aléatoire de périodes du modulo Ki est ajouté ou retranché aux coefficients afin de ne pas dévoiler la clef s. La valeur initiale de <watermark : Ki : s> est donc <watermark : Ki : s> = Q ( 1 ) - s dans Z/KiZ.
25 La marque initiale consiste à calculer ce polynôme Q en utilisant le principe de calcul de H~rner. La marque d'induction est constituée par un polynôme Q' (également du second degré, là aussi par commodité) satisfaisant la propriété de stabilité
Q'(s) = s dans ZJKiZ
30 Le polynôme Q' s'écrit Q'(x)=ax2+bx+c Les coefficients a et b sont tirés au sort, tandis que c est choisi de manière à assurer la propriété de stabilité pour la clef secrète comme expliqué ci-dessus. La marque d'induction est constituée par l'instruction 35 d'affectation <watermark:Kia> = Q'(<watermark:Kia>);
où le polynôme est calculé par la méthode de Hôrner.
Le programme tatoué PT correspondant au programme P incluant la marque m ci-dessus est le suivant public class FibonacciTatoue f public FibonacciTatoue() f~
public static void main(String[] args) f int n=Integer.parselnt(args[0]);
int a=0;
~5 int b=1;
int d=1;
int e=35538;
intf=1;
int g=227493;
2o e=d*e;
d=e+11445;
g=f*9~
f=g+155014;
for (int i=1;i<n;i++) 25 ~ f int c=a+b;
e=d*658;
f=f* 1323;
a=b; //a vaut ui 3o g=g+153;
e=e+971;
g=g*f~
e=e*d;
b=c; //b vaut ui+1 35 d=e+4623;

f=g+9109;
System.out.println("La valeur de la suite pour n="+n+"
est : "+b);
5 }
L'emplacement des marques est aléatoire, dans l'ordre des marques, la marque d'initialisation étant placée au dëbut du programme et la 1o marque d'induction étant placée de préférence dans une boucle ou dans une branche d'appel récursif de la méthode.
Dans l'exemple considéré de dispositif DB de la figure 8A, une transformation supplémentaire du programme est ensuite nécessaire pour lier le tatouage au programme original (pour résister par exemple au 15 détatouage par « slicing » automatique), comme par exemple) public class FibonacciTatoue public FibonacciTatoue() public static void main(String(] args) f int n=Integer.parselnt(args[0]);
int a=0;
int b=1;
int d=1;
int e=35538;
int f=1;
3o int g=227493;
e=d*e;
d=e+11445;
g=f*9~
f=g+155014;
for (int i=1;i<n;i++) int c=a+b;
e=d*658;
f=f*1323;
a=b+(155014/f); //a vaut ui g=g+153;
e=e+971;
g=g*f e=e*d;
1o b=c+(e/f); //b vaut ui+1 d=e+4623;
b=b-(e/f);
a=a-( 155014/f);
f=g+9109;
System.out.println("La valeur de la suite pour n="+n+"
est : "+b);
) La suppression des variables d, e, f et g utilisées pour le tatouage rendrait le programme erroné donc inutilisable.
Dans un exemple plus réaliste, le dispositif DB de la figure 8A doit s'assurer que la transformation du programme original n'introduit pas d'erreurs à l'exécution pouvant modifier la sémantique du programme original. On utilisera pour cela les techniques classiques de l'interprétation abstraite.
Dans un exemple plus réaliste, le dispositif DB de la figure 8A
utilisera des méthodes plus sophistiquëes pour lier la marque m au 3o programme original P dans le programme tatoué PT. Dans le cadre de l'exemple considéré ci-dessus, il faut pour cela pouvoir utiliser des propriétés invariantes secrètes de la variable de tatouage qui soient transposables en arithmétique signée sur 32 bits. C'est possible très simplement lorsque, par exemple, la clef Ki est une puissance de 2. En effet, supposons que Ki =2k .
Alors on a x=s+a.2kdansZ
Nous avons, toujours dans Z, que, pour tout j inférieur ou égal à
k:
x%2J=s%2J
où x % y désigne l'opération qui renvoie le reste de la division euclidienne de x par y .On remarque que cette propriété reste trivialement vraie dans Z/232Z qui est le domaine des entiers JavaTM. Les propriétés arithmëtiques du tatouage peuvent être utilisées pour modifier des calculs de la méthode. Par exemple, supposons que Ki = 216 et que s =18. Alors quelle que soit la valeur de x , l'invariant x % 4=2 est toujours satisfait. De ce fait, une constante explicite 2 dans le programme original P, peut être remplacëe par x % 4 dans le programme tatoué PT. Les autres constantes ou valeurs de variables du programme tatoué PT peuvent également être aisément calculées en fonction de cette valeur. Par exemple 1 dans P sera simplement ~ 5 remplacé par (x % 4)-1 dans PT.
Ceci conclut la description du dispositif de tatouage DT sur l'exemple considéré par enchaînement des dispositifs DC, DM et DB comme indiqué à la figure 10A et du dispositif DG par enchaînement des dispositifs DPS, DA et DT comme indiqué à la figure 11A.
2o Dans cet exemple, le dispositif DS de la figure 6A calcule la sémantique statique secrète s = (s1, ...,sn) en faisant n analyses statiques successives du programme tatoué selon l'interprétation abstraite définie ci dessus consistant en une analyse en avant, pouvant ignorer le flot de contrôle ou non, avec propagation des constantes modulo les clés restées 25 secrètes K1, ..., Kn.
Ä partir de cette sémantique statique secrète s = (s1, ...,sn), ie dispositif DF calcule l'authentifiant original du programme P comme indiqué à
la figure 8 bis.
Si le domaine D(K) utilisé dans l'exemple ci-dessus est public 30 (mais pas 1â clé K), alors la sémantique statique secrète s peut être découverte par la force brute, du moins pour de très petits programmes comportant peu de variables entières. Dans un exémple plus réaliste on utilisera donc, non pas des clés s1,..., sn codées sur 32 bits, mais une ou plusieurs clés s de 512 bits utilisant, par exemple, un codage àrithmétique de la valeur de <watermark: xi : s> sur 16 variables de 32 bits ou 8 de 64 bits ou utilisant toute autre technique de codage informatique de grands entiers.
Ceci conclut la description du dispositif DAu de la figure 12A pour l'exemple considéré.

Exemple 4 Un quatrième exemple est le programme de tri suivant private static void bubbleSort (double [] data, int size) {
int indexl , index2;
double temps boolean exchanged;
1o for (indexl = size; indexl >= 2; indexl--) {
exchanged = false;
for (index2 = 1; index2 <= indexa - 1; index2++) {
. if (data [index2] > data [index2 + 1 ]) {
temp = data [index2];
data [index2] = data [index2 + 1];
data [index2 + 1 ] = temps exchanged = true;
2o if (!exchanged) break;
Ä nouveau n=2, K1=10000 et K2=5421. La méthode bubblesort est tatouée deux fois, une première fois par la même signature sémantique secrète (2507, 3012) que ci-dessus et la deuxième fois par la signature secrète (9876, 2345). Le deuxième tatouage aurait aussi bien pu être réalisé
pour des valeurs différentes de K1 et K2. Après compilation, tatouage, obfuscation et décompilation, on obtient la classe bubblesort comme suit private static void bubbleSort (double [] r0, int i0) {
int i1, i2, i3, i4, i5, i6, i7, i8, i9, i10, i11;
double d0;
i1 = 1;

i2=i1;
i3 = i2;
i4 = i3;
i5 = i2 - 35539;
5 i6 = i1 - 62508;
i7 = i3 - 129877;
i6 = i1 * i6;
i5 = i2 * i5;
i1 = i6 - 144986;
10 i8 = i4 + 84390;
i7 = i3 * i7;
i2=i5-75291;
i8 = i4 * i8;
i3 = i7 + 169752;
15 i4=i$+10111;
for (i9 = i0; i9 >= 2; i9--) {
i10=0;
for (i11 = 1; i11 <= i9 - 1; i11++) {
if (r0 [i11] > r0 [i11 + 1]) {
20 i6 = i1 * 620;
i6=i6+1151;
d0 = r0 [i11];
i6=i6*i1;
i1 = i6 + 6570;
25 r0[i11]=r0[i11 +1];
i8 = i4 * 936;
i5 = i2 * 620;
i5=i5+1151;
i5 = i5 * i2;
30 i8 = i8 + 1057;
i2 = i5 + 2961;
r0 [i11 + 1] = d0;
i7 = i3 * 1231;
i7 = i7 + 1699;
35 if ((i2 - 2961 ) _= i5) i10= 1;
i8 = i8 * i4;
i7 = i7 * i3;
i3 = i7 + 2696;
i4 = i8 + 1389;
if (i 10 == 0) break; (10) which involve operations on integers.
Transcoding that respects the semantics of the instruction then consists in performing the same modulo operation on any integer Nk less than 232. An alternative is to perform an operation different chosen in a permutation table, still in algebra Modular. The module (320) has, in this case, the main function of keep the table of secrets assigned to given user classes for given programs.
The module (330) to calculate and insert the instructions transcoded is a sequence of compiler instructions (30) of the type described here after.
We want to insert p marks in each selected method.
Let p be numbers n1, n2, ...., nP and p brand values ci <n1; c2 <n2; ....
Pc <np constituting the secret set.
We will repeat the rest of the macro-instructions in appendix 1 to 1o each method to tattoo.
It is of course possible to repeat the process several times with different brand values.
It is also possible to improve the robustness of the tattoo by using different techniques of modular algebra.
~ 5 The strongest technique for linking tattoo to program is to make tattoo variables interfere with the original method. We must to do this you can use the properties of the watermarking variable which are transposable into signed 32-bit arithmetic. This is possible when the key K is a power of 2. In fact, suppose that K = 2k. So we have 2o X = v + a. 2k in Z
We know, still in Z, that for all j <k X% 2 '= v% 2' Where x% y denotes the operation that returns the rest of the division Euclidean of x by y. We notice that this property remains trivially 25 true in Z / 232 which is the domain of Java integers. So we can use arithmetic properties of the tattoo to modify calculations of the method. For example, suppose that K = 216 and that v = 18. So what whatever the value of x, we always have x% 4 = 2.
If we have for example an explicit constant 1 in the program 30 original, it can be replaced by x% 4-1. If now we have added a dynamic at x, we have a variable which apparently takes values stochastic but of which a hidden invariant is used. The program as well changed is irreversibly degraded while retaining its original semantics. A hacker trying to eliminate the variable from 35 tattoo x would then render the program unusable. We need a study thrust of the behavior of x in order to be able to find the information thus concealed.
Note that this technique of concealing constants can be done simply automatically and randomly.
The generic method of reading the tattoo is presented on 5 Figure 3. It assumes knowledge of at least part of the tattoo settings. It is thus possible to assign on several levels a software distribution chain (buyer, wholesaler, distributor, retailer) of their own brands.
In the embodiment of FIG. 4, the knowledge of the 10 or secret numbers Nk allows to find by a step by step execution of the program in the compiler variables that have a congruence modulo Nk at a given instant then to plot this variable until its initiation.
The table of secrets can be easily contained on a card ~ 5 microprocessor which will be connected to the computer comprising the interpreter.
The authorized user only needs to know a public key which will activate the program for reading secrets corresponding to his identification User. The private keys that define the table of secrets are not therefore not to be disclosed.
2o Two examples given in appendices 2 and 3 illustrate the application of semantic static analysis to authentication of the signature of the software with two simple cases of calculation of the fixed points of hidden variables.
These devices and methods can be implemented without difficulty on commercial computers. Depending on the complexity of the software and signature, execution times will be more or less long, in particular for the calculation of the hidden variables by the method of the fixed point.
To limit these calculation times, we will apply the enlargement methods and shrinkage known to those skilled in the art.
3o You can tattoo as many times as you want software (overlapping tattoos), unlike sound tattoos or images which undergo a certain saturation of the subliminal channel (according to the chosen perception model). We will not alter the software, we will overload, memory and time resources will be affected.
We can use for each of these tattoos semantics different secrets, so the chain of trust can have different secrets.
Of course, this tattooing device and method can be combined with devices and methods of the prior art to make the program unusable by an unauthorized user (prevention) to trace the possible dissemination of said program by an authorized user to 1o unauthorized users (audit). It will be enough for this not to communicate authorized users the keys allowing them to find the tattoo.
It is also possible to use the device and the method for automatically authenticate the codes that we will authorize to enter a network or a given station. The tattoo is comparable to ~ 5 an authentication certificate including the network monitoring station or of extension will have the reading key.
Annex 4 presents a glossary.
Annex 5 presents an embodiment of the invention with a cutting into finer modules.
2o Annex 6 and Annex 7 present new examples of achievements (examples 3 and 4).

NOTES

1.1 Decomposition of the method into 2 blocks A and B of size equivalent.
1.2 For i varying from 1 to p do 1.2.a find a random position in block A
1.2.b list the variables with a value determined at this level of execution - 2 cases case 1: there is none Create a variable w initialized to a value x. Insert this initialization between the start of the method and our current position.
case 2: there is at least one 75 Select one of these variables w, let x be its value at our current execution position.
1.2.c create any polynomial P of degree 2 checking P (x) = c; + k * n; (k random whole integer small) 1.2.d insert the following initialization instruction: int 2o v; = P (w) 1.2.e find a random position in block B
1.2.f create any polynomial Q of degree 2 checking Q (v;) = v; +1 * n; (1 small random integer) ' 1.2.g insert the instruction v; = Q (v;) Example 1 The tattooed method is the main method of the class 3o Fibonacci, which calculates the value of the noeme term of the Fibonacci sequence, defined by one + 2 one + 1 + one relations uo = 0 u1 = 1 Initial program public class Fibonacci public Fibonacci () public static void main (String [] args) f int n = Integer.parselnt (args [0]);
int a = 0;
int b = 1;
for (int i = 1; i <N; i ++) f 2o int c = a + b;
a = b; // a is u;
b = c; // b is u ~ + 1 .. + b);
System.out.println ("The value of the sequence for n =" + n + "is Choice of semantic correspondence tables We choose to insert two brand values. For it we use two transcoding tables.
The first will insert the hidden value 2507, the second 3012.

Our two tables associate with each algebraic operation on whole algebraic operation identical modulo a number N. At any return of integer value consecutive to the call of an int type method, we associate initialization to a value V. For the first table, N is 10,000 and V 17.
For (a second, N is 5421 and V is 50.
1. Transcoding table for 2507 Initial instruction Transcode instruction v (integer) = a (integer) + b (integer) v = a + b modulo 10,000 v (integer) = a (integer) * b (integer) v = a * b modulo 10,000 v (integer) is the return (integer) v = 17 of the call a function l0 2. Transcoding table for 3012 Initial instruction Transcode instruction v (integer) = a (integer) + b (integer) v = a + b modulo 5421 v (integer) = a (integer) * b (integer) v = a * b modulo 5421 v (integer) is the return (integer) v = 50 of the call a function Method tattoo The tattoo of our method consists of the insertion of two variables j and k, taking respectively the values 2507 and 3012 in our secret space.
The tattoo consists of two stages - Initialization of j and k as a function of n at the start of the program. This initialization allows us to calculate the value of j and k in our space however, the values of j and k during the execution of the 2o program are unknown. This allows us to avoid the transformation of these instructions by an optimizer.
- Addition of instructions in the algorithm ensuring the invariance of j and k in our secret space. .
These instructions are calculations of j and k using polynomials of degree 2.

Anchoring the brand The brand is anchored at the end of the method. Indeed, the values of j and k are apparently used for the calculation of the result b. This is a lure, because the sequence of operations performed on b ultimately leaves this 5 invariant variable.
Tattooed program public class Fibonacci {
publishes Fibonacci () ~ o {
) public static void main (String [] args) {
int n = Integer.parselnt (args [0]);
15 int a = 0;
int j = -34 * n * n-500 * n + 833; / lj is 2507 in the conjugate space n ° 1 // (i.e. modulo 10000, n-> 17) int b = 1;
int k = 2 * nn * n-9; // k is 3012 in the conjugate space n ° 2 20 (i.e. // modulo 5421, n-> 50) for (int i = i; i <N; i ++) {
int c = a + b;
a = b; // a is u (i) j * j = 5-28; // j is always 2507 in the conjugate space 1 b = c; // b is u (i + 1) k = k * k + 201; // k is always 3012 in the conjugate space 2 ) b + = k + j;
3o b = b-1-k * j + (1-k) * (1-j); // anchor of k and j System.out.println ("The value of the sequence for n =" + n + "is "+ B);

Brand analysis The brand is found by Static Analysis, and the method approximation of the fixed point by iterations.
The combined program in space 1 is as follows public static void main (String [] args), int n = 17;
int a = 0;
int j = (- 34 * n * n-500 * n + 833)% 10000;
1 o int b = 1;
int k = (2 * nn * n-9)% 10000;
for (int i = 1; i <N; i ++) int c = (a + b)% 10,000;
a = b;
j = (j * 5-28)% 10000;
b = c;
k = (k * k + 201)% 10000;
ob = (b + k + j)% 10000; .
b = (b-1-k * j + (1-k) * (1-j))% 10000;
System.out.println ("The value of the sequence for n =" + n + "is:" + b);
}
First transform the program into a flow graph execution Static Semantic Analysis: approximation of the iuce point beginning ~ N = 171 State (I) State (II) r:
_ i ~
= .- 34 x raz -x not +
833 ~ 10000 ~!

._._._._._._._._._._._._. J

State (III) ~ b 1 ~
i =

State (IV) ik = 2xn-nz-9 ~ 10000 ~ i State (~

l i ~ i ~

.
.

State (V1) !
i (i <ra) i otherwise !

s _._ i ._._. ~

tat tat (VII) (X111) ~ c = a + b ~ 10,000 ~ i ~ B = b + k +
j (10000 ~ i tat tat (VIII) (XIV) __ _ ~~ -bi ib = _b-1-_kx + j (= 1 .k) x (1-j ~ 10000 ~ i tat tat (I ~ (XV) . j = -jx 28 ~ 10000 ~!
!.at, isplay (b) i HE_._._ ._._._._._._._._.

end state (~

~ B = ~ i State (X1) ik =. 01 10000 ~ ~~
kz + 2 State (X11) i = i 1 ~ 10,000 ~!
+

SEMANTIC STATIC ANALYSIS
For each state we study the set of possible values taken by the variables, We will only consider the variables which have changed in value from all states possible precedents.
Variable state Initial value is studying /
Interval I n N = Ql II a A = O

NI Qs J =

I b _ B = Q3 k K = O

I i I = QS

II i I = O

III c = f I a A =! 'D

J = f ~

I b B = O

II k K = ~

III i I = ~

I b B = QS

b B = ~

Writing of the interdependence of states Nr = ~ 17 ~ Arr = ~ 0 ~ Jrrr = 1j = 34xn2 -500xn + 833; nE Nr Br ~, = ~ 1 ~ Kv = ~ k = 2, xn-nxn-9; nE Nr ~ 1 ~ = ~ 1 ~ U ~ Ivrr + 1 ~
I ~ r = Ivr (~ - ~; supn r ~ Nl Cvrrr = ~~ = a + b; aE ~ Arr UA ~~~ bE ~ Bn, UB ~
A ~ = Bn, UB ~ Jx = ~, j x5 -28; j E ~ Jx U Jrrr ~~ Bxr = wrrr Kir = ~ kxk + 201; kE ~ Kv UKxrr ~~ I ~ rr = Ivr nCinf n; ~
rtEN ~~
Bue, _ ~ b = b + k + j; b E ~ Br ~, U Bxr ~~ k E ~ Kv U Kxtr ~~ IE ~ Jr ~ r U Jx) ~
Bue, = ~ b = b-1-kx j + ~ 1-k ~ x ~ 1- j ~; bE B ~ v ~ k ~ ~ Kv UKxrr ~~ .1 E ~ Jrlr UJx ANALYSIS
STATIC
SEMANTIQIE
For each State, we are investigating all of the values possible taken through the variables.
We born considrera than the variables who have chang of value through report all the States prcdents possible.
We will approximate these sets through of the ntervalles Number 0 1 2 3 4 of itratio not N ~ 17 17 17 17 J if ~ 2507 2507 2507 KO ~ 9736 '9736 9736 I s? ~ 1 1 i; 2 1; 2) IOO 1 1 1; 2) C QJ fd 1 1; 2 (1; 2) AS ~ 1 1 1 JQ ~ ~ f 2507 2507 B ~ ~ f ~ 1 1; 2 K> ~ ~ D f 9897 82; 9930 I

B ~ O gj 2244 2244; 2405 BP ~ S23 s ~ 0; 9999 Number 5 6 7 8 35 of itratio FIXED POINT
not I (i; 3 (1; 3) (1; 4) (1; 4) (1; l 7) I (1; 2) (1; 3) (i; 3) (1; 4) (1; 16) ' (i; 3) (1; 4) (i; 5) (1; 7) (0; 9999) (1; 2) (1; 2) (1; 3) (1; 4) (0; 9999) B (1; 2) (1; 3) (1; 4) (1; 5) (1; 9999) K (2; 9997) (2; 9997) (2; 9997) (2; 9997) (2; 9997) I Ql f QJ ~ 'Q i 7 B (0; 9999) (0; 9999) (0; 9999) (0; 9999) (0; 9999) B (0; 9999) (0; 9999) (0; 9999) (0; 9999) (0; 9999) We find our first brand: the variable J always keeps the value 2507.

The combined program in space 2 is as follows publïc static void main (String [] args) int n = 50;
int a = 0;
10 int j = (- 34 * n * n-500 * n + 833) ~ 5421;
ïnt b = 1;
int k = (2 * nn * n-9) ~ 5421;
for (int i = 1; i <N; i ++) int c = (a + b) ~ 5421;
a = b;
j = (7 * 5-28) ~ 5421;
b = c;
k = (k * k + 201) ~ 5421;
é5 b = (b + k + j) ~ 5421;
b = (b-1-k * j + (1-k) * (1-j)) ~ 5421;
System.out.println ("The value of the sequence for n =" + n + "is.
"+ B);
l Let’s transform the program into a flow graph as before execution beginning ~ n = 50i ötat (I) i0 State (II) j = ~ -34 x n2-- 500 xn + 833 ~ 5421 ~ i State (III).
~ b = ~ li State (IV) ~~ k - 2x n.-n2.-9 ~ 5421 ~ i State (~
_ ü = .1i State (V1) ~ i If (i ~ h) i II otherwise ~
._._ i.
State (VII) State (X111) ic = _a + b ~ 5421 ~! ib = b + .k + j ~ 5421 ~ i . _. _. _. _ ._. _. J _ State (VIII) State (XIV) ~ a = bi ib = b = 1-kx j + (1 = k) x (1 _j) ~ 5421 ~~
State (XV) j = jx 5 - 28 ~ 5421 ~ i ~~ display (b) i ._._._._._._._: i ._._._._._ State (X) End ~ B ~ i State (IX) State (X1) ik = k2. + 201 ~ 5421 ~~
State (X11) _, _ = i + 1 ~ 5421 ~ i SEMANTIC STATIC ANALYSIS
For each state, we study the set of possible values taken by variables, We will only consider variables that have changed value in relation to all possible previous states, Variable state Initial value is studying /
Interval I n N = ~

II a A = Q ~

III J = O

I b B = P3 k K = O

I i I = O

II i I = ~ Zi III c = Ql IX a A = f XJ = O

XI b B = Qs XII k K = 0 XIII i 1 = O

XI b B = Q3 b 8, ~, = O

Writing of the interdependence of states Nr = ~ 50 ~ Arr = ~ 0 ~ Jrrr -lJ = -34xn2-SOOxn + 833; nE Nr B ~, = ~ 1 ~ Kv = ~ k = 2Xn-nxn-9; nE Nr ~ Ivr = ~ 1 ~ U ~ Ivrr + 1 ~
I ~ r = 1 ~ (~ - ~; supn r ~ NI
Cvrrr '~~ = a + b; a E ~ Arr U Arx ~ ~ b E ~ Bn, UB ~ ~~
Arx = Brv U Bxr Jx = ~ jx 5 - 28; j E ~ Jx U .l rrr O Bxr = Cvru Kir = ~ kxk + 20I; k E ~ Kv UK ~ 1 ~~ I xrrr = I vr n C inf n; -I- ~~
don't N ~
B ~ v = ~ b = b + k + j; bE ~ Br ~, UBxr ~~ kE ~ Kv UKxtr ~~ jE ~ Jrrl UJx Bue, = ~ b = b-1-kx j + ~ 1-k ~ x ~ l- j ~; bE Bxr ~, ~ kE ~ Kv UK ~ r ~~ jE ~ Jrrr UJx STATIC ANALYSIS
SEMANTICS
For each state, we study the whole values possible taken by the variables.
We will not consider that variables who have change from value by report all previous states possible.
We will approximate these sets by intervals Number 0 1 2 3 4 of itration N ~ 50 50 50 50 B ~ 1 1 1 1 i5 K ~ OO 3012 3012 3012 I f 1 1 1; 2 1; 2) IOO 1 1 (1; 2) C Q3 ~ d 1 1; 2 1; 2) A ~ O 1 1 1 OJ ~ ~ 1658 0; 5420 B Qj O QS 1 1; 2) KDO fd 3012 3012 IO ~ ~ O

B s QS Q ~ 2266 0; 4671 BO ~ ~ s ~ 0; 5420) Number 5 6 7 8 105 FIXED POINT description B 1 1 i 'ii I (1; 3) (1; 3) (1; 4) (1; 4) (1; 50) I (i; 2) (1; 3) (1; 3) (1; 4) (i; 49) C (1; 3) (1; 4) (1; 5) (1; 7) (0; 5421) (1; 2) (1; 2) 1; 3) (1; 4) (0; 5421) J (0; 5420) (0; 5420) (0; 5420) (0; 5420) (0; 5420) B (1; 2) (1; 3) (1; 4) (1; 5) (0; 5420) I pJ ~ 1d QS QS 50 B (0; 5420) (0; 5420) (0; 5420) (0; 5420) (0; 5420) B (0; 5420) (0; 5420) (0; 5420) (0; 5420) (0; 5420) We find our second brand: 3012.

Example 2 The method from which the semantic signature is extracted is a bubble sorting method.
Initial program public class Bulle public statïc voïd main (String [] args) int [] table = new int [args.length];
for (int i = O; ï <Args.length; i ++) table [i] = {znteger.parselnt ares [i]);
{print table);
print (tri (table));
public static void print (int [] table) {System.out.println "");
for (int i = O; i <Table.length; i ++) System.out.print {table [i] + "");
{System.out.println "");
public static int [] sort (int [] table) int [] table2 = table;
boolean flag = false;
int i = 0;
int v = 0;
while {! flag) flag = true;
for (i = O; i <Table2.length-1; i ++) if {table2 [i]> table2 [i + 1]) v = table2 [i];
table2 [i] = table2 [i + 1];
table2 [i + 1] = v;
flag = false;
return {table2);

Choice of a semantic correspondence table We want to get a semantic property after transcoding. For this we use a transcoding table.
For the variables of our conjugate space, we will add to 5 variables of our initial space two new variables W and W 'of type integers that are initialized with the value 0.
Then we try to locate the transpositions carried out on the program tables, and we will replace these operations performed on the table by operations on the variable W. A static analysis on 1o W will show that it keeps a constant value equal to 7 in all situations.
Transcoding table Initial instruction Transcode instruction Start of program W is 0 W 'is 0 Squence: Squence:

<sequence of any operations <sequence of operations (1)>
(1)>

<sequence of operations (2)>

<a variable X takes the value of water <sequence of operations (3)>
a Tab T

the index P><sequence of operations (4)>

<sequence of operations (5)>

<sequence of any operations (2) having neither X n T nor P> If W = 0 Then W = 1 <a variable Y takes the value un - W = W * Absolute value (P -from T 6 ~) Q index>

W '= min (P, Q) <sequence of any operations (3) affecting neither X, Y, nor T, neither P, nor Q>

<we write the index P of T
the value of Y>

<sequence of any operations (4) affecting neither X nor T nor Q>

<we write the index Q of T
the value of X>

<sequence of any operations (5)>

T is an array, parameter T is the array (5; 2;
of 4; 1; 3) Method signature The signature of the method consists of 'static analysis semantics of the method and property detection Program analysis the properties are found by Static Analysis, and the method approximation of the fixed point by iterations.

The execution graph of the combined program is! E following beginning stable - [5; 2; 4; 1; 3 [
State (I) ! ~ flag = ~ false i The _._._._._.
State (II) ii = O; v = Oi ..
State (III) ~ W --O; W ~ = 0i i._ ._._._ State (IV) ~ if (f lag = ~ false)! if not !
i ._._._._._._._. 1 i ._._. ~
State (~
â ~ = Oi Fin ._._ i.
if not !
State (V1) ~ ''-'-' ~ If (i <length (table2)) i i ._._._._._._._._._._._ State (VII) ! if (table [i]> table [i +1]) t otherwise I
L _._._._._._._._._._. ~ I ._._. ~
State (VIII) ~ W --W ~ eI i -. (I + 1 ~!
._._._._._ State (IX) iyy ~ - ~ n (i + l i) ~
State (X]
! ~ flag =. false i i ._._._._._.
State (X1) II ~ i =, i + lj Writing of the interdependence of states ~ TABLE, _ ~ 5; 2; 4; 1; 3 ~
FLAGII = ~ false ~
1O IIII ~~~ WIV ~ O
VIII ~~~ W'N
FLAGV = FLAGII n ~ false ~
he Ivl = ~ O ~ U ~ Imlr ~ 1J
Ivll = 1 ,,, (~ ~ - ~; length (tablell TABLEVIII = table E (TABLE, I UTABLEvIIr) such as ~ i E Ivll such as table (i) ~ -table (i +1) ~
Ivlll = ~ i E Ivll such as stable E (TABLEII ~ JTABLEVIII) such as table (i) ~ -table (i +1) ~
W Uf (n '~ f: OH ~ l rx = x ~ OH ~ xXli- (i + l ~; iE IvIrrI
~ (WauwN) W'X = ~ min (i + l; i} i E IVIII ~ _ FLAGxI = ~ f'alse ~.
Itrations 0 1 2 FLAG Q ~ false false I PJ 0 0; 1}

IP ~ 0 0; 1 TABLE Qs (5; 2; 4; 1; 3) (5; 2; 4; 1; 3) I ~ 0 0 W ~ 1 1 W 'P ~ 0 0 FLAG Qs false false Itrations 3 4 5 - FIXED POINT

FLAG false false false I 0; 1; 2 0; 1; 2; 3 0; 1; 2; 3; 4 I 0; 1; 2 0; 1; 2; 3 0; 1; 2; 3 TABLE (5; 2; 4; 1; 3) (5; 2; 4; 1; 3) (5; 2; 4; 1; 3) I 0; 2 0; 2 0; 2 W 1 1 ~ 1 W '0; 2 0; 2 0; 2 FLAG false false false Signature We note the signature of our sorting bubbles method Considering the secret transcoding table, we note that - “the associated variable W keeps a constant value equal to our variable X 'takes the values 0 and 2 ".

Glossary 5 - A software or hardware protection-prevention process is a set of techniques that make copying more difficult and use fraudulent software or electronic circuits.
- A program is written in a certain language of programming called computer program language.
10 - The interpretation of a program is the translation of the sequence of words composing it in a series of actions called program execution.
- The compilation of a source program, written in a language of high level, is its translation into another language or materialization into a automaton, in general machine language or electronic circuit.
15 - A software computer program is a program interpretable or compilable into an interpretable program.
- A hardware computer program is a program achievable by an electronic circuit and specified by a language ~ of description of circuit.
20 - An element of a computer program is a part not necessarily related to the program text corresponding to one or several instructions, possibly composed (like a command conditional choice or by case, a loop, etc.), a declaration or description of one or more data structures including 25 possibly the operations or operations acting on these structures of data, one or more procedures or methods, one or more modules, etc.
- A semantics of a software or hardware program is a mathematical model defining the set of possible behaviors 3o from a program to execution at a certain level of observation.
- Semantic static analysis is automatic determination of semantic properties of programs.
- Two software programs are semantically equivalent (or functionally equivalent) if they have the same 'behavior 35 observable that is to say they execute functionally equivalent, (for example, if for any possible entry, the exits of the program are the same).
- An abstract semantics of a software or hardware program is a mathematical model defining an over-approximation or a sub approximation of all the possible behaviors of a program at runtime.
- An abstract semantics is secret if its specification for a software or hardware program requires knowledge of a secret.
- A signature is characteristic information (label, thumbnail or summary) associated with an object (here, a software program or equipment). This information may depend on an intrinsic property or extrinsic of the object. These properties can authenticate the shape and background of content of the object (coding, format, syntax, aesthetics, semantics) or its traceability (the history and / or the future of this object).
~ 5 - A secret signature is a signature obtained using a secret method.
- A semantic signature is one. signature specified in semantic function of the object (here, of the program written in a language programming, with defined semantics).
20 - An authenticator is secret proof of the detention of a information or a right (for example designation of the object, name of the author name of recipient, terms of user license, etc.).
- A mark is a component of an object that allows identify it and which, in the context of the invention, makes it possible to find a 25 signature of the object.
- Labeling an object or a program consists in manufacturing a label which is generally separated from the content of the object or embedded in the object in an easily identifiable place.
- Unlike, tattooing involves embedding one (or more) 3o marks) in the body of the object. This brand distributed in the body of the object is generally if not undetectable at least indelible. The grafted object with this mark is called the tattooed object.
- Obfuscation is the transformation of a program into a semantically equivalent program in a difficult form 35 understand by a computer specialist but usable by a user. To the Unlike tattooing, obfuscation makes a program confidential (thanks difficulty understanding it) but does not allow authentication [cf.
Obfuscation techniques for enhancing software security, New Zealand Patent Application # 328057, WO 99/01815, PCT / US98 / 12017, June 9, 1997].
- Brand leaching is an attempt to erase or brand modification or program overload to drown the brand, without changing the semantics of the program.
- A program tattoo is robust if it resists a compile optimization, obfuscation, labeling and another 1o tattoo, all these operations being applied later.

Analytical description of modules and background theory ~ eu A process A of abstract semantics The inventor uses the principles and techniques of static analysis semantics of programs by abstract interpretation, cf. [P. Cousot & R.
Cousot, "Abstract Interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints ", Conference international “Principles of programming languages, POPL'77”, p. 238 252, ACM Press, January 1977] and [P. Cousot & R. Cousot, "Systematic design of program analysis frameworks ", International Conference "Principles of programming languages, POPL'77", p. 269-282, ACM
Press, January 1979].
Process / module A defines an infinity of abstract semantics dependent secret - an abstract domain D (K) parameterized by a secret key K (which, 2o apart from an injection, can be considered as a number N = b (K), this injection b which can itself constitute a secret);
- a specific secret key K which defines the abstract domain D (K) used in the SAS secret abstract semantics.
This SAS secret abstract semantics consists of secret abstract domain D (N) and corresponding abstract operations for the constructs and primitives of the family of languages of programming considered, obtained using the principles of abstract interpretation.
Any abstract domain D (K) obtained by an abstraction, in the sense 3o of the theory of abstract interpretation, cf. [P. Cousot & R. Cousot, "
Abstract Interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints ", International Conference "Principles of programming languages, POPL'77", p. 238-252, ACM
Press, January 1977] and [P. Cousot ~ R. Cousot, "Syster ~ natic design of program analysis frameworks ", International conference" Principles of programming languages, POPL'77 ”, p. 269-282, ACM Press, January 1979], parameterized by a secret key K can be used as a parameter for module A. Thanks to the use of an injection N = b (K), which can remain secret, any abstract domain D (N), parameterized by a number N, is reusable with any keys. There is therefore an infinity of abstract domains D (K), parameterized by a key K, which can be used.
However, the abstract domains published in the scientific literature do not not suitable, in the sense that they are not parameterized by a key K
(even chosen isomorphically to integers or to ordinals).
Part of the invention, the transformation of an abstract domain any usable for classical static semantic analysis whose concretization provides information on whole values (at a coding near injective) into an abstract domain D (N) parameterized by an integer N
constituting the secret key K = N (close to said coding). The process consists of ~ 5 abstract (in the sense of abstract interpretation) from concrete operations in an abstract operation considering the concrete operation carried out modulo N.
This abstraction is then extended to the control elements of the program according to the principles of semantic static analysis by abstract interpretation, possibly using techniques 2o widening and shrinking known to those skilled in the art.
Examples of the transformation of such D domains into domains D (N) are described below: ' For a P program written in algorithmic language (with additions, multiplications, loops, branchings, ...) and of which 25 the author wants to secure sensitive algorithmic content, we can define, for module A, an abstract domain D (K) for integer or fixed or floating point, or even the reals corresponding to the propagation of constants, powers, intervals, octagons, invariants of linear equality or inequality, etc. modulo a positive integer value N
3o constituting the secret key K = N;
For a P program written in programming language with symbolic data (with pointers, lists, tables, ...), and whose author wants to secure the sensitive handling of variables and, arrays, we can use a symbolic abstract domain (through 35 example an automaton or domains of trees) parameterized by integers (e.g. loop counters in the controller or a measurement of the size of symbolic data) by reusing the previous abstract domain D (N) to analyze the whole parameters, modulo an integer value positive N constituting the secret key K, again at an injection (which can 5 be secret) close.
On the one hand there are an infinity of possible abstract domains Di and on the other hand an infinity of instances Di (K) of each particular domain in secret key function K.
Unlike classic cryptology, if module A is secret, o it is not possible to break it by brute force. Indeed it there no possible enumeration of all the abstract domains configured usable.
For a given secret abstract domain Di (N), different keys lead to different SAS secret abstract semantics, like ~ 5 this is the case for the examples above. When the secret abstract domain at been disclosed or is public, the secret of the key must be managed as in classic cryptography to limit brute force attacks (like in the case of classical cryptology based on NP-complete problems or secret key encryption mechanisms).
2o The secret abstract semantics SAS must be invariant for the primitive transformations and constructions of the language family software or hardware considered, which leave their standard semantics invariant, as is the case for the examples above. this is important to avoid obfuscation attacks which involve transforming 25 a P program into a syntactically different program but semantically equivalent, for example, the mathematical expression (a =
2b) being changed to (a = b + b).
(See Figure 5) A DA module of abstract semantics Among the preferred embodiments of the above method, the invention also discloses a device (which can be produced by a computer product / program) including A DA module based on an abstract domain D; (K) parameterized by a secret key K chosen from a very large number possible, allowing to generate a program implementing the SAS secret abstract semantics the family of software or hardware programming languages considered;
(See Figure 5A) A signature process Method / module S uses secret abstract semantics o SAS for the constructs and primitives of the family of languages programming considered, to make the semantic static analysis of the original program P software or hardware, the result of this analysis providing the SSS secret semantic signature of the P program.
do, module S builds a representation of the system of equations of fixed-point whose approximate solution defines the SASP abstract semantics said software or hardware program P. This SASP abstract semantics of software or hardware program P is defined according to the principles of analysis static by abstract interpretation of the concrete semantics of the program based on the SAS secret abstract semantics for constructions and 2o primitives of the family of programming languages considered. This SASP abstract semantics is computed by elimination, or by iteration with acceleration of convergence. The SSS secret semantic signature of the program P is an injective function SSS = i (SASP) of the semantics SASP secret abstract. This injective function i can itself constitute a secret.
The secret SSS semantic signature of a software program or P material, calculated according to module S or by the DS device allows check whether a software or hardware program P 'has a semantic signature equivalent.
3o At this stage, it is possible to deposit the SSS signature of the software or hardware program P as produced according to module S or in release of the DS system with a Relevant Third Party (TPC). She did therefore not to be inserted into the software or hardware program P.
The authentication of the software or hardware program P will then take place with the TPC by comparison of the signature produced SSS 'of the software or hardware program P 'to be analyzed with that (SSS) which has been filed. In this case, the software or hardware program P is not protected against copy P ', but two cases arise SSS 'is different from SSS: the changes that P has undergone are detectable, since in the case of semantic modifications (for example example, adding a virus, modifying an algorithm), the signature is different;
SSS 'is equal to SSS: the software or hardware program P' is a copy of P or if the software or hardware program P has been obfuscated (manually or automatically) to become P ', the SSS signature' of P ' will be equal to that SSS of P, which will prove, with a probability extremely low error, that the software or hardware program P has been pirated, this piracy being masked by an edition make-up.
~ 5 (See Figure 6) DS signature module A DS module to calculate your SSS signature of a software or hardware program P according to the secret abstract semantics SAS
0 obtained using the previous DA module.
(See Figure 6A) An M marking process 25 For the invention, a mark m is a program element software or hardware that can be inserted into a software program or material by the process described in module B below. Let Pm be software or hardware program ie as simple as possible in which the brand m can be inserted. For any secret SAS abstract semantics, this 3o Pm program has a secret semantic signature, calculable by the module S, called secret semantic signature of said mark m and noted SSS (m).
A "brand" m is a triplet consisting of a "location of mark ", an" initialization mark "and an" induction mark ", s5 one of the last two may possibly be empty.

Let Pm be a software or hardware program from the family of languages considered, as simple as possible, which declares if necessary I '' mark location '' X ', execute the' initialization mark '' la ', then execute one or more or even an infinite number of times the "mark induction '' If '.
The M marking process is parameterized by a semantics SAS secret abstract (for example, produced by module A from a abstract domain Di (K) as a function of a secret key K) and a signature o secret SSS semantics;
Using a f3 bijection which can be secret, the module M
calculates an abstract value 's' of the abstract domain Di (K) as a function of the SSS secret semantic signature, according to semantic coding SAS secret abstract. This abstract value 's' is chosen so that it there are many concrete values 'x' whose abstraction according to the principles of abstract interpretation or that of the singleton '{x}' is the value abstract 's'. For example this value 'x' can be that of a variable in the case of a non-relational analysis or a vector of variables for a relational analysis or the value of any object analogous to a vector of variables;
Process M chooses, according to the principles of interpretation abstract, any of the concrete values 'a' including abstraction or that of the singleton '{a}' is the abstract value s;
Method M then chooses, according to the principles of interpretation abstract, a concrete operation 'f' whose functional abstraction leaves invariant the abstract value 's' but not the concrete values corresponding (i.e. if 'x' is a concrete value whose abstraction according to the principles of abstract interpretation or that of singleton '{x}' is the abstract value 's' so 'f (x)' is generally different from 'x' while abstraction according to the principles of abstract interpretation of 'f (x)' or that of the singleton '{f (x)}' is equal to 's');
Method M then chooses a 'brand location''X' which may be an existing program variable which is unnecessary or not alive in the elements of the P program to tattoo, a new variable auxiliary, an unnecessary or additional field of a data structure dynamically allocated, etc. whose values are taken into account in semantic static analysis used to determine semantics secret abstract of the program;
Method M then determines an "initialization mark"'la' which consists of one or more primitives or constructions of the family of languages considered to be interpreted as an assignment of value concrete 'a' defined above at the mark location 'X';
Method M then determines an 'induction mark''If' comprising one or more primitives or constructions of the family of languages considered to be interpreted as an assignment of the value of 'f (X)' at the mark location 'X'. According to the family of languages considered, these primitives or constructions will be assignments, parameter passages, unifications, etc.
The brand m is chosen such that, according to the principles of abstract interpretation, static semantic analysis of the Pm program, according to secret abstract semantics and in accordance with the directives of the module S, determined in a manner recognizable by the holder of the secret SAS, the abstract value 's' defined above and such as the signature secret semantics of the Pm program, as defined above, is that 2o SSS (m) of (a brand.
Using abstract domains ~ which are products Cartesian or reduced products of elementary abstract domains it is always possible to consider brands that are finite sets of elementary marks.
(See Figure 7) A DM marking module Among the preferred embodiments of method M above, the invention also discloses a DM device (which can be produced by a computer product / program) which, from the implanting program (a SAS secret abstract semantics and data representing the secret semantics SSS of program P, calculates the text of the mark m.
(See Figure 7A) A camouflage process B
The camouflage process / module B takes as parameter the original software or hardware program P, a selection of elements from the 5 program P to tattoo and a mark m. Module B provides the program tattooed PT which is a fusion of the original program P and the brand m.
This fusion is made without altering the secret abstract semantics of Ia brand, nor functionality. of the original program, which allows find the secret semantic signature SSS (m) of the mark m from the secret semantic signature SSS (PT) of the tattooed program PT.
It is important to note that in the preferred embodiment of the invention, the observation of the values taken by the variables during execution of the tattooed software or hardware program PT does not allow find the brand. On the one hand, you have to know the abstract semantics 15 secret to establish the tattooed program PT. On the other hand, the calculation of semantic properties of this tattooed software or hardware program PT does not can be done, in the case of non trivial abstract domains D (K), that by static semantic analysis, not by step-by-step execution.
Inserting instructions into the software or hardware program 2o to protect P will have to respect a certain number of minimum rules of robustness. For example, to avoid outright deletion by a optimizer using “slicing” extraction techniques, ensure that variables and operations containing tattoo marks give the impression that they have a possible influence on the semantics 25 observable (eg exits) from the program. Said addiction potential can be a simple syntactic dependence chosen so that the demonstration that this syntactic dependence does not cause a semantic dependence requires complex proof of equivalence semantics of software or hardware programs. This is the case for example 3o of dynamic value allocation (semantically useless but this is undecidable) of certain values calculated by the operations containing the tattoo marks.
The mark m created by the module M, the selection of the elements to tattoo in a software or hardware program P as it ~ is chosen by the PS module and used by module B must meet the criteria set out below.
'If' induction marks should be included in the parts of the software or hardware program containing primitives or repetition constructs (iterations, recursivities, etc.).
The “initialization marks” 'la' are used to make the initializations required by the semantics of the language family considered. In addition, if necessary, for the language family considered, any declarations must be added to the tattooed PT program if the nature of the 'X brand location' requires it.
Finally, the camouflage module B adds primitives or constructions with the tattooed program PT making the values active possibly used in 'X' brand locations. A
possible solution is to use any values used in ~ 5 "brand locations" in the calculation of active variables of the program tattooed hardware or software PT, or to assign the values from 'brand locations''X' to dynamic variables of the tattooed program PT, more generally by ensuring that the lifespan dynamics of 'brand locations''X' is that of the program 2o tattooed PT. Again these transformations of the P program into the program tattooed PT should be done without altering the secret abstract semantics of the brand, nor the functionality of the original program, so that, according to principles of static analysis by abstract interpretation, this allows find the secret semantic signature SSS (m) of the mark m from 25 the secret semantic signature SSS (PT) of the tattooed program PT.
(See Figure 8) A DB camouflage module 3o Among the preferred embodiments of method B above, the invention also discloses a DB device (which can be produced by a product / computer program) which, from the text of the software program or material P, data for selecting the elements to be tattooed in P and text of mark m, produces the text of the software program or hardware PT

whose semantics is functionally equivalent to that of P and which conceals the mark m.
(See Figure 8A) A PS security policy process The PS process for selecting the abstract tattoo domain and of the elements to be tattooed in the software or hardware program P depends on security policy to be applied to this program P
Like steganographic techniques, the brand can be distributed in the program in different and distant places in choosing, for example, static semantic analyzes by abstract interpretation based on abstract domains allowing ignore the control structures in the calculation of the semantic signature ~ 5 secret so that the random distribution of marks has no effect sure this analysis; this strategy is for example applicable when the brand is used to authenticate an entire P program, relatively large in size.
Another security policy is to tattoo the parties innovative and / or sensitive aspects of the P program. In general, the elements 2o selected for the tattoo relate to the algorithmic part of the program (algebraic operations or manipulation of variables in RAM). According to the family of programming languages considered and the programming methodology used, this algorithmic part can to be 25 - Significant elements of the program which are semantically autonomous. In JavaT "" for example, a program is assimilated to several classes composed of several methods, they-same can call methods of other classes and others packages. A security policy for JavaT "" may therefore consist of 3o tattoo the methods which are the smallest autonomous entities of the program. Another choice would be to tattoo BeansT "".
- Elements distributed in the program constituting a semantically coherent set like for example an abstract type algebraic which would be implemented by data structures and 35 procedures and functions distributed throughout the program;

In this case the elements to be tattooed can be selected automatically on syntactic criteria (procedures, modules, etc.) or semantics (by a static dependency analysis) or manually by operator intervention.
Several different security policies can be implemented works on the same program P by repeating successive tattoos of the same original software or hardware program for different players. II
is thus possible to assign to several levels of a distribution chain hardware or software (buyer, wholesaler, distributor, retailer) 1o brands of their own. We can thus tattoo as many times as one wants a software or hardware program (by superimposing tattoos), unlike tattoos of sounds or images which undergo a certain saturation of the subliminal channel (depending on the chosen perception model). We will not alter the software or hardware program, we will overload it, ~ 5 memory and time resources will be affected alone.
We can use for each of these tattoos semantics different secrets, so the chain of trust may have different secrets.
Of course, this tattooing device and method can be 2o combined with devices and methods of the prior art to make the hardware or software program unusable by unauthorized user (prevention) to trace the possible dissemination of said program by a authorized user to unauthorized users (audit). It will suffice for it not to communicate to authorized users the keys allowing them to 25 find the tattoo.
It is also possible to use the device and the method for automatically authenticate software or hardware programs that we will allow to transit on a network or to be hosted on a given computer station. The tattoo is comparable to a certificate 3o of authentication including the network or substation monitoring station IT will have the reading key.
(See Figure 9) A DPS Security Policy Module Among the preferred embodiments of the above PS method, the invention also discloses a DPS device (which can be produced by a product / computer program) which, from the text of the software program or material P and from a family of abstract domains D1, ..., Dm, chooses a abstract domain D and selects the elements of the program P to be tattooed.
(See Figure 9A) o A T tattoo process The watermarking method T uses an encryption module C, which implements a bijective function to calculate a semantic signature SSS secret based on an Auth authenticator, possibly using the secret abstract semantics SAS. From this semantic signature secret SSS and said SAS secret abstract semantics, the M marking described above produces an m mark. Module T uses again the module B described above which, from said mark m, from software or hardware program P and the selection of the elements to be tattooed in P provides the tattooed software or hardware program PT.
(See Figure 10) A DT tattoo module Among the preferred embodiments of the tattooing method T
above, the invention also discloses a DT device (which can be produced by a computer product / program) to conceal a authenticating Auth in the text of an original P software program or material through a program implementing secret abstract semantics SAS and data for selecting the elements of P to be tattooed.
(See Figure 10A) A general G process The general method G is used to tattoo a program P. For this do, the PS module described above is used to automatically choose or interactively the configured abstract domain which will be later used to perform static analysis by abstract interpretation by the Au authentication module as well as to automatically select or interactively the elements of the software or hardware program to be tattooed P.
5 Then the module A described above is used to calculate the semantics SAS secret abstract from the abstract domain previously configured selected and a secret key, usually chosen in interaction with a operator but which can also be automatically, or even randomly. Finally, the module T described above uses the program P, the 1 o secret abstract semantics SAS, an Auth authenticator and the selection of the elements to be tattooed in P to produce the software or hardware program tattooed PT. Variants consist in fixing once and for all the domain abstract which is used by the module G and to choose the secret key K in using a standard cryptographic method.
Application to authenticating compilation Compilation preserves the concrete semantics of the programs software or hardware to the nearest morphism. Therefore the compilation also keeps the secret abstract semantics of the software program or Po object material, which is the same as that of the P source program, at this same morphism. Therefore, the compilation of a hardware program or tattooed source PT software by a correct compiler does not wash the watermarking in the hardware or software program object PTo. knowing the target computer, we know the concrete semantics of the object code and therefore that of the PTo object hardware or software program. By adapting the devices DA and DS, according to the principles of the theory of abstract interpretation, to the semantics, concrete of the Masonic language of the computer or the system object computing using said compilation morphism, we get DAo and DSo devices conforming to modules A and S for semantics 3o concrete of the object code and such as the composition of DA then DS for PT
gives the same secret SSS semantics (near morphism) as the composition of DAo and DSo for PTo. Therefore, the DAu device using the DSo device calculates a representation of the signature secret SSS semantics of the PTo program, which is also that of the PT program and can therefore be used by the DAu device to find the authenticator of the PT program.
One of the applications of this patent is a Ca module of compilation, authenticating compiling a program while inserting a watermarking according to the principles of module G. In its embodiment preferred, a Dca authenticator compiler integrates the DG device of the figure 7 bis in a compiler for a hardware or software computer language which can be used as an option to tattoo the object code. As explained below above, the DAu device makes it possible to find the authenticator of the program 0 tattooed object.
(See Figure 11) The Modute DG gënérat ~ 5 Among the preferred embodiments of the tattooing method G
above, the invention also discloses a DG device (which can be produced by a computer product / program) to conceal a authenticating Auth in the text of an original P software program or material (depending on the choice of an abstract domain configured by a key 2o secret).
(See Figure 11A) An authentication process 25 The authentication process Au takes as parameters, the software or hardware program marked PT and secret abstract semantics SAS. It calculates the original Auth authentication. This module A is composed of two sub-modules module S, described above, which from the software program 3o tattooed PT and secret semantics extract the semantic signature secret SSS (PT) and therefore SSS (m) of the hidden mark m in PT;
a decryption and extraction module F, which is the inverse of the encryption module C used in watermarking module T, supports sets the secret semantic signature SSS (PT) to extract P.'s original Auth authenticator The decryption module F can also constitute a secret, just like module C.
(See Figure 12) DAu authentication module Among the preferred embodiments of the method 1o authentication In the above, the invention also discloses a device DAu authentication (can be achieved by a product / program computer using the DS device above and a DF decryption) for the authentication of a tattooed program PT
calculating its authenticator Auth.
(See Figure 12A) Presentation of additional figures 5 Figure 5 shows a block diagram of process A
to generate secret SAS abstract semantics from a domain abstract Di (K) dependent on a secret key K;
6 Figure 6 shows a block diagram of the S process to calculate the SSS signature according to the secret SAS semantics of a software or hardware program P;
Figures 5A and 6A show the diagram of the DA and DS devices corresponding to processes A and S in one of their variants of production ;
3o 7 Figure 7 (respectively 7A) shows the diagram of principle of the M method (respectively of the DM device in one of its variants) to produce a brand with a SAS secret abstract semantics (resulting for example from the application of principle of the method of FIG. 1) and a secret semantic signature SSS;

8 Figure 8 (respectively 8A) shows the diagram of principle of method B (respectively of the DB device in one of its variants) to tattoo a software or hardware program P in inserting a mark m into a selection of elements to be tattooed on said P;
9 Figure 9 (respectively 9a) shows the diagram of principle of the PS process (respectively of the DPS device in one of its variant embodiments) which, given a software or hardware program P, choose an abstract domain D from a family of abstract domains possible and selects elements of P to tattoo;
10 Figure 10 (respectively 10A) shows the diagram of principle of the T method (respectively of the DT device in one of its alternative embodiments) to insert a mark m characteristic of a authenticating Auth in a selection of elements of a software program or original P material using SAS secret abstract semantics;
~ 5 11 Figure 11 (respectively 11 A) shows the diagram of principle of the T method (respectively of the DT device in one of its variants) to choose an abstract domain parameterized by a key, a special secret and an authenticator to tattoo a software or hardware program P by transformation into a tattooed program 2o functionally equivalent PT;
12 Figure 12 (respectively 12A) shows the diagram of principle of the Au process (respectively of the DAu device in one of its variant embodiments) to authenticate a tattooed program PT.

Example 3 Consider an original program P to tattoo very simple by the DG device of FIG. 11 A, which is the following public class Fibonacci public Fibonacci () ~ o () public static void main (String [] args) int n = Integer.parselnt (args [0]);
int a = 0;
int b = 1;
for (int i = 1; i <N; i ++) int c = a + b;
2o a = b; // worth it b = c; // b is ui + 1 ) System.out.println ("The value of the sequence for n =" + n + "
is: "+ b);
For example, the DPS device of FIG. 9A selects methods to tattoo, like the main method.
A very simple example of tattooing JavaT methods "" consists 3o to use a collector semantics which is the set of descendants of the input states of the method (other alternatives being the semantics of ascending exit states or combinations like the intersection of these collector semantics).
In this example, the SAS secret abstract semantics is the abstraction of the collecting semantics of a method which retains only whole local variables and completely disregards others variables, the control flow graph, external elements and the context of the method. Therefore static analysis is insensitive to program transformations (for example for obfuscation or 5 scrambling) modifying the flow control graph and the transformations through equivalence of arithmetic expressions. For simplicity in this example SAS secret abstract semantics, only the arithmetic operations of base +, - and * are considered (including all other operations arithmetic to rewrite arithmetic expressions including these basic operations in an arithmetically form equivalent, like the least unary, etc.). The concrete semantics is chosen on the ring (Z, +, *) mathematical integers (and not as integers modulo 232 as in JavaTM, the arithmetic equivalences mentioned above to take this fact into account).
15 In this example, the SAS secret abstract semantics uses a finite height domain for integer local variables, which makes it insensitive to the chaotic iteration strategies used and avoids, always in the simplicity of the example, the use operators enlargement and narrowing.
2o In this example, the secret key K is the product K1 * K2 * ... * Kn of strictly positive natural numbers and prime among them. In the example '~ considered below, n = 2, K1 = 10000 and K2 = 5421. The abstract domain D (K), used to compute SAS secret abstract semantics by interpretation of collector semantics, is that of the propagation of 25 modulo K constants. According to the Chinese lemma, Z / ~ = Z / K1 * ", * KnZ is isomorphic to the product ring Z / K1 Z x ~ ~ ~ x ~ KnZ ~ Given the signature SSS secret semantics of the tattooed program PT in Z l ~ -~ K1 * ... * KnZ ~ we consider its image (s1, ..., sn) in Z / K1 Z x ~ ~ ~ x ~ KnZ
whose components are given by the canonical projection on the ring 3o Z / KiZ, i = 1,. ~ .., n. In the example considered, it is assumed that the DC device in Figure 6a calculates the secret semantic signature (2507, 3012) to from the program authenticator. The secret static semantics is then obtained by n static analyzes, for all integer variables local to the abstract domain method Z / Ki Z, i = 1, ..., n; corresponding 35 to the propagation of the modulo Ki constants. The device DA of FIG. 5A

therefore consists of a program of successive static analyzes by propagation of modulo constants (K1, ..., Kn).
In this example, the device DM of FIG. 7A uses the SAS secret abstract semantics and the SSS secret semantic signature defined above to produce a mark m the text of which is as follows int <Watermark: 10000: 2507>;
int <Tmp: 10000: 2507>;
o <Watermark: 10000: 2507> = 1;
<Tmp: 10000: 2507> = <watermark: l 0000: 2507>+227492;
<Tmp: 10000: 25D7> ~ <Watermark: 10000: 2507> * <Tmp: 10004: 2507>;
<watermark: l 0000: 2507> = <Tmp: 10000: 2507> 155 014;
<Tmp: 10000: 2507> = <Watermark: 10000: 2507> * 1323;
<Tmp: 10000: 2507> = <Tmp: 10000: 2507>153;
<Tmp: 10000: 2507> = <Tmp: 10000: 2507> * <watermark: l 0000: 2507>;
<watermark: l 0000: 2507> = <Tmp: 10000: 2507>9109;
2o int <Watermark: 5421: 3012>;
int <Tmp: 5421: 3012>;
<watermark: 5421: 3012> = 1;
<Tmp: 5421: 3012> = <Watermark: 5421: 3012> + - 35539;
<Tmp: 5421: 3012> = <Watermark: 5421: 3012> * <Tmp: 5421: 3012>;
<Watermark: 5421: 3012> = <Tmp: 5421: 3012>11,445;
<Tmp: 5421: 3012> = <Watermark: 5421: 3012> * 658;
<Tmp: 5421: 3012> = <Tmp: 5421: 3012>971;
<Tmp: 5421: 3012> = <Tmp: 5421: 3012> * <Watermark: 5421: 3012>;
<Watermark: 5421: 3012> = <tmp: 5421: 30i 2>+4623;
In a simplified case where Ki is not too large efi for a secret semantic signature s given in ZlKiZ, the text ~ of the brand m created by the device DM of FIG. 7A can consist of a single mark initialization which can be an assignment <watermark: Kia> = s';
where s' = s + aKi and the value a in Z is not chosen too large so as to avoid that it does not overflow outside 32 bits of Java integers.
The value of the variable <watermark: Ki: s> is always constant in a static analysis by propagation of the modulo Ki constants and equal to s.
This value does not appear in clear in the brand and is all the more hard to find that the secret key Ki is unknown.
1o A more sophisticated instantiation of the DM device in the figure 7A allows to choose this initialization mark as being a polynomial Q (in the variable <watermark: Kia>) of the form ak <watermark: Kia> k + ... + a1 <watermark: Kia> + a where ak, ..., a1 are random values and the value a is given through a - = s - akvk - ak-1 v k-1 _, .. - a1 v mod Ki where the initial value v is chosen randomly.
In this case, the initialization mark consists of the assignments <Watermark: Kia> = v;
<watermark: Kia> = ak <Watermark: Kia> k + ... + a1 <Watermark: Kia>
+ a;
so that ('we always <watermark: Kia> = s mod Ki The disadvantage of using a single initialization mark leave the value of <watermark: Ki: s> constant and therefore easily identifiable by a static analysis propagating the constants. To avoid it, a more sophisticated DM device will add an induction mark allowing to give a dynamic to the brand variable <watermark: Ki: s> of so that it takes stochastic values in Z but remains constant 3o in Z / KiZ. We use for this a polynomial Q 'which has the property of stability, i.e.
Q '(s) ---- s mod Ki The polynomial Q 'is generated as explained above and the induction mark is formed by instruction <Watermark: Kia> =
A K ~ <Watermark: Kia> k 'a' 1 + ... + <watermark: Kia> + a ';
or any equivalent, for example using the principle of calculating Horner.
The DB device in Figure 8A can place this mark induction in a loop or in a recursive call of the method to tattoo in P because its execution does not modify the value of <watermark: Ki: s> in the Z / KiZ ring chosen to calculate the semantics secret static s. However, the value observed in the field o Interpretation of Java integers will be totally stochastic.
In the example of the tattoo of program P above, the brand defined above uses two local variables <watermark: Ki: s>
and <tmp: Ki: s>. The initialization mark is the initial code segment constituted by a polynomial Q calculating the initial value of 5 ~ <watermark: Ki: s>. The induction mark consists of a polynomial Q 'having the stability property for s in the Z / KiZ ring considered.
Second degree polynomials are used for convenience. The values of coefficients of the second degree polynomial ensuring the initialization are random. This polynomial is given as follows in ZJKiZ
2o Q (x) _ (x -1) (x - s) = x2 + coeffl .x + coeff2.
A random number of Ki modulo periods is added or subtracted from the coefficients so as not to reveal the key s. The value initial of <watermark: Ki: s> is therefore <watermark: Ki: s> = Q (1) - s in Z / KiZ.
25 The initial mark consists in calculating this polynomial Q using the H ~ rner calculation principle. The induction mark consists of a polynomial Q '(also of the second degree, again for convenience) satisfying the stability property Q '(s) = s in ZJKiZ
30 The polynomial Q 'is written Q '(x) = ax2 + bx + c The coefficients a and b are drawn by lot, while c is chosen from way to ensure the stability property for the secret key like explained above. The induction mark consists of the instruction 35 assignment <watermark: Kia> = Q '( <Watermark: Kia>);
where the polynomial is calculated by the Hôrner method.
The tattooed program PT corresponding to the program P including the m mark above is as follows public class FibonacciTatoue f public FibonacciTatoue () f ~
public static void main (String [] args) f int n = Integer.parselnt (args [0]);
int a = 0;
~ 5 int b = 1;
int d = 1;
int e = 35538;
intf = 1;
int g = 227493;
2o e = d * e;
d = e + 11445;
g = f * ~ 9 f = g + 155014;
for (int i = 1; i <N; i ++) 25 ~ f int c = a + b;
e * d = 658;
f = f * 1323;
a = b; // worth it 3o g = g + 153;
e = e + 971;
g = g * f ~
e = e * d;
b = c; // b is ui + 1 35 d = e + 4623;

f = g + 9109;
System.out.println ("The value of the sequence for n =" + n + "
is: "+ b);
5}
The location of the marks is random, in order of marks, the initialization mark being placed at the start of the program and the 1o induction mark being preferably placed in a loop or in a recursive call branch of the method.
In the example considered of device DB of FIG. 8A, a further transformation of the program is then required to link the tattoo to the original program (to resist for example the 15 tattoo removal by automatic slicing), for example) public class FibonacciTatoue public FibonacciTatoue () public static void main (String (] args) f int n = Integer.parselnt (args [0]);
int a = 0;
int b = 1;
int d = 1;
int e = 35538;
int f = 1;
3o int g = 227493;
e = d * e;
d = e + 11445;
g = f * ~ 9 f = g + 155014;
for (int i = 1; i <N; i ++) int c = a + b;
e * d = 658;
f = f * 1323;
a = b + (155 014 / f); // worth it g = g + 153;
e = e + 971;
g = g * f e = e * d;
1o b = c + (e / f); // b is ui + 1 d = e + 4623;
b = b- (e / f);
a = a- (155014 / f);
f = g + 9109;
System.out.println ("The value of the sequence for n =" + n + "
is: "+ b);
) Removal of variables d, e, f and g used for watermarking would make the erroneous program therefore unusable.
In a more realistic example, the device DB of FIG. 8A must ensure that the transformation of the original program does not introduce runtime errors that could change the semantics of the program original. We will use classic interpretation techniques for this.
abstract.
In a more realistic example, the device DB of FIG. 8A
will use more sophisticated methods to link the brand m to 3o original program P in the tattooed program PT. As part of the example considered above, this requires being able to use properties secret invariants of the tattoo variable which are transposable into 32-bit signed arithmetic. It is possible very simply when, for example example, the key Ki is a power of 2. Indeed, suppose that Ki = 2k.
So we have x = s + a.2kdansZ
We have, still in Z, that, for all j less than or equal to k:
x 2J =% s% 2J
where x% y denotes the operation that returns the rest of the division Euclidean of x by y. We notice that this property remains trivially true in Z / 232Z which is the domain of JavaTM integers. Properties tattoo arithmetic can be used to modify calculations of the method. For example, suppose that Ki = 216 and that s = 18. While whatever the value of x, the invariant x% 4 = 2 is always satisfied. From this fact, an explicit constant 2 in the original program P, can be replaced by x% 4 in the tattooed program PT. Other constants or values variables of the tattooed program PT can also be easily calculated based on this value. For example 1 in P will simply be ~ 5 replaced by (x% 4) -1 in PT.
This concludes the description of the DT tattoo device on the example considered by linking DC, DM and DB devices as indicated in FIG. 10A and of the DG device by linking the devices DPS, DA and DT as shown in Figure 11A.
2o In this example, the device DS of FIG. 6A calculates the secret static semantics s = (s1, ..., sn) by doing n static analyzes successive of the tattooed program according to the abstract interpretation defined above above consisting of forward analysis, which may ignore the flow of control or not, with propagation of the modulo constants the keys remaining 25 secret K1, ..., Kn.
From this secret static semantics s = (s1, ..., sn), ie DF device calculates the original authenticator of the P program as indicated in Figure 8 bis.
If the domain D (K) used in the above example is public 30 (but not 1â key K), then the secret static semantics s can be discovery by brute force, at least for very small programs with few integer variables. In a more realistic example we will therefore use, not keys s1, ..., sn coded on 32 bits, but one or several 512-bit keys s using, for example, arithmetic coding of the value of <watermark: xi: s> on 16 variables of 32 bits or 8 of 64 bits or using any other computer coding technique for large integers.
This concludes the description of the device DAu in FIG. 12A for the example considered.

Example 4 A fourth example is the following sorting program private static void bubbleSort (double [] data, int size) {
int indexl, index2;
double time boolean exchanged;
1o for (indexl = size; indexl> = 2; indexl--) {
exchanged = false;
for (index2 = 1; index2 <= indexa - 1; index2 ++) {
. if (data [index2]> data [index2 + 1]) {
temp = data [index2];
data [index2] = data [index2 + 1];
data [index2 + 1] = time exchanged = true;
2o if (! Exchanged) break;
Ä new n = 2, K1 = 10000 and K2 = 5421. The bubblesort method is tattooed twice, the first time with the same semantic signature secret (2507, 3012) as above and the second time by signing secret (9876, 2345). The second tattoo could as well have been done for values different from K1 and K2. After compilation, tattoo, obfuscation and decompilation, we obtain the bubblesort class as follows private static void bubbleSort (double [] r0, int i0) {
int i1, i2, i3, i4, i5, i6, i7, i8, i9, i10, i11;
double d0;
i1 = 1;

i2 = i1;
i3 = i2;
i4 = i3;
i5 = i2 - 35539;
5 i6 = i1 - 62508;
i7 = i3 - 129877;
i6 = i1 * i6;
i5 = i2 * i5;
i1 = i6 - 144986;
I8 = i4 + 84390;
i7 = i3 * i7;
i2 = i5-75291;
i8 = i4 * i8;
i3 = i7 + 169752;
15 i4 = i $ + 10111;
for (i9 = i0; i9> = 2; i9--) {
i10 = 0;
for (i11 = 1; i11 <= i9 - 1; i11 ++) {
if (r0 [i11]> r0 [i11 + 1]) {
20 i6 = i1 * 620;
i6 i6 + = 1151;
d0 = r0 [i11];
i6 = i6 i1 *;
i1 = i6 + 6570;
25 r0 [i11] = r0 [i11 +1];
i8 = i4 * 936;
i5 = i2 * 620;
i5 i5 = + 1 151;
i5 = i5 * i2;
I8 = i8 + 1057;
i2 = i5 + 2961;
r0 [i11 + 1] = d0;
i7 = i3 * 1231;
i7 = i7 + 1699;
35 if ((i2 - 2961) _ = i5) i10 = 1;
i8 = i8 * i4;
i7 = i7 * i3;
i3 = i7 + 2696;
i4 = i8 + 1389;
if (i 10 == 0) break;

Claims

67

1. Product/computer program to process instructions software (10), characterized in that it comprises a choice module (310) according to predefined criteria of the instructions (110, 130, 150) of said input software of the transcoding program and a module (320) of choice among several of a secret transcoding method (220) applied to the say instructions.

2. Product/computer program according to claim 1, characterized in that the secret transcoding method (220) has as its output a semantic signature of the software (10).

3. Product/computer program according to one of the claims above, characterized in that the semantic signature of the software (10) is constituted by all or part of the set of properties of the software (10) transcoded by the secret method (220).

4. Product/computer program according to one of the claims above, characterized in that said software (10) is written in the language of programming.

5. Product/computer program according to one of the claims above, characterized in that the software (10) is written in the language of the type Java, Java Script or Java bytecode.

6. Product/computer program according to one of the claims above, characterized in that the software (10) is written in the language of the Very High Definition Language, Verilog or Assembler type.

7. Product/computer program according to one of the claims preceding, characterized in that the instructions to which a transcoding program will be applied are chosen from those involving algebraic operations or allocations on variables to be integer values.

8. Product/computer program according to one of the claims preceding, characterized in that the instructions to which a transcoding program will be applied are chosen from those involving algebraic operations or allocations on references with integer values.

9. Computer product/program according to claim 7 or claim 8, characterized in that the secret transcoding method is determined by the choice of secret integers as operators of congruence applied to algebraic operations or to allocations on variables or integer-valued references.

10. Product/computer program according to claim 8, characterized in that the secret transcoding method is determined by the secret choice of one or more integer-valued variables in association to references.

11. Product/computer program according to any of the preceding claims, characterized in that it further comprises a module (330) for inserting the reverse transcoded instructions (110",130",150") in the software (10).

12. Product/computer program according to claim 11, characterized in that the reverse transcoded instructions are inserted at positions chosen according to predefined criteria in the software (10) in the form of instructions for calculating said variables, said instructions comprising the application to said operation variables which leave invariant the semantic signature of the software (10).

13. Product/computer program according to claim 12, characterized in that the reverse transcoded instructions are inserted at positions chosen according to predefined criteria in the software (10) in the form of variable initialization instructions and calculation of said variables, said instructions comprising the application to said function variables by polynomials of degree 1 or of degree 2 created from a random coefficient and the values of numbers whole secrets.

14. Product/computer program according to any of the claims 11 to 13, characterized in that the module (330) comprises a encryption sub-module (330 A) to calculate a semantic signature secret (SSS) according to an authenticator (Auth), a sub-module (330 B) to determine a mark (m) from the secret method of transcoding (220) and said secret semantic signature (SSS), and a camouflage sub-module (330 C) to produce the watermarked program (40) from program text (10), mark (m) and instructions chosen for the tattoo (110, 130, 150).

15. Method for processing software instructions (10), characterized in that it comprises a step for choosing according to predefined criteria the instructions (110, 130, 150) of said program (10) to which a transcoding program is applied and a step for choose among several the secret transcoding method (220) to be applied to the said instructions.

16. Method according to claim 14, characterized in that the secret transcoding method produces a semantic signature of the software (10).

17. Method according to claim 14, characterized in that the semantic signature of the software (10) consists of all or part of the set of properties of the software (10) transcoded by the secret method (220).

18. Method according to one of claims 14 or following, characterized in that said software (10) is written in the language of programming.

19. Method according to one of claims 14 or following, characterized in that said software (10) is written in Java type language, Java Script or Java bytecode.

20. Method according to one of claims 14 or following, characterized in that the software (10) is written in language of the Very High type Definition Language, Verilog or Assembler.

21. Method according to one of claims 14 or following, characterized in that the instructions to which a program of transcoding will be applied are chosen from those containing algebraic operations or allocations on valued variables whole.

22. Method according to one of claims 14 or following, characterized in that the instructions to which a program of transcoding will be applied are chosen from those containing algebraic operations or allocations on valued references whole.

23. Method according to one of claims 20 or 21, characterized in that the secret transcoding method is determined by the choice of secret integers as congruence operators applied to algebraic operations or allocations on variables or integer-valued references.

24. Method according to claim 21, characterized in that the secret transcoding method is determined by the secret choice of a or more integer-valued variables in association with references.

25. Method according to one of claims 14 or following, characterized in that it further comprises a step for inserting the reverse transcoded instructions (110",130",150") in the software (10).

26. Method according to claim 24, characterized in that the reverse transcoded instructions are inserted at selected positions according to predefined criteria in the software (10) in the form instructions for calculating said variables, said instructions comprising the application to said variables of operations which leave invariant the software semantic signature (10).

27. Method according to claim 25, characterized in that the reverse transcoded instructions are inserted at selected positions according to predefined criteria in the software (10) in the form instructions for initializing variables and instructions for calculating say variables, the said instructions comprising the application to the said variables of polynomial functions of degree 1 or degree 2 created from a random coefficient and secret integer values.

28. Computer product/program for processing software instructions (40), characterized in that it comprises a module (50) allowing a user knowing the method parameter(s) secret (220) according to one of the preceding claims to recognize the software signing (10).

29. Product/computer program according to claim above, characterized in that the module (50) includes a program semantic static analysis to recognize the signature.

30. Product/computer program according to claim 27 or claim 28, characterized in that the module (50) comprises a program for calculating the fixed point of the values of all or part of the variables.

31. Method for processing program instructions source code computer (40), characterized in that it allows a user knowing the parameter(s) of the secret method (220) according to one of the preceding claims to recognize the software signature (10).

32. Method according to claim 30, characterized in that that it includes a semantic static analysis step to recognize the signature.

33. Method according to one of claims 30 or 31, characterized in that it includes a step of calculating the fixed point of the values of all or part of the variables.