EP3123398A1

EP3123398A1 - System for identifying a photographic camera model associated with a jpeg-compressed image, and associated method performed in such a system, uses and applications

Info

Publication number: EP3123398A1
Application number: EP15718526.5A
Authority: EP
Inventors: Rémi COGRANNE; Florent RETRAINT; Thanh Hai THAI
Original assignee: Universite de Technologie de Troyes
Current assignee: Universite de Technologie de Troyes
Priority date: 2014-03-28
Filing date: 2015-03-27
Publication date: 2017-02-01
Also published as: FR3019350A1; WO2015145092A1; JP2017517222A; FR3019350B1; US20180173993A1

Abstract

The invention relates to a system and method for identifying a photographic camera model from a photograph taking the form of a compressed image that has undergone post-acquisition processing and fulfilling a linear relationship between the expectation and the variance of the pixels, such that σ2ym,n= aµym,n+ b. The system is characterised in that it comprises an image processing device that can supply an analytical relationship between parameters (α, ß ) from the model of the discrete cosine transform (DCT) coefficients and the parameters (a, b) of the camera in the form (I), such that the parameters (c, d) determine fingerprints that characterise a photographic camera model and depend on both the frequency (p, q) and parameters a and b. The system also comprises a device for carrying out statistical hypothesis tests on the DCT coefficients and a statistical analysis device for determining if the photograph was taken by the photographic camera model or by another photographic camera model. The statistical properties of these tests have been established, allowing error probabilities to be controlled a priori. These results can be used, in particular, to guarantee compliance with a constraint on the probability of one of the two possible errors in the test (false positive or false negative). The invention further relates to the use of the method and to the application thereof.

Description

SYSTEME D'IDENTIFICATION D'UN MODELE D'APPAREIL PHOTOGRAPHIQUE ASSOCIE A UNE IMAGE COMPRESSEE AU FORMAT JPEG ET PROCEDE MISE EN ŒUVRE DANS UN TEL SYSTEME, UTILISATIONS ET APPLICATIONS ASSOCIES 1 - DOMAINE TECHNIQUE DE L'INVENTION SYSTEM FOR IDENTIFYING A MODEL OF PHOTOGRAPHIC APPARATUS ASSOCIATED WITH A JPEG COMPRESSED IMAGE AND METHOD IMPLEMENTED IN SUCH A SYSTEM, USES AND APPLICATIONS THEREOF 1 - TECHNICAL FIELD OF THE INVENTION

[001] L'invention se rapporte à l'identification d'un modèle d'appareil photographique, plus particulièrement, l'invention concerne un système pour déterminer l'identification d'un modèle d'appareil photographique à partir d'une photographie numérique ayant suivit l'ensemble des traitements de la chaîne d'acquisition, voire compressée selon la norme JPEG. L'invention concerne aussi un procédé de mise en œuvre d'un tel système. Ces systèmes trouvent des applications importantes pour déterminer la provenance d'une photographie. [001] The invention relates to the identification of a camera model, more particularly, the invention relates to a system for determining the identification of a camera model from a digital photograph. having followed all the processing of the acquisition chain, or even compressed according to the JPEG standard. The invention also relates to a method for implementing such a system. These systems find important applications for determining the provenance of a photograph.

[002] La criminalistique numérique ou la recherche de preuves dans un média numérique a connu un développement important au cours de la dernière décennie. Dans ce domaine, les méthodes proposées se distinguent en deux catégories selon que l'on souhaite identifier le modèle d'appareil photographique ou l'appareil lui même (une instance d'un certain modèle). [002] Digital forensics or the search for evidence in a digital medium has grown significantly over the past decade. In this field, the proposed methods are distinguished into two categories depending on whether one wishes to identify the camera model or the device itself (an instance of a certain model).

[003] De manière générale, les méthodes d'identification sont passives ou actives. Dans le cas des méthodes actives, les données numériques représentant le contenu de l'image sont modifiées afin d'insérer un identifiant (méthode dite de tatouage ou de watermarking). Lorsque l'image inspectée ne contient pas de tatouage, l'identification de l'appareil d'acquisition doit se faire à partir des données de l'image. [003] In general, the identification methods are passive or active. In the case of active methods, the digital data representing the content of the image are modified in order to insert an identifier (method called tattooing or watermarking). When the inspected image does not contain a tattoo, the identification of the acquisition device must be done from the image data.

2- ETAT DE LA TECHNIQUE ANTERIEURE 2- STATE OF THE PRIOR ART

[004] Concernant la criminalistique, on identifie deux problèmes clés : identification de l'origine de l'image et détection des fausses images ( voir [ 1 ] - [3] et les références intégrés dans ce documents). L'identification de l'origine de l'image vise à vérifier si une image numérique donnée est acquise par un appareil photo spécifique (i.e. une instance) et/ou déterminer son modèle. La détection des fausses images vise à détecter tout acte de manipulation comme l'épissage, le retrait ou l'ajout dans une image. Pour résoudre ces problèmes, il existe deux approches, active et passive. Le tatouage numérique est considéré comme une approche active. Il y a néanmoins quelques limitations cf. [3] car le mécanisme d'incorporation doit être disponible, et la crédibilité de l'information incorporée dans l'image reste discutable. L'approche passive a été de plus en plus étudiée dans la dernière décennie. Le filigrane, ou l'information préalable de l'image, y compris la disponibilité de l'image originale, n'est pas requise dans son mode de fonctionnement. Les méthodes criminalistique passives s'appuient sur les empreintes digitales de l'appareil photographique, laissées dans l'image, pour identifier son origine et vérifier son authenticité. Ces empreintes sont extraites par la chaîne de traitements d'acquisition d'image; voir les références [4] - [6], pour un aperçu des différentes étapes et de la structure des traitements au sein d'un appareil photographique numérique. [004] Concerning forensic science, two key problems are identified: identification of the origin of the image and detection of false images (see [1] - [3] and the references incorporated in this document). The identification of the origin of the image is intended to verify whether a given digital image is acquired by a specific camera (ie an instance) and / or to determine its model. The detection of false images is intended to detect any act of manipulation such as splicing, removing or adding to an image. To solve these problems, there are two approaches, active and passive. Digital tattooing is considered an active approach. There are nevertheless some limitations cf. [3] because the mechanism incorporation needs to be available, and the credibility of the information embedded in the image remains debatable. The passive approach has been increasingly studied in the last decade. The watermark, or the prior information of the image, including the availability of the original image, is not required in its mode of operation. Passive forensic methods rely on the fingerprints of the camera, left in the image, to identify its origin and verify its authenticity. These prints are extracted by the image acquisition processing chain; see references [4] - [6], for an overview of the different stages and structure of treatments within a digital camera.

[005] Les méthodes criminalistique passives proposées pour le problème de l'identification de l'origine de l'image peuvent être divisées entre les deux catégories fondamentales suivantes. La méthode de la première catégorie est basée sur l'hypothèse qu'il existe des différences entre les modèles d'appareils, que ce soit pour les techniques de traitement d'image et pour les composants technologiques. En effet, l'aberration de l'objectif cf. [ 7 ] , le « Color Filter Array » ( CFA ), l'algorithme d'interpolation, le dématriçage cf. [ 8 ] - [ 1 1 ], et la compression JPEG voir les références [ 12 ], [13] sont considérés comme des facteurs influents pour l'identification du modèle de l'appareil photo lorsque les algorithmes d'équilibrage des blancs, "white balancing", [ 14 ] sont utilisés pour l'identification de l'appareil source. Sur la base de ces facteurs, un ensemble de fonctionnalités est fourni et utilisé dans l'algorithme d'apprentissage automatique. Le principal défi est que les techniques de traitement d'image restent identiques ou similaires et, les composants, produits par quelques fabricants, sont partagés entre les modèles d'appareils photographiques. De plus, comme dans toutes les applications de l'apprentissage automatique, il est difficile de sélectionner un ensemble de fonctionnalités précises. En outre, l'analyse de la mise en place de la performance de détection reste un problème ouvert cf. [ 15 ] . [005] The passive forensic methods proposed for the problem of identifying the origin of the image can be divided into the following two basic categories. The first category method is based on the assumption that there are differences between device models, both for image processing techniques and for technological components. Indeed, the aberration of the objective cf. [7], the "Color Filter Array" (CFA), the interpolation algorithm, demosaicing cf. [8] - [1 1], and JPEG compression see references [12], [13] are considered influential factors for camera model identification when white balance algorithms, " white balancing ", [14] are used to identify the source device. Based on these factors, a set of features is provided and used in the machine learning algorithm. The main challenge is that image processing techniques remain the same or similar and the components, produced by some manufacturers, are shared between camera models. In addition, as in all machine learning applications, it is difficult to select a set of specific features. In addition, the analysis of the implementation of the detection performance remains an open problem cf. [15].

[006] La méthode de la deuxième catégorie vise à identifier les caractéristiques uniques, ou empreintes digitales, de l'appareil d'acquisition. Le "Sensor Pattern Noise" (SPN) ou bruit caractéristique d'un capteur, est basé sur les imperfections résultant du processus de fabrication du capteur photographique et sur la non- uniformité lors de la conversion électronique de la photo en raison du manque d'homogénéité des wafer (galettes) de silicium (également appelé « Photo- Response Non-Uniformity » ou PRNU). Cela est une empreinte digitale unique, voir les références [ 17 ] - [ 21 ]. De plus, les méthodes basées sur la présence d'un bruit de non-uniformité (PRNU) sont également utilisées dans la référence [22] pour l'identification du modèle de l'appareil photographique. Ces méthodes sont basées sur l'hypothèse que l'empreinte digitale obtenue à partir d'une image au format TIFF, ou JPEG, contient des traces du filigrane intrinsèque qu'est le SPN contenant des informations sur le modèle de l'appareil photo. [007] Il est à noter que les deux principaux composants du Sensor Pattern Noise SPN sont les « Fixed Noise Pattern » (FPN) et le « Photo-Response Non- Uniformity » PRNU. Le FPN, ou « structure de bruit de forme fixe », qui est utilisé dans la référence [23] pour l'identification de l'appareil, est généralement compensé dans un appareil photographique par la soustraction d'une image sombre sur l'image de sortie. Par conséquent, le Fixed Noise Pattern (FPN) n'est pas une empreinte digitale robuste et ne pourra pas être utilisée dans les travaux ultérieurs. Le PRNU est directement exploité dans certains travaux, voir les références [17], [18], [21 ]. La capacité d'extraire de manière fiable ce bruit de l'image est le principal défi dans cette catégorie. Un autre défi est la falsification de l'origine de l'image due aux activités de « contre-analyse », voir la référence [24]. Cependant, les méthodes existantes sont conçues avec une exploitation très limitée de la théorie des tests d'hypothèses et des modèles statistiques d'image. Par conséquent, leur performance reste analytiquement non établie. [006] The method of the second category is to identify the unique characteristics, or fingerprints, of the acquisition device. The "Sensor Pattern Noise" (SPN) or noise characteristic of a sensor, is based on the imperfections resulting from the process of manufacturing the photographic sensor and on the non Uniformity in electronic photo conversion due to lack of homogeneity of silicon wafers (aka "Photo-Response Non-Uniformity" or PRNU). This is a unique fingerprint, see References [17] - [21]. In addition, the methods based on the presence of a non-uniformity noise (PRNU) are also used in reference [22] for the identification of the model of the camera. These methods are based on the assumption that the fingerprint obtained from an image in TIFF or JPEG format contains traces of the intrinsic watermark that is the SPN containing information about the model of the camera. [007] It should be noted that the two main components of the SPN Sensor Pattern Noise are the "Fixed Noise Pattern" (FPN) and the "Photo-Response Non-Uniformity" PRNU. The FPN, or "fixed form noise structure", which is used in reference [23] for the identification of the apparatus, is generally compensated in a camera by subtracting a dark image on the image Release. As a result, the Fixed Noise Pattern (FPN) is not a robust fingerprint and can not be used in subsequent work. The UNPN is directly used in some works, see references [17], [18], [21]. The ability to reliably extract this image noise is the main challenge in this category. Another challenge is the falsification of the origin of the image due to "counter-analysis" activities, see reference [24]. However, existing methods are designed with very limited exploitation of hypothesis test theory and statistical image models. As a result, their performance remains analytically undefined.

[008] La plupart des méthodes d'image de médecine légale s'appuient sur les bruits des capteurs, voir les références [22], [25], ou sur des caractéristiques axées sur les opérations dans l'appareil photo, voir la référence [10]. La plupart des appareils photo numériques exportent des images dans le format JPEG. La capacité d'extraction des caractéristiques d'images est mise en doute parce que la compression JPEG peut gravement endommager ces caractéristiques. Dans la demande de brevet référencée en [25] nous avons proposé d'exploiter les paramètres (a, b) pour identifier de façon passive un modèle d'appareil photographique. Cette méthode est basée sur l'hétéroscédasticité du bruit présent dans une image RAW. Une image RAW est une image n'ayant subie aucun des traitements post-aqcuisition de la chaîne de traitement) . La demande de brevet référencé en [25] montre une performance de détection parfaite pour l'identification d'un modèle d'appareil photographique à partir des images RAW, d'images non compressées ou d'images compressées sans pertes. Cependant, il est utile d'étendre cette méthode aux images compressées de format TIFF ou JPEG. [008] Most of the forensic image methods rely on sensor noise, see references [22], [25], or feature-oriented camera features, see reference [10]. Most digital cameras export images in JPEG format. The ability to extract image features is in doubt because JPEG compression can severely damage these features. In the patent application referenced in [25] we proposed to use the parameters (a, b) to passively identify a camera model. This method is based on the heteroscedasticity of the present noise in a RAW image. A RAW image is one that has not undergone any post-processing treatment chain processing). The patent application referenced in [25] shows a perfect detection performance for the identification of a camera model from RAW images, uncompressed images or lossless compressed images. However, it is useful to extend this method to compressed TIFF or JPEG images.

[009] La problématique abordée dans la présente invention est celle de l'identification passive d'un modèle d'appareil photographique d'acquisition à partir d'une image compressée donnée. Par identification passive, on entend prendre une décision dans le cas où l'image n'est pas supposée contenir d'information identifiant la source. Les problèmes qu'il est envisagé de résoudre sont 1 ) s'assurer qu'une photographie n'a pas été prise par un appareil donné lorsque cette dernière est compromettante ou, 2) à l'inverse, garantir qu'une photographie inspectée a bien été prise par un appareil plutôt que par un autre. Les exemples qu'il est possible de donner sont nombreux parmi lesquels : cet appareil photographique est à l'origine de la photographie du document confidentiel (tels que ceux disponibles sur wikileaks); une photographie donnée à caractère pédopornographique a-t-elle pu avoir été acquise avec l'appareil photographique d'un suspect ; la copie d'un contrat a-t-elle été scannée avec l'appareil du client ; la photographie d'un document permet de copier et marquer le document etc.. [009] The problem addressed in the present invention is that of the passive identification of an acquisition camera model from a given compressed image. By passive identification, it is intended to make a decision in the case where the image is not supposed to contain information identifying the source. The issues that are being addressed are (1) ensuring that a photograph has not been taken by a given device when the latter is compromising, or (2) conversely, ensuring that an inspected photograph has well been taken by one device rather than another. The examples that can be given are numerous among which: this camera is at the origin of the photograph of the confidential document (such as those available on wikileaks); could a given child pornography photograph have been acquired with the camera of a suspect; has a copy of a contract been scanned with the customer's device; the photograph of a document makes it possible to copy and mark the document etc.

3- EXPOSE DE L'INVENTION 3- DESCRIPTION OF THE INVENTION

[0010] Le but de l'invention est de fournir un système pour identifier un modèle d'appareil photographique à partir d'images acquises avec un modèle d'appareil photographique connu et d'une photographie sous la forme d'une image compressée, ladite photographie ayant suivi un traitement de post-acquisition et répondant à la relation linéaire entre l'espérance et la variance des pixels telle que : o² _ym,n = a ym.n + b où a et b sont deux paramètres caractérisant ledit modèle appareil photographique, _ym,n et a² _ym,_n sont respectivement, l'espérance et la variance mathématique du pixel y_m,_n en position (m,n) ayant suivit le traitement poste acquisition, le système est caractérisé en ce qu'il comprend un dispositif de traitement d'image apte à fournir une relation analytique entre des paramètres (α , β ), du modèle de distribution statistique des coefficients DCT, Transformation en Cosinus Discrète, et les paramètres (a, b) de l'appareil photographique sous la forme : Çj = 4- d_p^ The object of the invention is to provide a system for identifying a camera model from images acquired with a known camera model and a photograph in the form of a compressed image, said photograph having followed a post-acquisition processing and responding to the linear relation between the expectation and the variance of the pixels such that: o ² _ym , n = a ym.n + b where a and b are two parameters characterizing said model photographic apparatus, _ym , n and a ² _ym , _n are respectively the expectation and the mathematical variance of the pixel y _m , _n in position (m, n) having followed post acquisition processing, the system is characterized in that it comprises an image processing device capable of providing an analytic relationship between parameters (α, β), the statistical distribution model of the DCT coefficients, Discrete Cosine Transformation, and the parameters ( a, b) of the camera in the form: Çj = 4- d _p ^

de sorte que les paramètres (c, d) déterminent des empreintes digitales caractérisant un modèle d'appareil photographique et dépendent à la fois de la fréquence (p, q) et des paramètres a et b, en ce que le système comprend en outre un dispositif d'exécution de tests d'hypothèses statistiques du coefficient DCT et un dispositif d'analyse statistique afin de déterminer si ladite photographie a été prise par ledit modèle d'appareil photographique ou par un autre modèle d'appareil photographique. Basiquement les images pour lesquelles le modèle d'acquisition est connu constituent la source de l'identification du modèle d'appareil photographique. so that the parameters (c, d) determine fingerprints characterizing a camera model and depend both on the frequency (p, q) and the parameters a and b, in that the system further comprises a device for performing statistical hypothesis tests of the DCT coefficient and a statistical analysis device for determining whether said photograph was taken by said camera model or other model of camera. Basically, the images for which the acquisition model is known are the source of the identification of the camera model.

[0011] Avantageusement, le dispositif d'analyse fournit une indication sur l'identification dudit modèle d' appareil photographique en certifiant l'exactitude de l'identification avec une précision préalablement définie. Advantageously, the analysis device provides an indication of the identification of said camera model by certifying the accuracy of the identification with a previously defined accuracy.

[0012] L'invention concerne encore un procédé mis en œuvre dans ledit système, qui comporte les étapes suivantes : - analyse préalable d'images acquises avec un modèle d'appareil photographique connu, The invention also relates to a method implemented in said system, which comprises the following steps: - prior analysis of images acquired with a known camera model,

- lecture d'une image compressée Z en vue de déterminer les matrices représentant la valeur des pixels, reading a compressed image Z in order to determine the matrices representing the value of the pixels,

- estimation des paramètres d'empreintes: (c_p,_q , d_p,_q), - suppression du contenu de l'image compressée Z afin d'obtenir une image résiduelle, - estimation of the fingerprint parameters: (c _p , _q , d _p , _q ), - deletion of the content of the compressed image Z in order to obtain a residual image,

- estimation des coefficients DCT de ladite image résiduelle, - exécution de tests statistiques comparatifs en vue d'identifier un appareil photographique. estimating the DCT coefficients of said residual image, - performing comparative statistical tests to identify a camera.

On réalise une analyse préalable d'images acquises avec un modèle d'appareil photographique connu, en vue d'estimer des paramètres d'empreinte (c_p,_q , d_p,_q), suivant le même procédé que pour l'analyse d'une image inconnue. A prior analysis of images acquired with a known camera model is carried out, with a view to estimating imprint parameters (c _p , _q , d _p , _q ), according to the same method as for the analysis of images. an unknown image.

[0013] Avantageusement, les tests statistiques sont exécutés en fonction d'une contrainte prescrite sur la probabilité d'erreur. Advantageously, the statistical tests are performed according to a prescribed constraint on the probability of error.

[0014] Selon l'invention, la photographie est une image compressée, selon la norme compression JPEG. According to the invention, the photograph is a compressed image, according to the JPEG compression standard.

[0015] Avantageusement, la photographie est une image de référence, ou intertrame, appartenant à un flux vidéo, et compressée selon la norme de compression MPEG. [0015] Advantageously, the photograph is a reference image, or interframe, belonging to a video stream, and compressed according to the MPEG compression standard.

[0016] L'invention concerne encore l'utilisation du procédé ci-dessus pour la détection de façon non supervisée de la falsification d'une zone d'une photographie. The invention also relates to the use of the above method for unsupervised detection of the falsification of an area of a photograph.

[0017] Par ailleurs, l'invention concerne l'utilisation du procédé ci-dessus pour la détection, de façon supervisée, de la falsification d'une zone d'une photographie. Cette détection est réalisée en testant si une zone connue a priori provient du même appareil photographique que le reste de l'image inspectée. Furthermore, the invention relates to the use of the above method for detecting, in a supervised manner, the falsification of an area of a photograph. This detection is performed by testing whether a prior known area comes from the same camera as the rest of the inspected image.

[0018] L'invention concerne encore l'utilisation du procédé ci-dessus dans la recherche de preuves à partir d'une image compromettante. The invention also relates to the use of the method above in the search for evidence from a compromising image.

[0019] L'invention concerne l'application du procédé ci-dessus dans des logiciels spécialisés, dans la recherche de preuves à partir de média numériques. The invention relates to the application of the above method in specialized software, in the search for evidence from digital media.

4- BREVE DESCRIPTION DES FIGURES [0020] D'autres caractéristiques, détails et avantages de l'invention ressortiront à la lecture de la description qui suit, en référence aux figures annexées, qui illustrent: 4- BRIEF DESCRIPTION OF THE FIGURES Other features, details and advantages of the invention will emerge on reading the description which follows, with reference to the appended figures, which illustrate:

- la figure 1 montre un système pour déterminer l'identification d'un modèle d'appareil photographique conforme à l'invention ; - Figure 1 shows a system for determining the identification of a camera model according to the invention;

la figure 2 illustre l'ensemble de la chaîne d'acquisition d'une image naturelle et les traitements effectués dans les appareils photo numériques; Figure 2 illustrates the entire chain of acquisition of a natural image and the treatments performed in digital cameras;

- la figure 3 montre la comparaison entre le modèle exact de coefficient DCT équation (7) et le modèle d'approximation de coefficient DCT équation (8) ; - la figure 4 montre les paramètres ( , ^"βθ4 ) estimés à partir de chaque image de la base de données d'image Dresden capturée avec deux modèle d'appareils photo distincts : Canon Ixus 70 et Nikon D200; - Figure 3 shows the comparison between the exact model of DCT coefficient equation (7) and the DCT coefficient equation approximation model (8); - Figure 4 shows the parameters (, ^" βθ4) estimated from each image of the Dresden image database captured with two different camera models: Canon Ixus 70 and Nikon D200;

- la figure 5 montre les paramètres ( C6₄ , ^"d6₄ ) estimés à partir de chaque image de la base de données d'image Dresden capturée avec deux modèle d'appareils photo distincts : Canon Ixus 70 et Nikon D200; - Figure 5 shows the parameters (C6 ₄ , ^" d6 ₄ ) estimated from each image of the Dresden image database captured with two separate camera models: Canon Ixus 70 and Nikon D200;

- la figure 6 montre les performances de détection du test δ^* pour les FIG. 6 shows the detection performance of the δ ^* test for

données simulées avec des paramètres a = 3 , cO = 1 1 ,5 , dO = -4 , c1 = 13 , d1 = -5,5; simulated data with parameters a = 3, cO = 1 1, 5, dO = -4, c1 = 13, d1 = -5.5;

- la figure 7 montre les performances de détection des tests ^"δ*ι et ^"δ*2 pour les données simulées avec des paramètres a = 3 , cO = 1 1 ,5 , dO = -4 , c1 =FIG. 7 shows the detection performance of the ^" δ * ι and ^" δ * 2 tests for data simulated with parameters a = 3, cO = 1 1, 5, dO = -4, c1 =

13 , d1 = -5,5; 13, d1 = -5.5;

- la figure 8 montre les performances de détection des tests δ pour 500 FIG. 8 shows the detection performance of the δ tests for 500

images issues de la base de données d'image Dresden, obtenues par deux appareils photos Canon Ixus 70 et Nikon D200. images from the Dresden image database, obtained by two Canon Ixus 70 and Nikon D200 cameras.

[0021] Pour plus de clarté, les éléments identiques ou similaires sont repérés par des signes de référence identiques sur l'ensemble des figures. For clarity, identical or similar elements are identified by identical reference signs throughout the figures.

5- DESCRIPTION DETAILLEE D'UN MODE DE REALISATION 5- DETAILED DESCRIPTION OF AN EMBODIMENT

[0022] La figure 1 représente un système pour déterminer l'identification d'un appareil photographique. La référence 1 indique le système et la référence 2 le modèle d'appareil photographique qui a pris une photographie 3. FIG. 1 represents a system for determining the identification of a photographic camera. Reference 1 indicates the system and reference 2 the model of camera that took a photograph 3.

[0023] C'est à partir de cette photographie 3 que le système 1 va déterminer le modèle d'appareil photographique qui a capturée cette photographie. Ce système se compose d'un analyseur de photo qui va examiner cette photographie 3. La photographie 3 se présente sous la forme d'un fichier compressé apte au traitement qui va suivre. Les formats de type JPEG ou tout image issue de la décompression d'une image préalablement compressée au format JPEG, compressée selon la norme JPEG, est apte à ce traitement. It is from this photograph 3 that the system 1 will determine the model of camera that captured this photograph. This system consists of a photo analyzer that will examine this photograph 3. The photograph 3 is in the form of a compressed file suitable for the treatment that follows. JPEG formats or any image resulting from the decompression of an image previously compressed in JPEG format, compressed according to the JPEG standard, is suitable for this processing.

[0024] Le système 1 peut être mis en œuvre sur un ordinateur de type PC. Ce système 1 est muni d'un organe d'entrée 10 pour pouvoir accueillir les données de la photographie 3. Ces données sont traitées par un organe de traitement 12 qui met en œuvre un traitement qui sera explicité ci-dessous. Un dispositif d'exécution de tests d'hypothèse statistique des coefficients DCT et un dispositif d'analyse statistique 14 fournira une indication sur l'identification du modèle d'appareil photographique à l'origine de ladite photographie. The system 1 can be implemented on a PC type computer. This system 1 is provided with an input member 10 to accommodate the data of the photograph 3. These data are processed by a processing member 12 which implements a treatment which will be explained below. A device for performing statistical hypothesis tests of the DCT coefficients and a statistical analysis device 14 will provide an indication of the identification of the camera model at the origin of said photograph.

[0025] Selon le procédé conforme à l'invention, à la première étape, la photographie numérique 2 est vue comme une ou plusieurs matrices dont les éléments représentent la valeur de chacun des pixels. Dans le cas d'une image en niveau de gris, la photographie peut être représentée par une unique matrice : According to the method according to the invention, in the first step, the digital photograph 2 is seen as one or more matrices whose elements represent the value of each of the pixels. In the case of a grayscale image, the photograph can be represented by a single matrix:

Z = Zj avec 1 < i < L Z = Zj with 1 <i <L

Pour les images en couleurs, trois couleurs distinctes sont usuellement utilisées: le rouge, le vert et le bleu. Dans ce cas, une image est assimilable à 3 matrices distinctes, une matrice par canal de couleurs : For color images, three distinct colors are usually used: red, green and blue. In this case, an image can be likened to 3 distinct matrices, one matrix per color channel:

Z = _Zi ^k avec l≤ ≤ Z = _Zi ^k with l≤ ≤

[0026] La seconde étape du procédé consiste à séparer les différents canaux de couleurs, lorsque l'image analysée est en couleur. La suite des opérations étant réalisée de manière identique avec chacune des matrices représentant les canaux de couleurs, nous considérons que l'image est représentée par une unique matrice (l'indice k est omis). The second step of the method consists in separating the different color channels, when the analyzed image is in color. The sequence of operations being carried out identically with each of the matrices representing the color channels, we consider that the image is represented by a single matrix (the index k is omitted).

[0027] Le bruit présent dans les photographies numériques, présente la propriété d'être hétéroscédastique : Les propriétés stochastiques (aléatoires) de bruit ne sont pas constantes sur l'ensemble des pixels de l'image. The noise present in digital photographs, has the property of being heteroscedastic: The stochastic properties (random) of noise are not constant on all the pixels of the image.

[0028] En raison du grand nombre de photons incidents sur les capteurs, il est possible d'approximer avec une grande précision le processus de comptage par une variable aléatoire Gaussienne. [0029] La figure 2 illustre l'ensemble de la chaîne d'acquisition d'une image naturelle et les traitements effectués. Cette chaîne d'acquisition comprend plusieurs étapes de traitement (dématriçage, équilibrage des blancs, et correction gamma) à la suite desquelles une image polychromatique est obtenue à partir de l'intensité lumineuse mesurée par chaque cellule photosensible du capteur. Selon la méthode de chaque étape, la qualité de l'image finale peut varier de façon significative. Chaque étape affecte l'image de sortie finale. Il convient de noter que la séquence d'opérations diffère d'un fabricant à un autre. Due to the large number of photons incident on the sensors, it is possible to closely estimate the counting process by a Gaussian random variable. Figure 2 illustrates the entire chain of acquisition of a natural image and the treatments performed. This acquisition chain comprises several processing steps (demosaicing, white balance, and gamma correction) after which a polychromatic image is obtained from the light intensity measured by each photosensitive cell of the sensor. Depending on the method of each step, the quality of the final image may vary significantly. Each step affects the final output image. It should be noted that the sequence of operations differs from one manufacturer to another.

[0030] Aujourd'hui, le format d'image JPEG apparaît de plus en plus comme un standard dans les images numériques. La plupart des appareils photo numériques et des logiciels codent les images dans ce format. L'utilisation de la compression JPEG est une question d'équilibre entre la taille de stockage et la qualité de l'image. Une image qui est compressée avec un facteur de compression élevé nécessite peu d'espace de stockage au détriment d'une moindre qualité visuelle. Today, the JPEG image format appears more and more as a standard in digital images. Most digital cameras and software encode images in this format. The use of JPEG compression is a matter of balance between storage size and image quality. An image that is compressed with a high compression factor requires little storage space to the detriment of lower visual quality.

[0031] L'opération de Transformation en Cosinus Discrète (DCT) est l'une des étapes clés de la compression JPEG. Etant donné une image arbitraire Z, l'opération de DCT est appliquée à chaque bloc de 8 χ 8 pixels de Z comme suit The Discrete Cosine Transformation (DCT) operation is one of the key steps in JPEG compression. Given an arbitrary image Z, the DCT operation is applied to each block of 8 χ 8 pixels of Z as follows

où z_m,n représente un pixel à l'intérieur d'un bloc de 8 x 8 de Z, 0 < m < 7 , 0 < n <7 et Ip q désigne le coefficient bidimensionnel de DCT et where z _m , n represents a pixel within an 8 x 8 block of Z, 0 <m <7, 0 <n <7 and Ip q denotes the two-dimensional coefficient of DCT and

[0032] Le terme Tq peut être facilement dérivé par analogie avec le terme Tp. Le coefficient DCT à la position (0 , 0 ), appelé le coefficient courant continu ( DC ), représente la valeur moyenne des pixels dans le bloc 8 ^ 8 pixels. Les 63 coefficients restants sont appelés le coefficient courant alternatif (AC). Deux avantages principaux de l'opération DCT sont la décorrélation sous-optimale et le compactage d'énergie. Après l'opération DCT , l'énergie est située principalement dans les basses fréquences tandis que les hautes fréquences contiennent principalement des composantes du bruit. The term Tq can be easily derived by analogy with the term Tp. The DCT coefficient at the (0, 0) position, called the direct current coefficient (DC), represents the average value of the pixels in the 8 ^ 8 pixel block. The remaining 63 coefficients are called the AC coefficient. Two The main advantages of the DCT operation are suboptimal decorrelation and energy compaction. After the DCT operation, the energy is located mainly in the low frequencies while the high frequencies mainly contain noise components.

[0033] La modélisation de la distributions des coefficients DCT a été largement étudiée dans la littérature. Les modèles Laplacien en référence [27], Gaussien généralisé (cf. [ 28 ]) et Gamma généralisé (Gr ) (cf. [ 29 ]) ont été proposés pour les coefficients DC. Mais la distribution Laplacienne reste un choix dominant dans le traitement d'image en raison de sa simplicité et de sa précision relative. Dans nos travaux antérieurs, nous avons établi un cadre mathématique rigoureux pour modéliser les coefficients AC, voir plus de détails dans référence [ 6 ]. Comme le coefficient DC représente la valeur moyenne des pixels dans chaque bloc de 8 8 pixels, la répartition de coefficient DC ne peut pas être directement dérivée du fait de l'hétérogénéité dans une image naturelle. Soit I un coefficient AC, par souci de clarté, l'indice dudit coefficient AC est omis . En raison de la variabilité de la variance du bloc, la fonction de densité de probabilité (pdf) du I est donnée en foncti 'un modèle bi-stochastique (cf. [ 27 ] ) de la manière suivante où fx( x ) désigne la fonction de densité de probabilité (pdf) d'une variable aléatoire notée X et où σ² désigne la variance du bloc considéré. On suppose que les pixels à l'intérieur d'un bloc sont indépendamment et identiquement distribués, cf. [ 27 ]. Étant donné σ² une variance constante du bloc, la distribution des coefficients AC de I peut être approchée par une distribution Gaussienne de moyenne nulle, en vertu du théorème de la limite centrale de Lindeberg ( CLT ) pour les variables aléatoires corrélées cf. [27], [30] The modeling of the DCT coefficient distributions has been widely studied in the literature. The Laplacian models in reference [27], generalized Gaussian (cf [28]) and generalized Gamma (Gr) (cf [29]) have been proposed for DC coefficients. But Laplacian distribution remains a dominant choice in image processing because of its simplicity and relative accuracy. In our previous work, we established a rigorous mathematical framework to model the AC coefficients, see more details in reference [6]. Since the coefficient DC represents the average value of the pixels in each 8 8 pixel block, the DC coefficient distribution can not be directly derived because of the heterogeneity in a natural image. Let I be a coefficient AC, for the sake of clarity, the index of said coefficient AC is omitted. Due to the variability of the block variance, the probability density function (pdf) of the I is given as a bi-stochastic model (see [27]) as follows where fx (x) denotes the probability density function (pdf) of a random variable denoted X and where σ ² denotes the variance of the block considered. It is assumed that the pixels inside a block are independently and identically distributed, cf. [27]. Given σ ² a constant variance of the block, the distribution of the AC coefficients of I can be approximated by a Gaussian distribution of zero mean, by virtue of the Lindeberg central limit theorem (CLT) for the correlated random variables cf. [27], [30]

(4) En outre, la distribution de la variance o² sur l'ensemble des blocs constituant une image peut être modélisée de façon approchée par la loi de distribution de gamma ⁽g (α , β) cf. [6] définie par la fonction de densité de probabilité (pdf) suivante : où a est un paramètre de forme positif, β est un paramètre d'échelle positif, et Γ (^■) représente la fonction gamma. À partir de (3), (4) et (5) , le modèle de distribution statistique du coefficient AC de I est donné comme: (4) In addition, the distribution of the variance o ² over all the blocks constituting an image can be modeled in an approximated way by the distribution law of gamma ⁽ g (α, β) cf [6] defined by the density function of probability (pdf): where a is a positive form parameter, β is a positive scale parameter, and Γ ( ^■ ) represents the gamma function. Starting from (3), (4) and (5), the statistical distribution model of the AC coefficient of I is given as:

A partir de [31 ] et en utilisant la représentation intégrale de la fonction de Bessel modifiée, notée Kv () , l'intégrale (6) peut être écrite Starting from [31] and using the integral representation of the modified Bessel function, written Kv (), the integral (6) can be written

où Kv( x ) représente la fonction de Bessel modifiée [ 31 , chapitre . 5,5 ]. Le modèle de coefficients AC proposé en (7) comprend les cas particuliers de distribution Laplacienne et de distribution Gaussienne (cf. [ 6 ]) . Comme indiqué dans [ 6 ] , ce modèle surpasse les modèles de Laplace et de Gauss, mais au détriment des expressions plus complexes. En utilisant l'approximation de Laplace [ 32 ] ( voir plus de détails dans l'Annexe A), une approximation de la fonction fi ( x ) peut être donnée comme: [0034] Il convient de noter que d'autres expansions polynomiales de la fonction de Bessel modifiée Kv(x) sont également données dans [31 ], aussi une approximation polynomiale de fi( x ) peut être dérivée . Cependant, ces approximations ne sont pas considérées dans la présente demande. La précision de l'approximation donnée en ( 8 ) dépend du choix des fonctions g (t) et h (t) (voir plus loin dans la relation (56)). Le principal avantage de la relation ( 8 ) est de fournir une approximation sous la forme d'une fonction exponentielle. Ce modèle approché est utilisé pour simplifier le calcul des fonctions de vraisemblance ainsi que le calcul du seuil pour le rapport de vraisemblance utilisé dans les tests proposés. L'estimation des paramètres est effectuée en se basant sur le modèle exact défini dans la relation (7). where Kv (x) represents the modified Bessel function [31, chapter. 5.5]. The model of AC coefficients proposed in (7) includes the particular cases of Laplacian distribution and Gaussian distribution (cf [6]). As shown in [6], this model surpasses the Laplace and Gauss models, but at the expense of more complex expressions. Using the Laplace approximation [32] (see more details in Appendix A), an approximation of the function fi (x) can be given as: It should be noted that other polynomial expansions of the modified Bessel function Kv (x) are also given in [31], so a polynomial approximation of fi (x) can be derived. However, these approximations are not considered in the present application. The precision of the approximation given in (8) depends on the choice of the functions g (t) and h (t) (see later in relation (56)). The main advantage of relation (8) is to provide an approximation in the form of an exponential function. This approximate model is used to simplify the calculation of the likelihood functions as well as the calculation of the threshold for the likelihood ratio used in the proposed tests. The estimation of the parameters is performed based on the exact model defined in relation (7).

[0035] On peut noter que ce modèle approché est un cas particulier du modèle de distribution de gamma généralisé quand y = 1 (la variable y est donnée dans la référence [29, Eq (6).] ). Un exemple est donné sur la Fig. 3 pour illustrer la précision du modèle exact de la distribution des coefficients DCT d'une image JPEG et la précision du modèle approché de la distribution des coefficients DCT. Les données empiriques sont extraites à partir d'une image réelle de la base de données d'image Dresden (voir [ 33 ]) préalablement compressée en utilisant le standard de compression JPEG. Les paramètres sont estimés à partir des données empiriques sur la base du modèle exact de la distribution des coefficients DCT. Les modèles exact et approché du coefficient DCT sont représentés avec les paramètres estimés par maximum de vraisemblance. Le principal inconvénient du modèle d'approximation est que lorsque x tend vers 0, la fonction (8) tend vers 0 lorsque a > 1 , et elle tend vers l'infinie lorsque a < 1 . Cela conduit à une imprécision au voisinage de 0, comme représenté sur la Fig . 3 . Néanmoins, cela ne cause pas de perte dans la performance de détection, lorsqu'on conçoit le test LRT, autant en terme de capacité à garantir une probabilité de fausse-alarme qu'en terme de probabilité de détection correcte. Comme le modèle de coefficients AC défini par la relation ( 7 ) est symétrique , les moments impairs disparaissent . Sur la base de la loi de l'espérance totale, le calcul montre que ^'< f U 3a •: 0 ^'f O It may be noted that this approximate model is a special case of the generalized gamma distribution model when y = 1 (the variable y is given in reference [29, Eq (6).]). An example is given in FIG. 3 to illustrate the accuracy of the exact model of the distribution of the DCT coefficients of a JPEG image and the precision of the approximate model of the distribution of the DCT coefficients. The empirical data is extracted from a real image of the Dresden image database (see [33]) previously compressed using the JPEG compression standard. The parameters are estimated from the empirical data based on the exact model of the DCT coefficient distribution. The exact and approximate models of the DCT coefficient are represented with the parameters estimated by maximum likelihood. The main disadvantage of the approximation model is that when x tends to 0, the function (8) tends to 0 when a> 1, and it tends to infinity when a <1. This leads to inaccuracy in the vicinity of 0, as shown in FIG. 3. Nevertheless, this does not cause a loss in the detection performance, when designing the LRT test, as much in terms of ability to guarantee a probability of false alarm as in terms of probability of correct detection. As the model of coefficients AC defined by the relation (7) is symmetrical, the odd moments disappear. On the basis of the law of total expectancy, the calculation shows that ^'< f U 3a •: 0 ^' f

(10) où E_x désigne l'espérance mathématique d'une variable aléatoire X. Par conséquent, la méthode des moments (MM) permet l'estimation des paramètres (α, β ) comme suit : a = d(10) where E _x denotes the expectation of a random variable X. Therefore, the method of moments (MM) allows the estimation of parameters (α, β) as follows: a = d

M Al 3m¾ M Al 3m¾

' (1 1 ) (12) où ^Am2 et ^Am4 sont respectivement le deuxième et le quatrième moments empiriques de I. L'estimation du maximum de vraisemblance (ML) des paramètres (a, β ) sont définies comme la solution du problème de la maximisation '(1 1) (12) where ^A m2 and ^A m4 are respectively the second and fourth empirical moments of I. The maximum likelihood (ML) estimate of the parameters (a, β) are defined as the solution of the problem of maximization

où N est le nombre de coefficients à la même fréquence. Il est à noter que la fonction de vraisemblance est différentiable mais le calcul de la dérivée semble extrêmement difficile. Comme II n'y a pas de forme exacte ou analytique pour le maximum de vraisemblance (13), il est proposé de résoudre le problème de maximisation numériquement à l'aide de la méthode d'optimisation de Nelder - Mead [34]. Les estimations MM (^Λα ^MM, ^Λβ^ΜΜ ) sont pris comme solution initiale dans l'algorithme d'optimisation . where N is the number of coefficients at the same frequency. It should be noted that the likelihood function is differentiable, but calculating the derivative seems extremely difficult. Since there is no exact or analytic form for maximum likelihood (13), it is proposed to solve the maximization problem numerically using the Nelder-Mead optimization method [34]. The MM estimates ( ^Λ α ^MM , ^Λ β ^ΜΜ ) are taken as initial solution in the optimization algorithm.

[0036] Comme discuté ci-dessus, l'un des principaux avantages de l'approximation en (8) est le compactage de l'énergie. L'énergie a tendance à être situées principalement dans les basses fréquences tandis que les hautes fréquences contiennent principalement des composantes du bruit. Par conséquent, il existe une différence d'échelle entre les coefficients DCT. Les coefficients DCT ne partagent pas les mêmes paramètres ( α , β ). Comme l'estimation des paramètres ( α , β ) est effectuée séparément sur chaque fréquence, nous devons désigner les paramètres ( a_p,_q , β_{ρ ς} ) par rapport au coefficient DCT de l _p , _q . 5.A Les empreintes intrinsèque de l'appareil photo As discussed above, one of the main advantages of the (8) approximation is the compaction of energy. Energy tends to be located mainly in low frequencies while high frequencies mainly contain noise components. Therefore, there is a difference in scale between the DCT coefficients. The DCT coefficients do not share the same parameters (α, β). Since the estimation of the parameters (α, β) is carried out separately on each frequency, we must designate the parameters (a _p , _q , β _{ρ ς} ) with respect to the DCT coefficient of l _p , _q . 5.A The intrinsic footprints of the camera

[0037] Dans nos travaux antérieurs références: [25] et [35], les paramètres (a,b) ont été exploités pour identifier de façon passive un modèle d'appareil photographique, où a et b représentent des paramètres caractéristiques d'un d'appareil photographique. Cette méthode est basée sur l'hétéroscédasticité du bruit présent dans une image RAW. Les propriétés stochastiques (aléatoires) de bruit ne sont pas constantes sur l'ensemble des pixels de l'image. Plus précisément, la valeur de chaque pixel dépend linéairement du nombre de photons incidents. Ce modèle, qui représente tous les bruits contaminant l'image RAW, donne la variance du bruit comme une fonction linéaire de l'espérance mathématique des pixels et répond à la relation suivante: In our previous work references: [25] and [35], the parameters (a, b) have been exploited to passively identify a camera model, where a and b represent characteristic parameters of a of a camera. This method is based on the heteroscedasticity of the noise present in a RAW image. Stochastic (random) noise properties are not constant across all pixels in the image. More precisely, the value of each pixel depends linearly on the number of incident photons. This model, which represents all the noise contaminating the RAW image, gives the variance of the noise as a linear function of the mathematical expectation of the pixels and responds to the following relation:

¾ ¾ "^: " Λ^: f !½._^ ,, , „ ^■÷^■ h] ¾ ¾ " ^: " Λ ^: f! ½. _^ ,,, " ^■ ÷ ^■ h]

^{' "} ~ (14) où Sm,n est la valeur mesurée du pixel RAW à la position (m,n) et Sm,n son espérance mathématique Même si cette méthode montre une performance de détection presque parfaite, il y a deux limitations principales. Premièrement, elle se concentre sur les images RAW, qui peuvent ne pas être disponibles dans la pratique. En effet, la partie la plus difficile lors de l'extension de cette méthode à d'autres formats d'image, par exemple TIFF et JPEG, est l'impact du procédé post acquisition (dématriçage, équilibrage des blancs et gamma correction) ainsi que du procédé de compression, car le dématriçage provoque la corrélation spatiale entre les pixels et les opérations non-linéaires détruisent la relation linéaire entre l'espérance et la variance du pixel. Deuxièmement, l'empreinte digitale proposée, définie par les paramètres ( a, b ), dépend de la sensibilité ISO. Bien que cela ne soit pas crucial en pratique, car il n'y a pas beaucoup de sensibilité ISO et seul un petit nombre d'images est suffisant pour estimer les paramètres de référence ( a, b ) pour chaque sensibilité ISO, Il est souhaitable de compter sur une empreinte qui est invariante par rapport au contenu de l'image et qui est robuste pour les opérations de transformation non linéaires (par exemple le facteur de correction gamma). [0038] Pour rendre une image couleur complète à la sortie et améliorer sa qualité visuelle, une image RAW nécessite un processus de post acquisition, par exemple, le dématriçage, l'équilibrage des blancs, et la correction gamma. Pour étendre le procédé référencé dans [21] aux images compressée, supposons que l'effet de l'algorithme de dématriçage et d'équilibrage des blancs soit négligeable sur la relation hétéroscédastique de l'espérance et de la variance du pixel. Alors l'image compressée répond à la relation entre l'espérance et la variance des pixels telle q ¹ue : σ² ym,n = a u ¹ ym,n + b où a et b sont deux p ¹ aramètres caractérisant un modèle appareil photographique, μ et o²y_m,_n sont respectivement, l'espérance et la variance mathématique du pixel y_m,_n en position m,n. ^'" ~ (14) where Sm, n is the measured value of the RAW pixel at position (m, n) and Sm, n is its mathematical expectation Although this method shows almost perfect detection performance, there are two main limitations First, it focuses on RAW images, which may not be available in practice, because the most difficult part when extending this method to other image formats, for example TIFF and JPEG, is the impact of the post-acquisition process (demosaicing, white balancing and gamma correction) as well as the compression process, since demosaicing causes spatial correlation between pixels and non-linear operations destroy the linear relationship between the expectation and variance of the pixel Secondly, the proposed fingerprint, defined by the parameters (a, b), depends on the ISO sensitivity, although this is not crucial in practice because there is not much sensitivity ISO and only a small number of images is sufficient to estimate the reference parameters (a, b) for each ISO sensitivity, It is desirable to rely on a footprint that is invariant with respect to the content of the image and which is robust for nonlinear transformation operations (eg gamma correction factor). To make a complete color image at the output and improve its visual quality, a RAW image requires a post-acquisition process, for example, demosaicing, white balancing, and gamma correction. To extend the process referenced in [21] to the compressed images, suppose that the effect of the demosaicing and white balancing algorithm is negligible on the heteroscedastic relationship of pixel expectancy and variance. Then the compressed image corresponds to the relationship between the mean and variance of pixels such q ue ^1: σ ² ym, n = ¹ to Pm, n + b where a and b are both ¹ p arameters characterizing a camera model , μ and o ² y _m , _n are respectively the expectation and the mathematical variance of the pixel y _m , _n at position m, n.

[0039] Nous allons fournir une relation analytique entre les paramètres ( α , β ) du modèle du coefficient DCT (Transformé de cosinus discret) et les paramètres de l'appareil photographique (a, b ) . On suppose que les pixels dans chaque bloc deWe will provide an analytical relationship between the parameters (α, β) of the model of the DCT coefficient (discrete cosine transform) and the parameters of the camera (a, b). It is assumed that the pixels in each block of

8 x 8 sont indépendants et identiquement distribués . De plus, nous supposons que l'effet de l'algorithme de dématriçage et d'équilibrage des blancs est négligeable sur la relation hétéroscédastique de l'espérance et de la variance du pixel. La correction gamma est définie par la transformation, appliquée à chaque pixel indépendamment, définie comme suit : 8 x 8 are independent and identically distributed. In addition, we assume that the effect of the demosaicing and white balancing algorithm is negligible on the heteroscedastic relationship of pixel expectancy and variance. The gamma correction is defined by the transformation, applied to each pixel independently, defined as follows:

(15) où | | désigne la valeur absolu et y est le facteur de correction (typiquement, y = 2.2). Ici, est référencé comme le pixel blanc équilibré et η est un signal de moyenne nulle, représentant le bruit de pixel à la position (m,n) de variance : σ²η = Var [n ] = /u + b. Le premier ordre du développement en série(15) where | | denotes the absolute value and y is the correction factor (typically, y = 2.2). Here, is referenced as the balanced white pixel and η is a zero average signal, representing the pixel noise at the position (m, n) of variance: σ ² η = Var [n] = / u + b. The first order of serial development

<m,n m,n ym,n <m, n m, n ym, n

de Taylor de ( 1 + x ) V^Y pour x = 0 conduit à: Taylor's (1 + x) V ^Y for x = 0 leads to:

En prenant l'espérance et la variance sur les deux côtés de l'équation ( 16 ) , nous obtiendrons : Taking expectation and variance on both sides of equation (16), we will obtain:

où z_m,_n et o²z_m, _n représentent, respectivement, la valeur de l'espérance et de la variance du pixel z_m, _n. La relation ( 17 ) se justifie sous l'hypothèse de ⁿ _m ,_n « where z _m , _n and o ² z _m , _n represent, respectively, the value of the expectation and the variance of the pixel z _m , _n . The relation (17) is justified under the assumption of ⁿ _m , _n "

^y_m , n -Par ailleurs, on peut noter que la variance du pixel gamma corrigé, o²z_m, _n est directement proportionnelle à la variance du bruit a²n,_m,n et inversement proportionnelle à la valeur de l'espérance ^z_m,_n. En prenant la variance sur les deux côtés de l'équation ( 1 ), il s'ensuit que Var [ l_P , _q ] = Var [ z_m , _n ] = o²z_m, _n. Cela se justifie sous l'hypothèse que les pixels sont indépendants et identiquement distribués dans le bloc 8 8. Il résulte de l'équation (9) que ^ y _m, n -By Moreover, it is noted that the variance of the gamma corrected pixel, o ² z _m, _n is directly proportional to the noise variance n ^2, _m, n, and inversely proportional to the value of the expectation ^ z _m , _n . Taking the variance on both sides of equation (1), it follows that Var [l _P , _q ] = Var [z _m , _n ] = o ² z _m , _n . This is justified under the assumption that the pixels are independent and identically distributed in block 8. It follows from equation (9) that

En outre, à partir de (9) et (10) , on obtient 9) Moreover, starting from (9) and (10), we obtain 9)

où II résulte de (16) que (22)where it follows from (16) that (22)

Par conséquent, du (19)-(22) on obtient: Therefore, from (19) - (22) we obtain:

Par souci de simplification, en résolvant les équations (18) et (23) et en utilisant à nouveau la série de Taylor , on obtient une relation à peu près linéaire For the sake of simplification, by solving equations (18) and (23) and using the Taylor series again, we obtain a roughly linear relationship

où les paramètres (c_p ,_q , d_p , _q ) dépendent à la fois de la fréquence (p,q) et des paramètres (a, b ) de l'appareil photographique. En conséquence , les paramètreswhere the parameters (c _p , _q , d _p , _q ) depend on both the frequency (p, q) and the parameters (a, b) of the camera. As a result, the parameters

(c_p ,_q , dp , _q ) peuvent être utilisés comme empreinte digitale pour l'identification du modèle de l'appareil photographique. (c _p , _q , dp, _q ) can be used as a fingerprint for identifying the model of the camera.

5.B- Estimation de l'empreinte de l'appareil photographique 5.B- Estimation of the impression of the camera

[0040] Il est important de rappeler que la compression JPEG implique des étapes de base, à savoir: l'opération DCT, la quantification uniforme des coefficients DCT avec une matrice de quantification et un codage entropique des valeurs quantifiées . Par conséquent, il semble que les empreintes digitales de l'appareil photographique (c_p ,_q , d_p, _q ) donné par l'analyse mathématique ci-dessus pourraient travailler avec les coefficients DCT originaux quantifiés extraits du fichier JPEG. Toutefois, en raison de l'effet de la quantification, l'information qui n'est pas visuellement significative est ignorée. La haute fréquence contient la plupart des zéros. Ainsi il n'y a pas suffisamment de statistiques pour estimer les paramètres (c_p ,_q , d_p, _q ). En outre, les paramètres (c_p ,_q , d_p, _q ) estimés à basse fréquence peuvent être contaminés parce que leurs coefficients DCT sont fortement influencés par le contenu de l'image. It is important to remember that JPEG compression involves basic steps, namely: the DCT operation, the uniform quantization of the DCT coefficients with a quantization matrix and an entropy coding of the quantized values. Therefore, it appears that the fingerprints of the camera (c _p , _q , d _p , _q ) given by the mathematical analysis above could work with the quantized original DCT coefficients extracted from the JPEG file. However, because of the effect of quantization, information that is not visually significant is ignored. High frequency contains most zeros. Thus, there are not enough statistics to estimate the parameters (c _p , _q , d _p , _q ). In addition, the parameters (c _p , _q , d _p , _q ) estimated at low frequency can be contaminated because their DCT coefficients are strongly influenced by the content of the image.

[0041] Soit Z une image couleur avec trois composantes Z = { Z^c } où c { R , G , B } désigne l'indice de canaux rouge, vert et bleu . Afin d'atténuer l'impact du contenu de l'image, nous proposons de supprimer le contenu de l'image à partir de l'image Z donnée dans le domaine spatial en utilisant un filtre de débruitage T⁾ sur chaque canal de couleur pour obtenir une image résiduelle W^c (25) Let Z be a color image with three components Z = {Z ^c } where c {R, G, B} denotes the red, green and blue channel index. In order to mitigate the impact of the image content, we propose to remove the content of the image from the given Z image in the spatial domain by using a denoising filter T ⁾ on each color channel to to obtain a residual image W ^c (25)

L'image résiduelle avec 3 composantes {W^c} est converti en image monochrome comme The residual image with 3 components {W ^c } is converted into a monochrome image as

W = 0.298 W^R + CIS87W° - CU14W^S W = 0.298 W ^R + CIS87W ° - CU14W ^S

(26) (26)

L'image résiduelle^~W est ensuite transformé dans le domaine DCT = DCT(W) The residual image ^~ W is then transformed into the domain DCT = DCT (W)

(27) où Γ est l'image des coefficients DCT de l'image résiduelle W. (27) where Γ is the image of the DCT coefficients of the residual image W.

[0042] Par souci de clarté , l'image Γ est arrangée dans 64 vecteurs de coefficients DCT suivants l'ordre de « zig -zag » utilisé dans le standard de compression JPEG. Soit l^w _K = ( I _M ^w , ... , l^w _k,_N ) T , k <≡ { 1 , ... , 64 } , le vecteur de longueur N représentant le k-ième coefficient DCT où l , , 1 < i < N, désigne la valeur du k-ième coefficient DCT du bloc i et U^T désigne la transposée de la matrice U. De façon analogue, les paramètres caractérisant la distribution de l^wk sont désignés ( αι_< , βκ ) et les empreintes digitales de l'appareil photographique sont également notées (Ck, d_k). For the sake of clarity, the image Γ is arranged in 64 DCT coefficient vectors following the order of "zig -zag" used in the JPEG compression standard. Let l ^w _K = (I _M ^w , ..., l ^w _k , _N ) T, k <≡ {1, ..., 64}, the vector of length N representing the k-th coefficient DCT where l, , 1 <i <N, denotes the value of the k-th DCT coefficient of the block i and U ^T denotes the transpose of the matrix U. Similarly, the parameters characterizing the distribution of l ^w k are designated (αι _< , βκ ) and the fingerprints of the camera are also noted (Ck, d _k ).

[0043] L'analyse mathématique effectuée ci-dessus est basée sur l'hypothèse forte que les pixels sont distribués de façon identique à l'intérieur d'un petit bloc de 8 x 8 . Cette hypothèse peut ne pas être réaliste en raison de la présence de bords ou de détails dans une image naturelle. Par conséquent, pour assurer approximativement le cadre mathématique, nous travaillons uniquement sur les blocs homogènes 8 8 et nous effectuons l'estimation des paramètres sur les données sélectionnées . Avant la sélection d'un bloc, les blocs saturés doivent être exclus parce qu'ils faussent le cadre mathématique ci-dessus. Un pixel est désigné comme saturé si sa valeur de niveaux de gris se trouve dans la zone correspondant au 10% inférieure ou supérieure de la plage dynamique (typiquement la plage dynamique est de [ 0255 ] pour une image 8 bits, les zone saturées sont donc [0,25] et [220 255]) . Un bloc est désigné comme saturé s'il existe au moins un pixel saturé dans le bloc. La sélection des blocs est réalisée par un simple procédé non- adaptatif basé sur l'écart-type du bloc. L'idée est que dans un bloc homogène sans bord ou en détail, l'écart-type du bloc doit être faible. L'écart-type du bloc est estimée par un estimateur robuste : la médiane de la valeur absolue des écarts à la moyenne (MAD). Par conséquent, un bloc i est sélectionné si les deux conditions suivantes sont remplies où k e { 2 , ... , 64 } et I _k,i désigne le k-ième coefficient DCT dans le bloc ί de l'image^~Z en niveaux de gris en utilisant la même transformation qu'en (26) . Ici, au lieu de calculer la médiane de la valeur absolue des écarts à la moyenne (MAD) de chaque bloc dans le domaine spatial, on calcule la médiane de la valeur absolue des écarts à la moyenne (MAD) dans le domaine DCT pour profiter profit des propriétés de décorrélation et de compactage d'énergie de la DCT et ainsi obtenir une meilleure estimation de l'écart-type. Les coefficients DC sont exclus du calcul. Les seuils Ti et T₂ sont fixés à Ti= 1 ,5 et T₂ = 0,8. Il est à noter que la première condition est d'éliminer des blocs avec des bords forts et la seconde est de supprimer les blocs où une perturbation peut exister à cause du filtre de débruitage. The mathematical analysis performed above is based on the strong assumption that the pixels are identically distributed within a small block of 8 x 8. This assumption may not be realistic because of the presence of edges or details in a natural image. Therefore, to approximate the mathematical framework, we only work on the homogeneous blocks 8 8 and we estimate the parameters on the selected data. Before selecting a block, saturated blocks must be excluded because they distort the mathematical framework above. A pixel is designated as saturated if its gray level value is in the area corresponding to the lower or upper 10% of the dynamic range (typically the dynamic range is [0255] for an 8-bit image, the saturated areas are therefore [0.25] and [220 255]). A block is designated as saturated if there is at least one saturated pixel in the block. The block selection is performed by a simple non-adaptive method based on the standard deviation of the block. The idea is that in a homogeneous block without edge or in detail, the standard deviation of the block must be small. The standard deviation of the block is estimated by a robust estimator: the median of the absolute value of the deviations from the mean (MAD). Therefore, a block i is selected if both of the following conditions are met where ke {2, ..., 64} and I _k , i denote the k-th coefficient DCT in the block ί of the image ^~ Z in grayscale using the same transformation as in (26). Here, instead of calculating the median of the absolute value of the deviations from the mean (MAD) of each block in the spatial domain, we calculate the median of the absolute value of the deviations from the mean (MAD) in the DCT domain to take advantage of take advantage of the decorrelation and compaction properties of the DCT and thus obtain a better estimate of the standard deviation. DC coefficients are excluded from the calculation. The thresholds Ti and T ₂ are set at Ti = 1, 5 and T ₂ = 0.8. It should be noted that the first condition is to eliminate blocks with strong edges and the second is to remove blocks where a disturbance may exist because of the denoising filter.

[0044] Dans la pratique, l'empreinte de l'appareil photographique ( Ck , d_k ) doit être estimée à partir d'une seule image . Par conséquent, plusieurs paires de (<¾,β^ à partir d'une seule image sont nécessaires. SoifN le nombre de blocs sélectionnés, pour chaque fréquence , on extrait par hasard ^échantillons avec N≤^~N de Ι , , i { 1 , ... , N } et C { 1 , ... , L } , à partir de N échantillons, on effectue l'estimation du maximum de vraisemblance ML des paramètres (<¾,β^- Les paramètres ( Ck , dk ) sont estimés en considérant L paires de ( ^"<¾, _c, ^" k ) , C { 1 , ... , L } , et en appliquant les estimations des moindres carrés ordinaires (MCO) cf. [36] . A partir de la relation (24) on obtient: In practice, the print of the camera (Ck, d _k ) must be estimated from a single image. Therefore, several pairs of (<¾, β ^ from a single image are required.) SoifN the number of blocks selected, for each frequency, samples are randomly extracted with N≤ ^~ N from Ι,, i { 1, ..., N} and C {1, ..., L}, from N samples, the maximum likelihood ML of the parameters (<¾, β ^ - Parameters (Ck, dk) are estimated by considering L pairs of ( ^" <¾, _c , ^" k), C {1, ..., L}, and applying ordinary least squares (OLS) estimates, see [36]. from the relation (24) we obtain:

[0045] Les estimations MCO (ô_k, ^"d_k) ne sont pas biaisées et sont asymptotiquement équivalentes à des estimations du maximum de vraisemblance ML car les estimations ML ( <¾ , ι , βκ , ι ) suivent la distribution gaussienne asymptotique cf. [37]. Par conséquent, les estimations (ô_k , ^"d_k) suivent également la distribution. Afin de s'assurer qu'il y a suffisamment de statistiques pour l'estimation, nous utilisons L = 200 et N = 4096 (soit environ 10% de^~N ). The MCO estimates (δ _k , ^" d _k ) are unbiased and are asymptotically equivalent to ML estimates because the ML estimates (<¾, ι, βκ, ι) follow the asymptotic Gaussian distribution cf [37] Therefore, the estimates (ô _k , ^" d _k ) also follow the distribution. To ensure that there are enough statistics for the estimate, we use L = 200 and N = 4096 (about 10% of ^~ N).

[0046] Même si le contenu de l'image est supprimé, les coefficients DCT de l^w _k à basse fréquence peuvent encore être affectés légèrement. Par conséquent, il est plus pertinent d'utiliser les coefficients DCT de l^w _k à haute fréquence. Un exemple est donné sur la Fig. 4 pour illustrer la relation linéaire entre <¾ et _k ^"1 . En outre, la figure 5 montre un nuage de points des paramètres ( c , ^"d) estimés à partir de chaque image en suivant la méthode ci-dessus. Les images utilisées pour la figure 4 et la figure 5 couvrent différentes scènes et des différents réglages de l'appareil photographique. Les paramètres (c, d) sont invariantes et robustes pour le traitement des algorithmes non - linéaires . En outre, du fait que les paramètres ( c , d) sont liées à des paramètres ( a, B) qui ont été proposés pour l'identification du modèle de l'appareil photographique dans [ 25 ] et [ 35 ] , ils sont aussi discriminatifs pour différentes modèles de l'appareil photographique et ils peuvent être également exploités pour l'identification du modèle de l'appareil photographique. Even if the content of the image is removed, the DCT coefficients of l ^w _k at low frequency can still be slightly affected. Therefore, it is more relevant to use the DCT coefficients of l ^w _k at high frequency. An example is given in FIG. 4 to illustrate the linear relationship between <¾ and _k ^"1. In addition, Figure 5 shows a scatter plot of the parameters (c, ^" d) estimated from each image by following the above method. The images used for Figure 4 and Figure 5 cover different scenes and different settings of the camera. The parameters (c, d) are invariant and robust for the processing of nonlinear algorithms. In addition, because the parameters (c, d) are related to parameters (a, B) that have been proposed for the identification of the camera model in [25] and [35], they are also discriminative for different models of the camera and they can also be exploited for the identification of the model of the camera.

6- Formulation du test d'hypothèses statistiques 6- Formulation of the statistical hypothesis test

[0047] La présente invention vise à identifier les modèles d'appareils photo basés sur des statistiques DCT. Cette partie permet d'analyser deux modèles d'appareils photographiques 0 et 1 . Chaque modèle d'appareil photographique j, j <≡ {0,1 }, est caractérisé par des paramètres ( Ck , d k ) où k désigne la fréquence avec k <≡ { 2 , ... , 64 }. Dans un test d'hypothèse binaire, l'image Z inspectée est soit acquise par un modèle d'appareil photographique 0, ou par un modèle d'appareil photo 1 . L'Obje tif de test est de décider entre deux hypothèses définies par : The present invention aims to identify camera models based on DCT statistics. This part allows to analyze two models of devices 0 and 1. Each model of camera j, j <≡ {0,1}, is characterized by parameters (Ck, dk) where k designates the frequency with k <≡ {2, ..., 64}. In a binary hypothesis test, the inspected Z image is either acquired by a camera model 0, or by a camera model 1. The test object is to decide between two hypotheses defined by:

où P_Qk _, kj- représente la distribution statistique des coefficients DCT de l^w _k,i sous hypothèse i des paramètres (<¾ , P_kj)- Comme expliqué précédemment, on met l'accent sur la garantie prescrite d'un probabilité de fausse alarme. Par conséquent, on définit la classe de tests dont la probabilité de fausse alarme inférieure à la borne prescrite par αθ. Ici, P _r (E) représente la probabilité d'événement E sous l'hypothèse 3^~C avec j {0,1 }, et le supremum au dessus de Θ doit être compris comme n'importe quel valeur pour les paramètres du modèle. Parmi tous les tests de la classe Καο , on vise à trouver un test δ qui maximise la fonction puissance, défini par la probabilité de détection correcte: where P _Q k _, kj- represents the statistical distribution of the DCT coefficients of l ^w _k , i under the assumption i of the parameters (<¾, P _k j) - As explained above, the emphasis is placed on the prescribed guarantee of a probability of false alarm. Therefore, we define the class of tests whose probability of false alarm is less than the limit prescribed by αθ. Here, P _r (E) represents the probability of event E under the assumption 3 ^~ C with j {0,1}, and the supremum above Θ must be understood as any value for the parameters of the model . Among all the tests of the Καο class, we aim to find a test δ that maximizes the power function, defined by the probability of correct detection:

[0048] Le problème défini en (30) met en évidence trois difficultés fondamentales de l'identification du modèle de l'appareil photo. Tout d'abord, même si tous les paramètres de modèle (<¾ , Ck , j , d_k , j) sont connus , le test le plus puissant, à savoir le LRT, n'a jamais été étudié pour ce problème. La deuxième difficulté concerne les paramètres de nuisance inconnus ci_k en pratique. Une approche possible pour faire face à des paramètres de nuisance inconnus consiste à les éliminer en utilisant le principe d'invariance cf. [38] . Cette approche a été discutée dans les références [39], [40] et a réalisé une bonne performance dans certaines applications cf. [41 ]. Cependant, cette approche ne peut être appliquée ici car les hypothèses testées ne possèdent pas les propriétés de « symétrie » requises pour l'application du principe d'invariance en statistique. Une autre approche est de concevoir un test GLRT en remplaçant les paramètres inconnus par des estimations du maximum de vraisemblance ML cf. [ 42 ] . Enfin, les deux hypothèses J^~C_o et sont composites car les paramètres de l'appareil photographique ( Ck j , dk j) sont inconnus. The problem defined in (30) highlights three fundamental difficulties in identifying the model of the camera. First, even if all the model parameters (<¾, Ck, j, _dk , j) are known, the most powerful test, namely the LRT, has never been studied for this problem. The second difficulty concerns the unknown nuisance parameters _k it in practice. A possible approach to deal with parameters of unknown nuisance consists in eliminating them by using the principle of invariance cf. [38]. This approach has been discussed in references [39], [40] and has performed well in some applications cf. [41]. However, this approach can not be applied here because the hypotheses tested do not possess the "symmetry" properties required for the application of the invariance principle in statistics. Another approach is to design a GLRT test by replacing the unknown parameters with ML estimates. [42]. Finally, the two hypotheses J ^~ C _o and are composite because the parameters of the camera (Ck j, dk j) are unknown.

[0049] Par souci de clarté, nous supposons que les paramètres de l'appareil photo o , d_k ,o ) sont connus et nous résolvons seulement le problème dans lequel l'hypothèse alternative est composite, autrement dit, les paramètres de l'appareil photo ( Ck,i ; d_k,i ) ne sont pas connus. Il est à noter qu'un test qui maximise la puissance de détection quelle qu'elle soit ( Ck,i ; d_k, i ) pourrait exister. Notre objectif principal est d'étudier le test LRT et de concevoir le test GLRT pour répondre à la deuxième et à la troisième difficultés. For the sake of clarity, we assume that the parameters of the camera o, d _k , o) are known and we only solve the problem in which the alternative hypothesis is composite, that is, the camera settings (Ck, i; d _k , i) are not known. It should be noted that a test that maximizes the detection power of any kind (Ck, i; d _k , i) could exist. Our main goal is to study the LRT test and design the GLRT test to answer the second and third difficulties.

[0050] En outre, il convient de souligner que le test GLRT traité avec des paramètres de nuisance inconnus <¾ , lorsque les paramètres de l'appareil photo sont connus, peut être interprété comme un test d'hypothèse fermé où une image donnée est soit acquise par le modèle d'appareil photo 0, soit par le modèle d'appareil photo 1 . Tandis que, le test GLRT traité avec les paramètres de l'appareil photo inconnu ( Ck,i ; d_k, i ) devient un test d'hypothèse ouverte dans lequel une image donnée est acquise par un modèle d'appareil photo 0 ou non. En effet, l'image donnée peut être acquise par un modèle d'appareil photo inconnu. Par conséquent, les deux tests proposés peuvent être appliqués, en fonction de l'exigence du contexte. In addition, it should be noted that the GLRT test treated with unknown nuisance parameters <¾, when the camera parameters are known, can be interpreted as a closed hypothesis test where a given image is is acquired by the camera model 0, or by the camera model 1. While, the GLRT test processed with the parameters of the unknown camera (Ck, i; d _k , i) becomes an open hypothesis test in which a given image is acquired by a camera model 0 or not . Indeed, the given image can be acquired by an unknown camera model. Therefore, the two proposed tests can be applied, depending on the context requirement.

6.A- Test du rapport de vraisemblance pour deux hypothèses simples. 6.A- Likelihood ratio test for two simple hypotheses.

[0051] Lorsque tous les paramètres du modèle sont connus, en vertu du lemme de Neyman- Pearson [31 , théorème 3.2.1 ], le test le plus puissant, δ qui résout le problème (30) est le test LRT proposé par la règle de décision suivante: le seuil de décision 7 est la solution de l'équation suivante: When all the parameters of the model are known, by virtue of the Neyman- Pearson lemma [31, Theorem 3.2.1], the most powerful test, δ which solves the problem (30), is the LRT test proposed by the following decision rule: decision threshold 7 is the solution of the following equation:

¾0 A(Z> > T = f^' .) ¾0 A (Z>> T = f ^' .)

(32) pour s'assurer que le test LRT est dans la classe Καο , en utilisant Γ approximation du modèle de coefficient DCT, définie en ( 8 ), le rapport de vraisemblance ( LR ) d'une observation Ι , est défini par (32) to ensure that the LRT test is in the Καο class, using Γ approximation of the DCT coefficient model, defined in (8), the likelihood ratio (LR) of an observation Ι, is defined by

La distribution statistique du rapport de vraisemblance, LR noté Λ ( Z ), sous chaque hypothèse 3^~(j est donné par The statistical distribution of the likelihood ratio, LR denoted Λ (Z), under each assumption 3 ^~ (j is given by

Π Π

A(Z) A ( ir V_j ) A (Z) A (ir V _j )

(34) où la notation désigne la convergence vers la distribution et (ιη, Vj) désignent respectivement la moyenne et la variance du LR Λ ( Z ) défini plus loin en ( 66 ) et ( 67 ). Comme une image RAW naturelle est hétérogène, il est proposé de normaliser le LR Λ ( Z ) afin de fixer le seuil de décision indépendamment du contenu de l'image. Le LR normalisé est définie par (34) where the notation denotes the convergence towards the distribution and (ιη, Vj) respectively denote the mean and the variance of LR Λ (Z) defined later in (66) and (67). Since a natural RAW image is heterogeneous, it is proposed to normalize the LR Λ (Z) in order to set the decision threshold independently of the content of the image. Normalized LR is defined by

En conséquence, le test correspondant δ* est réécrit comme suit : As a result, the corresponding test δ * is rewritten as follows:

[0052] La normalisation du LR de Λ ( Z ) permet au test δ* d'être applicable à n'importe quelle image RAW naturelle car le LR Λ^* ( Z ) normalisé, suit la distribution gaussienne standard sous l'hypothèse 3-C. Le seuil de décision 7 * et la fonction puissance β_δ* sont obtenus par le théorème suivant: The normalization of the LR of Λ (Z) allows the test δ * to be applicable to any natural RAW image because the normalized LR Λ ^* (Z) follows the Gaussian standard distribution under the assumption 3- vs. The decision threshold 7 * and the power function β _δ * are obtained by the following theorem:

[0053] Théorème 1 : En supposant que tous les paramètres du modèle (<¾, Ck , dkj ) sont exactement connus , le seuil de décision et la fonction puissance β_δ* du test δ* est donné par Theorem 1: Assuming that all the parameters of the model (<¾, Ck, dkj) are exactly known, the decision threshold and the power function β _δ * of the test δ * is given by

* = φ^·--^{· 1}(1 - « ) * = φ ^· - ^{· 1} (1 - «)

(37) (37)

où Φ ( ) et Φ^"1 ( ) désignent respectivement la fonction de répartition de la distribution Gaussienne standard et sa fonction réciproque. where Φ () and Φ ^"1 () respectively denote the distribution function of the standard Gaussian distribution and its inverse function.

Ce seuil de décision garanti que la probabilité de fausse-alarme sera égale à <¾, c'est à dire de décider que la photographie ne provient pas de l'appareil photographique 0 alors que c'est effectivement le cas. This decision threshold guarantees that the probability of false alarm will be equal to <¾, that is to say to decide that the photograph does not come from the camera 0 when it is indeed the case.

[0054] Comme nous proposons de concevoir le test LRT en utilisant la fonction d'approximation du modèle de coefficient DCT défini en ( 8 ), il est souhaitable d'évaluer la perte de puissance entre le test LRT théorique et le LRT approché. Dans le test LRT théorique, nous utilisons le modèle de coefficient DCT exact, défini en (7), et la moyenne ainsi que la vahance du LR sont numériquement calculées par les calculs d'intégrales. La fonction de puissance des deux tests LRTs est représentée sur la Figure 6 pour les données simulées avec des paramètres a = 3 , Co = 1 1 ,5 , do = -4 , Ci = 13 , d i = -5,5. Ces paramètres correspondent respectivement à des paramètres à la fréquence 64 de Canon Ixus 70 et des appareils Nikon D200 de la base de données d'image Dresden cf. [ 33 ]. ( voir Fig . 5 ). Ils sont utilisés pour générer deux vecteurs de coefficients 2¹⁰ et 2¹² . La simulation est réalisée avec 5000 répétitions. As we propose to design the LRT test using the approximation function of the DCT coefficient model defined in (8), it is desirable to evaluate the power loss between the theoretical LRT test and the approximate LRT. In the theoretical LRT test, we use the exact DCT coefficient model, defined in (7), and the mean as well as the vahance of the LR are numerically calculated by integral calculations. The power function of the two LRTs tests is shown in Figure 6 for simulated data with parameters a = 3, Co = 1 1, 5, do = -4, Ci = 13, di = -5.5. These parameters correspond respectively to parameters at frequency 64 of Canon Ixus 70 and Nikon D200 devices of the Dresden image database cf. [33]. (see Fig. 5). They are used to generate two vectors of coefficients 2 ¹⁰ and 2 ¹² . The simulation is performed with 5000 repetitions.

[0055] La figure 6 montre clairement que la perte de puissance entre le test LRT théorique et le LRT approximative est négligeable. La puissance de détection βδ* sert de borne supérieure d'un test statistique pour le problème de l'identification de l'appareil photographique. Le test δ* permet de justifier d'un taux de fausse alarme prescrit et maximise aussi la probabilité de détection. Comme sa performance statistique est analytiquement établie, il peut fournir un résultat analytique prévisible pour toute probabilité de fausse alarme <¾. Figure 6 clearly shows that the power loss between the theoretical LRT test and the approximate LRT is negligible. The detection power βδ * serves as the upper limit of a statistical test for the problem of the identification of the camera. The δ * test makes it possible to justify a prescribed false alarm rate and also maximizes the probability of detection. Since its statistical performance is analytically established, it can provide a predictable analytic result for any false alarm probability <¾.

En fait , le test LRT vise à prendre une décision en utilisant le rapport entre la fonction du maximum de vraisemblance d'une image donnée sous l'hypothèse alternative /i caractérisée par les paramètres de l'appareil photo ( Ck , 1 , dk , 1 ) et la fonction du maximum de vraisemblance sous l'hypothèse nulle 3^~(o caractérisée par les paramètres de l'appareil photo ( Ck , o , dk , o)■ Si ce rapport est inférieur à un seuil, l'hypothèse nulle 3^~(o est acceptée. A l'inverse, l'hypothèse l'alternative /i est acceptée. Par conséquent, plus la distance entre les deux points ( Ck , o , dk , o) et ( Ck , 1 , dk , 1 ) est petite, plus l'identification de l'appareil photographique est difficile. In fact, the LRT test aims to make a decision using the ratio of the maximum likelihood function of a given image under the alternative hypothesis / i characterized by the parameters of the camera (Ck, 1, dk, 1) and the maximum likelihood function under the null hypothesis 3 ^~ (o characterized by the parameters of the camera (Ck, o, dk, o) ■ If this ratio is less than a threshold, the null hypothesis ^3- (o is accepted. in contrast, the alternative hypothesis / i is accepted. therefore, the greater the distance between the two points (Ck, o, dk, o) and (Ck, 1, dk, 1) is small, the more difficult the identification of the camera.

6.B- Test du rapport de vraisemblance généralisé 6.B- Generalized likelihood ratio test

[0056] Dans cette partie, on conçoit deux tests GLRTs pour traiter avec des paramètres inconnus. Il est proposé de remplacer les paramètres inconnus par leurs estimations du maximum de vraisemblance ML définies en ( 33 ). En supposant que les paramètres de l'appareil photo ( Ck , o , dk , o) et ( Ck , 1 , dk , 1 ) sont connus , le premier test GLRT est conçu comme suit In this part, two GLRTs tests are designed to deal with unknown parameters. It is proposed to replace the unknown parameters by their maximum likelihood ML estimates defined in (33). Assuming that the camera parameters (Ck, o, dk, o) and (Ck, 1, dk, 1) are known, the first GLRT test is designed as follows

(39) où , à nouveau pour assurer ^"δι d'être dans la classe ΚαΟ , ^"T? est la solution de l'équation (39) where, again to ensure ^" δι to be in the class ΚαΟ, ^" T? is the solution of the equation

et le rapport de vraisemblance généralisé ( GLR ) de Ύ\ι (Ι , ) est donné par and the generalized likelihood ratio (GLR) of Ύ \ ι (Ι,) is given by

Ici, l'estimation du maximum de vraisemblance ML de ^"<¾ est donnée par la méthode proposée précédemment et Here, the maximum likelihood ML estimate of ^" <¾" is given by the method previously proposed and

^"βΛ,ί = c k , j ^"cik + d k , j , j ^e { 0 , 1 }. L'estimation ML de ^"<¾ converge asymptotiquement vers sa vraie valeur: ^"a_k ^fS - a_k. Par conséquent, d'après le théorème de la Slutsky [ 38 , théorème 1 1 .2.1 1 ] , nous obtenons la distribution statistique asymptotique du GLR de ^"Λ1 ( Z ) sous chaque hypothèse 3^~C ^" βΛ, ί = ck, j ^" cik + dk, j, j ^e {0, 1}. The ML estimate of ^" <¾ converges asymptotically to its true value: ^" a _k ^fS - a _k . Therefore, according to the Slutsky theorem [38, Theorem 1 1 .2.1 1], we obtain the asymptotic statistical distribution of the GLR of ^" Λ1 (Z) under each assumption 3 ^~ C

A: (Z)—— i A: (Z) - i

(42) (42)

Comme dans le cas du test LRT, il est proposé de normaliser le GLR ^"Λ1 ( Z ) . Cependant, l'espérance m₀ et la variance v₀ ne sont pas définis car le paramètre ci_k est inconnu en pratique. Nous remplaçons <¾ par ^"a_k définie dans ( 66 ) et ( 67 ) pour obtenir les estimations de m₀ et de v₀ désignant respectivement ^"m₀ ^{(1 )} et ^"v₀ ^{(1 )}■ Par conséquent, le GLR normalisé Ύ\^*ι (Z) est défini par: et le ^"δ*ι correspondant est réécrit comme suit As in the case of the LRT test, it is proposed to standardize the GLR ^" Λ1 (Z), however, the expectation m ₀ and the variance v ₀ are not defined because the parameter ci _k is unknown in practice. ¾ by ^" a _k defined in (66) and (67) to obtain the estimates of m ₀ and v ₀ denoting respectively ^" m ₀ ⁽¹⁾ and ^" v ₀ ⁽¹⁾ ■ Therefore, the normalized GLR Ύ \ ^* ι (Z) is defined by: and ^"δ * ι matching is rewritten as follows

En utilisant à nouveau le théorème de Slutsky [ 38 , théorème 11.2.11 ] , on obtient le seuil de décision et la fonction puissance du test ^"δ*ι . By using again Slutsky's theorem [38, Theorem 11.2.11], we obtain the decision threshold and the power function of the test ^" δ * ι.

[0057] Théorème 2. Lorsque les paramètres de l'appareil photo ( Ck , o , dk , o) et ( Ck , i , dk , i) sont connus , le seuil de décision et la fonction puissance du test ^"δ*ι sont donnés par Theorem 2. When the parameters of the camera (Ck, o, dk, o) and (Ck, i, dk, i) are known, the decision threshold and the power function of the test ^" δ * ι are given by

Lorsque les paramètres de l'appareil photo ( Ck , 1 , dk , 1) ne sont pas connus , nous pouvons également concevoir le test GLRT suivant la procédure ci-dessus. Les paramètres inconnus (Ck,i , dk ,1) sont remplacés par les estimations MCO de ( ¾,ι ,^"dk,i) défini en (33) When the camera settings (Ck, 1, dk, 1) are not known, we can also design the GLRT test following the procedure above. The unknown parameters (Ck, i, dk, 1) are replaced by the MCO estimates of (¾, ι, ^" dk, i) defined in (33)

où ^"βΛ,ο = ^"Ck,o^"a_k + d _k,oet ^"βΛ,, = ^"c _k, i ^"a_k + ^"d _k, i- where ^" βΛ, ο = ^" Ck, o ^" a _k + d _k , o and ^" βΛ ,, = ^" c _k , i ^" a _k + ^" d _k , i

[0058] Comme les estimations MCO de ( ¾,ι ,^"dk,i) sont conformes, le GLR As the MCO estimates of (¾, ι, ^" dk, i) conform, the GLR

7\2 ( Z ) = i = i ^"Λ2 ( _k,i) converge aussi vers la distribution gaussienne avec la moyenne rrij et la variance Vj sous chaque hypothèse 3^~. De même, le GLR normalisé 7\2 ( Z ) est défini par où ^"mo^ et ^"vo^ sont les estimations de l'espérance m₀ et de la variance v₀ en remplaçant a_k par ^"a_k et (Ck,i , dk ,i) par ( ¾ ,ι ,^"dk,i). Le test ^"5*2 correspondant est réécrit comme suit 7 \ 2 (Z) = i = i ^" Λ2 ( _k , i) also converges to the Gaussian distribution with mean rrij and variance Vj under each assumption 3 ^~ Similarly, the normalized GLR 7 \ 2 (Z) is defined by where ^" mo ^" and ^" vo ^" are the estimates of the expectation m ₀ and the variance v ₀ by replacing a _k by ^" a _k and (Ck, i, dk, i) by (¾, ι, ^" dk, i). The corresponding ^" 5 * 2 ^" test is rewritten as follows

D'après le théorème de Slutsky, le seuil de décision et la fonction puissance du test ^"5*2 sont donnés par le théorème suivant: Théorème 3 . Lorsque les paramètres de l'appareil photo ((Ck,o , dk ,o) sont connus et les paramètres (C_k,i , d_k ,₁) ne sont pas connus , le seuil de décision et la fonction puissance du test 5*2 sont donnés par : According to Slutsky's theorem, the decision threshold and the power function of the ^" 5 * 2 ^" test are given by the following theorem: Theorem 3. When the parameters of the camera ((Ck, o, dk, o) are known and the parameters (C _k , i, d _k , ₁ ) are not known, the decision threshold and the power function of the 5 * 2 test are given by:

[0059] Les deux tests GLRTs proposés peuvent être appliqués dans le contexte pratique. Le premier test GLRT ^"5*i vise à détecter toute image donnée acquis par le modèle d'appareil photo 0 ou le modèle d'appareil photo 1 tandis que le test GLRT ^"5*2 vise à vérifier si l'image donnée est acquis par le modèle de l'appareil photographique 0 . The two proposed GLRTs tests can be applied in the practical context. The first GLRT ^" 5 * i test is designed to detect any given image acquired by camera model 0 or camera model 1 while the GLRT ^" 5 * 2 test aims to verify whether the given image is acquired by the model of the camera 0.

[0060] Ce seuil de décision garanti que la probabilité de fausse-alarme sera égale à cio, c'est-à-dire de décider que la photographie ne provient pas de l'appareil photographique 0 alors que c'est effectivement le cas. [0061] La Figure 7 représente la fonction puissance de deux tests GLRTs ^"δ*ι et ^"5*2 comparé avec le test LRT δ* pour les coefficients 2¹⁰ et 2¹². La perte de puissance est évidemment révélée en raison de l'erreur d'estimation des paramètres du modèle. Par ailleurs, on peut noter que la perte de puissance diminue lorsque le nombre de coefficients augmente. Pour coefficients 2¹⁴ utilisés, tous les tests sont parfaits, β_δ* = β -5*1 = β~_δ*ι = 1 , c'est à dire qu'il n'y a pas d'erreur de détection sur 5000 images simulées à partir du modèle de l'appareil photo 0 et sur 5000 images simulées à partir du modèle de l'appareil photo 1 . This decision threshold ensures that the probability of false alarm will be equal to cio, that is to say, to decide that the photograph does not come from the camera 0 when it is indeed the case. FIG. 7 represents the power function of two GLRTs tests ^" δ * ι and ^" 5 * 2 compared with the LRT test δ * for the coefficients 2 ¹⁰ and 2 ¹² . The loss of power is obviously revealed because of the error of estimation of the parameters of the model. On the other hand, it can be noted that the power loss decreases as the number of coefficients increases. For coefficients 2 ¹⁴ used, all the tests are perfect, β _δ * = β -5 * 1 = β ~ _δ * ι = 1, ie there is no detection error on 5000 simulated images from the camera model 0 and over 5000 simulated images from the camera model 1.

[0062] La méthode proposée peut être étendue aux images issus d'un flux de vidéo. Un flux vidéo est composé d'une succession d'images qui défilent à un rythme fixe. La compression vidéo est une méthode de compression de données qui consiste à réduire la quantité de données, en minimisant l'impact sur la qualité visuelle de la vidéo. L'intérêt de la compression vidéo est de réduire les coûts de stockage et de transmission des fichiers vidéo. Les séquences vidéo contiennent une très grande redondance statistique, aussi bien dans le domaine temporel que dans le domaine spatial. La propriété statistique fondamentale sur laquelle les techniques de compression se fondent, est la corrélation entre pixels. Cette corrélation est à la fois spatiale, les pixels adjacents d'une image courante sont similaires, et temporelle, les pixels des images passées et futures sont aussi très proches du pixel courant. Les algorithmes de compression vidéo de type MPEG utilisent la transformation DCT (transformé en cosinus discrète) sur des blocs de 8 x 8 pixels, pour analyser efficacement les corrélation spatiales entre pixels voisins de la même image. Ainsi, dans le procédé selon l'invention la photographie peut être une image issus d'un flux vidéo et compressée selon la norme MPEG. [0063] Il est possible d'envisager un autre cas d'utilisation du système d'identification d'un modèle d'appareil photographique en vue de déterminer si une zone de l'image n'a pas été falsifiée, par copier/coller depuis une autre photographie ou par suppression d'un élément. The proposed method can be extended to images from a video stream. A video stream is composed of a succession of images that scroll at a fixed rate. Video compression is a method of data compression that involves reducing the amount of data, minimizing the impact on the visual quality of the video. The advantage of video compression is to reduce the cost of storing and transmitting video files. The video sequences contain a very high statistical redundancy, both in the time domain and in the spatial domain. The fundamental statistical property on which compression techniques are based is the correlation between pixels. This correlation is both spatial, the adjacent pixels of a current image are similar, and temporally, the pixels of past and future images are also very close to the current pixel. MPEG-based video compression algorithms use discrete cosine transform (DCT) transformations on 8x8 pixel blocks to effectively analyze spatial correlations between neighboring pixels in the same image. Thus, in the method according to the invention the photograph may be an image from a video stream and compressed according to the MPEG standard. It is possible to envisage another case of use of the identification system of a camera model in order to determine whether an area of the image has not been falsified, by copying / pasting. from another photograph or by deleting an element.

[0064] Enfin, il est possible d'envisager un autre cas d'utilisation du système d'identification d'un appareil photographique en vue de déterminer, de façon supervisée, si une zone de l'image n'a pas été falsifiée (par copier/coller depuis une autre photographie ou par suppression d'un élément). On entend ici par « supervisée » le fait que l'utilisateur souhaite s'assurer de l'intégrité d'une zone préalablement définie. Le principe est alors d'appliquer la méthode d'identification aux deux " sous-images " issues respectivement de la zone ciblée par l'utilisateur et de sa complémentaire (le reste de l'image). Si l'élément inspecté provient d'une autre photographie et a été ajouté par copier/coller, les propriétés de bruit seront différentes, ce que le système proposé sera capable d'identifier (en supposant que les photographies n'ont pas été prises dans les mêmes conditions d'acquisition et avec le même modèle d'appareil photographique, ce qui semble raisonnable). [0065] La méthode d'identification du modèle d'appareil photographique proposée répond aux deux faiblesses des méthodes brièvement présentées dans l'état de l'art : 1 ) leurs performances ne sont pas établies et, 2) ces méthodes peuvent être mises en échec par la calibration de l'appareil photographique. La méthode proposée reposant sur les propriétés du bruit inhérentes à l'acquisition des photographies compressées, elle est applicable quels que soient les traitements pots-acquisition appliqués par un utilisateur (notamment en vue d'améliorer la qualité visuelle). En outre, la modélisation paramétrique de la distribution statistique de la valeur des pixels dans le domaine fréquentiel permet de fournir de façon analytique les performances du test proposé. Cet avantage permet notamment d'assurer le respect d'une contrainte prescrite sur la probabilité d'erreur. Finally, it is possible to envisage another case of use of the identification system of a camera in order to determine, in a supervised manner, whether an area of the image has not been falsified ( copy / paste from another photograph or by deleting an element). Here, the term "supervised" means that the user wishes to ensure the integrity of a previously defined area. The principle is then to apply the identification method to the two "sub-images" respectively from the area targeted by the user and its complementary (the rest of the image). If the item inspected comes from another photograph and has been added by copy / paste, the noise properties will be different, which the proposed system will be able to identify (assuming the photographs were not taken in the same acquisition conditions and with the same model of camera, which seems reasonable). The method of identification of the proposed camera model meets the two weaknesses of the methods briefly presented in the state of the art: 1) their performance is not established and, 2) these methods can be implemented. failure by the calibration of the camera. Since the proposed method is based on the noise properties inherent to the acquisition of compressed photographs, it is applicable regardless of the pot-acquisition treatments applied by a user (in particular to improve the visual quality). In addition, the parametric modeling of the statistical distribution of the value of the pixels in the frequency domain makes it possible to provide analytically the performances of the proposed test. This advantage allows in particular to ensure compliance with a prescribed constraint on the probability of error.

[0066] Les principaux domaines d'applications de l'invention sont d'une part, la recherche de preuve à partir d'une image « compromettante » et, d'autre part, la garantie qu'une photographie a été acquise par un appareil donné. [0067] La méthode proposée peut être étendue au contrôle de l'intégrité d'une photographie. Le but est alors de garantir qu'une photographie n'a pas été modifiée/falsifiée depuis son acquisition. Cela permet par exemple de détecter les photographies comportant des éléments provenant d'un appareil photographique différent, i.e. importés après l'acquisition, ou encore d'assurer l'intégrité d'un document scanné ou photographié (un document juridique par exemple). [0068] Le procédé de l'invention pourra être développé dans des logiciels spécialisés de fabricants logiciels, dans la recherche de preuve à partir de média numériques. Le procédé selon l'invention peut être utilisé auprès des tribunaux en vue de fournir un outil d'aide à la décision. 7- Expériences numériques The main fields of application of the invention are on the one hand, the search for evidence from a "compromising" image and, on the other hand, the guarantee that a photograph has been acquired by a given device. The proposed method can be extended to the control of the integrity of a photograph. The goal is to guarantee that a photograph has not been modified / falsified since its acquisition. This allows for example to detect photographs with elements from a different camera, ie imported after the acquisition, or to ensure the integrity of a document scanned or photographed (a legal document for example). The method of the invention may be developed in software manufacturers specialized software, in the search for evidence from digital media. The method according to the invention can be used with the courts to provide a decision support tool. 7- Digital experiments

7.A- Résultats sur une grande base de données 7.A- Results on a large database

ANNEXE A ANNEX A

L'approximation de Laplace de modèle de coefficient DCT Laplace approximation of DCT coefficient model

Nous allons décrire brièvement l'approximation de Laplace [43]. La méthode de Laplace vise à fournir un rapprochement des intégrales de la forme J exp ( - g (t ) ) dt quand la fonction g (t) atteint le minimum global à t^*. Considérons l'intégrale We will briefly describe the Laplace approximation [43]. The Laplace method aims to provide a comparison of the integrals of the form J exp (- g (t)) dt when the function g (t) reaches the global minimum at t ^* . Consider the integral

Le développement de Taylor de la fonction g (t ) à t^* donne: où g' (t) et g"(t) représentent respectivement la première et la seconde dérivées de la fonction g (t). Étant donné que la fonction g (t) atteint un minimum à t ^* , g' (t ^* ) = 0 . Par conséquent, l'intégrale I peut être approchée comme Taylor's development of the function g (t) at t ^* gives: where g '(t) and g "(t) represent respectively the first and the second derivatives of the function g (t), since the function g (t) reaches a minimum at t ^* , g' (t ^* ) = 0. Therefore, the integral I can be approximated as

Cette intégrale prend la forme de l'intégrale gaussienne. nous obtenons: This integral takes the form of the Gaussian integral. we obtain:

Une généralisation a été faite dans [32] avec une fonction arbitraire h(t) A generalization was made in [32] with an arbitrary function h (t)

À partir de la relation (6) , le modèle de coefficient DCT de f| ( x) est réécrite comme suit From relation (6), the DCT coefficient model of f | (x) is rewritten as follows

Q(Î) = — 4- ^~ mâ h(t) = if*^~ s Q (Î) = - 4- ^~ m h (t) = if * ^~ s

^"' ^■ " H v.t. " " ^" ' ^■ " H vt ""

(58) (58)

La fonction g(t) atteint le minimum à t^*=|x|V /2 et sa seconde dérivée est définie par g" (t) = x²/t³. Par conséquent, la fonction fi (x) peut être approchée comme suit : The function g (t) reaches the minimum at t ^* = | x | V / 2 and its second derivative is defined by g "(t) = x ² / t ^3. Therefore, the function fi (x) can be approximated as following :

ANNEXE B ANNEX B

Distribution statistique du test LR Λ(Ζ) Statistical distribution of the LR test Λ (Ζ)

[0069] A noter qu'à partir de la relation ( 33 ) il est nécessaire de définir les deux premiers moments de la variable | I | . Étant donné une variance connue σ², la variable aléatoire I est la variable gaussienne de moyenne nulle avec la variance σ². Ainsi, la variable aléatoire | I | suit la distribution demi- normale cf. Note that from the relation (33) it is necessary to define the first two moments of the variable | I | . Given a known variance σ ² , the random variable I is the Gaussian variable of zero mean with the variance σ ² . So, the random variable | I | follows the half-normal distribution cf.

[44]. Par conséquent, on obtient [44]. Therefore, we get

Sur la base de la loi de l'espérance totale, l'espérance mathématique de | I | est donnée par On the basis of the law of total hope, the mathematical expectation of | I | is given by

[ V( .± i [V (. ± i

V ^" Γ(α) (61) V ^" Γ (α) (61)

En outre, la variance de | I | est donnée par In addition, the variance of | I | is given by

Var_/[|/|] = E_/[|/|²] -E [\I Var _/ [| / |] = E _/ [| / | ² ] -E [\ I

ri \ 'in ^■ ¾) ri \ 'in ^■ ¾)

····· o ····· o

2(α) (62) 2 (α) (62)

En conséquence, il résulte de ( 33 ) que les deux premiers moments du LR Λ (Ι , ) sous chaque hypothèse 3^~C sont donnés par: As a result, it follows from (33) that the first two moments of LR Λ (Ι,) under each hypothesis 3 ^~ C are given by:

(63) (63)

,. [Λ(¾)] = 2(-_ν ¾Γΐ - >,.,.',)^''' ,. [Λ (¾)] = 2 (- _ν ¾Γΐ ->,.,. ',) ^'''

X ";, ^ . . .- (64) où [^■] et Var [^■] désignent respectivement l'espérance et la variance mathématique sous l'hypothèse 3^~(j. En raison d'un grand nombre de coefficients dans une image naturelle, en vertu du théorème de la limite centrale de Lindeberg' ( CLT ) [ 38 , théorème 1 1 .2.5 ], la distribution statistique du LR Λ ( Z ) est donnée par X ";, ^. .- (64) where [ ^■ ] and Var [ ^■ ] respectively denote the expectation and the mathematical variance under assumption 3 ^~ (j) Because of a large number of coefficients in a natural image, by virtue of the central limit theorem of Lindeberg '(CLT) [38, Theorem 1 1 .2.5], the statistical distribution of LR Λ (Z) is given by

(65) où la notation désigne la convergence en distribution et (65) where the notation denotes convergence in distribution and

avec E A [(l^w _k,i)] et Var [A (l^w _k,i)] défini respectivement dans (63) et (64). with EA [(l ^w _k, i)] and Var [A (l ^w _k, i)] defined respectively in (63) and (64).

REFERENCES REFERENCES

[I ] T. V. Lanh, K.-S. Chong, S. Emmanuel, and M. Kankanhalli, "A survey on digital caméra image forensic methods," in Multimedia and Expo, 2007 IEEE International Conférence on, 2007, pp. 16-19. [I] T. V. Lanh, K.-S. Chong, S. Emmanuel, and M. Kankanhalli, "A survey on digital camera image forensic methods," in Multimedia and Expo, 2007 IEEE International Conference on, 2007, pp. 16-19.

[2] T. -T. Ng, S. -F. Chang, C.-Y. Lin, and Q. Sun, "Passive-blind image forensics," in In Multimedia Security Technologies for Digital Rights, 2006. [3] H. Farid, "A survey of image forgery détection," IEEE Signal Processing Magazine, vol. 2, no. 26, pp. 16-25, 2009. [2] T. -T. Ng, S.F. Chang, C.-Y. Lin, and Q. Sun, "Passive-blind image forensics," in Multimedia Security Technologies for Digital Rights, 2006. [3] H. Farid, "A survey of image forgery detection," IEEE Signal Processing Magazine, vol. 2, no. 26, pp. 16-25, 2009.

[4] R. Ramanath, W. E. Snyder, Y. Yoo, and M. S. Drew, "Color image processing pipeline," Signal Processing Magazine, IEEE, vol. 22, no. 1 , pp. 34 - 43, Jan. 2005. [4] R. Ramanath, W. E. Snyder, Y. Yoo, and S. Drew, "Color image processing pipeline," Signal Processing Magazine, IEEE, vol. 22, no. 1, pp. 34 - 43, Jan. 2005.

[5] J. Nakamura, Image Sensors and Signal Processing for Digital Still Caméras. CRC Press, 2005. [6] T. H. Thai, R. Cogranne, and F. Retraint, "Statistical Model of Natural Images," in ICIP 2012, International Conférence on Image Processing, Sep. 2012, pp. 2525 - 2528. [5] J. Nakamura, Image Sensors and Signal Processing for Digital Still Cameras. CRC Press, 2005. [6] T. H. Thai, R. Cogranne, and F. Retraint, "Statistical Model of Natural Images," in ICIP 2012, International Conference on Image Processing, Sep. 2012, pp. 2525 - 2528.

[7] K. S. Choi, E. Y. Lam, and K. Wong, "Source caméra identification using footprints from lens aberration," in Proc. of the SPIE, vol. 6069, Feb. 2006, pp. 172 - 179. [7] K. S. Choi, E. Y. Lam, and K. Wong, "Source Camera Identification Using Footprints from Lens Aberration," in Proc. of the SPIE, vol. 6069, Feb. 2006, pp. 172 - 179.

[8] M. Kharrazi, H. T. Sencar, and N. Memon, "Blind source caméra identification," in Image Processing, 2004. ICIP '04. 2004 International Conférence on, vol. 1 , Oct. 2004, pp. 709 - 712. [8] M. Kharrazi, H. T. Sencar, and N. Memon, "Blind Source Camera Identification," in Image Processing, 2004. ICIP '04. 2004 International Conference on, vol. 1, Oct. 2004, pp. 709 - 712.

[9] S. Bayram, H. Sencar, N. Memon, and I. Avcibas, "Source caméra identification based on cfa interpolation," in Image Processing, 2005. ICIP 2005. IEEE International Conférence on, vol. 3, Sept. 2005, pp. 69 - 72. [10] A. Swaminathan, M. Wu, and K. J. R. Liu, "Nonintrusive component forensics of visual sensors using output images," Information Forensics and Security, IEEE Transactions on, vol. 2, no. 1 , pp. 91 - 106, 2007. [9] S. Bayram, H. Sencar, N. Memon, and I. Avcibas, "Source Camera Identification Based on Cfa Interpolation," in Image Processing, 2005. ICIP 2005. IEEE International Conference on, vol. 3, Sept. 2005, pp. 69 - 72. [10] A. Swaminathan, Wu M., and K. J. R. Liu, "Nonintrusive component forensics of visual sensors using output images," Information Forensics and Security, IEEE Transactions on, Vol. 2, no. 1, pp. 91 - 106, 2007.

[I I ] H. Cao and A. C. Kot, "Accurate détection of demosaicing regularity for digital image forensics," Information Forensics and Security, IEEE Transactions on, vol.[1] H. Cao and A. C. Kot, "Information forensics and Security, IEEE Transactions on, Vol.

4, no. 4, pp. 899 - 910, Dec. 2009. 4, no. 4, pp. 899-910, Dec. 2009.

[12] K. S. Choi, E. Lam, and K. Wong, "Source caméra identification by JPEG compression statistics for image forensics," in TENCON 2006. [12] K. S. Choi, E. Lam, and K. Wong, "Camera Source Identification by JPEG Statistics for image compression forensics," in TENCON 2006.

2006 IEEE Région 10 Conférence, Nov. 2006, pp. 1 - 4. [13] G. Xu, S. Gao, Y. Q. Shi, R. Hu, and W. Su, "Caméra model identification using markovian transition probability matrix," in Proc. 9th Int. Workshop Digital Watermarking, vol. 5703. Springer, Aug. 2009, pp.294 - 307. [14] Z. Deng, A. Gijsenij, and J. Zhang, "Source camera identification using auto- white balance approximation," in Computer Vision (ICCV), 201 1 IEEE International Conférence on, Nov. 201 1 , pp. 57 - 64. 2006 IEEE Region 10 Conference, Nov. 2006, pp. 1 - 4. [13] G. Xu, S. Gao, YQ Shi, R. Hu, and W. Su, "Camera Model Identification Using Markovian Transition Probability Matrix," in Proc. 9th Int. Digital Watermarking Workshop, Vol. 5703. Springer, Aug. 2009, pp.294 - 307. [14] Z. Deng, A. Gijsenij, and J. Zhang, "Source camera identification using auto-white balance approximation," in Computer Vision (ICCV), 201 1 IEEE International Conference on, Nov. 201 1, pp. 57 - 64.

[15] C. Scott, "Performance measures for Neyman-Pearson classification," Information Theory, IEEE Transactions on, vol. 53, no. 8, pp. 2852-2863, aug.2007. [16] J. Lukas, J. Fridrich, and M. Goijan, "Digital camera identification from sensor pattern noise," Information Forensics and Security, IEEE Transactions on, vol. 1 , no. 2, pp. 205 - 214, Jun. 2006. [15] C. Scott, "Performance Measurements for Neyman-Pearson Classification," Information Theory, IEEE Transactions on, Vol. 53, no. 8, pp. 2852-2863, aug.2007. [16] J. Lukas, J. Fridrich, and M. Goijan, "Digital camera identification from noise sensor pattern," Information Forensics and Security, IEEE Transactions on, vol. 1, no. 2, pp. 205 - 214, Jun. 2006.

[17] M. Chen, J. Fridrich, M. Goijan, and J. Lukas, "Determining image origin and integrity using sensor noise," Information Forensics and Security, IEEE Transactions on, vol. 3, no. 1 , pp. 74 - 90, Mar. 2008. [17] M. Chen, J. Fridrich, M. Goijan, and J. Lukas, "Determining the image and origin of sensor noise," Information Forensics and Security, IEEE Transactions on, Vol. 3, no. 1, pp. 74 - 90, Mar. 2008.

[18] M. Goijan, J. Fridrich, and T. Filler, "Large scale test of sensor fingerprint camera identification," in Proc. SPIE, Electronic Imaging, Security and Forensics of Multimedia Contents XI, vol. 7254, Jan. 2009, pp. 18 - 22. [18] M. Goijan, J. Fridrich, and T. Filler, "Large scale test of sensor fingerprint identification," in Proc. SPIE, Electronic Imaging, Security and Forensics of Multimedia Contents XI, vol. 7254, Jan. 2009, pp. 18 - 22.

[19] C.-T. Li, "Source camera identification using enhanced sensor pattern noise," Information Forensics and Security, IEEE Transactions on, vol. 5, no. 2, pp. 280 -287, june 2010. [19] C.-T. Li, "Source camera identification using enhanced noise sensor pattern," Information Forensics and Security, IEEE Transactions on, vol. 5, no. 2, pp. 280-287, june 2010.

[20] X. Kang, Y. Li, Z. Qu, and J. Huang, "Enhancing source camera identification performance with a camera référence phase sensor pattern noise," Information Forensics and Security, IEEE Transactions on, vol. 7, no. 2, pp. 393 -402, april 2012. [20] X. Kang, Y. Li, Z. Qu, and J. Huang, "Enhancing the Source of a Camera's Performance with a Camera Reference Phase Sensor Pattern Noise," Information Forensics and Security, IEEE Transactions on, Vol. 7, no. 2, pp. 393 -402, April 2012.

[21 ] C.-T. Li and Y. Li, "Color-decoupled photo response non-uniformity for digital image forensics," Circuits and Systems for Video Technology, IEEE Transactions on, vol. 22, no. 2, pp. 260 - 271 , Feb. 2012. [21] C.-T. Li and Y. Li, "Circuits and Systems for Video Technology, IEEE Transactions on, Color-decoupled photo response non-uniformity for digital image forensics, vol. 22, no. 2, pp. 260 - 271, Feb. 2012.

[22] T. Filler, J. Fridrich, and M. Goijan, "Using sensor pattern noise forcamera model identification," in Image Processing, ICIP 2008. 15th IEEE International Conférence on, Oct. 2008, pp. 1296 - 1299. [22] T. Filler, J. Fridrich, and M. Goijan, "Using sensor pattern noise forcamera model identification," in Image Processing, ICIP 2008. 15th IEEE International Conference on, Oct. 2008, pp. 1296 - 1299.

[23] K. Kurosawa, K. Kuroki, and N. Saitoh, "CCD fingerprint methodidentification of a video camera from videotaped images," in Image Processing, International Conférence on, vol. 3, 1999, pp. 537 - 540. [23] K. Kurosawa, K. Kuroki, and N. Saitoh, "CCD fingerprint method identification of a video camera from videotaped images," in Image Processing, International Conference on, vol. 3, 1999, pp. 537-540.

[24] T. Gloe, M. Kirchner, A. Winkler, and R. B^'Ohme, "Can we trust digital image forensics?" in International Conférence on Multimedia, 2007, pp. 78 - 86. [25] T. H. Thai, R. Cogranne, and F. Retraint, "Camera Model Identification based on the Heteroscedastic Noise Model," be revised and resubmitted to Image Processing, IEEE Transactions on, 2012. [26] , "Steganalysis of jsteg algorithm based on a novel statistical model of quantized DCT coefficients," in to be published in ICIP, International Conférence on Image Processing, Sep. 2013. [27] E. Y. Lam and J. W. Goodman, "A mathematical analysis of the DCT oefficient distributions for images," Image Processing, IEEE Transactions on, vol. 9, no. 10, pp. 1661 - 1666, Oct. 2000. [24] T. Gloe, M. Kirchner, A. Winkler, and R. B ^' Ohme, "Can we trust digital image forensics?" in International Conference on Multimedia, 2007, pp. 78 - 86. [25] TH Thai, R. Cogranne, and F. Retraint, "Camera Modeling based on the Heterosedastic Noise Model," revised and resubmitted to Image Processing, IEEE Transactions on, 2012. [26], "Steganalysis of the algorithm based on a novel statistical model of quantized DCT coefficients," in ICIP, International Conference on Image Processing, Sep. 2013. [27] EY Lam and JW Goodman, "A Mathematical Analysis of the DCT for Distributions for Images," Image Processing, IEEE Transactions on, Vol. 9, no. 10, pp. 1661 - 1666, Oct. 2000.

[28] F. Muller, "Distribution shape of two-dimensional DCT coefficients of natural images," Electronics Letters, vol. 29, no. 22, pp. 1935 - 1936, Oct. 1993. [28] F. Muller, "Distribution shape of two-dimensional DCT coefficients of natural images," Electronics Letters, vol. 29, no. 22, pp. 1935 - 1936, Oct. 1993.

[29] J.-H. Chang, J.-W. Shin, N. S. Kim, and S. K. Mitra, "Image probability distribution based on generalized gamma function," Signal Processing Letters, IEEE, vol. 12, no. 4, pp. 325 - 328, Apr. 2005. [29] J.-H. Chang, J.-W. Shin, N. S. Kim, and S. K. Mitra, "Image probability distribution based on generalized gamma function," Signal Processing Letters, IEEE, vol. 12, no. 4, pp. 325 - 328, Apr. 2005.

[30] M. Blum, "On the central limit theorem for correlated random variables," Proceedings of the IEEE, vol. 52, no. 3, pp. 308 - 309, Mar. 1964. [30] Dr. Blum, "On the central limit theorem for correlated random variables," Proceedings of the IEEE, vol. 52, no. 3, pp. 308 - 309, Mar. 1964.

[31 ] I. M. Ryzhik and I. S. Gradshteyn, Tables of Intégrais, Séries, and Products. United Kingdom: Elsevier, 2007. [31] I. M. Ryzhik and I. S. Gradshteyn, Tables of Integrations, Series, and Products. United Kingdom: Elsevier, 2007.

[32] R. Butler and A. Wood, "Laplace approximations for hypergeometric functions with matrix argument," The Annals of Statistics, vol. 30, no. 4, pp. 1 155 - 1 177, 2002. [32] R. Butler and A. Wood, "Laplace approximations for hypergeometric functions with matrix argument," The Annals of Statistics, vol. 30, no. 4, pp. 1,155 - 1,177, 2002.

[33] T. Gloe and R. Bohme, "The 'dresden image database' for benchmarking digital image forensics," Proc. ACM SAC, vol. 2, pp. 1585-1591 , 2010. [33] T. Gloe and R. Bohme, "The 'dresden image database' for benchmarking digital image forensics," Proc. ACM SAC, vol. 2, pp. 1585-1591, 2010.

[34] J. Nelder and R. Mead, "A simplex method for function minimization," [34] J. Nelder and R. Mead, "A simplex method for function minimization,"

The Computer Journal, vol. 7, pp. 308-313, 1965. The Computer Journal, vol. 7, pp. 308-313, 1965.

[35] T. H. Thai, R. Cogranne, and F. Retraint, "Caméra model identification based on hypothesis testing theory," in Signal Processing Conférence IEEE TRANSACTION ON IMAGE PROCESSING 1 1 (EUSIPCO), 2012 Proceedings of the 20th European, aug. 2012, pp. 1747 -1751 . [35] T. H. Thai, R. Cogranne, and F. Retraint, "Camera model identification based on hypothesis testing theory," in Signal Processing Conference IEEE TRANSACTION ON IMAGE PROCESSING 1 1 (EUSIPCO), 2012 Proceedings of the 20th European, aug. 2012, pp. 1747 -1751.

[36] C. Rao and H. Toutenburg, Linear models : Least Squares and Alternatives, 2nd ed. Springer, 1999. [37] L. Le Cam, Asymptotics Methods in Statistical Décision Theory. Newyork: Séries in Statistics, Springer, 1986. [36] C. Rao and H. Toutenburg, Linear Models: Least Squares and Alternatives, 2nd ed. Springer, 1999. [37] L. Cam, Asymptotics Methods in Statistical Decision Theory. Newyork: Series in Statistics, Springer, 1986.

[38] E. L. Lehmann and J. P. Romano, Testing Statistical Hypothèses, 3rd ed. Newyork: Springer, 2005. [38] E. L. Lehmann and J. Romano, Testing Statistical Assumptions, 3rd ed. Newyork: Springer, 2005.

[39] M. Fouladirad and I. Nikiforov, "Optimal statistical fault détection with nuisance parameters," Automatica, vol. 41 , pp. 1 157-1 171 , 2005. [40] L. Scharf and B. Fhedlander, "Matched subspace detectors," IEEE Trans. Signal Process., vol. 42, no. 8, pp. 2146-2157, 1994. [39] M. Fouladirad and I. Nikiforov, "Optimal statistical fault detection with nuisance parameters," Automatica, vol. 41, pp. 1,157-1,171, 2005. [40] L. Scharf and B. Fhedlander, "Matched subspace detectors," IEEE Trans. Signal Process., Vol. 42, no. 8, pp. 2146-2157, 1994.

[41 ] L. Fillatre, I. Nikiforov, and F. Retraint, "Epsilon-optimal non-bayesian anomaly détection for parametric tomography," IEEE Trans. Image Process., vol. 17, no. 1 1 , pp. 1985-1999, 2008. [41] L. Fillatre, I. Nikiforov, and F. Retraint, "Epsilon-optimal non-bayesian anomaly detection for parametric tomography," IEEE Trans. Image Process., Vol. 17, no. 1, pp. 1985-1999, 2008.

[42] D. Birkes, "Generalized likelihood ratio tests and uniformly most powerful tests," The American Statistician, vol. 44, no. 2, pp. 163-166, 1990. [42] D. Birkes, "Generalized likelihood ratio tests and uniformly most powerful tests," The American Statistician, vol. 44, no. 2, pp. 163-166, 1990.

[43] L. Tierney and J. B. Kadane, "Accurate approximations for posterior moments and marginal densities," Journal of the American Statistical Association, vol. 81 , no. 393, pp. 82 - 86, Mar. 1986. [44] N. L. Johnson, S. Kotz, and N. Balakrishnan, Continuous univariate distributions, 2nd . [43] L. Tierney and J. B. Kadane, "Accurate approximations for posterior moments and marginal densities," Journal of the American Statistical Association, vol. 81, no. 393, pp. 82 - 86, Mar. 1986. [44] N. L. Johnson, S. Kotz, and N. Balakrishnan, Continuous univariate distributions, 2nd.

Claims

1. System for identifying a camera model (2) from images acquired with a known camera model and a photograph (3) in the form of a compressed image, said photograph (3) having followed a post-acquisition processing and responding to the linear relationship between the expectation and the variance of the pixels such that: a ² _ym , _n = a _ym , n + b where a and b are two parameters characterizing said camera model, _ym , n and a ² _ym , n are, respectively, the expectation and the mathematical variance of the pixel y _m , _n in position (m, n) having followed the post acquisition processing, the system is characterized in that it comprises a device image processing apparatus (12) capable of providing an analytic relationship between parameters (α, β), the statistical distribution model of the DCT coefficients, Discrete Cosine Transformation, and the parameters (a, b) of the camera Under the form :

so that the parameters (c, d) determine fingerprints characterizing a camera model and depend both on the frequency (p, q) and the parameters a and b, in that the system further comprises a apparatus for performing statistical hypothesis tests of the DCT coefficients and a statistical analysis device (14) for determining whether said photograph was taken by said camera model, or by another model of a camera .

2. System according to claim 1, characterized in that the statistical analysis device (14) provides an indication of the identification of said camera model by certifying the accuracy of the identification with a previously defined accuracy.

3. Method implemented in the system according to claims 1 and 2 characterized in that it comprises the following steps: prior analysis of images acquired with a known camera model, - reading of a compressed image Z in order to determine the matrices representing the value of the pixels,

- estimation of the print parameters: (c _p , _q , d _p , _q ),

removing the content of the compressed image Z in order to obtain a residual image,

estimating the DCT coefficients of said residual image,

- performing statistical hypothesis tests to identify a camera.

4. Method according to claim 3, characterized in that the statistical hypothesis tests are performed according to a prescribed constraint on the probability of error.

5. Method according to claim 3 characterized in that the photograph is a compressed image, according to the JPEG compression standard.

6. Method according to claim 3 characterized in that the photograph is a reference image, or inter-frame, belonging to a video stream and compressed according to the MPEG compression standard.

7. Use of the method according to one of claims 3 to 6 for the unsupervised detection of the falsification of an area of a photograph.

8. Use of the method according to one of claims 3 to 6 for the detection, in a supervised manner, of the falsification of an area of a photograph.

9. Use of the method according to one of claims 3 to 6 for the search for evidence from a compromising photograph.

10. Application of the method according to one of claims 3 to 6 in specialized software for the search of evidence from digital media.