FR2920560A1

FR2920560A1 - Three-dimensional synthetic actor i.e. avatar, constructing and immersing method, involves constructing psychic profile from characteristic points and features, and fabricating animated scene from head of profile and animation base

Info

Publication number: FR2920560A1
Application number: FR0706201A
Authority: FR
Inventors: Alain Staron
Original assignee: BOTTON UP SOC RESPONSABILITE L
Current assignee: BOTTON UP SOC RESPONSABILITE L
Priority date: 2007-09-05
Filing date: 2007-09-05
Publication date: 2009-03-06

Abstract

The method involves morphing a three dimensional head (26) for matching characteristic points of a head with characteristic points (2) of a face (19) by considering characteristic features (23). A texture of an extracted face is corrected in brightness and chrominance, and a sample of the extracted texture of the face is joined on hidden parts of the deformed head. Fabricated accessories are added in synthesized images taken in a base. A psychic profile is constructed from the points and features, and an animated scene is fabricated from the deformed, mapped head of profile and animation base.

Description

La présente invention concerne un procédé de fabrication et d'immersion automatiques de personnages de synthèse en 3 dimensions capables d'animation à partir de photos de visages en 2 dimensions obtenues dans des conditions de prise de vue aléatoires ; ce procédé est destiné à produire des personnages de synthèse manipulables ressemblant à la photo d'origine, et à les immerger dans des scènes dont les animations reflètent les caractères psychologiques proches de ceux de la personne ayant servi de sujet à la photo. The present invention relates to a method for automatically manufacturing and immersing 3-dimensional synthetic characters capable of animation from 2-dimensional face shots obtained under random shooting conditions; this process is intended to produce manipulable synthetic characters resembling the original photo, and to immerse them in scenes whose animations reflect the psychological characters close to those of the person who served as subject to the photo.

Le développement des outils multimédias de communication (en particulier les mondes virtuels, qu'ils soient de jeux comme World of Warcraft (marque déposée), 8 millions d'inscrits, ou simplement d'échanges, comme Second Life (marque déposée), 3 millions d'inscrits) crée le besoin de retrouver dans un monde virtuel une image pouvant représenter l'utilisateur. A défaut, le développement des sites communautaires de partage de vidéos incite les utilisateurs à se mettre en scène dans des scenarii de plus en plus risqués (par exemple l'e-suicide publié sur Scoopeo (marque déposée) début 2007). The development of multimedia communication tools (especially virtual worlds, whether gaming like World of Warcraft (registered trademark), 8 million subscribers, or simply exchanges, such as Second Life (registered trademark), 3 millions of registrants) creates the need to find in a virtual world an image that can represent the user. Otherwise, the development of community video sharing sites encourages users to stage themselves in increasingly risky scenarios (eg e-suicide published on Scoopeo (registered trademark) in early 2007).

Aujourd'hui on utilise comme représentation de l'utilisateur principalement des avatars. Today we use as representation of the user mainly avatars.

Ces derniers sont choisis et configurés manuellement par les utilisateurs en fonction de leurs goûts. Chaque utilisateur peut essayer de déformer l'avatar pour que ce dernier lui ressemble, avec des systèmes plus ou moins sophistiqués, allant de la déformation de la tête au plaquage d'une photo. En général, ces systèmes sont des interfaces avec des logiciels professionnels pré configurés, qui permettent la synthèse de visages. En effet, les logiciels professionnels sont le plus souvent destinés à des experts ( MAYA , 3DSMax , Softimage - marques déposées), et ne sont pas directement exploitables par le grand public. These are selected and configured manually by the users according to their tastes. Each user can try to distort the avatar so that it looks like him, with more or less sophisticated systems, ranging from the deformation of the head to the plating of a photo. In general, these systems are interfaces with pre-configured professional software, which allows the synthesis of faces. Indeed, professional software is most often intended for experts (MAYA, 3DSMax, Softimage - trademarks), and are not directly exploitable by the general public.

Cependant, un usage très grand public d'une telle fonction ne pourra se développer que si la création de l'avatar ressemblant à l'utilisateur est complètement automatique. However, a very public use of such a function can only develop if the creation of the avatar resembling the user is completely automatic.

La disparité des formes de visages humains, associée à la disparité des conditions et des qualités de prise de vues inhérentes au projet (conditions grand public) sont deux conditions qui ne permettent pas d'utiliser exclusivement soit des solutions basées sur la ressemblance avec un exemple, soit des solutions basées sur la déformation d'un modèle, pour mettre en place un processus 100% automatique de reconnaissance et de détection des paramètres pertinents d'un visage. The disparity of the human face shapes, associated with the disparity of the conditions and the qualities of shooting inherent in the project (conditions general public) are two conditions which do not make it possible to use only solutions based on the resemblance with an example , or solutions based on the deformation of a model, to set up a 100% automatic process of recognition and detection of the relevant parameters of a face.

Le cas particulier de la chevelure, très important dans le processus d'identification à un visage est en lui-même un problème complexe : une étude américaine de Juillet 2006 montre que la détection de chevelure a un taux de réussite de l'ordre de 50% sur une série de photos prises en conditions d'éclairement homogènes. Par ailleurs, on ne sait pas créer des animations de cheveux réels, l'invention propose donc un mécanisme pour superposer une chevelure artificielle sur une tête ressemblant à un personnage réel. The particular case of the hair, very important in the process of identification to a face is in itself a complex problem: an American study of July 2006 shows that the detection of hair has a success rate of the order of 50 % on a series of photos taken under homogeneous lighting conditions. Moreover, it is not known to create real hair animations, the invention therefore proposes a mechanism for superimposing an artificial hair on a head resembling a real character.

A lui tout seul, le problème de l'effacement de lunettes (que l'on doit résoudre pour pouvoir animer un visage porteur de lunettes en faisant de ces dernières un objet indépendant), qui consiste à identifier la présence éventuelle de lunettes sur une photo, puis à l'effacer du visage en recréant les parties manquantes, puis à ajouter en 3D une paire de lunettes artificielle ressemblant à la paire initiale n'a pas de solutions à ce jour. All by itself, the problem of the erasure of glasses (which must be solved in order to animate a spectacle-wearing face by making them an independent object), which consists in identifying the possible presence of glasses on a photograph , then to erase the face by recreating the missing parts, then to add in 3D a pair of artificial glasses resembling the original pair has no solutions to date.

Le problème de la reconstruction d'un volume 3D à partir d'une seule image en 2D (nécessaire si l'avatar doit effectuer des rotations par rapport à la caméra) est aussi complexe. Il est en théorie soluble dans la mesure où l'objet est d'une forme connue. Cependant, dans le cas qui nous occupe, il existe une grande disparité de profils humains ne permettant pas une approche fondée exclusivement sur un modèle. The problem of rebuilding a 3D volume from a single 2D image (necessary if the avatar must rotate with respect to the camera) is also complex. It is theoretically soluble to the extent that the object is of a known form. However, in this case, there is a great disparity in human profiles that does not allow for an approach based exclusively on a model.

En aval, d'autres types de problèmes doivent être résolus, autour de la création automatique de scènes lorsque les personnages ne sont pas connus dans leurs formes définitives au moment de la création. Par exemple les collisions entre formes qui s'interpénètrent. Ceci arrive en particulier lorsqu'il s'agit de préparer des scènes dans lesquelles les avatars viendront s'immerger et seront dotés de mouvements définis directement par les utilisateurs (comme dans les jeux par exemple). Downstream, other types of problems must be solved, around the automatic creation of scenes when the characters are not known in their definitive forms at the time of creation. For example, collisions between forms that interpenetrate each other. This happens especially when it comes to preparing scenes in which the avatars will immerse themselves and will have movements defined directly by users (as in games for example).

L'ajout d'une couche d'intelligence artificielle pour donner un comportement plus naturel aux avatars en fonction de situations rencontrées est aussi un problème encore largement en friche, aux confins de l'intelligence artificielle et des sciences du comportement, qui trouve d'abord des applications dans la résolution de problèmes psychologiques de dynamique de groupe ou de gestion de groupe de personnes en situation de crise. Le placage d'un profil psychologique réel sur un contour psychologique artificiel est également un sujet nouveau. The addition of a layer of artificial intelligence to give a more natural behavior to avatars according to situations encountered is also a problem still largely fallow, on the borders of artificial intelligence and behavioral sciences, which finds First, applications in the psychological problem solving of group dynamics or group management of people in crisis. Placing a real psychological profile on an artificial psychological contour is also a new subject.

L'état de l'art nous indique qu'il existe des systèmes de création d'avatars basés sur l'apparence physique de l'utilisateur, construits à partir de photos et de paramètres renseignés par l'utilisateur. Dans la majorité des cas, on passe par la représentation de N points caractéristiques du visage, qui servent de passerelle entre la photo réelle et sa représentation virtuelle. N peut varier d'une trentaine à près de 200 points. The state of the art tells us that there are systems for creating avatars based on the physical appearance of the user, built from photos and parameters entered by the user. In the majority of cases, we go through the representation of N characteristic points of the face, which serve as a bridge between the real photo and its virtual representation. N can vary from thirty to nearly 200 points.

Ainsi, la société Gizmoz (marque déposée), par exemple, crée des avatars ressemblants en ayant demandé à l'avance un certain nombre d'informations sur l'utilisateur (donc le processus n'est pas entièrement automatique), et en limitant les possibilités d'animation de l'avatar pour ne pas faire apparaître ses limites (les chevelures en particulier ne sont pas animées). Thus, the company Gizmoz (registered trademark), for example, creates similar avatars by asking in advance a certain amount of information about the user (so the process is not entirely automatic), and by limiting the possibilities of animation of the avatar not to reveal its limits (the hair in particular are not animated).

De manière plus précise, la détection de visages dans des images est un procédé assez bien connu (la société LTU Technologies par exemple commercialise Image Seeker (marque déposée), un système d'indexation d'images). De nombreux ouvrages décrivent différentes techniques adaptées à la reconnaissance de visage à des fins de sécurité, de vérification d'identité ( pouvant aller jusqu'à la commercialisation de produit comme par exemple la société Bioscrypt (marque déposée) au Canada qui a annoncé un produit en mars 2007, ou encore la société PolarRose (marque déposée), en Suède, dont la version beta actuelle propose de reconnaître des visages dans des photos), pour la sécurisation des PC en entreprise, de gestions de bases de données multimédias ou de création de contenu divertissant. Les récents progrès en matière de reconnaissance de forme et d'apprentissage permettent de développer des techniques de plus en plus robustes pour ces usages. More specifically, the detection of faces in images is a fairly well-known process (the company LTU Technologies for example markets Image Seeker (registered trademark), an image indexing system). Many books describe various techniques adapted to face recognition for security purposes, identity verification (up to product marketing, for example the company Bioscrypt (registered trademark) in Canada which announced a product in March 2007, or the company PolarRose (registered trademark), in Sweden, whose current beta version offers to recognize faces in photos), for the security of PCs in enterprise, management of multimedia databases or creation entertaining content. Recent advances in form recognition and learning are helping to develop more and more robust techniques for these uses.

On pourra citer les travaux d'Osuna (Osuna et Girosi, 1997), dans lesquels les auteurs proposent un détecteur original à base de machine à vecteurs de support pour classifier les visages en deux 20 classes : visages et non visages. Osuna's work (Osuna and Girosi, 1997), in which the authors propose an original detector based on a support vector machine to classify faces into two classes: faces and non-faces.

Les machines à vecteur de support (SVM) ont souvent été utilisés comme outils de détection de formes et donc de visages, y compris dans des applications temps réel, ou tolérant de faibles rotations du visage, avec par exemple une application à la télésurveillance [Carminati, 2003]. 25 Détecter un visage, puis à l'intérieur d'un visage des endroits caractéristiques (yeux, bouche), fait l'objet de travaux parmi lesquels on pourra citer ceux de Rachid Belaroussi (Université Pierre et Marie Curie, 2005), basé sur la fusion de trois détecteurs (ellipse, apparence, chair) pour le visage, et sur la détection de maxima locaux de gradients associé à une transformation chinoise. Le taux de 30 réussite de détection est de l'ordre de 90% dans ce cas. Support vector machines (SVM) have often been used as shape detection and thus face detection tools, even in real-time applications, or tolerant of small rotations of the face, with for example an application to remote monitoring [Carminati , 2003]. 25 Detecting a face, then inside a face of the characteristic places (eyes, mouth), is the subject of works among which we can mention those of Rachid Belaroussi (University Pierre and Marie Curie, 2005), based on the fusion of three detectors (ellipse, appearance, flesh) for the face, and the detection of local maxima of gradients associated with a Chinese transformation. The detection success rate is of the order of 90% in this case.

Dans un autre exemple, le brevet WO0163560 3D game avatar using physical characteristics décrit un système qui comprend une interface utilisateur pour recevoir les données, un procédé de création de représentations 3D, et une interface qui permet la création d'environnement 3D pour 35 intégrer les représentations créées. Ce brevet mentionne explicitement divers moyens pour l'utilisateur de renseigner le système afin que son image puisse être modélisée en 3D, et précise qu'il pourrait incorporer un système de reconnaissance de caractéristiques ( feature recognition process ) sans indiquer comment ce dernier procédé fonctionnerait. Toutes les méthodes exposées dans ce brevet reposent sur une ou des actions de l'utilisateur au-delà du simple envoi d'une photo. In another example, patent WO0163560 3D game avatar using physical characteristics describes a system that includes a user interface for receiving data, a method for creating 3D representations, and an interface that allows the creation of 3D environment to integrate the data. created representations. This patent explicitly mentions various means for the user to inform the system so that its image can be modeled in 3D, and specifies that it could incorporate a feature recognition process without indicating how the latter method would work. All the methods described in this patent are based on one or more actions of the user beyond the simple sending of a photo.

Le brevet EP 1510973 Method and apparatus for image-based photorealistic 3D face modeling décrit une méthode composée des phases suivantes : détection des caractéristiques dans des images de face et de profil ; génération d'une tête 3D en adaptant une tête générique ; génération d'une texture réaliste à partir des images face et profil ; étalage de la texture sur la tête 3D. Les méthodes décrites font référence à des modèles génériques pour chaque élément du visage à reconstruire. Il faut par ailleurs prendre deux photos pour faire une reconstruction en 3 dimensions. EP 1510973 Method and apparatus for image-based photorealistic 3D face modeling describes a method composed of the following phases: detection of characteristics in front and profile images; generating a 3D head by adapting a generic head; generation of a realistic texture from face and profile images; display of the texture on the 3D head. The methods described refer to generic models for each element of the face to be reconstructed. It is also necessary to take two photos to make a reconstruction in 3 dimensions.

Le brevet 6,044,168 du 28 Mars 2000 intitulé Model based faced coding and decoding using feature detection and eigenface coding se base sur un modèle de visage en 3 dimensions pour analyser des images inconnues en entrée et reconstruire en sortie des visages ressemblants, par mapping de texture en utilisant la technologie des eigenfaces . Dans cet exemple la recherche se fait par comparaison avec un modèle théorique de visage qui permet de reconnaître où se trouvent les éléments clés du visage : yeux, nez, bouche. The patent 6,044,168 of March 28, 2000 entitled Model based faced coding and decoding using feature detection and eigenface coding is based on a model of face in 3 dimensions to analyze unknown images in input and to reconstruct at the exit faces resembling, by mapping of texture in using eigenfaces technology. In this example the search is done by comparison with a theoretical model of face that can recognize where are the key elements of the face: eyes, nose, mouth.

L'article "Hierarchical wavelet networks for facial feature localization" décrit une technique à base d'ondelettes pour localiser les traits caractéristiques d'un visage. Cette technique se fait en deux étapes et par comparaison avec une base de données de visages. The article "Hierarchical wavelet networks for facial feature localization" describes a wavelet technique for locating the characteristic features of a face. This technique is done in two stages and by comparison with a database of faces.

L'article Primitive-based image coding technique for still pictures décrit une technique de compression par projection d'une image inconnue sur un espace de primitives obtenues par codage d'une base de données déjà connue. The article Primitive-based image coding technique for still pictures describes a compression technique by projection of an unknown image on a space of primitives obtained by coding an already known database.

L'article Component-based LDA method for face recognition with one training sample s'intéresse au problème de la reconnaissance de visage en l'absence d'une grande base de données de visages, par déformation d'un unique visage de référence. L'article View-based and Modular Eigenspaces for Face Recognition expose une solution pour reconnaitre un visage dans une base de plusieurs milliers de visages, en cherchant une signature unique indépendante des modes d'exposition de la photo. Component-based LDA method focuses on the problem of face recognition in the absence of a large database of faces, by deformation of a single reference face. View-based and Modular Eigenspaces for Face Recognition presents a solution for recognizing a face in a base of thousands of faces, looking for a unique signature independent of the exposure modes of the photo.

35 Le brevet WO 2004/040502 Image processing method for removing glasses from color facial images du 13 Mai 2004 décrit une méthode pour effacer les lunettes d'une photo de personne. Cette méthode est basée sur la différence entre la texture de la monture des lunettes et la texture30 environnante pour effacer la monture. Il est à noter que cette méthode ne permet pas d'effacer des lunettes de soleil par exemple. Patent WO 2004/040502 describes the method for erasing the glasses of a person's photo. This method is based on the difference between the texture of the eyeglass frame and the surrounding texture30 to erase the frame. It should be noted that this method does not make it possible to erase sunglasses for example.

Le brevet N° WO 99/27486 du 3 Juin 1999 image recognition or reconstruction with principal component analysis and with exclusions of occlusions and noise décrit une méthode de reconnaissance d'image par rapport à un modèle en éliminant les parties différentes (port de lunettes occasionnelles par exemple). Patent No. WO 99/27486 of June 3, 1999, describes a method of image recognition with respect to a model by eliminating different parts (wearing occasional glasses for example).

Le brevet intitulé 3D object recognition N° US 2006/0039600 du 23 Février 2006 indique une méthode de reconnaissance d'objet 3D dont on connaît une image 2D par extension de cette image en 3D, comparaison avec une base de données 3D, et classification. La phase d'apprentissage s'appuie sur un ensemble de projections 2D d'un objet 3D. The patent entitled 3D object recognition No. US 2006/0039600 of 23 February 2006 indicates a 3D object recognition method which is known a 2D image by extension of this image in 3D, comparison with a 3D database, and classification. The learning phase is based on a set of 2D projections of a 3D object.

De nombreux autres exemples existent, mais sont encore plus laborieux dans la manière d'analyser le visage. Many other examples exist, but are even more laborious in the way of analyzing the face.

D'autres études tentent de créer de manière automatique des expressions pour des avatars pré existants (Marco Paleari et Christine Lisetti, Humaine summerschool, septembre 2006, Italie) De nombreuses études se proposent de détecter les expressions des visages pour les reproduire sur des avatars (voir acquisition et animation de clones réalistes pour les télécommunications , AC Andres del Valle, JL Dugelay, E Garcia, S Valente, Institut Eurecom, 2000). Other studies attempt to automatically create expressions for pre-existing avatars (Marco Paleari and Christine Lisetti, Humaine summerschool, September 2006, Italy) Many studies propose to detect expressions of faces to reproduce them on avatars ( see acquisition and animation of realistic clones for telecommunications, AC Andres del Valle, JL Dugelay, Garcia E, S Valente, Eurecom Institute, 2000).

Cet état de l'art montre donc un grand nombre de travaux, mais des résultats relativement mitigés. This state of the art shows a lot of work, but relatively mixed results.

En particulier, la détection des éléments tels que les yeux, le nez ou la bouche pour prendre ces 3 exemples surtout traités, de manière suffisamment précise pour pouvoir y placer des points caractéristiques qui seront le support des animations, de manière suffisamment robuste pour pouvoir tenir compte de conditions de prises de vue grand public , et de manière suffisamment simple pour se contenter d'une seule photo 2D comme donnée, est un problème à ce jour non résolu. De même, la construction automatique d'un avatar capable d'animation ressemblant physiquement et psychologiquement à une personne dont on ne dispose que d'une photo 2D, et son immersion automatique dans des scènes en 3D où il joue un rôle ressemblant à son caractère n'a pas été faite. In particular, the detection of elements such as the eyes, the nose or the mouth to take these 3 examples especially treated, sufficiently precise to be able to place there characteristic points which will be the support of the animations, in a sufficiently robust way to be able to hold account of general shooting conditions, and sufficiently simple to be satisfied with a single 2D photo as given, is an unresolved problem. Similarly, the automatic construction of an animation-capable avatar physically and psychologically similar to a person with only a 2D photo, and its automatic immersion in 3D scenes where it plays a role resembling its character was not made.

Par rapport à cet état de l'art, l'objet de la présente invention est un procédé particulièrement robuste et complètement automatique pour construire un avatar 3D ressemblant à une photo modèle en 2D prise dans des conditions de vues aléatoires, lui associer des caractéristiques psychologiques proches de celles du modèle, et l'immerger dans des scènes tournées en images de synthèse dont les scénarii sont adaptés pour obtenir un rendu réaliste aussi bien physiquement que psychologiquement. Compared to this state of the art, the object of the present invention is a particularly robust and fully automatic method for constructing a 3D avatar resembling a 2D model picture taken in random view conditions, to associate psychological characteristics with it. close to those of the model, and immerse in scenes shot in computer graphics whose scenarios are adapted to obtain a realistic rendering both physically and psychologically.

En synthèse, il s'agit d'un procédé pour fabriquer et immerger automatiquement dans des scènes 3D un personnage de synthèse en 3 dimensions réalisé à partir d'une photo en 2 dimensions caractérisé en ce qu'il comprend : une étape de réception de photo (16) à travers un canal de transmission (17), une étape (18) d'extraction du ou des visages (19) présents sur la photo, une étape (20) de détermination des points caractéristiques (2) dudit visage par comparaison de la zone environnant chaque point caractéristique avec la zone équivalente dans les visages d'une base de données prédéfinie (21) et par test de cohérence sur les positionnements relatifs des points par rapport à un modèle (39), une étape (22) de détermination des traits caractéristiques (23) du ou des visages extraits, par calcul à partir des points caractéristiques trouvés (2), par comparaison avec les parties équivalentes dans les visages de la base de données prédéfinie (21) ou avec des accessoires pré calculés (24), une étape (25) de morphing d'une tête neutre 3D (26) pour faire correspondre les points caractéristiques (2) de la tête avec ceux issus de l'étape (20) en tenant compte des traits caractéristiques (23), une étape (27) de mapping de la texture du visage extrait (19) et corrigée en luminance et chrominance (28) par un processus (29) et de collage d'un échantillon de texture uniforme extraite du visage (28) selon une position déterminée par (22) sur les parties cachées de la tête déformée (30), une étape (31) d'ajout d'accessoires fabriqués en images de synthèse pris dans la base (24) et correspondants au visage (19) tels que détectés par l'étape (22) de traits caractéristiques (23) sur la tête déformée et mappée (32), une étape (33) de construction de profil psychologique (34) défini à partir des points caractéristiques (2) et des traits caractéristiques (23), que l'on associe à la tête déformée, mappée et accessoirisée (35), et une étape (36) de fabrication d'une scène animée (38) à partir de la tête déformée, mappée et accessoirisée (35), du profil psychologique (34), et d'une base d'animations (37). In synthesis, it is a method for automatically manufacturing and immersing in 3D scenes a 3-dimensional synthetic character made from a 2-dimensional photograph characterized in that it comprises: a reception step of photo (16) through a transmission channel (17), a step (18) for extracting the face or faces (19) present in the photograph, a step (20) for determining the characteristic points (2) of said face by comparing the area surrounding each feature point with the equivalent area in the faces of a predefined database (21) and by consistency testing the relative positionings of the points with respect to a model (39), a step (22) determining the characteristic features (23) of the extracted face (s), by calculation from the found characteristic points (2), by comparison with the equivalent parts in the faces of the predefined database (21) or with pre-calculated ccessories (24), a step (25) of morphing a 3D neutral head (26) to match the characteristic points (2) of the head with those from step (20) taking into account the features features (23), a step (27) for mapping the texture of the extracted face (19) and corrected in luminance and chrominance (28) by a process (29) and for gluing a sample of uniform texture extracted from the face ( 28) in a position determined by (22) on the hidden parts of the deformed head (30), a step (31) of adding accessories manufactured in computer images taken in the base (24) and corresponding to the face ( 19) as detected by step (22) of features (23) on the deformed and mapped head (32), a step (33) of psychological profile construction (34) defined from the characteristic points (2) and features (23), which is associated with the deformed, mapped and accessorized head (35), and a step (36) for producing an animated scene (38) from the deformed, mapped and accessorized head (35), the psychological profile (34), and an animation database. (37).

Plus précisément, représenter par un vecteur en spirale l'aire autour de chaque point caractéristique permet de surpondérer le proche voisinage du point caractéristique. More precisely, to represent by a spiral vector the area around each characteristic point makes it possible to overweight the close neighborhood of the characteristic point.

L'identification des points caractéristiques (2) en temps que telle est un mélange de solutions d'analyse statistique que sont la réduction de l'espace de représentation au moyen d'une analyse en composante principale, suivie d'une classification basée sur l'algorithme des k plus proches voisins, et de solutions reposant sur un modèle, que sont les tests de cohérence prenant en compte la topographie particulière de chaque zone du visage, la cohérence des zones du visage entre elles, des points de chaque zone entre eux et la proximité du barycentre du nuage statistique des points réputés bons candidats pris dans la base de données. The identification of the characteristic points (2) as such is a mixture of statistical analysis solutions that is the reduction of the representation space by means of a principal component analysis, followed by a classification based on the algorithm of the k nearest neighbors, and of solutions based on a model, that are the coherence tests taking into account the particular topography of each zone of the face, the coherence of the zones of the face between them, points of each zone between them and the proximity of the barycentre of the statistical cloud to the points considered as good candidates in the database.

Quant aux traits caractéristiques, ils comprennent des modèles théoriques de barbes, coupes de cheveux et paires de lunettes, qui servent à la reconnaissance et sont ensuite utilisés dans la synthèse de la tête finale en 3D. Pour la reconnaissance de la chevelure on utilise l'intégrale par morceaux d'une courbe caractéristique de l'épaisseur des cheveux selon des coupes passant par un point central du visage pour classer la coupe de cheveux inconnue parmi un nombre fini de coupes de cheveux typiques préparées en image de synthèse. La détection et l'effacement de lunettes sur l'image à analyser, se fait par identification par rapport à des lunettes types au moyen d'une détection de contours, puis par une reconstruction des pixels (5) dissimulés par les lunettes, en procédant à l'extension des zones immédiatement périphériques et visibles autour des lunettes de l'image à analyser par analogie avec les zones équivalentes sur des images d'une base de données connue de visages sans lunettes. La troisième dimension des points détectés sur l'image 2D à analyser est calculée en associant à tout ou partie des projections 2D des points caractéristiques trouvés tout ou partie de ceux de la tête 3D prise dans une base de données de têtes 3D ayant une projection 2D de ses points caractéristiques la plus proche, et en attribuant au point caractéristique en cours de calcul la troisième dimension du point du modèle correspondant, ou une moyenne des troisièmes dimensions des points les plus proches. L'invention est renforcée en appliquant un détecteur d'émotions sur le visage reconstruit, et en déplaçant les points caractéristiques (2) pour obtenir une émotion neutre avant l'étape de morphing. Le profil psychologique quant à lui est déterminé par une technique de morphopsychologie qui utilise les coordonnées des points caractéristiques pour évaluer des distances, des angles, des aires et des volumes propres à chaque visage, de manière à classer chaque visage dans un profil psychologique pré déterminé dont les interactions avec d'autres profils ou dans des situations clés est défini et est utilisé dans le choix de l'évolution d'une scène découpée en plusieurs tableaux, avec plusieurs possibilités pré écrites pour chaque tableau, parmi lesquelles possibilités une est choisie en fonction du profil. As for the characteristic features, they include theoretical models of beards, haircuts and pairs of glasses, which are used for recognition and are then used in the synthesis of the final head in 3D. For the recognition of the hair is used the piecewise integral of a characteristic curve of the thickness of the hair according to cuts passing through a central point of the face to classify the unknown haircut among a finite number of typical haircuts prepared in computer graphics. The detection and the erasure of glasses on the image to be analyzed, is done by identification with respect to standard glasses by means of an edge detection, then by a reconstruction of the pixels (5) hidden by the glasses, by proceeding the extension of the immediately peripheral and visible zones around the glasses of the image to be analyzed by analogy with the equivalent areas on images of a known database of faces without glasses. The third dimension of the detected points on the 2D image to be analyzed is calculated by associating all or part of the 2D projections of the characteristic points found all or part of those of the 3D head taken in a database of 3D heads having a 2D projection. its closest characteristic points, and assigning to the characteristic point being computed the third dimension of the point of the corresponding model, or an average of the third dimensions of the nearest points. The invention is enhanced by applying an emotion detector to the reconstructed face, and moving the feature points (2) to obtain a neutral emotion prior to the morphing step. The psychological profile is determined by a morphopsychology technique that uses the coordinates of the characteristic points to evaluate distances, angles, areas and volumes specific to each face, so as to classify each face in a pre-determined psychological profile. whose interactions with other profiles or in key situations is defined and is used in the choice of the evolution of a scene divided into several tables, with several pre-written possibilities for each table, among which possibilities one is chosen in profile function.

Ce procédé se décompose en trois étapes principales décrites dans la figure 18. This process is broken down into three main steps as described in Figure 18.

Une première étape ((20), (22)) d'extraction de points caractéristiques et de traits caractéristiques d'un visage présente l'originalité de s'appuyer à la fois sur une analyse statistique de ressemblance par rapport à une base de données de visages réels (21), et sur une analyse de pertinence par rapport à des modèles théoriques de visages. Ce couplage offre plusieurs avantages, comme la robustesse par rapport à des prises de vues dans des conditions difficiles d'éclairage et de position, ou l'extraction de points ou de traits cachés sur la photo, ou encore la reconstruction en 3D à partir d'une seule photo en 2D (16), ou encore la reconnaissance et la classification d'accessoires tels que barbe, chevelure, ou lunettes. Il permet aussi une amélioration constante du processus de détection par augmentation de la base de données d'apprentissage, tout en évitant les dérives inhérentes aux méthodes statistiques (temps de calcul en particulier) en incluant une dose de modélisation. A first step ((20), (22)) of extraction of characteristic points and features of a face has the originality of relying both on a statistical analysis of resemblance to a database real faces (21), and an analysis of relevance to theoretical models of faces. This coupling offers several advantages, such as the robustness compared to shooting in difficult lighting and position conditions, or the extraction of points or hidden lines on the photo, or the 3D reconstruction from the image. a single photo in 2D (16), or the recognition and classification of accessories such as beard, hair, or glasses. It also allows a constant improvement of the detection process by increasing the learning database, while avoiding the drifts inherent in statistical methods (especially computation time) by including a modeling dose.

Une deuxième étape de modelage ((25), (27), (31)) présente l'originalité de mélanger sur un même objet 3D des textures issues de données réelles et des objets (comme les lunettes, les yeux, l'intérieur de la bouche ou la chevelure) sélectionnés à partir d'une base d'objets 3D (24). Ce couplage offre pour avantage une meilleure qualité d'animation. Cette étape peut être entièrement automatique également grâce à la pertinence des traits caractéristiques déterminés dans la première étape. En effet, il est important de noter que cette deuxième étape est étroitement liée aux procédés mis en oeuvre lors de la première étape. A second modeling step ((25), (27), (31)) has the originality of mixing on the same 3D object textures from real data and objects (such as glasses, eyes, interior of the mouth or the hair) selected from a base of 3D objects (24). This coupling offers the advantage of a better quality of animation. This step can be fully automatic also thanks to the relevance of the characteristic features determined in the first step. Indeed, it is important to note that this second step is closely related to the processes implemented during the first step.

La troisième étape (36) est celle de l'animation de l'objet construit à l'issue des étapes 1 et 2. Elle consiste à créer des scènes animées à partir d'un objet générique (26), qui sera remplacé au moment de la fabrication par l'objet (35) issu des étapes 1 et 2. Ces scènes (37) peuvent comprendre une animation fine du modèle 3D. Cette animation sera automatiquement transposée vers l'objet fabriqué, donnant l'illusion d'animer directement le personnage avatar. De surcroît, plusieurs animations peuvent être prévues en parallèle dans la même scène, un algorithme choisissant à chaque instant la bonne animation fonction des paramètres psychologiques déterminés dans la première étape, pour conduire à une scène finale (38) représentative du comportement attendu du personnage eu égard à son profil psychologique. The third step (36) is that of the animation of the object constructed after the steps 1 and 2. It consists in creating animated scenes from a generic object (26), which will be replaced at the moment manufacturing by the object (35) from steps 1 and 2. These scenes (37) may include a fine animation of the 3D model. This animation will be automatically transposed to the manufactured object, giving the illusion of directly animating the avatar character. In addition, several animations can be planned in parallel in the same scene, an algorithm choosing at each moment the good animation function of the psychological parameters determined in the first step, to lead to a final scene (38) representative of the expected behavior of the character regard to his psychological profile.

Cette invention offre ainsi la plus grande facilité d'usage puisqu'il n'est pas nécessaire de prendre plusieurs photos, ou de renseigner le programme avec des informations complémentaires, ou encore d'avoir une photo correctement prise, pour en extraire les traits caractéristiques. Cette invention est particulièrement personnalisée, puisqu'elle déduit directement d'une seule photo un avatar 3D ressemblant au personnage présent sur la photo, et le met en scène dans des animations avec un comportement proche de celui dudit personnage. This invention thus offers the greatest ease of use since it is not necessary to take several photos, or to inform the program with additional information, or to have a photograph correctly taken, to extract the characteristic features. . This invention is particularly personalized, since it deduces directly from a single photo a 3D avatar resembling the character present in the photo, and staged in animations with behavior similar to that of said character.

Les animations sont pré définies, sans intervention de l'utilisateur. L'ensemble des mouvements étant connu, la conception des animations tiendra compte d'éventuelles déformations de la tête modèle pour éviter les collisions lors du calcul avec l'avatar 3D final. Plusieurs animations sont possibles pour une même scène. Le choix des animations retenues dépend du profil psychologique trouvé pour l'avatar. Animations are pre-defined, without user intervention. All movements are known, the design of the animations will take into account possible deformations of the model head to avoid collisions during the calculation with the final 3D avatar. Several animations are possible for the same scene. The choice of animations selected depends on the psychological profile found for the avatar.

Cette invention est ainsi particulièrement utile dans le cas des prises de vues faites avec les téléphones mobiles équipés d'appareils photos de qualité moyenne, dont l'interface utilisateur doit rester très simple. Or ces caméraphones représentent aujourd'hui le plus grand volume d'appareils photos vendus dans le monde, et sont donc potentiellement la première source d'images. L'arrivée de la réception de la télévision ou d'autres sources d'image animée sur les mêmes téléphones donne tout son sens à cette invention dans un mode distant : le téléphone envoie la photo, et reçoit une scène animée personnalisée. This invention is particularly useful in the case of shooting with mobile phones equipped with medium quality cameras, whose user interface must remain very simple. But these cameras today represent the largest volume of cameras sold in the world, and are potentially the first source of images. The arrival of television reception or other sources of animated images on the same phones makes sense of this invention in a remote mode: the phone sends the photo, and receives a personalized animated scene.

Selon la présente invention, et comme expliqué sur la figure 18, une photo (16) est tout d'abord transmise à un ordinateur central par un moyen de communication quelconque (17) (par exemple transmission Internet ou transmission MMS). La photo est analysée (18) à la recherche de zones correspondant probablement à des visages. Ensuite, chaque zone correspondant à un visage est isolée et reformatée (19). Chaque visage ainsi isolé est comparé à une base de visages préalablement constituée (21), dont les points caractéristiques (au nombre compris entre 30 et 200) ont été renseignés, ainsi qu'un certain nombre de caractéristiques accessoires comme la barbe, la chevelure, les lunettes ou l'azimut (angle de rotation par rapport à l'axe vertical du visage), ou encore l'éclairage. Les méthodes de comparaison concernent les éléments comparables du visage entre la base et le visage à analyser. Elles permettent d'évaluer la position la plus probable pour les points caractéristiques présents (visibles) sur le visage à analyser, par analyse statistique des visages de la base. Les éléments manquants dans le visage à analyser sont enfin reconstruits à partir de la moyenne des éléments présents dans la base, moyenne pondérée par les coefficients trouvés sur les éléments présents. Cette méthode permet de trouver et de placer avec une grande précision les points caractéristiques d'un visage inconnu sans faire appel à la moindre manipulation de la part de l'utilisateur, ni à des modèles sophistiqués de modélisation de visages. De surcroît, elle permet de reconstituer des parties cachées du visage à analyser, par exemple derrière des lunettes ou une barbe. Elle permet aussi d'imprimer une rotation selon un axe vertical à un visage qui n'a pas été pris de face lors de la prise de vue. Elle permet enfin une reconstruction en 3D et la correction des expressions. According to the present invention, and as explained in Figure 18, a picture (16) is first transmitted to a central computer by any means of communication (17) (eg Internet transmission or MMS transmission). The photo is scanned (18) for areas that are likely to be faces. Then, each area corresponding to a face is isolated and reformatted (19). Each face thus isolated is compared to a previously constituted base of faces (21), whose characteristic points (to the number between 30 and 200) have been indicated, as well as a number of accessory features such as the beard, the hair, glasses or azimuth (angle of rotation relative to the vertical axis of the face), or lighting. The comparison methods concern the comparable elements of the face between the base and the face to be analyzed. They make it possible to evaluate the most probable position for the present characteristic points (visible) on the face to be analyzed, by statistical analysis of the faces of the base. The missing elements in the face to be analyzed are finally reconstructed from the average of the elements present in the base, weighted by the coefficients found on the elements present. This method makes it possible to find and place with great precision the characteristic points of an unknown face without resorting to the slightest manipulation on the part of the user, nor to sophisticated models of face modeling. In addition, it allows to reconstitute hidden parts of the face to be analyzed, for example behind glasses or a beard. It also makes it possible to print a rotation along a vertical axis to a face that has not been taken from the front during the shooting. It finally allows a 3D reconstruction and correction of expressions.

La première étape de détection de visage dans une image peut faire appel à différentes méthodes : Transformée de Hough, pour détecter les ellipses correspondant aux visages présumés. Représentation en réseau d'ondelettes de Gabor, Estimation Gaussienne et maximum de vraisemblance, particulièrement adaptée dans ce cas puisqu'il existe une base de visages sur lesquels on peut appliquer une analyse en composantes principales et appliquer la recherche sur les composantes principales de l'image en les comparant à des Eigenfaces (méthode qui utilise une représentation des éléments caractéristiques d'une image de visage à partir d'images modèles en niveau de gris). Des variantes de Eigenface sont fréquemment utilisées comme base pour d'autres méthodes de reconnaissance. La méthode du maximum de vraisemblance détermine alors la probabilité qu'une image appartienne à une classe donnée en s'appuyant sur la divergence de Kullback-Leibler. The first step of face detection in an image can use different methods: Hough Transform, to detect the ellipses corresponding to the presumed faces. Gabor wavelet lattice representation, Gaussian estimation and maximum likelihood, particularly adapted in this case since there is a base of faces on which we can apply a principal component analysis and apply the research on the principal components of the image by comparing them to Eigenfaces (a method that uses a representation of the characteristic elements of a face image from grayscale model images). Variants of Eigenface are frequently used as a basis for other recognition methods. The maximum likelihood method then determines the probability that an image belongs to a given class based on the Kullback-Leibler divergence.

Chacune de ces méthodes s'applique en fonction d'une stratégie de parcours de l'image par des zones de plus en plus petites centrées sur chaque pixel. On interroge alors un classifieur sur chaque zone, par exemple le classifieur de Fisher, ou une machine à support vectoriel (dite SVM ou support vector machine ). Each of these methods is applied as a function of a strategy of moving the image through smaller and smaller areas centered on each pixel. A classifier is then interrogated on each zone, for example the Fisher classifier, or a vector support machine (called SVM or vector support machine).

Une fois le visage détecté, il est redressé (par rapport à un axe perpendiculaire au plan de la photo) et normalisé pour offrir un nombre de pixels et un ratio hauteur/largeur constant. Ce format doit correspondre au format des visages de la base, eux-mêmes déjà indexés des points caractéristiques, et répertoriés selon tous les cas de figure (type de barbe, lunettes, chevelure et azimut). Once the face is detected, it is straightened (relative to an axis perpendicular to the plane of the photo) and standardized to offer a number of pixels and a constant height / width ratio. This format must correspond to the format of the faces of the base, themselves already indexed of the characteristic points, and listed according to all the cases (type of beard, glasses, hair and azimuth).

Un classifieur permet de déterminer les éléments accessoires. Le corps de la méthode exposée ci dessous permet de retrouver les points caractéristiques non cachés par l'un des accessoires. Dans le cas d'une rotation, cette opération peut se faire après redressement du visage inconnu une fois son angle déterminé, ou bien sur un sous-ensemble de la base correspondant à des visages ayant la même rotation. On explique ensuite comment reconstruire les points caractéristiques cachés, ainsi que la troisième coordonnée de tous les points caractéristiques. A classifier determines the accessory elements. The body of the method explained below makes it possible to find the characteristic points not hidden by one of the accessories. In the case of a rotation, this operation can be done after straightening the unknown face once its determined angle, or on a subset of the base corresponding to faces having the same rotation. Then we explain how to reconstruct the hidden characteristic points, as well as the third coordinate of all the characteristic points.

Pour identifier les points caractéristiques visibles, on utilise une méthode basée sur deux algorithmes: l'un pour la réduction de l'espace de représentation et l'autre pour la classification. Pour la réduction, on fait appel à une analyse en composante principale. Pour la classification, est utilisé l'algorithme des k plus proches voisins. Pour ce faire, on représente chaque point par l'aire qui l'entoure dans l'image. Les pixels de cette aire sont ensuite étalés de façon à former un premier vecteur de grande taille. Cette aire doit être suffisamment grande pour prendre en compte assez d'informations, mais suffisamment petite pour que la différence entre deux points proches soit substantielle. On peut éventuellement remplacer les points de cette aire qui débordent de l'image par des points gris. Ceci permet d'avoir toujours la même dimension de départ, et donc d'entraîner une seule fois l'algorithme de réduction. Le fait d'utiliser comme représentation de l'aire autour de chaque point un vecteur en spirale permet d'éviter un trop grand nombre de coordonnées, et d'avoir à la fois beaucoup de coordonnées proches et quelques coordonnées lointaines. To identify the visible characteristic points, we use a method based on two algorithms: one for the reduction of the representation space and the other for the classification. For reduction, a principal component analysis is used. For the classification, is used the algorithm of the k nearest neighbors. To do this, we represent each point by the area around it in the image. The pixels of this area are then spread out to form a first large vector. This area should be large enough to accommodate enough information, but small enough that the difference between two nearby points is substantial. We can possibly replace the points of this area that overflow the image with gray dots. This makes it possible to always have the same initial dimension, and thus to drive the reduction algorithm once. Using a representation of the area around each point a spiral vector avoids too many coordinates, and at the same time has many close coordinates and some distant coordinates.

L'algorithme de réduction est entraîné sur un sous ensemble de points caractéristiques homogènes (voire séparément pour chaque point), ainsi que pour des points réputés non valides. The reduction algorithm is driven on a subset of homogeneous characteristic points (or separately for each point), as well as for points deemed to be invalid.

On effectue ensuite les opérations suivantes pour chacun des points caractéristiques cherchés, de préférence dans l'ordre du point le plus facile à détecter au point le moins difficile (De cette façon on peut utiliser les points déjà placés avec certitude pour aider au placement des nouveaux points plus problématiques) : - Sélection des images pertinentes de la base: On identifie la zone du point cherché (nez, front, oeil gauche, droit, bouche, menton, coté gauche, droit, sourcil gauche, droit, joue gauche, droite.) Les images pertinentes sont celles pour lesquelles cette zone est visible. En effet la base est indexée selon ces critères: front visible, oreilles visibles, barbe ou non. On peut également effectuer des filtres sur diverses conditions de l'image de manière à ce que les images sélectionnées correspondent à des conditions similaires aux conditions de la nouvelle image. La base peut donc être indexée selon des valeurs discrètes d'éclairement, présence ou non de lunettes, d'une barbe, degré de rotation du visage, expression. Une base contenant un nombre suffisant de cas correspondant à chaque association de ces conditions permet de résoudre tous les cas de figure. - découpage de l'image : sur chaque image, on détecte la zone rectangulaire dans laquelle la probabilité est la plus grande de trouver les yeux ou la bouche. Cette détection se fait en calculant les gradients verticaux sur l'image transformée en niveaux de gris, puis en recherchant les maxima de la somme de ces gradients calculée sur une zone rectangulaire de taille approximative identique à celle des yeux ou de la bouche. Les zones rectangulaires correspondant au nez, aux oreilles, à la mâchoire, ou au front sont déduites des premières. The following operations are then carried out for each of the characteristic points sought, preferably in the order of the easiest point to be detected at the least difficult point (In this way the points already placed with certainty can be used to assist in placing the new more problematic points): - Selection of the relevant images of the base: We identify the area of the sought point (nose, forehead, left eye, right, mouth, chin, left side, right, left eyebrow, right, cheek left, right. ) The relevant images are those for which this area is visible. Indeed the base is indexed according to these criteria: visible front, visible ears, beard or not. It is also possible to perform filters on various conditions of the image so that the selected images correspond to conditions similar to the conditions of the new image. The base can therefore be indexed according to discrete values of illumination, presence or absence of glasses, a beard, degree of rotation of the face, expression. A base containing a sufficient number of cases corresponding to each association of these conditions makes it possible to solve all the cases. - cutting of the image: on each image, we detect the rectangular zone in which the probability is greatest to find the eyes or the mouth. This detection is done by calculating the vertical gradients on the image transformed into gray levels, then looking for the maxima of the sum of these gradients calculated on a rectangular zone of approximate size identical to that of the eyes or the mouth. The rectangular areas corresponding to the nose, the ears, the jaw, or the forehead are deduced from the first.

- Premier essai de positionnement: En fonction de la zone du point recherché on détermine un masque de points pertinents (dont la position est susceptible de donner une idée de la position du point recherché.). On effectue l'intersection de ce masque avec les points déjà placés avec certitude. First positioning test: Depending on the area of the desired point, a mask of relevant points is determined (the position of which is likely to give an idea of the position of the desired point). We make the intersection of this mask with the points already placed with certainty.

Si cette intersection est vide on place le nouveau point à sa position moyenne relativement à la zone rectangulaire correspondante dans les images pertinentes de la base. Si elle contient un élément on place le nouveau point à sa position moyenne dans la base par rapport au premier. Si elle contient plus d'un élément on en prend le masque. Le masque est un vecteur représentant les distances de ces points à leur centre de gravité. On prend dans la base le masque le plus ressemblant à celui de l'image, on prend la position du point cherché par rapport au barycentre dans l'image correspondante de la base, et on place le nouveau point par rapport au barycentre du masque dans la nouvelle image, en corrigeant éventuellement l'homothétie. If this intersection is empty, place the new point at its average position relative to the corresponding rectangular area in the relevant images of the base. If it contains an element, the new point is placed at its average position in the base relative to the first one. If it contains more than one element we take the mask. The mask is a vector representing the distances of these points to their center of gravity. We take in the base the most resembling the mask of the image, we take the position of the sought point relative to the centroid in the corresponding image of the base, and place the new point relative to the centroid of the mask in the new image, possibly correcting the homothety.

- Entraînement de l'algorithme de détection: Dans chaque image de la base, on prélève une centaine de points situés à proximité immédiate du point cherché, dont la position est connue pour chaque image de la base. On prend pour chacun de ces points l'aire centrée, que l'on transforme en un vecteur dont on réduit la dimensionnalité en utilisant l'algorithme de réduction. On entraîne alors l'algorithme de détection en spécifiant que ces points sont hors classe (out-of-class) et que les points cherchés sont dans la classe (in-class). On a donc environ 99% de points hors classe (out-ofclass) et 1% de points dans la classe (in-class) dans l'entraînement. La figure 1 (Répartition des points relevés dans chaque image de la base pour entraîner l'algorithme de détection pour le premier point cherché) donne une idée de la répartition (1) de ces points. - Training of the detection algorithm: In each image of the base, one draws a hundred points located in the immediate vicinity of the sought point, whose position is known for each image of the base. We take for each of these points the centered area, which we transform into a vector whose dimensionality is reduced by using the reduction algorithm. The detection algorithm is then trained by specifying that these points are out-of-class and that the searched points are in-class. So we have about 99% out-ofclass and 1% in-class points in the training. Figure 1 (Distribution of the points recorded in each image of the database to drive the detection algorithm for the first point sought) gives an idea of the distribution (1) of these points.

- Recherche autour du premier essai de positionnement: On parcourt une aire autour du premier essai et on appelle l'algorithme de détection sur chacun des points de cette aire. Parmi les points pour lesquels l'algorithme de classification donne un indice de confiance maximum, on calcule le barycentre, et on choisit le point le plus proche de ce barycentre comme le plus probable. - Search around the first positioning test: We go around an area around the first test and we call the detection algorithm on each point of this area. Among the points for which the classification algorithm gives a maximum confidence index, the center of gravity is calculated, and the nearest point of this centroid is chosen as the most probable.

- Test de cohérence inter points : L'algorithme recherche les points dans un ordre prédéterminé qui privilégie les points réputés faciles à trouver. Pour chaque zone (chaque oeil, nez, bouche, etc...), les points les plus difficiles à trouver sont testés une fois trouvés pour vérifier que leur position est cohérente avec celle des premiers points faciles à trouver. Si elle ne l'est pas, l'algorithme abandonne le point trouvé et cherche un nouveau candidat. - Inter-point coherence test: The algorithm searches for points in a predetermined order that favors points that are deemed easy to find. For each zone (each eye, nose, mouth, etc.), the most difficult points to find are tested once found to verify that their position is consistent with that of the first points easy to find. If it is not, the algorithm drops the found point and looks for a new candidate.

- Test de vraisemblance globale : Pour chaque zone, les indices de confiance trouvés pour chaque point (nombre de plus proches voisins connus comme in-class ) sont additionnés. Si l'indice de confiance global n'est pas bon, l'algorithme est relancé pour la zone, en élargissant l'aire de recherche du premier point. Global Likelihood Test: For each zone, the confidence indices found for each point (number of nearest neighbors known as in-class) are summed. If the global confidence index is not good, the algorithm is restarted for the zone, widening the search area of the first point.

La figure 2 (Un visage inconnu indexé par ses points caractéristiques) montre un visage inconnu sur lequel ont été placés les points caractéristiques (2) trouvés par cette méthode avant optimisation. Figure 2 (An unknown face indexed by its characteristic points) shows an unknown face on which were placed the characteristic points (2) found by this method before optimization.

Plusieurs éléments accessoires, ou encore traits caractéristiques sont utiles pour une bonne définition du visage et une déformation correcte de la tête de synthèse prévue dans l'étape 2 de l'invention.. Il s'agit, entre autres, de la chevelure, de la couleur des yeux, de l'inclinaison du visage par rapport à un plan vertical orthogonal au plan de la photo, d'une zone de front correspondant à la texture de peau du sujet, de la barbe, des lunettes. Several accessory elements, or characteristic features are useful for a good definition of the face and a correct deformation of the synthesis head provided in step 2 of the invention. It is, among other things, the hair, the color of the eyes, the inclination of the face relative to a vertical plane orthogonal to the plane of the photo, a forehead zone corresponding to the skin texture of the subject, the beard, the glasses.

L'objectif de la détection de la coupe de cheveux est de pouvoir associer à l'avatar 3D généré une coupe de cheveux artificielle la plus proche possible de sa coupe réelle. Les coupes de cheveux artificielles sont répertoriées selon 5 critères : hauteur de cheveux sur le front, longueur de cheveux sur les côtés, volume des cheveux sur les côtés, pourcentage de front couvert, symétrie de la coiffure sur le front. Il s'agit donc de mesurer ces 5 paramètres sur l'image du visage pour y associer un modèle de coupe de cheveux qui sera reconduit sur l'avatar 3D. The goal of haircut detection is to be able to associate with the generated 3D avatar an artificial haircut as close as possible to its actual cut. The artificial haircuts are listed according to 5 criteria: height of hair on the forehead, length of hair on the sides, volume of the hair on the sides, percentage of covered front, symmetry of the hairstyle on the front. It is therefore a question of measuring these 5 parameters on the image of the face to associate a model of haircut which will be renewed on the 3D avatar.

La coupe de cheveux est caractérisée par une courbe. Comme indiqué sur les figures 3 et 4, cette 5 courbe est représentative de l'épaisseur de cheveux prise sur la demi-droite (3) partant du centre des points caractéristiques au dessus des sourcils et formant un angle 1 avec l'axe des ordonnées. The haircut is characterized by a curve. As indicated in FIGS. 3 and 4, this curve is representative of the thickness of hair taken on the half-line (3) starting from the center of the characteristic points above the eyebrows and forming an angle 1 with the ordinate axis. .

cl) évolue dans le sens des aiguilles d'une montre en prenant (1) = 0 lorsque la ligne est verticale. cl) moves in a clockwise direction by taking (1) = 0 when the line is vertical.

10 La figure 4 montre un exemple de courbe d'épaisseur (5) associée à la détection des points d'interface (4) entre front et cheveux montrés sur la figure 3. FIG. 4 shows an example of a thickness curve (5) associated with the detection of the interface points (4) between forehead and hair shown in FIG. 3.

Les courbes calculées pour les différents types de cheveux possèdent ainsi des caractéristiques qui sont simplement différentiable pour les différents types de coupes de cheveux. Un intérêt supplémentaire de cette méthode est un faible temps de calcul comparé aux temps de calcul impliqués par des algorithmes de traitement d'image. En effet, on effectuera les calculs sur des signaux 1D et non des signaux 2D. The curves calculated for the different types of hair thus have characteristics that are simply differentiable for the different types of haircuts. An additional advantage of this method is a low computation time compared to the calculation times involved by image processing algorithms. Indeed, we will perform the calculations on 1D signals and not 2D signals.

20 La courbe d'épaisseur possède des propriétés qui sont directement liées à la coupe de cheveux considérée. The thickness curve has properties that are directly related to the haircut considered.

Ainsi son intégrale donne le volume de la zone de cheveux. Son intégrale limitée à des valeurs de (1) comprises entre 0 et 45° d'une part, et 0 et -45° d'autre part donne la symétrisation de la coiffure 25 sur le front. Les autres valeurs des paramètres nécessaires à l'identification de la coiffure par rapport aux coupes de référence se déduisent de la même manière, à partir de tris des valeurs de certains points ou en extrayant les angles correspondant à des maxima ou des minima. So its integral gives the volume of the hair zone. Its integral limited to values of (1) between 0 and 45 ° on the one hand, and 0 and -45 ° on the other hand gives the symmetry of the hairstyle 25 on the front. The other values of the parameters necessary for the identification of the hairstyle with respect to the reference sections are deduced in the same way, starting from sorting of the values of certain points or by extracting the angles corresponding to maxima or minima.

Pour définir la courbe, on commence par réduire l'information contenue dans l'image, en 30 supprimant l'information contenue en dessous des sourcils, comme indiqué sur les figures 5, 6 et 7, qui montrent l'élimination automatique des zones non pertinentes (6) du visage. To define the curve, one begins by reducing the information contained in the image, by suppressing the information contained below the eyebrows, as shown in FIGS. 5, 6 and 7, which show the automatic elimination of the non-visible areas. relevant (6) of the face.

Puis on définit les demi-droites en partant du pixel le plus proche de la coordonnée du centre du segment délimité par les points caractéristiques du haut de sourcils. En notant A20 et A16 ces deux 35 points nous avons la coordonnée du point centrale : O qui est définie de la manière suivante comme suit : 15 O(x) = entier ((A20(x)+ A16(x))/2) 0(y) = entier ((A20(y)+ A16(y))/2) Avec Then we define the half-lines starting from the pixel closest to the coordinate of the center of the segment delimited by the characteristic points of the upper eyebrows. Noting A20 and A16 these two points we have the center point coordinate: O which is defined as follows: O (x) = integer ((A20 (x) + A16 (x)) / 2) 0 (y) = integer ((A20 (y) + A16 (y)) / 2) With

P(x) l'abscisse du point P P(y) l'ordonnée du point P Entier(a) : la fonction qui renvoi l'entier supérieur du réel a L'ensemble des lignes est défini par un ensemble de valeurs discrètes de l'angle o. Cet ensemble discret de valeurs sera compris dans l'intervalle : [-125 :125]. 15 Afin d'obtenir des lignes qui croisent de manière régulière les pixels les valeurs de 6p sont définies de la manière suivante : n = sin 1 (~n2 + P2 ) Pour n et p appartenant à [0,3] les droites (7) définies dans le cadrant Nord-Est sont représentées sur la figure 8. La figure 10 montre les valeurs des points (9) appartenant à 7 droites (8) décrites sur la figure 9 pour cp prenant des valeurs espacées de 45°, soit pour n et p appartenant à {0,1 }. P (x) the abscissa of the point PP (y) the ordinate of the point P Integer (a): the function which returns the higher integer of the real to the set of lines is defined by a set of discrete values of the o angle. This discrete set of values will be in the range: [-125: 125]. In order to obtain lines which regularly cross the pixels the values of 6p are defined as follows: n = sin 1 (~ n2 + P2) For n and p belonging to [0.3] the lines (7 ) are defined in the North-East frame are shown in Figure 8. Figure 10 shows the values of the points (9) belonging to 7 lines (8) described in Figure 9 for cp taking values spaced 45 °, either for n and p belonging to {0,1}.

La détection de la chevelure consiste à effectuer la dérivée seconde du signal. Pratiquement, cela 25 revient à effectuer un filtrage par un filtre passe bande Laplacien, dont le résultat est montré figure 11. Les points ou le signal est nul (10) sont les sauts d'amplitude qui correspondent aux contours les plus marqués, donc les pixels pour lesquels un contour est détecté. The detection of the hair consists in carrying out the second derivative of the signal. Practically, this amounts to filtering by a Laplacian band pass filter, the result of which is shown in FIG. 11. The points where the signal is zero (10) are the amplitude jumps that correspond to the most marked outlines, so the pixels for which an outline is detected.

Pour définir les points les plus pertinents, on extrait d'abord la teinte et la luminosité moyenne sur 30 un ensemble de pixels représentant des cheveux. La zone où l'on peut considérer cette information est celle se situant au premier zéro du signal obtenu à l'étape précédente sur la demi-droite verticale. Le seul cas où cette zone ne contient pas de cheveux est le cas où la personne représentée est chauve. Afin de s'affranchir de ce cas particulier la teinte de cette zone est comparée avec la teinte 14 20 des pixels en haut à gauche de l'image. Si ce cas est rencontré, la personne sera considérée comme chauve. To define the most relevant points, the hue and the average brightness are first extracted on a set of pixels representing hair. The area where this information can be considered is that located at the first zero of the signal obtained in the previous step on the vertical half-line. The only case where this area does not contain hair is the case where the person represented is bald. In order to overcome this particular case, the hue of this area is compared with the hue 14 of the pixels at the top left of the image. If this case is met, the person will be considered bald.

La seconde étape de ce traitement consiste à prendre les deux `zéros' les plus pertinents. Le traitement précédent peut en effet donner, avec à priori une faible probabilité, un résultat dans lequel il y a plus de deux contours détectés par demi-droite (passage de la demi-droite par une oreille, plis très marqué dans la chevelure...). Afin de prendre en compte seulement les points ou le contour détecté est due à la transition entre visage/chevelure et chevelure/fond, de part et d'autre du point les informations de teinte et de luminosité moyenne sont mesurées. Si ces informations de teinte et de luminosité sont proches de la teinte et la luminosité moyenne des cheveux déterminés à l'étape précédente, la transition est classée dans l'un des cas suivants : transition chevelure/fond ou transition visage/chevelure. The second step of this treatment is to take the two most relevant 'zeros'. The previous treatment can indeed give, with a priori a low probability, a result in which there are more than two contours detected by half-right (passage of the half-right by an ear, folds very marked in the hair .. .). In order to take into account only the points where the detected contour is due to the transition between face / hair and hair / background, on both sides of the point the information of hue and average brightness are measured. If these hue and brightness information are close to the hue and the average brightness of the hair determined in the previous step, the transition is classified in one of the following cases: hair transition / background or transition face / hair.

En cas de mauvaise mesure de l'épaisseur de la chevelure pour un angle donné qui se traduit par un pic, un filtre médian (filtrage statistique) sur une fenêtre de longueur 3 permet de conserver les transitions tout en éliminant les pics. In case of a bad measurement of the thickness of the hair for a given angle which results in a peak, a median filter (statistical filtering) on a window of length 3 makes it possible to preserve the transitions while eliminating the peaks.

La couleur des yeux suit le processus suivant : On mesure la teinte et la saturation moyenne des pixels immédiatement autour des points caractéristiques centraux des pupilles. Sur un histogramme, on repère d'éventuels pics blancs (blanc des yeux ou reflets sur la pupille) ou rouge (effet flash), et l'on élimine les pixels correspondants. S'il reste deux pics (bords de la paupière et couleur de la pupille), on fait la comparaison sur les deux valeurs. Cette teinte (ou ces deux teintes), qui peut être moyennée sur les deux yeux pour diminuer le bruit, doit être comparée aux teintes de référence, données par des images prédéfinies. La moyenne la plus proche donne la couleur des yeux de la photo inconnue. The color of the eyes follows the following process: The hue and average saturation of the pixels are measured immediately around the central characteristic points of the pupils. On a histogram, we find any white peaks (white eyes or reflections on the pupil) or red (flash effect), and we eliminate the corresponding pixels. If there are two peaks (eyelid margins and pupil color), the two values are compared. This hue (or these two hues), which can be averaged over both eyes to reduce noise, should be compared to the reference hues given by predefined images. The closest average gives the color of the eyes of the unknown photo.

L'inclinaison du visage se mesure en prenant la tangente de l'angle passant par les deux points caractéristiques des coins externes des yeux et une horizontale. On peut affiner cette mesure en la moyennant par la tangente de l'angle passant par le milieu des deux points caractéristiques des coins externes des yeux et le point caractéristique du bout du nez ou du milieu du menton, et une verticale. The inclination of the face is measured by taking the tangent of the angle passing through the two characteristic points of the outer corners of the eyes and a horizontal. This measurement can be refined by averaging it by the tangent of the angle passing through the middle of the two characteristic points of the outer corners of the eyes and the characteristic point of the tip of the nose or the middle of the chin, and a vertical.

Il est utile de repérer une zone de peau relativement neutre pour mapper la tête 3D représentant l'avatar dans les zones où la texture de la photo ne s'applique pas (oreilles cachées par les cheveux, front caché par les cheveux, ailes du nez, etc...). La zone de front nécessaire pour la texture de la peau est définie comme une zone de forme à peu près carrée, qui s'inscrit entre la droite passant par les points caractéristiques des sommets des sourcils, et les points caractéristiques de la jonction entre le front et les cheveux. It is useful to locate a relatively neutral skin area to map the 3D head representing the avatar in areas where the texture of the photo does not apply (ears hidden by the hair, forehead hidden by the hair, wings of the nose , etc ...). The forehead area necessary for the texture of the skin is defined as a zone of approximately square shape, which falls between the line passing through the characteristic points of the eyebrow tops, and the characteristic points of the junction between the forehead. and the hair.

La reconnaissance de la barbe s'appuie sur la méthode de similitude avec un modèle (dite template matching ). Dans les deux cas, la méthode permet de rapprocher la barbe ou la chevelure d'une liste de barbes et chevelures dites représentatives, pour en déterminer celle dont elle se rapproche le plus. L'information qui sera transmise par le procédé est un code qui définit le type de chevelure ou de barbe reconnu. The recognition of the beard is based on the method of similarity with a template (called template matching). In both cases, the method makes it possible to bring the beard or the hair close to a list of beards and hair so-called representative, to determine the one of which it approaches the most. The information that will be transmitted by the method is a code that defines the type of hair or beard recognized.

Quant aux lunettes, un exemple de méthode est indiqué ci-dessous et illustré par les figures 13 à 17 qui montrent les résultats des différentes étapes à partir d'une image originale en figure 12 : Un prétraitement de détection de contours permet de faire ressortir le tracé des lunettes dans le reste de l'image. As for the glasses, an example of a method is indicated below and illustrated by FIGS. 13 to 17 which show the results of the various steps from an original image in FIG. 12: A contour detection pretreatment makes it possible to highlight the traced glasses in the rest of the image.

Ce prétraitement consiste en plusieurs étapes : - Filtrage de Prewitt illustré en figure 13 : c'est un détecteur de fort gradient utilisant un voisinage 3x3 pour chaque pixel. Il permet à la fois de détecter des contours (11) et de déterminer leur orientation. - Seuillage, nettoyage et dilatation binaire, illustrés en figure 14: afin d'obtenir une image binaire, on seuille l'image de Prewitt de manière à ne garder que les plus fort gradients, puis on applique un nettoyage et une dilatation binaire, opérations de morphologie mathématique permettant de supprimer les points isolés et d'améliorer la connexité du tracé des lunettes (12). L'élément structurant est un disque de rayon 3. A l'issue de ces opérations on a repéré les pixels appartenant aux lunettes et aux plus forts reflets de lumière sur les verres. Pour identifier la forme des lunettes, deux approches sont possibles : d'une part, prédéfinir des types de lunettes (par exemple elliptiques, rectangulaires...), identifier le type de lunettes puis estimer les paramètres caractéristiques (longueur des axes et centre des ellipses, etc.) ; d'autre part, permettre des formes plus libres en déterminant un contour fermé fin approchant au mieux la forme des lunettes. Cette dernière option est réalisable par la technique des contours actifs (dite des snakes). Cette technique est itérative ; on initialise le contour par un cercle autour du centre de l'oeil (déjà calculé), puis à chaque itération, le contour se déforme pour s'approcher des zones de fort gradient détecté par filtrage de Prewitt et seuillage. Enfin, pour reconstruire les pixels dissimulés par les lunettes, on utilise la technique dite des "Eigenfaces". Les différentes étapes sont les suivantes : - Construction de la base des visages propres : La base de données étant étiquetée, on connaît la position du centre de chaque oeil. On sélectionne un carré centré sur le centre de l'oeil gauche et de 151 pixels de côté. Ainsi les yeux sont tous alignés pour tous les visages. Chaque oeil devient la colonne d'une matrice sur laquelle on effectue une décomposition en valeurs singulières. Le résultat de cette décomposition est une base de "l'espace des yeux gauches" (après re dimensionnement des colonnes en matrices de pixels de 151x151). On fait de même pour l'oeil droit. - Repérage des yeux sur le nouveau visage : A l'aide des scripts de recherche des points caractéristiques, on détermine les centres des yeux des nouveaux visages. - Projection : Le carré de dimension 151x151 autour du centre repéré est projeté sur la base des "yeux propres". C'est l'image la plus proche de l'oeil de départ dans l'espace des yeux sans lunettes. -Effacement des lunettes illustré sur les figures 15, 16 et 17. Chaque point détecté précédemment comme appartenant aux lunettes est remplacé par son homologue dans l'oeil reconstruit. On force le non-remplacement d'un petit voisinage autour du contre de l'oeil (qui contient de forts gradients mais où ne peuvent pas sé trouver les lunettes). La figure 15 montre le résultat sur l'ceil gauche (13) avant rétablissement des pixels reconnus comme n'appartenant pas aux lunettes, la figure 16 montre le rétablissement des pixels (14) n'appartenant pas aux lunettes, la figure 17 montre le résultat sur les deux yeux avec la zone (15) préalablement occupée par les lunettes. This pretreatment consists of several steps: - Prewitt filtering illustrated in Figure 13: it is a strong gradient detector using a 3x3 neighborhood for each pixel. It makes it possible both to detect contours (11) and to determine their orientation. - Thresholding, cleaning and binary expansion, illustrated in Figure 14: in order to obtain a binary image, threshold the image of Prewitt so as to keep only the strongest gradients, then we apply a cleaning and a binary dilation, operations mathematical morphology to remove the isolated points and improve the connectivity of the outline of the glasses (12). The structuring element is a disk of radius 3. At the end of these operations we have identified the pixels belonging to the glasses and the strongest light reflections on the glasses. To identify the shape of the glasses, two approaches are possible: on the one hand, predefined types of glasses (for example elliptical, rectangular ...), identify the type of glasses then estimate the characteristic parameters (length of the axes and center of the glasses). ellipses, etc.); on the other hand, to allow more free forms by determining a closed closed end approaching the shape of the glasses. This last option is feasible by the technique of active contours (called snakes). This technique is iterative; the contour is initialized by a circle around the center of the eye (already calculated), then at each iteration, the contour is deformed to approach the zones of strong gradient detected by filtering of Prewitt and thresholding. Finally, to reconstruct the pixels hidden by the glasses, we use the technique called "Eigenfaces". The different steps are as follows: - Construction of the base of clean faces: The database being labeled, we know the position of the center of each eye. We select a square centered on the center of the left eye and 151 pixels side. So the eyes are all aligned for all faces. Each eye becomes the column of a matrix on which we perform a decomposition into singular values. The result of this decomposition is a basis of "left-eye space" (after re-sizing the columns into 151x151 pixel matrices). We do the same for the right eye. - Locating the eyes on the new face: Using the search scripts of the characteristic points, the centers of the eyes of the new faces are determined. - Projection: The 151x151 square around the marked center is projected on the basis of "clean eyes". This is the closest image to the eye of departure in the eye space without glasses. -Eclacement of the glasses shown in Figures 15, 16 and 17. Each previously detected point as belonging to the glasses is replaced by its counterpart in the reconstructed eye. We force the non-replacement of a small neighborhood around the counter of the eye (which contains strong gradients but where can not find the glasses). FIG. 15 shows the result on the left eye (13) before restoring the pixels recognized as not belonging to the glasses, FIG. 16 shows the restoration of the pixels (14) not belonging to the glasses, FIG. result on both eyes with the area (15) previously occupied by the glasses.

Les lunettes sont ensuite ajoutées sur l'avatar une fois comparées avec une base de lunettes typiques (par exemple elliptiques, rectangulaires...), à partir d'une estimation des paramètres caractéristiques (longueur des axes et centre des ellipses, etc.). En ce qui concerne l'azimut (l'azimut du visage est l'angle du plan vertical du visage, plan contenant les deux oreilles , avec le plan de projection de la photographie, plan de référence), on procède de la façon suivante :20 Pour les 5 positions retenues : face (0 °), quart-profil (22 °), demi-profil (45 °), trois-quarts-profil (67 °) et profil (90 °), on détermine un visage moyen qui est simplement la moyenne des niveaux de gris de toutes les images d'apprentissage de la position correspondante. Pour classer un nouveau visage, on calcule son coefficient de corrélation avec les 5 images moyennes de référence : corr(x,y) = [1/n] [ 1(xi --x) (yi - -y) ] / ax oy The glasses are then added to the avatar when compared with a base of typical glasses (for example elliptical, rectangular ...), from an estimation of the characteristic parameters (length of the axes and center of the ellipses, etc.) . With regard to the azimuth (the azimuth of the face is the angle of the vertical plane of the face, plane containing the two ears, with the plane of projection of the photograph, reference plane), one proceeds as follows: For the 5 positions selected: face (0 °), quarter-profile (22 °), half-profile (45 °), three-quarter-profile (67 °) and profile (90 °), an average face is determined. which is simply the average of the gray levels of all the learning images of the corresponding position. To classify a new face, we calculate its correlation coefficient with the 5 reference mean images: corr (x, y) = [1 / n] [1 (xi - x) (yi - -y)] / ax oy

où x est un vecteur-colonne contenant les pixels xi de l'image, -x sa moyenne et 6x sa variance. Ce coefficient est d'autant plus proche de 1 que les images sont corrélées. Parmi les 5 coefficients de 10 corrélation calculés, le plus élevé détermine l'angle du nouveau visage. On appliquera ensuite une fonction rotation sur les points caractéristiques trouvés pour leur donner leurs coordonnées de face. where x is a column vector containing the pixels xi of the image, -x its mean and 6x its variance. This coefficient is even closer to 1 as the images are correlated. Of the 5 correlation coefficients calculated, the highest determines the angle of the new face. A rotation function will then be applied to the characteristic points found to give them their face coordinates.

En ce qui concerne la reconstruction en 3 dimensions des coordonnées des points caractéristiques, il faut que la base de visages de référence ait été auparavant renseignée dans les 3 dimensions, 15 c'est-à-dire que chaque visage de cette base ait été pris en photo selon deux angles, l'un de face, l'autre de profil. Chacun de ces visages aura été renseigné, c'est-à-dire que ses points caractéristiques auront été définis, par exemple par une opération manuelle, dans les 3 dimensions. A partir d'une photo 2D, la méthode précédente permet de reconstruire la projection des points caractéristiques selon deux dimensions. Chaque point caractéristique reconstruit en deux 20 dimensions est soit une moyenne de points réellement renseignés dans la base de visages, soit directement le point correspondant à l'un des visages. Ces points ayant leur troisième coordonnée renseignée dans la base de visages, reconstruire la troisième coordonnée du point cherché selon la même méthode (moyenne ou association directe) donnera finalement pour l'ensemble du visage une troisième dimension calculée à partir de la troisième dimension des images de la base 25 pondérées des mêmes coefficients que pour la reconstruction en deux dimensions, ou directement la troisième coordonnée du point trouvé dans la base s'il y en a qu'un. Une première approximation globale de cette méthode consiste à trouver, pour la projection 2D de l'ensemble des points caractéristiques de l'image à analyser, la projection la plus proche parmi un ensemble de N modèles en 3D (N étant de l'ordre de 300) représentant les têtes humaines dans leur diversité. 30 La pertinence de cette méthode s'appuie sur le fait que les informations projetées dans un plan à deux dimensions d'un visage possèdent des informations quant à la troisième dimension du visage. With regard to the 3-dimensional reconstruction of the coordinates of the characteristic points, it is necessary that the base of reference faces was previously filled in the 3 dimensions, that is to say that each face of this base was taken. in photo according to two angles, one of face, the other of profile. Each of these faces will have been informed, that is to say that its characteristic points will have been defined, for example by a manual operation, in the 3 dimensions. From a 2D photo, the previous method allows to reconstruct the projection of the characteristic points in two dimensions. Each characteristic point reconstructed in two dimensions is either an average of points actually filled in the face base, or directly the point corresponding to one of the faces. Since these points have their third coordinates entered in the face database, reconstructing the third coordinate of the searched point according to the same method (average or direct association) will finally give for the whole face a third dimension calculated from the third dimension of the images. of the weighted base 25 of the same coefficients as for the two-dimensional reconstruction, or directly the third coordinate of the point found in the base if there is only one. A first global approximation of this method consists in finding, for the 2D projection of all the characteristic points of the image to be analyzed, the closest projection among a set of N 3D models (N being of the order of 300) representing the human heads in their diversity. The relevance of this method is based on the fact that the information projected in a two-dimensional plane of a face has information as to the third dimension of the face.

L'invention permet donc de construire à partir d'une photo de visage imparfaitement prise ou incomplète l'ensemble des points caractéristiques et éléments permettant de créer une image 35 virtuelle la plus proche possible du visage initial, y compris dans les trois dimensions et en tenant compte d'éventuelle rotation. Elle permet également de caractériser les éléments accessoires du5 visage que sont la chevelure, la barbe et les lunettes. L'ensemble de ces informations permet de créer une tête de synthèse en 3D ressemblant à la personne prise sur la photo. The invention therefore makes it possible to construct from an imperfectly taken or incomplete face photograph all the characteristic points and elements making it possible to create a virtual image as close as possible to the initial face, including in the three dimensions and taking into account possible rotation. It also makes it possible to characterize the accessory elements of the face that are the hair, the beard and the glasses. All of this information allows you to create a 3D synthesis head that looks like the person in the picture.

Pour créer la tête de synthèse, on part d'une tête neutre et chauve sur laquelle l'ensemble des points caractéristiques est représenté. Cette tête est définie dans un contexte de logiciel d'image de synthèse (par exemple le logiciel MAYA de la société Autodesk ), et est dotée d'animation (elle dispose des points, des paramètres et de la texture nécessaire pour qu'un professionnel puisse l'animer). Puis on déforme par morphing cette tête neutre jusqu'à ce que les points caractéristiques de la tête neutre aient la même projection de face que les points caractéristiques de la photo. Si la base est renseignée en 3D, les points caractéristiques trouvés sur la photo auront également une troisième dimension renseignée, et on pourra également déformer la tête pour que la projection de profil corresponde. A défaut, on conserve un profil dit moyen. On mappe ensuite la texture du visage de la photo sur la tête déformée, et on mappe la texture repérée sur le front sur les zones non visibles de la tête. Enfin, on plaque sur la tête la chevelure la plus proche de la chevelure réelle parmi un ensemble de chevelures de synthèse, et on opère de la même façon pour les éventuelles barbe et lunettes. To create the synthesis head, we start from a neutral and bald head on which all the characteristic points are represented. This head is defined in a computer-generated image software context (eg Autodesk's MAYA software), and has animation (it has the points, parameters, and texture needed for a professional can animate it). This neutral head is then deformed by morphing until the characteristic points of the neutral head have the same frontal projection as the characteristic points of the photo. If the base is filled in 3D, the characteristic points found on the photo will also have a third dimension filled, and one can also deform the head so that the projection of profile corresponds. Otherwise, we keep a so-called average profile. We then map the texture of the face of the photo on the deformed head, and we map the texture identified on the forehead on the non-visible areas of the head. Finally, the hair nearest to the real hair is placed on the head among a set of synthetic hair, and the same is done for the beards and glasses.

Pour tenir compte des disparités de conditions de prise de vue, un module de correction d'éclairement (29) est ajouté, qui consiste à mesurer la luminance et la chrominance sur une zone centrale du visage, et à modifier sur toute l'image les paramètres RGB pour que la luminance et la chrominance tiennent dans des intervalles pré définis garants d'une bonne température de couleur et d'une bonne luminosité. To take into account differences in shooting conditions, an illumination correction module (29) is added, which consists of measuring luminance and chrominance on a central zone of the face, and modifying the entire image RGB settings for luminance and chrominance to fit within pre-defined intervals to ensure good color temperature and good brightness.

Un perfectionnement de l'invention consiste, une fois les points caractéristiques reconstruits, à analyser l'émotion exprimée par le visage en mesurant la déviation de paramètres dérivés des positions des points caractéristiques par rapport à une expression neutre, puis à redresser ledit visage jusqu'à lui donner une expression neutre. Ainsi les points caractéristiques seront ceux du visage neutre, le mieux à même d'être ensuite manipulé par tout type de procédé. An improvement of the invention consists, once the characteristic points have been reconstructed, of analyzing the emotion expressed by the face by measuring the deviation of parameters derived from the positions of the characteristic points with respect to a neutral expression, and then straightening said face up to to give him a neutral expression. Thus the characteristic points will be those of the neutral face, the best able to be then manipulated by any type of process.

L'avatar ainsi défini est complété par un profil psychologique destiné à affiner son comportement dans des scènes futures. Ce profil psychologique est défini à partir de l'analyse des points caractéristiques du visage trouvés lors de la première étape de l'invention, selon une méthode empruntée à la morphopsychologie, telle que décrite par exemple dans Visages et Caractères , de Louis Corman, 1985. On applique un classifieur sur les mesures des distances, des angles et des intégrales en surface et en volume entre les différents points caractéristiques, les lignes qui les rejoignent et les surfaces et volumes délimités pour les yeux, le nez, la bouche, le front, le menton, les mâchoires. Chaque visage est ainsi classé selon N axes décrivant la personnalité, N pouvant varier entre 5 et 20. The avatar thus defined is completed by a psychological profile intended to refine its behavior in future scenes. This psychological profile is defined from the analysis of the characteristic points of the face found during the first step of the invention, according to a method borrowed from morphopsychology, as described for example in Faces and Characters, by Louis Corman, 1985. A classifier is applied to the measurements of the distances, angles and integrals in surface and volume between the different characteristic points, the lines which join them and the surfaces and volumes delimited for the eyes, the nose, the mouth, the forehead. , chin, jaws. Each face is thus classified according to N axes describing the personality, N being able to vary between 5 and 20.

Les classes d'un visage selon ces axes sont ensuite projetées sur une matrice décrivant les potentiels d'interaction, face à des situations-type ou à des profils types. Les résultats de cette projection donnent les interactions types du personnage face à des profils type ou à des situations types. Elles sont utilisées dans le choix des animations les plus pertinentes parmi les animations préparées pour calculer la scène la plus représentative du personnage et de sa psychologie. En quelque sorte, le scénario de la scène finale est construit par choix successifs d'animations, chaque choix étant conduit par la pertinence par rapport aux interactions types déterminées. The classes of a face along these axes are then projected onto a matrix describing the interaction potentials, in the face of typical situations or profiles. The results of this projection give the character's typical interactions with typical profiles or typical situations. They are used in the choice of the most relevant animations among the animations prepared to calculate the scene most representative of the character and his psychology. In a way, the scenario of the final scene is constructed by successive choices of animations, each choice being driven by the relevance with respect to the specific types of interaction.

On peut consulter les sources en annexe pour obtenir plus d'informations sur les méthodes proposées. The sources in the appendix can be consulted for more information on the proposed methods.

Annexe : Hierarchical Wavelet Networks for Facial Feature Localization (2001) Rogerio Schmidt Feris, Jim Gemmell, Kentaro Toyama, Volker Krueger 2 Face Récognition by Elastic Bunch Graph Matching (1996) Laurenz Wiskott, Jean-Marc Fellous, Norbert Kruger, Christoph von der Malsburg Proc. 7th Intern. Conf. on Computer Analysis of Images and Patterns, CA1P'97, Kiel Face recognition: A literature survey, ACM Computing Surveys (CSUR) archive Volume 35 , Issue 4 (December 2003) Détection de visage par transformée de Hough: http://www.rennes.supelec.fr/ren/persQ/rseguier/pages/detection.html Détection d'ellipse, cas général : http://www.tsi. enst.fr/tsi/enseigi iement/ressources/mti/ell ipses/Houghol. htm Human face segmentation and identification - Saad Ahmed Sirohey - University of Maryland A subspace method for maximum likelihood target detection, Proceeding of the 1995 International Conference on Image Processing (Vol. 3)- Training Support Vector Machines: an Application to Face Détection, CVPR archive Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97) Segmentation du visage: Human Face Segmentation and Identification, Sirohey, Saad Ahmed, UM Computer Science Department; CS-TR-3176 CAR-TR-695 Comparaison à la base de données : ACP: présentation générale : httpi/zoonek2.free.fr/UNIX/483/05.htrnl Méthodes EigenFace : Eigenface-based facial recognition Dimitri PISSARENKO February 13, 2003 openbio.sourceforge.net/resources/eigenfaces/eigenfaces-html/facesOptions. htm l Sur les réseaux de fonction à base radiale, voir par exemple : www.scico.u-bordeaux2.fr/--corsini/Pedagogie/ANN/main/node38.html - 6k - 3D face processing : modeling, analysis & synthesis, (Int. series in video computing, Vol. 8) Auteurs : WEN Zhen, HUANG Thomas S.. Date de parution: 07-2004 Sur la morphopsychologie: ABC de la morphopsychologie, Carleen Binet, 2005. Appendix: Hierarchical Wavelet Networks for Facial Feature Localization (2001) Rogerio Schmidt Feris, Jim Gemmell, Kentaro Toyama, Volker Krueger 2 Face Reconnection by Elastic Bunch Matching Graph (1996) Laurenz Wiskott, Jean-Marc Fellous, Norbert Kruger, Christoph von der Malsburg proc. 7th Intern. Conf. on Computer Analysis of Images and Patterns, CA1P'97, Kiel Face recognition: A literature survey, ACM Computing Surveys (CSUR) archive Volume 35, Issue 4 (December 2003) Hough Transform Face Detection: http: // www. rennes.supelec.fr/ren/persQ/rseguier/pages/detection.html Ellipse detection, general case: http: //www.tsi. enst.fr/tsi/enseigi iement / ressources / mti / ell ipses / Houghol. Htm Human face segmentation and identification - Saad Ahmed Sirohey - University of Maryland A subspace method for a maximum likelihood of detection, Proceeding of the 1995 International Conference on Image Processing (Vol.3) - Training Support Vector Machines: an Application to Face Detection, CVPR Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97) Facial Segmentation: Human Face Segmentation and Identification, Sirohey, Saad Ahmed, UM Computer Science Department; CS-TR-3176 CAR-TR-695 Comparison to the database: ACP: General presentation: httpi / zoonek2.free.fr / UNIX / 483 / 05.htrnl Methods EigenFace: Eigenface-based facial recognition Dimitri PISSARENKO February 13, 2003 openbio.sourceforge.net/resources/eigenfaces/eigenfaces-html/facesOptions. htm l For radial-based function networks, see for example: www.scico.u-bordeaux2.fr/--corsini/Pedagogie/ANN/main/node38.html - 6k - 3D face processing: modeling, analysis & synthesis , (Int. Series in video computing, Vol 8) Authors: WEN Zhen, HUANG Thomas S .. Release date: 07-2004 On Morphopsychology: ABC of Morphopsychology, Carleen Binet, 2005.

Sur la psychologie et les interactions: L'ennéagramme, René de Lassus, 1997. On Psychology and Interactions: The Enneagram, René de Lassus, 1997.

Claims

claims

1 - A method for automatically manufacturing and immersing in 3D scenes a 3-dimensional synthetic character made from a 2-dimensional photograph characterized in that it comprises: a step of receiving a photo (16) through a channel transmission device (17), a step (18) for extracting the face or faces (19) present in the photograph, a step (20) for determining the characteristic points (2) of said face by comparing the zone surrounding each characteristic point with the equivalent area in the faces of a predefined database (21) and by consistency test on the relative positions of the points with respect to a model (39), a step (22) for determining the characteristic features (23) the face or faces extracted, by calculation from the found characteristic points (2), by comparison with the equivalent parts in the faces of the predefined database (21) or with pre-calculated accessories s (24), a step (25) of morphing a 3D neutral head (26) to match the characteristic points (2) of the head with those from step (20) taking into account the characteristic features ( 23), a step (27) for mapping the texture of the extracted face (19) and corrected in luminance and chrominance (28) by a process (29) and for gluing a sample of uniform texture extracted from the face (28) in a position determined by (22) on the hidden parts of the deformed head (30), a step (31) of adding accessories made in computer images taken in the base (24) and corresponding to the face (19) as detected by step (22) of features (23) on the deformed and mapped head (32), a step (33) of psychological profile construction (34) defined from the characteristic points (2) and features (23), which is associated with the deformed, mapped and accessorized head (35), and a step (36) manufacturing an animated scene (38) from the deformed, mapped and accessorized head (35), the psychological profile (34), and an animation base (37).

2 - method according to claim 1 characterized in that the area around each characteristic point is represented by a spiral vector.

3 - method according to claims 1 or 2 characterized in that the identification of the characteristic points (2) is a mixture of statistical analysis solutions that are the reduction of the representation space by means of a principal component analysis , followed by a classification based on the k nearest neighbor algorithm, and model-based solutions, which are the coherence tests taking into account the particular topography of each zone of the face, the coherence of the zones of the face between they, points of each zone between them and the proximity of the barycentre of the statistical cloud of the points considered good candidates taken in the database.

4 - A method according to any one of claims 1 to 3 characterized in that the characteristic features include theoretical models of beards, haircuts and pairs of fins, which are used for recognition and are then used in the synthesis of the final head In 3D. 5. Method according to claim 4, characterized in that the recognition of the hair uses the piecewise integral of a curve characteristic of the thickness of the hair in sections passing through a central point of the face to classify the haircut unknown among a finite number of typical haircuts prepared in computer graphics. 6 - Process according to any one of claims 1 to 5, characterized in that it comprises a step of detecting and erasing glasses on the image to be analyzed, by identification with respect to typical glasses by means of a contour detection, then by a reconstruction of the pixels (5) hidden by the glasses, by extending the immediately peripheral and visible zones around the glasses of the image to be analyzed by analogy with the equivalent zones on the images of a known database of faces without glasses. 7 - method according to any one of claims 1 to 6 characterized in that the third dimension of the detected points on the 2D image to be analyzed is calculated by associating all or part of the 2D projections of the characteristic points found all or part of those the 3D head taken from a database of 3D heads having a 2D projection of its closest characteristic points, and assigning to the characteristic point being computed the third dimension of the point of the corresponding model, or an average of the third dimensions closest points. 8 - method according to any one of claims 1 to 7 characterized in that an emotion detector is applied to the reconstructed face, and the characteristic points (2) are moved to obtain a neutral emotion before the morphing step . 9 - method according to any one of claims 1 to 8 characterized in that the psychological profile is determined by a morphopsychology technique that uses the coordinates of the characteristic points to evaluate distances, angles, areas and volumes specific to each face, in order to classify each face in a pre-determined psychological profile whose interactions with other profiles or in key situations is defined and is used in the choice of the evolution of a scene cut into several tables, with several pre-written possibilities for each table, among which possibilities one is chosen according to the profile.