FR3085221A1

FR3085221A1 - MULTIMEDIA SYSTEM COMPRISING HUMAN-MACHINE INTERACTION MATERIAL EQUIPMENT AND A COMPUTER

Info

Publication number: FR3085221A1
Application number: FR1857665A
Authority: FR
Inventors: Ludovic Fagot; Emmanuel Bourcet
Original assignee: Pls Experience
Current assignee: Pls Experience
Priority date: 2018-08-24
Filing date: 2018-08-24
Publication date: 2020-02-28
Anticipated expiration: 2038-08-24
Also published as: FR3085221B1; WO2020039152A2; WO2020039152A3

Abstract

La présente invention concerne un procédé et un système multimédia comportant un équipement matériel d'interaction homme-machine et un ordinateur exécutant un programme commandant le déroulement de séquences audiovisuelles caractérisé en ce qu'il comporte en outre un moyen d'acquisition sonore et/ou visuel associé, une caméra pour la capture d'images faciales et un programme d'ordinateur pour la caractérisation de l'état émotionnel par analyse automatique d'un visage identifié sur lesdites images faciales et pour le paramétrage de l'enchaînement desdites séquences audiovisuelles.The present invention relates to a multimedia method and system comprising hardware equipment for man-machine interaction and a computer executing a program controlling the unfolding of audiovisual sequences, characterized in that it further comprises a means of sound acquisition and / or associated visual, a camera for capturing facial images and a computer program for characterizing the emotional state by automatic analysis of a face identified on said facial images and for configuring the sequence of said audiovisual sequences.

Description

Système multimédia comportant un équipement matériel d'interaction homme-machine et un ordinateurMultimedia system comprising hardware human-machine interaction equipment and a computer

Domaine de 1'inventionField of the invention

La présente invention concerne le domaine des systèmes multimédias d'interaction homme-machine, destinés notamment à des applications de jeux, de simulation, de formation ou d'apprentissage.The present invention relates to the field of multimedia man-machine interaction systems, intended in particular for games, simulation, training or learning applications.

Dans le domaine de la formation et de l'apprentissage, de tels systèmes mettent souvent en œuvre une plateforme d'apprentissage (parfois appelée « environnement numérique d'apprentissage » et en anglais « learning management system — LMS -») qui gère un processus d'apprentissage ou un parcours pédagogique.In the field of training and learning, such systems often implement a learning platform (sometimes called “digital learning environment” and in English “learning management system - LMS -”) which manages a process learning or educational path.

Ce genre de système informatique propose un espace numérique de travail comprenant des tests d'évaluation qui sont soit soumis à validation par 1'enseignant soit proposées comme activités de régulation en auto-évaluation.This kind of computer system offers a digital workspace comprising evaluation tests which are either subject to validation by the teacher or proposed as regulation activities in self-evaluation.

Dans le domaine des jeux vidéos, une application informatique gère le déroulement des scénarios en fonction des actions du joueur ou des joueurs pour les jeux partagés, selon des règles et processus prédéterminés prenant en compte uniquement les interactions physiques du ou des joueurs avec les commandes délivrant les signaux d'entrée de l'application de jeu.In the field of video games, a computer application manages the unfolding of the scenarios according to the actions of the player or players for shared games, according to predetermined rules and processes taking into account only the physical interactions of the player (s) with the commands delivering the game application input signals.

Récemment, les chercheurs ont établi l'importance de la prise en compte des compétences socio-émotionnelles qui contribuent à l'efficacité individuelle et collective des enseignements. L'éventail de qualités sous-entendues est très large : l'empathie, le respect de l'autre, la capacité de solliciter ou de prêter de l'aide, l'aptitude à ajuster sesRecently, researchers have established the importance of taking into account socio-emotional skills that contribute to the individual and collective effectiveness of lessons. The range of qualities implied is very wide: empathy, respect for others, the ability to seek or lend help, the ability to adjust one's

-2émotions aux situations ou à discerner celles d'autrui. L'Organisation mondiale de la Santé les définit comme « un ensemble de capacités permettant à chacun d'adopter un comportement adaptable et positif pour répondre efficacement aux exigences du quotidien ».-2emotions to situations or to discern those of others. The World Health Organization defines them as "a set of capacities allowing everyone to adopt an adaptable and positive behavior to respond effectively to the demands of everyday life".

L'invention concerne plus précisément la prise en compte des contextes émotionnels dans les systèmes multimédias d'interaction homme-machine.The invention relates more precisely to the taking into account of emotional contexts in multimedia systems of human-computer interaction.

Etat de la techniqueState of the art

On connaît dans l'état de la technique la demande de brevet américain US20080254434 décrivant un serveur de système de gestion d'apprentissage hébergeant une interface logicielle exposant des fonctions pour traiter des activités d'apprentissage associées à des objets d'apprentissage conformes présentés dans une application client participante ayant un premier adaptateur d'interface logicielle ;The US patent application US20080254434 describing a learning management system server hosting a software interface exposing functions for processing learning activities associated with conforming learning objects presented in a participating client application having a first software interface adapter;

une application tutorielle intelligente pour sélectionner et fournir des objets d'apprentissage pour présentation à un participant via 1'application cliente participante, l'application tutorielle intelligente comprenant un second adaptateur d'interface logicielle pour envoyer des commandes associées à l'objet d'apprentissage conforme, l'application tutorielle intelligente comprenant en outre un module logiciel de recherche et de sélection, un module logiciel d'analyse de style d'apprentissage et un module logiciel de moteur de tutorat.an intelligent tutorial application for selecting and providing learning objects for presentation to a participant via the participating client application, the intelligent tutorial application comprising a second software interface adapter for sending commands associated with the learning object compliant, the intelligent tutorial application further comprising a search and selection software module, a learning style analysis software module and a tutoring engine software module.

Un autre document, la demande internationale W02008064431A1A, décrit un procédé et un système de surveillance des changements d'état émotionnel chez un sujet sur la base des expressions du visage, s'appuyant sur la capture de premières puis des deuxièmes données d'image faciale du sujet à un premier puis à un deuxième instant, et le traitement des premières et deuxièmes données d'image faciale pour produire des données de changement d'étatAnother document, the international application W02008064431A1A, describes a method and a system for monitoring changes in emotional state in a subject based on facial expressions, based on the capture of first and then second facial image data. of the subject at a first then at a second instant, and the processing of the first and second facial image data to produce state change data

-3 émotionnel relatives au sujet pour la période comprise entre la capture des premières données d'image faciale et des deuxièmes données d'image faciale. Dans certains modes de réalisation, une suite de stimuli est appliquée au sujet et des données de changement d'état émotionnel corrélées avec chaque stimulus sont acquises. Le procédé et le système peuvent être utilisés pour la modélisation des relations entre les réponses cognitives et émotionnelles. Les modèles et l'analyse des changements d'état émotionnel du sujet peuvent être employés dans divers contextes tels que la prise de décisions en matière de gestion, de recrutement, d'évaluation, d'enseignement et d'exploration approfondie de données.-3 emotional relating to the subject for the period between the capture of the first facial image data and the second facial image data. In some embodiments, a series of stimuli is applied to the subject and emotional state change data correlated with each stimulus is acquired. The method and the system can be used for modeling the relationships between cognitive and emotional responses. The subject's emotional state change models and analysis can be used in a variety of contexts such as decision-making in management, recruitment, evaluation, teaching, and in-depth data exploration.

On connaît aussi la demande de brevet US20140234816 décrivant un système éducatif basé sur un réseau pour fournir une éducation sociale et émotionnelle, comprenant :Also known is the patent application US20140234816 describing an educational system based on a network for providing social and emotional education, comprising:

- un stockage informatique configuré pour stocker une pluralité de compétences pour des scénarios d'apprentissage socio-émotionnel et d'éducation pour la pluralité de compétences ;- a computer storage configured to store a plurality of skills for socio-emotional learning and education scenarios for the plurality of skills;

un ou plusieurs serveurs configurés pour permettre à l'un des scénarios de formation d'être présenté sur un dispositif utilisateur, dans lequel le ou les serveurs sont configurés pour recevoir une entrée d'un utilisateur enfant en réponse à la présentation de l'un des scénarios de formation ;one or more servers configured to allow one of the training scenarios to be presented on a user device, in which the server or servers are configured to receive input from a child user in response to the presentation of one training scenarios;

dans lequel le stockage informatique est configuré pour stocker l'historique d'un utilisateur enfant en visualisant les scénarios d'éducation, l'entrée de l'utilisateur enfant pendant la présentation des scénarios d'éducation, et une évaluation des compétences de l'utilisateur enfant en compétences émotionnelles sociales;wherein the computer storage is configured to store the history of a child user by viewing the education scenarios, the input of the child user during the presentation of the education scenarios, and a skills assessment of the child user in social emotional skills;

et un module d'intelligence configuré pour évaluer les compétences de l'utilisateur enfant en matière de compétences socio-émotionnelles, en partie sur la base de laand an intelligence module configured to assess the skills of the child user in socio-emotional skills, in part based on the

-4contribution de l'utilisateur enfant lors de la présentation des scénarios d'éducation.-4contribution of the child user during the presentation of the education scenarios.

La demande de brevet US20080091515 décrit un procédé de formation de secouriste mettant en œuvre la présentation d'une pluralité d'agents logiciels activables et comportant une table de niveaux associés chacun à un comportement spécifique, et un contrôleur principal commandant l'édition d'un paramètre de niveau commandant la modification de l'état desdits agents logiciels caractérisé en ce qu'il comporte l'enregistrement d'un niveau de consigne et des étapes intermittentes d'acquisition d'informations représentatives de l'état émotionnel dudit secouriste, la modification du niveau commandé par le contrôleur est diminué si le niveau mesuré est supérieur au niveau de consigne, et augmenté si le niveau mesuré est inférieur au niveau de consigne.The patent application US20080091515 describes a first aid training method implementing the presentation of a plurality of activatable software agents and comprising a table of levels each associated with a specific behavior, and a main controller controlling the editing of a level parameter controlling the modification of the state of said software agents characterized in that it comprises the recording of a set level and intermittent steps of acquisition of information representative of the emotional state of said rescuer, the modification the level controlled by the controller is decreased if the measured level is higher than the set level, and increased if the measured level is lower than the set level.

Inconvénients de l'art antérieurDisadvantages of the prior art

Les solutions de l'art antérieur ne permettent pas de prendre en compte dynamiquement la situation émotionnelle de l'utilisateur (joueur, apprenti ou personne en formation) pour faire évoluer en temps réel le processus de jeu ou de formation.The solutions of the prior art do not make it possible to dynamically take into account the emotional situation of the user (player, apprentice or person in training) in order to make the game or training process evolve in real time.

Solution apportée par l'inventionSolution provided by the invention

Afin de remédier à cet inconvénient, l'invention concerne selon son acception la plus générale un système multimédia comportant un équipement matériel d'interaction homme-machine et un ordinateur exécutant un programme commandant le déroulement de séquences audiovisuelles caractérisé en ce qu'il comporte en outre un moyen d'acquisition sonore et/ou visuel associé, une caméra pour la capture d'images faciales et un programme d'ordinateur pour la caractérisation de l'état émotionnel par analyse automatiqueIn order to remedy this drawback, the invention relates, in its most general sense, to a multimedia system comprising hardware equipment for human-machine interaction and a computer executing a program controlling the progress of audiovisual sequences, characterized in that it comprises in addition to an associated sound and / or visual acquisition means, a camera for capturing facial images and a computer program for characterizing the emotional state by automatic analysis

-5d'un visage identifié sur lesdites images faciales et pour le paramétrage de l'enchaînement desdites séquences audiovisuelles.-5a face identified on said facial images and for the configuration of the sequence of said audiovisual sequences.

Selon des modes de réalisation particuliers de l'invention, le système présente en outre une ou plusieurs des caractéristiques suivantes :According to particular embodiments of the invention, the system also has one or more of the following characteristics:

- il comporte en outre un équipement matériel de simulation piloté par un calculateur en fonction de l'état du paramétrage de l'enchaînement desdites séquences audiovisuelles ;- It also includes a hardware simulation equipment controlled by a computer according to the state of the setting of the sequence of said audiovisual sequences;

- il comporte en outre un programme paramétrable de jeu interactif ;- it also includes a configurable interactive game program;

- ledit équipement matériel d'interaction hommemachine est constitué par un casque de réalité virtuelle ou de réalité augmentée ;- said human machine interaction equipment consists of a virtual reality or augmented reality headset;

ledit un programme d'ordinateur pour la caractérisation de l'état émotionnel comporte des moyens d'analyse de la partie non masquée du visage identifié.said a computer program for characterizing the emotional state includes means for analyzing the unmasked part of the identified face.

L'invention concerne aussi un procédé multimédia pour la commande d'un équipement matériel d'interaction hommemachine comportant une étape de commande du déroulement de séquences audiovisuelle et des étapes de caractérisation de l'état émotionnel par analyse automatique d'un visage identifié sur des images faciales capturées par une caméra.The invention also relates to a multimedia method for the control of a human machine interaction equipment comprising a step of controlling the unfolding of audiovisual sequences and steps of characterizing the emotional state by automatic analysis of a face identified on facial images captured by a camera.

Avantageusement, le déroulement des séquences audiovisuelles est déterminé en temps réel par le résultat de la caractérisation de l'état émotionnel.Advantageously, the progress of the audiovisual sequences is determined in real time by the result of the characterization of the emotional state.

De préférence, les paramètres de difficulté sont déterminés en temps réel par le résultat de la caractérisation de l'état émotionnel.Preferably, the difficulty parameters are determined in real time by the result of the characterization of the emotional state.

Selon une variante, l'activation d'aides numériques est déterminée en temps réel par le résultat de la caractérisation de l'état émotionnel.According to a variant, the activation of digital aids is determined in real time by the result of the characterization of the emotional state.

-6Description détaillée d'un exemple non limitatif de 1'invention-6 Detailed description of a nonlimiting example of the invention

La présente invention sera mieux comprise à la lecture de la description détaillée d'un exemple non limitatif de l'invention qui suit, se référant aux dessins annexés où :The present invention will be better understood on reading the detailed description of a nonlimiting example of the invention which follows, referring to the accompanying drawings in which:

- la figure 1 représente une vue schématique d'un système de simulation pour la formation aux gestions de réanimation cardio-pulmonaire ;- Figure 1 shows a schematic view of a simulation system for training in cardiopulmonary resuscitation management;

- la figure 2 représente une vue schématique du processus de reconnaissance des émotions.- Figure 2 shows a schematic view of the emotion recognition process.

L'invention est décrite dans ce qui suit pour un exemple non limitatif d'application à la formation aux gestes de réanimation cardio-pulmonaire.The invention is described in the following for a nonlimiting example of application to training in cardiopulmonary resuscitation gestures.

En cas d'arrêt cardiaque extra-hospitalier, une réanimation cardio-pulmonaire pratiquée par un témoin de l'événement est cruciale en termes de survie. Des compressions thoraciques et une défibrillation rapide sont les principaux déterminants de la survie après un arrêt cardiaque extrahospitalier, et plusieurs données montrent que la formation du grand public permet d'améliorer la survie à 30 jours et à 1 an.In the event of extra-hospital cardiac arrest, cardiopulmonary resuscitation performed by a witness to the event is crucial in terms of survival. Chest compressions and rapid defibrillation are the main determinants of survival after extrahospital cardiac arrest, and several data show that training for the general public improves survival at 30 days and 1 year.

La formation du grand public à la réanimation de base est efficace pour améliorer le nombre de personnes volontaires pour pratiquer une réanimation de base en situation réelle notamment dans les populations à haut risque (par exemple, dans les zones où il existe un risque élevé d'arrêt cardiaque mais un faible taux d'intervention par des témoins.Training the general public in basic resuscitation is effective in improving the number of people who volunteer to perform basic resuscitation in real situations, particularly in high-risk populations (for example, in areas where there is a high risk of cardiac arrest but a low rate of intervention by witnesses.

La formation est assurée généralement avec des mannequins permettant de mettre les stagiaires en situation pour s'exercer à la Réanimation Cardio-Pulmonaire.Training is generally provided with mannequins to put the trainees in a situation to practice Cardiopulmonary Resuscitation.

Ces mannequins présentent des repères anatomiques réalistes permettant d'entrainer les stagiaires à la pratique des gestes de ventilation et compression simulées avec uneThese mannequins present realistic anatomical landmarks allowing to train the trainees in the practice of ventilation and compression gestures simulated with a

-7résistance correspondant à un adulte ou un enfant pour pratiquer et mémoriser les bonnes techniques. De tels mannequins peuvent être associés à des moyens de réalité virtuelle qui permettent d'immerger les stagiaires dans une situation réaliste et leur apprendre les gestes opportuns face à un arrêt cardiaque en étant dans une scène où une personne est victime d'un arrêt cardiaque.-7 resistance corresponding to an adult or a child to practice and memorize good techniques. Such mannequins can be associated with virtual reality means which make it possible to immerse the trainees in a realistic situation and teach them the appropriate gestures in the face of cardiac arrest by being in a scene where a person is victim of a cardiac arrest.

Le système comprend un mannequin (1) présentant la forme et les dimensions d'un torse humain, un masque de réalité virtuelle (2), une caméra (3) pour l'acquisition de la position du mannequin et des mains de l'opérateur en formation et un ordinateur (4) assurant le traitement des images capturées par la caméra (3) et générant le flux vidéo affiché par le masque de réalité virtuelle (2).The system comprises a mannequin (1) having the shape and dimensions of a human torso, a virtual reality mask (2), a camera (3) for acquiring the position of the mannequin and of the hands of the operator. in training and a computer (4) processing the images captured by the camera (3) and generating the video stream displayed by the virtual reality mask (2).

Le mannequin (1) est essentiellement de type connu.The mannequin (1) is essentially of known type.

Il comporte une plaque supérieure rigide (10) présentant la conformation d'un torse humain et une plaque inférieure rigide (11). Un ressort de compression (12) est installé entre la plaque supérieure (10) et la plaque inférieure (11) pour apporter au torse une résistance comparable à celle d'une cage thoracique humaine.It comprises a rigid upper plate (10) having the conformation of a human torso and a rigid lower plate (11). A compression spring (12) is installed between the upper plate (10) and the lower plate (11) to provide the torso with a resistance comparable to that of a human rib cage.

Le mannequin comporte également un sac gonflable (13) permettant de simuler la reprise de la respiration.The mannequin also includes an inflatable bag (13) for simulating the resumption of breathing.

Il comporte aussi une tête (14) présentant un orifice en forme de bouche (15) pour réaliser des simulations d'insufflation.It also includes a head (14) having a mouth-shaped orifice (15) for carrying out insufflation simulations.

Un capteur (16) relié à la plaque supérieure (10) mesure l'effort appliqué sur la cage thoracique par l'intermédiaire d'une jauge de contrainte (18) et fournit une information concernant l'effort de compression exercé par l'opérateur, et permettant également d'estimer le risque de fracture de cote.A sensor (16) connected to the upper plate (10) measures the force applied to the rib cage by means of a strain gauge (18) and provides information concerning the compression force exerted by the operator. , and also used to estimate the risk of a rating fracture.

La position des mains peut également être acquise par l'intermédiaire de système de localisation en réalitéThe position of the hands can also be acquired via location system in reality

-8virtuelle (22, 23), par exemple les composants commercialisés sous le nom commercial de HTC Vive Tracker 2.0 communiquant avec une balise fixe délivrant les informations sur la position et l'orientation tridimensionnelle de chacun des poignées de l'opérateur.-8virtuelle (22, 23), for example the components marketed under the trade name of HTC Vive Tracker 2.0 communicating with a fixed beacon delivering information on the position and the three-dimensional orientation of each of the operator's handles.

Traitements d'imageImage processing

L'ordinateur (4) fournit un flux vidéo correspondant à une scène réaliste visualisée par l'opérateur grâce à un casque de réalité virtuelle (2) lui procurant une perception immersive dans une ambiance simulant un contexte d'intervention.The computer (4) provides a video stream corresponding to a realistic scene viewed by the operator using a virtual reality headset (2) providing him with an immersive perception in an atmosphere simulating an intervention context.

L'ordinateur calcule par ailleurs l'incrustation dans la scène d'ambiance d'un mannequin numérique, et recalcule en temps réel le repère des images de synthèse par rapport à la position du mannequin réel pour replacer sur l'image de synthèse le mannequin numérique dans une position correspondant à la position du mannequin ( 1 ) par rapport à 1'opérateur.The computer also calculates the overlay in the ambient scene of a digital mannequin, and recalculates in real time the mark of the synthetic images relative to the position of the real mannequin to replace the mannequin on the synthetic image. numeric in a position corresponding to the position of the dummy (1) relative to the operator.

La caméra (3) capte la position des mains et l'ordinateur (4) analyse l'image des mains pour rechercher dans une bibliothèque d'images numériques une image d'un couple de mains conforme à la position des mains de l'opérateur, et pour commander l'orientation de l'image numérique du couple de main sélectionné par rapport au mannequin numérique, pour la mettre en correspondance avec la position des mains sur le mannequin réel (1).The camera (3) captures the position of the hands and the computer (4) analyzes the image of the hands to search in a digital image library for an image of a couple of hands conforming to the position of the operator's hands. , and to control the orientation of the digital image of the selected hand torque with respect to the digital mannequin, to put it in correspondence with the position of the hands on the real mannequin (1).

Ainsi l'opérateur est plongé dans une ambiance totalement immersive, et non pas dans une ambiance de réalité augmentée, et retrouve sur l'écran d'affichage la combinaison des images de synthèse de l'environnement, de l'image de synthèse du mannequin numérique ou l'image d'une véritable victime repositionné en prenant en compte la position du mannequin physique (1) et la position des mains par extraction d'une image de synthèse dans une base d'images numériques, etThus the operator is immersed in a totally immersive atmosphere, and not in an augmented reality atmosphere, and finds on the display screen the combination of synthetic images of the environment, of the synthetic image of the mannequin. digital or the image of a real victim repositioned by taking into account the position of the physical mannequin (1) and the position of the hands by extracting a synthetic image from a database of digital images, and

-9repositionnées pour correspondre à la position des mains de 1'opérateur.-9repositioned to correspond to the position of the operator's hands.

L'écran d'affichage du casque (2) affiche également des indications techniques, par exemple sur la fréquence et le niveau des compressions, et sur la conformité des gestes de l'opérateur. La position de cette zone d'affichage n'est pas fixe par rapport au référentiel des images de synthèse de l'environnement, mais est déplacée en fonction de la position de la tête de l'opérateur et de la direction de la tête et de 1 'orientation/direction du mannequin. Ainsi, il peut orienter la tête pour prendre connaissance des informations techniques, ou détourner la tête pour déplacer la zone d'affichage de ces informations en dehors de la zone visualisée.The helmet display screen (2) also displays technical indications, for example on the frequency and level of compressions, and on the conformity of the operator's actions. The position of this display area is not fixed relative to the frame of reference of the synthetic images of the environment, but is shifted according to the position of the operator's head and the direction of the head and 1 orientation / direction of the dummy. Thus, he can orient the head to take note of the technical information, or turn his head to move the display area of this information outside the area displayed.

Caméra d'acquisition des expressions faciales du stagiaireTrainee facial expression acquisition camera

Une caméra (19) est intégrée au mannequin (1) pour capturer l'image du visage de l'opérateur et fournir des images dont l'analyse permet de déterminer son état émotionnel.A camera (19) is integrated into the mannequin (1) to capture the image of the operator's face and provide images whose analysis makes it possible to determine his emotional state.

La caméra (19) capture l'image de l'opérateur et de son attitude et fourni des images faisant l'objet d'un traitement de caractérisation en fonction d'une base de données d'expressions faciales et d'un algorithme de caractérisation tel que l'algorithme décrit dans la publication « Gil Levi et Tal Hassner. Reconnaissance des émotions dans la nature via les réseaux neuronaux convolutionnels et les modèles binaires mappés. Proc. Conférence internationale ACM sur 1¹ interaction multimodale (ICMI), Seattle, 2015 ».The camera (19) captures the image of the operator and his attitude and provides images which are subject to characterization processing according to a database of facial expressions and a characterization algorithm. as the algorithm described in the publication "Gil Levi and Tal Hassner. Recognition of emotions in nature via convolutional neural networks and mapped binary models. Proc. ACM International Conference on 1 ¹ Multimodal Interaction (ICMI), Seattle, 2015 ”.

Ces informations sont exploitées pour commander le scénario de la formation et la sélection des séquences vidéos affichées sur le masque (2).This information is used to control the training scenario and the selection of the video sequences displayed on the mask (2).

-10Détection des émotions-10Detection of emotions

Les images captées par la caméra (19) font l'objet d'un premier traitement par un module de détection de visage (100). Ce module est par exemple basé sur la méthode de Viola et Jones de détection d'objet dans une image numérique, par un procédé d'apprentissage supervisé permettant de détecter des visages efficacement et en temps réel des objets dans une image.The images captured by the camera (19) are subject to a first processing by a face detection module (100). This module is for example based on the method of Viola and Jones of object detection in a digital image, by a supervised learning process allowing to detect faces efficiently and in real time of objects in an image.

Le module (100) est par exemple constitué par une application informatique basée sur des classificateurs Haar rectangulaires et la représentation intégrale d'une image d'entrée, tel que décrit dans l'article de PA Viola, MJ Jones paru dans International Journal of Computer Vision, vol. 57, non. 2, pp. 137-154, 2004, incorporé par référence.The module (100) is for example constituted by a computer application based on rectangular Haar classifiers and the integral representation of an input image, as described in the article by PA Viola, MJ Jones published in International Journal of Computer Vision, vol. 57, no. 2, pp. 137-154, 2004, incorporated by reference.

Un exemple de détection de visage utilisant une méthode basée sur Viola-Jones a été implémenté dans une librairie de code libre disponible à l'adresse http;//www.mathworks.com/matlabcentral/fileexchange/19912.An example of face detection using a method based on Viola-Jones has been implemented in a free code library available at http; // www.mathworks.com/matlabcentral/fileexchange/19912.

La deuxième étape (200) concerne l'extraction de caractéristiques faciales à partir de la zone de l'image correspondant au visage. Un exemple de traitement est décrit dans l'article « Srinivasan R, Golomb JD, Martinez AM. A Neural Basis of Facial Action Recognition in Humans. The Journal of Neuroscience. 2016 ; 36(16) : 4434-4442. doi: 10.1523/JNEUROSCI.1704-15.2016. ».The second step (200) concerns the extraction of facial features from the image area corresponding to the face. An example of treatment is described in the article “Srinivasan R, Golomb JD, Martinez AM. A Neural Basis of Facial Action Recognition in Humans. The Journal of Neuroscience. 2016; 36 (16): 4434-4442. doi: 10.1523 / JNEUROSCI.1704-15.2016. "

On procède ensuite à un traitement de classification par un module (300) de classification d'expression utilisant un classificateur du plus proche voisin (NN) et la machine de vecteur de support (SVM), par exemple, comme décrit dans l'article « Vishnubhotla, S., Support Vector Classification. 2005, » ou l'article « Dasarathy, B. V., Normes du plus proche voisin (NN): NN Pattern Classification Techniques. 1991 », incorporés par référence.A classification processing is then carried out by an expression classification module (300) using a nearest neighbor classifier (NN) and the support vector machine (SVM), for example, as described in the article " Vishnubhotla, S., Support Vector Classification. 2005, ”or the article“ Dasarathy, B. V., Nearest Neighbor Standards (NN): NN Pattern Classification Techniques. 1991 ", incorporated by reference.

-11 Une règle euclidienne-NN ou la règle cosinus-NN peuvent également être utilisées.-11 A Euclidean-NN rule or the cosine-NN rule can also be used.

La région faciale à modéliser est annotée automatiquement par une collection de points de repère. Un vecteur de forme est donné par les coordonnées concaténées de tous les points de repère et peut être formellement écrit comme, s = (xl, x2, . .., xL, yl, y2, ..., yL) T, où L est le nombre des points de repère.The facial region to be modeled is automatically annotated by a collection of landmarks. A shape vector is given by the concatenated coordinates of all landmarks and can be formally written as, s = (xl, x2,. .., xL, yl, y2, ..., yL) T, where L is the number of landmarks.

Le modèle de forme peut être obtenu en appliquant une analyse en composantes principales (PCA) sur l'ensemble des formes alignées.The shape model can be obtained by applying a principal component analysis (PCA) on all the aligned shapes.

Après la création d'un modèle statistique d'apparence, un algorithme (400) est utilisé pour adapter le modèle statistique à une nouvelle image. Ceci détermine la meilleure correspondance du modèle à 1'image permettant de trouver les paramètres du modèle qui génèrent une image synthétique aussi proche que possible de l'image cible.After the creation of an appearance statistical model, an algorithm (400) is used to adapt the statistical model to a new image. This determines the best correspondence of the model to the image making it possible to find the parameters of the model which generate a synthetic image as close as possible to the target image.

Ce traitement permet d'associer les caractéristiques géométriques affectées lors de l'expression des émotions.This treatment makes it possible to associate the geometric characteristics affected during the expression of emotions.

A titre d'exemples, lorsqu'ils sont surpris, les yeux et la bouche s'ouvrent largement, ce dernier résultant en un menton allongé. La tristesse s'exprime par un clignotement fréquent des yeux. Quand ils sont en colère, les sourcils ont tendance à être attirés ensemble.As examples, when surprised, the eyes and mouth open wide, the latter resulting in an elongated chin. Sadness is expressed by frequent blinking of the eyes. When they are angry, the eyebrows tend to be drawn together.

Chaque image acquise par la caméra (19) est classée en temps réel en comparant ses paramètres avec un modèle correspondant à chaque expression. Les modèles pour chaque classe sont obtenus par une moyenne sur une succession d'image, ou une médiane, sur le vecteur de paramètres de forme pour chaque expression.Each image acquired by the camera (19) is classified in real time by comparing its parameters with a model corresponding to each expression. The models for each class are obtained by an average over a succession of images, or a median, over the vector of shape parameters for each expression.

Les classificateurs sont de type expression 1 / expression 2, c'est-à-dire neutre / non-neutre, triste / nontriste, colère / non-colère, dégoûté / non-dégoûté, peur /The classifiers are of type expression 1 / expression 2, i.e. neutral / non-neutral, sad / nontrist, anger / non-anger, disgusted / not disgusted, fear /

-12non-peur, surpris / non-surpris, et heureux / non-heureux, et de type expression / non-expression.-12no-fear, surprised / not-surprised, and happy / not-happy, and of type expression / non-expression.

Le résultat de ce traitement défini une combinaison de paramètres représentatifs de l'état émotionnel du stagiaire 5 ou du joueur.The result of this processing defines a combination of parameters representative of the emotional state of the trainee 5 or of the player.

Ces paramètres sont exploités par le programme de simulation ou de jeu pour commander en temps réel le séquencement des interactions homme-machine.These parameters are used by the simulation or game program to control the sequencing of human-machine interactions in real time.

Par exemple, dans une application de simulation, 10 les paramètres émotionnels neutres commanderont la sélection de séquences plus difficiles. Des paramètres émotionnels de peur commanderont la sélection de séquences moins anxiogènes.For example, in a simulation application, the neutral emotional parameters will control the selection of more difficult sequences. Emotional parameters of fear will control the selection of less anxiety-provoking sequences.

Le but est de mettre le stagiaire ou le joueur dans des situations contextuelles variées, adaptées à améliorer sa 15 maîtrise des gestes dans des conditions proches des environnements réelles, en prenant en compte sa capacité d'apprentissage sous stress.The goal is to put the trainee or the player in various contextual situations, adapted to improve his mastery of gestures in conditions close to real environments, taking into account his ability to learn under stress.

Claims

Multimedia system comprising hardware equipment for human-machine interaction and a computer executing a program controlling the unfolding of audiovisual sequences, characterized in that it further comprises an associated sound and / or visual acquisition means, a camera for capturing of facial images and a computer program for the characterization of the emotional state by automatic analysis of a face identified on said facial images and for the configuration of the sequence of said audiovisual sequences.

Multimedia system according to claim 1 characterized in that it further comprises a hardware simulation equipment controlled by a computer as a function of the state of the setting of the sequence of said audiovisual sequences.

Multimedia system according to claim 1 characterized in that it further comprises a configurable interactive game program.

Multimedia system according to claim 1 characterized in that said hardware equipment for human-machine interaction consists of a virtual reality or augmented reality headset.

Multimedia system according to the preceding claim, characterized in that said a computer program for the characterization of the emotional state comprises means for analyzing the unmasked part of the identified face.

-14Multimedia method for the control of material equipment for human-machine interaction comprising a stage for controlling the progress of audiovisual sequences and stages for characterizing the emotional state by automatic analysis of a face identified on captured facial images by a camera.

Multimedia method according to claim 6 characterized in that the progress of the audiovisual sequences is determined in real time by the result of the characterization of the emotional state.

Multimedia method according to claim 6 characterized in that the difficulty parameters are determined in real time by the result of the characterization of the emotional state.

Multimedia method according to claim 6 characterized in that the activation of digital aids is determined in real time by the result of the characterization of the emotional state