FR2944402A1

FR2944402A1 - COMMUNICABLE MOBILE TERMINAL, DEVICE AND METHOD FOR RECOGNIZING SHAPE FOR ROBOT

Info

Publication number: FR2944402A1
Application number: FR0901741A
Authority: FR
Inventors: Pierre Rouanet; Pierre Yves Oudeyer
Original assignee: Institut National de Recherche en Informatique et en Automatique INRIA
Current assignee: Institut National de Recherche en Informatique et en Automatique INRIA
Priority date: 2009-04-08
Filing date: 2009-04-08
Publication date: 2010-10-15
Anticipated expiration: 2029-04-08
Also published as: FR2944402B1; WO2010116057A1

Abstract

L'invention décrit un terminal mobile communicant agencé pour communiquer avec un robot par un moyen de communication. Le terminal comprend une visualisation apte à afficher des données d'image, un sélectionneur pour désigner une portion d'image, un organe de saisie, un outil de communication agencé pour coopérer avec un robot et une unité informatique, et un microprogramme agencé pour produire sur la visualisation une image de travail à partir de données vidéo correspondant à une prise de vues par un robot, recevoir des données de sélection d'une portion d'image désignée avec le sélectionneur, transmettre à l'unité informatique les données de sélection pour générer un modèle d'image correspondant à la portion d'image désignée, et transmettre à l'unité informatique un identifiant unique intelligible défini avec l'organe de saisie pour stockage en correspondance avec le modèle d'image. L'invention vise également un dispositif cybernétique comprenant le terminal et un procédé cybernétique.The invention describes a communicating mobile terminal arranged to communicate with a robot by a communication means. The terminal comprises a display capable of displaying image data, a selector for designating an image portion, an input member, a communication tool arranged to cooperate with a robot and a computer unit, and a microprogram arranged to produce on viewing a work image from video data corresponding to a shot by a robot, receiving selection data of a designated image portion with the selector, transmitting to the computer unit the selection data for generating an image model corresponding to the designated image portion, and transmitting to the computer unit a unique intelligible identifier defined with the data capture member for storage in correspondence with the image model. The invention also relates to a cybernetic device comprising the terminal and a cybernetic method.

Description

INRIA108.FRD Terminal mobile communicant, dispositif et procédé de reconnaissance de forme pour robot INRIA108.FRD Communicating mobile terminal, device and robot pattern recognition method

L'invention se rapporte à la robotique personnelle. Plus particulièrement, l'invention se rapporte à un terminal mobile communicant agencé pour communiquer avec un robot, un dispositif cybernétique et à un procédé cybernétique. The invention relates to personal robotics. More particularly, the invention relates to a communicating mobile terminal arranged to communicate with a robot, a cybernetic device and a cybernetic process.

La robotique personnelle, en particulier domestique, qu'elle soit de service ou ludique, connaît actuellement un fort développement. En général, la robotique fait intervenir de nombreuses techniques d'une complexité avancée. L'état de la technique est de nature diversifié, à coté de la robotique domestique on y trouve également des travaux portant sur des robots humanoïdes. Parmi des robots ludique on peut citer notamment le chien robot de compagnie AIBO ou encore un le robot de compagnie humanoïde QRIO (développés par la société Sony Corporation). Parmi les robots de service on peut citer le robot humanoïde ASIMO (développé par la société Honda Motor Company, Ltd.). Il existe également des travaux portant sur des robots de type androïdes. La demande de brevet WO 98/49629 décrit une machine epistémologique universelle pour la création d'une forme d'existence synthétique. La demande de brevet EP 1 477 277 décrit un robot humanoïde à deux jambes capable de mouvements fluides. 1 Dans la robotique personnelle il existe un besoin de développer des techniques d'interaction intuitives entre utilisateur et robot. Tout particulièrement, dans le domaine de la robotique ludique une interaction simple est de plus en plus demandée pour par exemple apprendre des mots nouveaux à un robot domestique de compagnie. Mais également dans le domaine de la robotique de service ce besoin est présent. En effet, l'apprentissage d'une nouvelle tâche de service à effectuer pour venir en aide aux personnes handicapées, âgées ou malade doit pouvoir se réaliser le plus simplement possible et être accessible au grand public . L'apprentissage doit donc être suffisamment intuitif et simple pour un utilisateur n'ayant aucune connaissance 15 technique en robotique et/ou en programmation informatique. Aujourd'hui on cannait des apprentissages de forme complexes faisant appel à des algorithmes de segmentation. Dans ces apprentissages l'interaction entre utilisateur et robot est limitée et tout particulièrement en ce qui 20 concerne l'apprentissage à un robot de mots nouveaux associés à des objets ou des formes, ou encore à des tâches nouvelles. Des exemples peuvent être trouvés notamment dans la publication scientifique : Luc Steels and Frederic 25 Kaplan. Aibo's first words: The social learning of language and meaning. Evolution of Communication, 4(1):3-32, 2000 . Les algorithmes de segmentations de l'état de la technique sont quasiment impossibles à mettre en oeuvre dans un environnement non contraint lorsqu'il existe aucun modèle a 30 priori d'un objet. En effet, l'identification d'objet ou de formes spécifiques dans un espace tridimensionnel donné au moyen d'algorithmes de segmentation, est dépendant de couleurs et/ou de textures uniformes dans cet espace. Ainsi, dans un espace non contraint tel que le monde réel, l'identification est dépendante notamment de la lumière ou de l'angle de prise de vue. Ces algorithmes présentent donc des problèmes de robustesse dans des environnements non contraints. De plus, il existe un besoin de faciliter les commandes de robot ainsi que l'interaction entre ce dernier et un utilisateur. On connait des travaux utilisant des objets médiateurs pour interagir avec un robot. Notamment, la divulgation de T. W. Fong, C. Thorpe, et B. Glass, "Pdadriver: A handheld system for remote driving," in IEEE International Conference on Advanced Robotics 2003. IEEE, July 2003, décrit un téléguidage de robot via un médiateur mobile. Toutefois, les commandes de l'art antérieur ne permettent pas d'accéder intuitivement et simplement aux interactions sociales, ludiques ou de service entre utilisateur et robot. De plus, les commandes de l'art antérieur ne permettent pas d'apprendre à un robot des capacités supplémentaires, et tout particulièrement des capacités de reconnaissance d'objet du monde réel, et ce de manière simple, intuitive et robuste. Personal robotics, especially domestic, whether service or fun, is currently experiencing a strong development. In general, robotics involves many techniques of advanced complexity. The state of the art is of a diversified nature, alongside domestic robotics there are also works on humanoid robots. Among playful robots include the companion robot dog AIBO or a humanoid robot QRIO (developed by Sony Corporation). Among the service robots is the humanoid robot ASIMO (developed by Honda Motor Company, Ltd.). There is also work on androids robots. The patent application WO 98/49629 describes a universal epistemological machine for the creation of a synthetic existence form. Patent application EP 1 477 277 describes a humanoid robot with two legs capable of fluid movements. 1 In personal robotics there is a need to develop intuitive interaction techniques between user and robot. Especially, in the field of playful robotics a simple interaction is more and more requested for example to learn new words to a domestic robot company. But also in the field of service robotics this need is present. Indeed, learning a new task of service to perform to help the disabled, elderly or sick must be realized as simply as possible and be accessible to the general public. The learning must therefore be sufficiently intuitive and simple for a user having no technical knowledge in robotics and / or computer programming. Today, we could learn complex forms of learning using segmentation algorithms. In these learning the interaction between user and robot is limited and particularly with regard to learning a robot of new words associated with objects or forms, or new tasks. Examples can be found in particular in the scientific publication: Luc Steels and Frederic Kaplan. Aibo's first words: The social learning of language and meaning. Evolution of Communication, 4 (1): 3-32, 2000. Segmentation algorithms of the state of the art are almost impossible to implement in a non-constrained environment when there is no prior model of an object. Indeed, the identification of object or specific shapes in a given three-dimensional space by means of segmentation algorithms, is dependent on uniform colors and / or textures in this space. Thus, in an unstressed space such as the real world, the identification is dependent in particular on the light or the angle of view. These algorithms therefore have problems of robustness in unconstrained environments. In addition, there is a need to facilitate robot controls as well as interaction between the robot and a user. We know work using mediating objects to interact with a robot. Notably, the disclosure of TW Fong, C. Thorpe, and B. Glass, "Pdadriver: A handheld system for remote driving," in IEEE International Conference on Advanced Robotics 2003. IEEE, July 2003, describes a robot remote control via a mediator mobile. However, the controls of the prior art do not allow access intuitively and simply social interactions, fun or service between user and robot. In addition, the commands of the prior art do not teach a robot additional capabilities, especially real-world object recognition capabilities, in a simple, intuitive and robust.

La présente invention vient améliorer la situation. A cet effet, l'invention vise un terminal mobile communicant agencé pour communiquer avec un robot par un moyen de communication à protocole, caractérisé en ce que le terminal mobile communicant comprend : - une visualisation apte à afficher des données d'image, - un sélectionneur pour désigner une portion d'image, - un organe de saisie, - un outil de communication agencé pour coopérer avec un robot et une unité informatique, et - un microprogramme agencé pour : a. produire sur la visualisation une image de travail à partir de données vidéo reçues depuis le robot, et correspondant à une prise de vues par ce robot, b. recevoir des données de sélection d'une portion d'image 10 désignée avec le sélectionneur, c. transmettre à l'unité informatique les données de sélection pour générer un modèle d'image correspondant à la portion d'image désignée, et d. transmettre à l'unité informatique un identifiant unique 15 intelligible défini avec l'organe de saisie pour stockage en correspondance avec le modèle d'image. Selon un mode de réalisation le terminal mobile communicant, comprend en outre un écran tactile pour faciliter l'interaction entre utilisateur et terminal. 20 L'organe de saisie du terminal mobile peut comprendre un clavier physique ou virtuel. L'organe de saisie peut également comprendre un microphone en association avec un programme de reconnaissance vocale. De plus, selon un autre mode de réalisation, le terminal comprend une entrée de 25 commande de recherche pour commander le robot. L'invention vise également un dispositif cybernétique comportant un robot, un terminal mobile communicant et une unité informatique de traitement, reliés entres eux par un moyen de communication à protocole caractérisé en ce que le dispositif comprend : - une caméra disposée sur le robot pour capter des données vidéo, - une visualisation sur le terminal mobile communicant pour la formation d'une image de travail à partir des données vidéo reçues par la caméra, - un sélectionneur pour désigner une portion d'image sur l'image de travail, - un analyseur d'image dans l'unité informatique de traitement pour générer un modèle d'image selon la portion d'image désignée avec le sélectionneur, - un organe de saisie pour la saisie d'un identifiant unique intelligible en correspondance avec le modèle 15 d'image, et - une zone de mémoire dédiée pour stocker le modèle d'image et son identifiant unique intelligible. Selon un mode de réalisation le terminal mobile communicant compris dans le dispositif comprend un écran tactile. 20 Selon un autre mode de réalisation le sélectionneur du dispositif est agencé sur le terminal mobile communicant. Selon un autre mode de réalisation l'organe de saisie du dispositif est agencé sur le terminal mobile communicant. L'organe de saisie peut comprendre un clavier physique ou 25 virtuel pour permettre la saisie. Selon une mode préférentiel de l'invention l'organe de saisie comprend un microphone en association avec un programme de reconnaissance vocale pour faciliter la saisie par l'utilisateur. The present invention improves the situation. For this purpose, the invention provides a communicating mobile terminal arranged to communicate with a robot by a protocol communication means, characterized in that the communicating mobile terminal comprises: a display capable of displaying image data; selector for designating an image portion, - an input member, - a communication tool arranged to cooperate with a robot and a computer unit, and - a microprogram arranged for: a. producing on the visualization a working image from video data received from the robot, and corresponding to a shot by this robot, b. receiving selection data of a designated image portion with the selector, c. transmitting to the computer unit the selection data to generate an image template corresponding to the designated image portion, and d. transmitting to the computer unit a unique intelligible identifier defined with the input member for storage in correspondence with the image model. According to one embodiment the communicating mobile terminal further comprises a touch screen to facilitate the interaction between user and terminal. The mobile terminal input device may comprise a physical or virtual keyboard. The input device may also include a microphone in association with a voice recognition program. In addition, according to another embodiment, the terminal includes a search control input for controlling the robot. The invention also relates to a cybernetic device comprising a robot, a communicating mobile terminal and a computing processing unit, connected to one another by a protocol communication means, characterized in that the device comprises: a camera arranged on the robot to capture video data, - a visualization on the communicating mobile terminal for the formation of a working image from the video data received by the camera, - a selector for designating an image portion on the working image, - a image analyzer in the processing computer unit for generating an image model according to the image portion designated with the selector, - an input device for entering a unique intelligible identifier in correspondence with the model 15 image, and - a dedicated memory area for storing the image template and its unique identifier intelligible. According to one embodiment, the communicating mobile terminal included in the device comprises a touch screen. According to another embodiment, the selector of the device is arranged on the communicating mobile terminal. According to another embodiment, the input device of the device is arranged on the communicating mobile terminal. The input device may comprise a physical or virtual keyboard to enable input. According to a preferred embodiment of the invention the input member comprises a microphone in association with a voice recognition program to facilitate input by the user.

Le robot du dispositif de l'invention peut comprendre un moyen de locomotion. Le moyen de locomotion peut être commandé par le terminal mobile communicant au moyen d'une commande adaptée. The robot of the device of the invention may comprise a means of locomotion. The means of locomotion can be controlled by the communicating mobile terminal by means of a suitable command.

Selon un mode de réalisation le dispositif de l'invention comprend une entrée de commande de recherche, et un scanner d'environnement. Dans ce mode de réalisation, l'entrée de commande de recherche peut être agencée sur le terminal mobile communicant et le scanner peut être agencé sur le robot. Préférentiellement, le scanner est agencé dans la caméra. De plus, l'invention vise un procédé cybernétique comprenant les étapes suivantes : a. capter des données vidéo par une caméra disposée sur un 15 robot, lesdites données correspondant à une prise de vues par ce robot, b. former une image de travail sur un terminal mobile communicant (TMC) à partir des données vidéos captés à l'étape a., 20 c. désigner une portion d'image sur l'image de travail formée à l'étape b., d. générer un modèle d'image à partir de la portion d'image désignée à l'étape c., e. saisir un identifiant unique intelligible correspondant 25 au modèle d'image généré à l'étape d., f. stocker le modèle d'image et son identifiant unique intelligible dans une mémoire dédiée. Selon un mode de réalisation le procédé comprend en outre les étapes suivantes . g. accéder à la mémoire dédiée pour sélectionner l'identifiant unique intelligible, h. envoyer par la sélection faite à l'étape g. une commande de scan vers un robot, i. scanner l'environnement du robot, l'environnement comprenant des objets, j. sélectionner un objet de l'environnement, k. établir un modèle d'objet de l'objet sélectionné à l'étape j., 1. comparer le modèle d'objet de l'étape k. au modèle d'image généré à l'étape d., et établir un facteur de correspondance selon l'identité entre les deux modèles, m. répéter les étapes j. à 1. jusqu'à obtention d'une valeur seuil du facteur de correspondance indiquant que le 15 modèle d'objet correspond au modèle d'image. La désignation à l'étape c. peut être réalisée sur terminal mobile communicant. Selon un mode de réalisation la désignation à l'étape c. est réalisée par moyen tactile. 20 La saisie à l'étape e. peut être réalisée sur du terminal mobile communicant. Selon un mode de réalisation la saisie à l'étape e. est réalisée par reconnaissance vocale d'une voix d'utilisateur. 25 Selon un autre mode de réalisation la saisie à l'étape e. est réalisée sur un clavier physique ou virtuel. According to one embodiment, the device of the invention comprises a search control input, and an environment scanner. In this embodiment, the search control input can be arranged on the communicating mobile terminal and the scanner can be arranged on the robot. Preferably, the scanner is arranged in the camera. In addition, the invention relates to a cybernetic process comprising the following steps: a. capturing video data by a camera disposed on a robot, said data corresponding to a shot by this robot, b. forming a working image on a communicating mobile terminal (TMC) from the video data captured in step a., c. designate an image portion on the working image formed in step b. d. generating an image template from the image portion designated in step c. e. enter a unique intelligible identifier corresponding to the image pattern generated in step d. f. store the image template and its unique identifier intelligible in a dedicated memory. According to one embodiment, the method further comprises the following steps. boy Wut. access the dedicated memory to select the unique intelligible identifier, h. send by the selection made in step g. a scan command to a robot, i. scan the environment of the robot, the environment including objects, j. select an object from the environment, k. establish an object model of the object selected in step j., 1. compare the object model of step k. to the image model generated in step d., and establish a correspondence factor according to the identity between the two models, m. repeat steps j. to 1. until a threshold value of the correspondence factor is obtained indicating that the object model corresponds to the image model. The designation in step c. can be performed on communicating mobile terminal. According to one embodiment, the designation in step c. is performed by touch. The entry in step e. can be performed on the communicating mobile terminal. According to one embodiment, the input in step e. is performed by voice recognition of a user voice. According to another embodiment, the seizure in step e. is performed on a physical or virtual keyboard.

D'autres avantages et caractéristiques apparaîtront à la lecture de la description détaillée ci-après et sur les figures annexées sur lesquelles : - La figure 1 est une représentation fonctionnelle du terminal mobile communicant de l'invention selon un mode de réalisation, - La figure 2 est une représentation fonctionnelle du terminal mobile communicant de l'invention selon un autre mode de réalisation, - La figure 3 est une représentation fonctionnelle du terminal mobile communicant de l'invention selon un autre mode de réalisation, - La figure 4 est une représentation schématique du dispositif de l'invention, - La figure 5 est un organigramme du procédé selon l'invention, - La figure 6 est un organigramme du procédé selon un autre mode de réalisation de l'invention, - La figure 7 est un schéma illustratif d'un exemple de réalisation d'une étape de désignation selon un mode de réalisation de l'invention, La figure 8 est un schéma illustratif d'un exemple de réalisation d'un menu contextuel accessible lors du procédé selon un mode de réalisation de l'invention, - La figure 9 est un schéma illustratif d'un exemple de réalisation d'une étape de saisie/saisie du procédé selon un mode de réalisation de l'invention, - La figure 10 est un schéma illustratif d'un exemple de réalisation d'une étape d'accès à mémoire selon un mode de réalisation de l'invention, et - La figure 11 est un schéma illustratif d'un exemple de 5 réalisation d'un menu sélectif accessible lors du procédé selon un mode de réalisation de l'invention. Other advantages and characteristics will appear on reading the detailed description below and on the appended figures in which: FIG. 1 is a functional representation of the communicating mobile terminal of the invention according to one embodiment, FIG. 2 is a functional representation of the communicating mobile terminal of the invention according to another embodiment, FIG. 3 is a functional representation of the communicating mobile terminal of the invention according to another embodiment, FIG. schematic of the device of the invention, - Figure 5 is a flowchart of the method according to the invention, - Figure 6 is a flowchart of the method according to another embodiment of the invention, - Figure 7 is an illustrative diagram. of an exemplary embodiment of a designation step according to one embodiment of the invention, FIG. 8 is an illustrative diagram of an example of a of a contextual menu accessible during the method according to one embodiment of the invention, - Figure 9 is an illustrative diagram of an embodiment of a method of entering / entering the method according to an embodiment of the invention. FIG. 10 is an illustrative diagram of an exemplary embodiment of a memory access step according to an embodiment of the invention, and FIG. 11 is an illustrative diagram of an example. 5 of a selective menu accessible during the method according to one embodiment of the invention.

Les dessins et la description ci-après contiennent, pour l'essentiel, des éléments à caractère certain. Les dessins 10 font partie intégrante de la description et pourront donc non seulement servir à mieux faire comprendre la présente invention, mais aussi contribuer à sa définition, le cas échéant. La figure 1 montre une représentation fonctionnelle d'un 15 terminal mobile communicant TMC 100 selon un mode de réalisation de l'invention. Le terminal mobile 100 comprend un outil de communication 110 à protocole (par exemple WiFi ) afin de communiquer OUTC avec un robot 102. 20 L'échange de donnée Dl qui à lieu entre le robot 102 et le terminal 100 permet notamment la transmission de données vidéos captés par le robot 102. De plus, le terminal mobile communicant 100 peut via l'échange de données Dl transmettre des commandes de déplacements ou d'autres 25 commandes mécaniques au robot 102. Pour cela, un mode de réalisation prévoit une entrée de commande de recherche sur le terminal mobile 100 (non représentée ici). Selon le mode de réalisation décrit, un échange de données D2 entre l'outil de communication 110 et un microprogramme 30 114, permet de transmettre des données transmises par le robot 102 au microprogramme 114. Il s'agit généralement de données vidéo numériques. Le microprogramme 114 peut à partir des données D2 produire VISU une image de travail sur une visualisation 104. Cette image de travail correspond à une prise de vues par le robot 102. Toutefois, l'image de travail ne correspond pas nécessairement aux données vidéos transmises dans les donnés Dl. En effet, en pratique l'image de travail correspond généralement à une simplification des images réellement captés par le robot 102. La visualisation 104 est avantageusement tactile pour faciliter une interaction directe entre image de travail et utilisateur (détaillé ci-dessous). Un sélectionneur 106 est disposé dans le terminal mobile 100 pour permettre à un utilisateur une sélection SEL d'une portion d'image sur l'image de travail. Lorsque la visualisation 104 est tactile, la sélection SEL peut être réalisée par désignation manuelle d'une portion choisie sur l'image de travail, par exemple avec un stylet de sélection. La portion désignée sur l'image de travail correspond à des données numériques D4 partiels des données numériques D3 formant l'image de travail. Le microprogramme 114 est agencé pour recevoir des données de sélection D5 et pour transmettre ces données à une unité informatique 112. Ces données peuvent êtres altérés avant transmission par exemple par des filtres ou autre. On distingue donc entre les données de sélection D5 reçues par le microprogramme et les données de sélection D6 transmises vers l'unité informatique 112. Néanmoins, D5 et D6 peuvent être identiques. The drawings and the description below contain, for the most part, certain elements. The drawings are an integral part of the description and may therefore serve not only to enhance the understanding of the present invention, but also to contribute to its definition, as appropriate. Figure 1 shows a functional representation of a communicating mobile terminal TMC 100 according to one embodiment of the invention. The mobile terminal 100 comprises a communication tool 110 with a protocol (for example WiFi) in order to communicate OUTC with a robot 102. The data exchange D1 which takes place between the robot 102 and the terminal 100 makes it possible in particular for the transmission of data. videos captured by the robot 102. In addition, the communicating mobile terminal 100 can via the data exchange D1 transmit movement commands or other mechanical commands to the robot 102. For this, an embodiment provides an input of search command on the mobile terminal 100 (not shown here). According to the embodiment described, a data exchange D2 between the communication tool 110 and a microprogram 114 makes it possible to transmit data transmitted by the robot 102 to the microprogram 114. This is generally digital video data. The microprogram 114 can, from the data D2, produce a working image VISU on a display 104. This work image corresponds to a shot by the robot 102. However, the work image does not necessarily correspond to the transmitted video data. in the data Dl. Indeed, in practice the working image generally corresponds to a simplification of the images actually captured by the robot 102. The display 104 is advantageously tactile to facilitate a direct interaction between the working image and the user (detailed below). A selector 106 is disposed in the mobile terminal 100 to allow a user to select a SEL of an image portion on the work image. When the display 104 is touch, the selection SEL can be performed by manually designating a selected portion on the working image, for example with a selection pen. The portion designated on the working image corresponds to partial digital data D4 of the digital data D3 forming the working image. The microprogram 114 is arranged to receive selection data D5 and to transmit these data to a computer unit 112. These data can be altered before transmission for example by filters or other. A distinction is therefore made between the selection data D5 received by the microprogram and the selection data D6 transmitted to the computer unit 112. Nevertheless, D5 and D6 may be identical.

L'unité informatique 112 va ensuite générer UI un modèle d'image en correspondance avec la portion d'image désignée avec le sélectionneur sur l'image de travail. Pour ceci, un algorithme générateur d'histogramme peut être utilisé et est implémentée dans l'unité informatique 112. Notons, que la génération peut être réalisé en coopération directe avec le microprogramme 114 ou indépendamment. Le terminal mobile 100 comprend également un organe de saisie 108. Celui-ci peut être notamment un clavier physique ou virtuel pour faciliter une saisie ORGS. Lorsque l'organe de saisie 108 est un clavier virtuel il peut être agencé ensemble avec la visualisation 104. En effet, dans ce cas il s'agit généralement d'un clavier accessible par écran tactile. Pour des raisons de conceptions et des raisons pratiques la visualisation 104 et l'organe de saisie 108 sont alors agencés ensemble. Il est particulièrement avantageux de prévoir un organe de saisie 108 comprenant un microphone et un programme de reconnaissance vocale. En effet, ceci facilite la saisie pour l'utilisateur final qui n'aurait pas besoin d'écrire pour saisir un identifiant unique intelligible. Notamment les enfants et les personnes atteints de dyslexie profiteront particulièrement de ce mode de réalisation. Selon un mode de réalisation une implémentation d'un algorithme de calcul du type comparaison dynamique (anglais : Dynamic Time Warping ) pour la reconnaissance vocale est prévue. Le microprogramme 114 est agencé pour recevoir des données D7 formés avec l'organe de saisie 108. Ces données D7 comprennent un identifiant unique intelligible, généralement choisi par l'utilisateur lors de la saisie ORGS. L'identifiant unique intelligible est en correspondance directe avec la sélection faite au moyen du sélectionneur 106, à savoir avec la portion d'image choisie par l'utilisateur. Le microprogramme 114 est agencé pour transmettre D8 à l'unité informatique 112 un identifiant unique intelligible défini avec l'organe de saisie 108. La transmission des données D8 est réalisée pour stocker l'identifiant unique intelligible en correspondance avec le modèle d'image généré par l'unité informatique 112. Le stockage peut se faire dans une zone de mémoire de type RAM et peut être agencé dans une base donnée relationnelle. Bien évidemment, le stockage n'est pas nécessairement agencé dans l'unité informatique 112. Une mémoire distincte peut être utilisée et être stockée aux choix dans le terminal mobile 100, le robot 102 ou dans l'unité informatique 112. De plus, la figure 1 montre une unité informatique 112 agencée dans le terminal mobile communicant 100. Ceci n'est pas nécessairement le cas. Computer unit 112 will then generate UI an image template in correspondence with the image portion designated with the selector on the working image. For this, a histogram generator algorithm can be used and is implemented in the computer unit 112. Note that the generation can be performed in direct cooperation with the microprogram 114 or independently. The mobile terminal 100 also comprises an input device 108. This can be in particular a physical or virtual keyboard to facilitate an ORGS input. When the input member 108 is a virtual keyboard it can be arranged together with the display 104. Indeed, in this case it is usually a keyboard accessible by touch screen. For design reasons and practical reasons the display 104 and the input member 108 are then arranged together. It is particularly advantageous to provide an input device 108 comprising a microphone and a voice recognition program. Indeed, this facilitates the input for the end user who would not need to write to enter a unique identifier intelligible. Especially children and people with dyslexia will particularly benefit from this embodiment. According to one embodiment an implementation of a calculation algorithm of the dynamic comparison (Dynamic Time Warping) type for speech recognition is provided. The microprogram 114 is arranged to receive data D7 formed with the input device 108. This data D7 comprises a unique identifier intelligible, generally chosen by the user during the ORGS input. The unique intelligible identifier is in direct correspondence with the selection made by means of the selector 106, namely with the image portion chosen by the user. The microprogram 114 is arranged to transmit D8 to the computer unit 112 a unique intelligible identifier defined with the input device 108. The data transmission D8 is performed to store the unique intelligible identifier in correspondence with the image model generated. by the computer unit 112. The storage can be done in a memory area of the RAM type and can be arranged in a given relational database. Of course, the storage is not necessarily arranged in the computer unit 112. A separate memory can be used and stored in the choices in the mobile terminal 100, the robot 102 or in the computer unit 112. In addition, the Figure 1 shows a computer unit 112 arranged in the communicating mobile terminal 100. This is not necessarily the case.

La figure 2 montre une représentation fonctionnelle d'un terminal mobile communicant TMC 100 de l'invention selon un autre mode de réalisation de l'invention. Dans ce mode de réalisation, l'unité 112 peut être déconnectée matériellement du terminal mobile 100. L'outil de communication 110 assure alors le transfert de données, notamment D6 et D8. La connexion virtuelle du microprogramme 114 avec l'unité informatique 112 est également assurée par l'outil de communication 110. Dans ce mode de réalisation l'unité informatique 112 est 30 agencé dans un ordinateur personnel ce qui peut être avantageux en matière de puissance de calcul et de stockage. La figure 3 montre une représentation fonctionnelle d'un terminal mobile communicant TMC 100 de l'invention selon un autre mode de réalisation dans lequel l'unité informatique 112 est agencée dans le robot 102. Dans ce mode de réalisation, l'unité 112 peut être déconnectée matériellement du terminal mobile 100. L'outil de communication 110 assure alors le transfert de données, notamment D6 et D8. La connexion virtuelle du microprogramme 114 avec l'unité informatique 112 est également assurée par l'outil de communication 110. Un robot comprend de manière générale toujours une unité informatique de traitement. Notamment, cette unité assure des fonctions de motorisation dudit robot. Il peut être avantageux de disposer l'unité informatique 112 dans le robot 102 pour des raisons de conceptions et d'utilisation facilitée pour l'utilisateur final. La figure 4 est une représentation schématique du 20 dispositif cybernétique 400 de l'invention. Selon le mode de réalisation décrit ici, le dispositif 400 cybernétique comprend un robot 102 un terminal mobile communicant 100 et une unité informatique 112. Le robot 102, le terminal 100 et l'unité informatique 112 sont 25 reliés entre eux par un outil de communication 110 à protocole du type WiFi ou autre. Le robot 102 du dispositif 400 comporte une caméra 116 pour capter CAM des données vidéo. Les données vidéos correspondent généralement à une prise de vues dudit robot 30 102. FIG. 2 shows a functional representation of a communicating mobile terminal TMC 100 of the invention according to another embodiment of the invention. In this embodiment, the unit 112 can be physically disconnected from the mobile terminal 100. The communication tool 110 then provides data transfer, including D6 and D8. The virtual connection of the microprogram 114 with the computer unit 112 is also provided by the communication tool 110. In this embodiment the computer unit 112 is arranged in a personal computer which can be advantageous in terms of the power of the computer. calculation and storage. FIG. 3 shows a functional representation of a communicating mobile terminal TMC 100 of the invention according to another embodiment in which the computer unit 112 is arranged in the robot 102. In this embodiment, the unit 112 can The communication tool 110 then transfers the data, in particular D6 and D8. The virtual connection of the microprogram 114 with the computer unit 112 is also provided by the communication tool 110. A robot generally always comprises a computer processing unit. In particular, this unit provides motorization functions of said robot. It may be advantageous to have the computer unit 112 in the robot 102 for reasons of design and ease of use for the end user. Figure 4 is a schematic representation of the cybernetic device 400 of the invention. According to the embodiment described here, the cybernetic device 400 comprises a robot 102 a communicating mobile terminal 100 and a computer unit 112. The robot 102, the terminal 100 and the computer unit 112 are interconnected by a communication tool. 110 to protocol type WiFi or other. The robot 102 of the device 400 includes a camera 116 for capturing CAM video data. The video data generally correspond to a shooting of said robot 30 102.

Le terminal mobile communicant 100 comporte une visualisation 104 pour la formation d'une image de travail. La formation de l'image de travail est réalisée à partir des données vidéos captés par la caméra 116 disposée sur le robot 102, lesquelles sont transmises au terminal 100 via l'outil de communication 110. L'image de travail peut être traitée selon les besoins et être soumise à un ou plusieurs filtrages par exemple. Le dispositif cybernétique 400 comprend un sélectionneur 106 pour désigner une portion d'image sur l'image de travail. Le sélectionneur est préférentiellement agencé sur le terminal mobile 100 car cela facilite son usage à l'utilisateur final. Toutefois, il ne l'est pas nécessairement. En effet, le sélectionneur peut notamment être agencé dans un ordinateur personnel lequel représente l'image de travail sur un écran par exemple. Pour ces raisons le lien entre la visualisation 104 et le sélectionneur 106 est représenté par une ligne discontinue sur la figue 4. The communicating mobile terminal 100 includes a display 104 for the formation of a working image. The formation of the working image is performed from the video data captured by the camera 116 disposed on the robot 102, which are transmitted to the terminal 100 via the communication tool 110. The work image can be processed according to the needs and be subject to one or more filtering for example. The cybernetic device 400 includes a selector 106 for designating an image portion on the working image. The selector is preferably arranged on the mobile terminal 100 because it facilitates its use to the end user. However, it is not necessarily so. Indeed, the breeder may in particular be arranged in a personal computer which represents the working image on a screen for example. For these reasons, the link between the display 104 and the selector 106 is represented by a broken line in FIG.

Avantageusement, la composante (ordinateur, TMC, ou autre) formant le sélectionneur 106 comprend des moyens tactiles pour la sélection SEL de la portion d'image de l'image de travail. Le dispositif cybernétique 400 comprend une unité informatique 112 dans laquelle est agencé un analyseur 118 d'image. L'analyseur 108 génère un modèle d'image selon la portion d'image désignée avec le sélectionneur 106. L'analyseur 118 peut comprendre une implémentation d'un programme de traitement d'image du type générateur d'histogramme. Advantageously, the component (computer, TMC, or other) forming the selector 106 comprises touching means for the SEL selection of the image portion of the working image. The cybernetic device 400 comprises a computer unit 112 in which an image analyzer 118 is arranged. The analyzer 108 generates an image template according to the image portion designated with the selector 106. The analyzer 118 may include an implementation of an image processing program of the histogram generator type.

Le dispositif comprend une zone de mémoire 120 pour stocker d'une part le modèle d'image généré avec l'analyseur 118 dans l'unité informatique 112. Il peut s'agir d'une zone de mémoire de type RAM. La zone de mémoire peut être agencée au choix dans l'unité informatique 112, le terminal mobile communicant 100, le robot 102 ou encore distinctement sur un disque dur portable par exemple. Le dispositif 400 comprend en outre un organe de saisie 108 pour la saisie d'un identifiant unique intelligible en correspondance avec le modèle d'image. Généralement, c'est l'utilisateur final qui définira cet identifiant. Préférentiellement, l'organe de saisie 108 comprend un microphone et un programme de reconnaissance vocale pour faciliter son utilisation. L'organe de saisie peut être agencé dans le terminal mobile 100 ou distinctement. Notamment l'organe de saisie 108 peut être agencé dans un ordinateur personnel. La zone de mémoire 120 est agencée pour stocker, à coté du modèle d'image généré par l'analyseur 118, l'identifiant unique intelligible défini avec l'organe de saisie 108. La zone de mémoire 120 peut comprendre une base de donnée relationnelle pour stocker la correspondance entre chaque identifiant unique intelligible saisie et son modèle d'image qui lui est propre. The device comprises a memory area 120 for storing, on the one hand, the image model generated with the analyzer 118 in the computer unit 112. It can be a memory zone of the RAM type. The memory area may be arranged in the computer unit 112, the communicating mobile terminal 100, the robot 102 or distinctly on a portable hard disk for example. The device 400 further comprises an input member 108 for the input of a unique intelligible identifier in correspondence with the image model. Generally, it is the end user who will define this identifier. Preferably, the input device 108 comprises a microphone and a voice recognition program to facilitate its use. The input member may be arranged in the mobile terminal 100 or separately. In particular, the input member 108 may be arranged in a personal computer. The memory area 120 is arranged to store, next to the image model generated by the analyzer 118, the unique intelligible identifier defined with the input device 108. The memory area 120 can comprise a relational database. to store the correspondence between each unique intelligible identifier entered and its own image model.

Selon un mode de réalisation le dispositif cybernétique 400 comprend une entrée de commande de recherche et un scanner d'environnement (non représentés). Dans ce mode de réalisation, l'entrée de commande de recherche peut être agencée sur le terminal mobile communicant 100 et le scanner peut être agencé sur le robot 102. According to one embodiment, the cybernetic device 400 comprises a search control input and an environment scanner (not shown). In this embodiment, the search control input can be arranged on the communicating mobile terminal 100 and the scanner can be arranged on the robot 102.

Préférentiellement, le scanner est agencé dans la caméra 116. La figure 5 montre un organigramme fonctionnel d'un procédé mis en œuvre avec le dispositif de l'invention. Le procédé cybernétique de l'invention comprend une opération 500 de captage CAPT pour capter des données vidéo par une caméra 116 disposée sur un robot 102. Les données vidéo correspondent à une prise de vues par le robot 102. La prochaine opération 502 de formation d'image FORM IMG W comprend la formation d'une image de travail. L'image de travail peut être formée au moyen d'un microprogramme 114 sur une visualisation 104 agencée sur le terminal mobile communicant 100. La formation de l'image de travail est réalisée à partir des données vidéo captées lors de l'opération 500 précédente. Une suivante opération 504 de désignation DES comprend le fait de désigner une portion d'image sur l'image de travail formée à l'opération 502 précédente. La désignation peut être réalisée avec un sélectionneur 106. La portion d'image de travail est généralement choisie par l'utilisateur final et correspond à un objet se trouvant dans l'environnement du robot 102. En effet, l'image de travail correspond indirectement à une prise de vues du robot 102. L'image de travail comprend donc des objets se trouvant dans l'environnement du robot 102. L'opération 506 suivante de génération GEN MOD IMG comporte la génération d'un modèle d'image à partir de la portion d'image désignée à l'opération de désignation 504. Selon un exemple de réalisation, la génération est faite avec un logicielle du type générateur d'histogramme. Le modèle d'image est donc directement lié à l'apparence réelle de l'objet initialement désigné par l'utilisateur sur l'image de travail. L'opération 508 suivante de saisie IDENT UNI INTEL comprend la saisie d'un identifiant unique intelligible correspondant au modèle d'image généré à l'opération 506. Le choix de l'identifiant est généralement réalisé par l'utilisateur final pour définir logiquement la portion d'image préalablement choisie lors de l'opération de désignation 504. La saisie est réalisée au moyen d'un organe de saisie 108. Avantageusement, cet organe de saisie comprend un microphone et un programme de reconnaissance vocale utilisant un algorithme de calcul du type comparaison dynamique (anglais : Dynamic Time Warping ). Une opération 510 de stockage SVE MEM assure ensuite le stockage du modèle d'image généré lors de l'opération 506 de génération GEN MOD IMG en association avec l'identifiant unique intelligible défini lors de l'opération 508 de saisie IDENT UNI INTEL. Le stockage peut être réalisé dans la mémoire 120 notamment. Preferably, the scanner is arranged in the camera 116. FIG. 5 shows a functional flowchart of a method implemented with the device of the invention. The cybernetic method of the invention comprises a capturing operation 500 CAPT for capturing video data by a camera 116 disposed on a robot 102. The video data correspond to a shot by the robot 102. The next operation 502 of training FORM IMG W image comprises the formation of a working image. The working image may be formed by means of a microprogram 114 on a display 104 arranged on the communicating mobile terminal 100. The formation of the working image is performed from the video data captured during the previous operation 500 . A subsequent DES designation operation 504 comprises designating an image portion on the working image formed at the previous operation 502. The designation can be made with a selector 106. The portion of the work image is generally chosen by the end user and corresponds to an object in the environment of the robot 102. Indeed, the work image corresponds indirectly. to a shot of the robot 102. The working image therefore comprises objects in the environment of the robot 102. The following operation 506 generation GEN MOD IMG comprises the generation of an image model from of the image portion designated to the designation operation 504. According to an exemplary embodiment, the generation is done with a software of the type histogram generator. The image model is therefore directly related to the real appearance of the object initially designated by the user on the working image. The following operation 508 of input IDENT UNI INTEL comprises the input of a unique intelligible identifier corresponding to the image model generated in the operation 506. The choice of the identifier is generally made by the end user to logically define the portion of image previously chosen during the designation operation 504. The capture is carried out by means of an input member 108. Advantageously, this input member comprises a microphone and a speech recognition program using an algorithm for calculating the image. dynamic comparison type (Dynamic Time Warping). An SVE MEM storage operation 510 then ensures the storage of the image model generated during the GEN GEN generation operation 506 IMG in association with the unique intelligible identifier defined during the IDENT UNI INTEL capture operation 508. Storage can be performed in the memory 120 in particular.

Le procédé cybernétique permet notamment au moyen du terminal mobile communicant 100 de l'invention et/ou du dispositif 400 de l'invention de contourner les problèmes rencontrés avec les algorithmes de segmentation. La figure 6 montre un autre mode de réalisation du procédé 25 de l'invention avec des opérations supplémentaires. Dans ce mode de réalisation, le procédé cybernétique comprend en outre une opération 600 d'accès à la mémoire ACC MEM pour accéder à la zone de mémoire 120. Dans le mode décrit ici, l'accès se fait par connexion à l'appareil 30 comportant la zone mémoire 120 (TMC, ordinateur ou robot). L'accès permet avec l'opération 602 la sélection SEL ID UNI INTEL au moyen du sélectionneur 106 un identifiant unique intelligible préalablement défini et stocké dans la zone de mémoire 120. La sélection 602 peut être faite par saisie orale de l'identifiant lorsque l'organe de saisie comprend un microphone et un programme de reconnaissance vocale. La sélection 602 d'un identifiant unique intelligible est directement lié à l'association dans la zone de mémoire 120 entre cet identifiant et un modèle d'image correspondant généré à l'opération 506. The cybernetic method makes it possible in particular by means of the communicating mobile terminal 100 of the invention and / or the device 400 of the invention to circumvent the problems encountered with the segmentation algorithms. Figure 6 shows another embodiment of the method of the invention with additional operations. In this embodiment, the cybernetic method further comprises an operation 600 of access to the memory ACC MEM to access the memory area 120. In the mode described here, the access is made by connection to the device 30 having the memory zone 120 (TMC, computer or robot). The access allows with the operation 602 the selection SEL UNI INTEL ID by means of the selector 106 a unique identifier intelligible previously defined and stored in the memory area 120. The selection 602 can be made by oral input of the identifier when the input device comprises a microphone and a voice recognition program. The selection 602 of a unique intelligible identifier is directly related to the association in the memory area 120 between this identifier and a corresponding image template generated in the operation 506.

Dans l'opération 604 suivante d'envoie de commande SCAN COM, une commande de scan est transmise à un robot 102. Le robot 102 va dans une opération 606 suivante de balayage SCAN MR scanner le monde réel qui l'entoure. Dans le cas d'un robot de service, l'environnement peut être un centre hospitalier ou une usine nucléaire par exemple. Dans le cas d'un robot ludique ou domestique, l'environnement peut être l'intérieur d'une maison, un habitacle de voiture ou un jardin par exemple. L'environnement du robot 102, à savoir le monde réel comprend des objets. In the following operation 604 of SCAN COM command sending, a scan command is transmitted to a robot 102. The robot 102 goes into a next scanning operation 606 SCAN MR to scan the surrounding real world. In the case of a service robot, the environment may be a hospital center or a nuclear plant for example. In the case of a fun or domestic robot, the environment can be the interior of a house, a car interior or a garden for example. The environment of the robot 102, namely the real world, includes objects.

Dans une suivante opération 608, le robot 102 va sélectionner aléatoirement un objet du monde réel et générer dans une opération 610 de génération GEN MOD OBJ un modèle d'objet basé sur la forme de l'objet sélectionné. L'opération 610 de génération de modèle d'objet est réalisée avec un logiciel identique à celui utilisé pour l'opération 506 de génération de modèle d'image. Une opération suivante 612 de comparaison ALIGN MOD OBJ VS MOD IMG va compare le modèle d'objet généré à l'opération 610 au modèle d'image généré à l'opération 506, et établir un facteur de correspondance selon l'identité entre les deux modèles. Ceci peut être réalisé par une superposition des histogrammes respectifs. Lorsque, lors de l'opération de comparaison 612, l'identité des deux modèles dépasse un seuil prédéterminé (par exemple 90% d'identité) le robot 102 termine le balayage de l'environnement est donne un retour affirmatif à l'utilisateur indiquant qu'il à trouvé l'objet. Lorsque, lors de l'opération de comparaison 612, l'identité des deux modèles ne dépasse pas le seuil prédéterminé, le robot 102 reprend l'opération 606 de balayage SCAN MR pour scanner le monde réel et sélectionner (opération 608) un autre objet pour effectuer une comparaison (opération 612). L'opération 614 de calcul permet de définir l'identité entre modèle d'objet et modèle d'image. Généralement l'opération 614 est comprise dans l'opération 612 de comparaison. In a subsequent operation 608, the robot 102 will randomly select an object of the real world and generate in a GEN generation operation 610 GEN OBJ an object model based on the shape of the selected object. The object model generation operation 610 is performed with software identical to that used for the image template generation operation 506. A following comparison operation 612 compares the object model generated with the operation 610 to the image model generated in the operation 506, and establishes a correspondence factor according to the identity between the two. models. This can be achieved by superimposing the respective histograms. When, during the comparison operation 612, the identity of the two models exceeds a predetermined threshold (for example 90% of identity) the robot 102 ends the scanning of the environment and gives an affirmative return to the user indicating that he found the object. When, during the comparison operation 612, the identity of the two models does not exceed the predetermined threshold, the robot 102 resumes scanning operation 606 SCAN MR to scan the real world and select (operation 608) another object to perform a comparison (operation 612). The calculation operation 614 makes it possible to define the identity between the object model and the image model. Generally operation 614 is included in the comparison operation 612.

Dans la suite il est décrit un exemple de réalisation de l'invention lors de sa mise en oeuvre pratique par un utilisateur. Cet exemple est donné à titre indicatif pour faciliter la compréhension du lecteur et est présenté avec une implémentation de logicielle non limitative. L'exemple ci-après fait référence à la totalité des figures 1 à 11. Ainsi l'invention, permet notamment à un utilisateur ne connaissant pas la robotique d'apprendre des mots nouveaux à un robot en utilisant un objet médiateur : un terminal mobile communiquant 100. Avantageusement, ce terminal mobile communiquant peut être tactile pour faciliter l'interaction avec l'utilisateur. À titre d'exemple de terminal mobile communiquant 100 on peut citer un iPhone ou iPodTouch de la société Apple Inc. . In the following is described an embodiment of the invention during its practical implementation by a user. This example is given as an indication to facilitate understanding of the reader and is presented with a non-limiting software implementation. The example below refers to the entirety of FIGS. 1 to 11. Thus, the invention notably allows a user unfamiliar with robotics to learn new words from a robot by using a mediator object: a mobile terminal Communicating 100. Advantageously, this mobile terminal communicating can be tactile to facilitate interaction with the user. As an example of a mobile terminal communicating 100, mention may be made of an Apple Inc. iPhone or iPod Touch.

Le terminal mobile communicant 100 comprend une visualisation 104 apte à afficher des données d'image. The communicating mobile terminal 100 comprises a display 104 able to display image data.

Ainsi le terminal 100 peut afficher un retour vidéo d'une caméra 116 disposée sur un robot 102. Le retour vidéo est généralement composé de données vidéo numériques captés par la caméra 116 disposée sur le robot 102, et correspondent à une prise de vues par ce robot 102. Ainsi, la caméra 116 capte l'environnement physique du robot 102. L'environnement physique est généralement le monde réel et comprend des objets. En exemple d'objets du monde réel on peut notamment citer, des chaises, des tables, des téléviseurs, des arbres, des voitures, des jouets (balles etc.) ou autre. Bien évidemment, les objets rencontrés sont dépendant de l'environnement du robot 102. Les données d'image affichées sur le terminal mobile communicant 100 permettent de former une image de travail 502. Sur cette image de travail, l'utilisateur peut désigner 504 une portion d'image. Cette portion d'image comprend généralement la visualisation d'un objet du monde réel. La désignation 504 permet donc de montrer des objets du monde réel au robot 102. Thus, the terminal 100 can display a video return of a camera 116 arranged on a robot 102. The video return is generally composed of digital video data captured by the camera 116 disposed on the robot 102, and corresponds to a shooting by this camera. robot 102. Thus, the camera 116 captures the physical environment of the robot 102. The physical environment is generally the real world and includes objects. Examples of objects in the real world include, chairs, tables, televisions, trees, cars, toys (balls etc.) or other. Of course, the objects encountered are dependent on the environment of the robot 102. The image data displayed on the communicating mobile terminal 100 makes it possible to form a working image 502. In this work image, the user can designate 504 a portion of image. This image portion generally includes viewing an object of the real world. The designation 504 therefore makes it possible to show objects of the real world to the robot 102.

La figure 7 montre un schéma illustratif de la désignation DES 504 d'une portion d'image sur une image de travail comprenant un objet du monde réel - ici une balle. Lorsque le terminal mobile 100 comprend un écran tactile, la désignation DES 504 est généralement réalisée par entourage d'une portion d'image sur l'image de travail, la portion d'image comprenant un objet. L'entourage peut se faire à la main ou avec un stylet adapté pour écran tactile. L'entourage de l'objet permet non seulement à l'utilisateur de montrer l'objet au robot 102, mais également de réaliser une segmentation manuelle de l'image sans utiliser d'algorithmes de segmentation. Ceci apporte notamment une robustesse au procédé de l'invention. Dans l'exemple de réalisation décrit ici, le terminal mobile 100 comprend une implémentation d'un algorithme dénommé Navidget . L'algorithme est décrit dans la publication scientifique Navidget for 3D Interaction: Camera Positioning and Further Uses Martin Hachet, Fabrice Dècle, Sebastian Knôdel, Pascal Guitton. Int. J. Human-Computer Studies - 2008 , et permet une désignation sensible. Selon l'invention, il est ensuite généré 506 un modèle d'image à partir de la portion d'image désignée (lors de l'étape 504). Le modèle d'image généré est directement dépendant de la portion d'image, ainsi que des éléments de formes la constituant. Grossièrement, le modèle d'image correspond donc à un objet du monde réel. La génération 506 du modèle d'image peut notamment être réalisée au moyen d'un logicielle comprenant un algorithme de type générateur d'histogramme. Le logicielle peut être contenu sur une zone de mémoire localisée dans une unité informatique 112 physiquement distincte ou comprise dans le terminal mobile communicant 100. Lorsqu'il s'agit d'une unité informatique 112 physiquement distincte du terminal mobile communicant 100, celle-ci peut être logée dans un ordinateur personnel ou encore dans le robot 102. Selon un mode de réalisation optionnel, le terminal mobile 100 est agencé pour afficher un menu contextuel MENU sur l'écran après désignation 504. Ce menu MENU contextuel peut notamment proposer à l'utilisateur différents choix d'interactions en fonction de l'objet entouré. Figure 7 shows an illustrative diagram of the DES 504 designation of an image portion on a work image including a real world object - here a bullet. When the mobile terminal 100 comprises a touch screen, the designation DES 504 is generally made by surrounding an image portion on the working image, the image portion comprising an object. The entourage can be done by hand or with a pen adapted for touch screen. The surrounding of the object not only allows the user to show the object to the robot 102, but also to perform a manual segmentation of the image without using segmentation algorithms. This brings in particular a robustness to the process of the invention. In the embodiment described here, the mobile terminal 100 includes an implementation of an algorithm called Navidget. The algorithm is described in the scientific publication Navidget for 3D Interaction: Camera Positioning and Further Uses by Martin Hachet, Fabrice Dècle, Sebastian Knôdel and Pascal Guitton. Int. J. Human-Computer Studies - 2008, and allows a sensitive designation. According to the invention, an image template is then generated 506 from the designated image portion (at step 504). The image model generated is directly dependent on the image portion, as well as the shape elements constituting it. Roughly, the image model corresponds to an object of the real world. The generation 506 of the image model may in particular be carried out using software comprising a histogram generator type algorithm. The software may be contained on an area of memory located in a computer unit 112 that is physically distinct or included in the communicating mobile terminal 100. In the case of a computer unit 112 that is physically distinct from the communicating mobile terminal 100, the latter can be housed in a personal computer or in the robot 102. According to an optional embodiment, the mobile terminal 100 is arranged to display a MENU context menu on the screen after designation 504. This menu contextual menu may in particular propose to the user. user different choices of interactions depending on the object surrounded.

La figure 8 montre un schéma illustratif d'un menu contextuel MENU, proposant différentes options de commande. Dans l'exemple de réalisation décrit, le menu contextuel MENU propose quatre options à l'utilisateur : - Nom , Zoom , - Approcher et Figure 8 shows an illustrative diagram of a MENU pop-up menu, offering different control options. In the embodiment described, the contextual menu MENU offers four options to the user: - Name, Zoom, - Approach and

- Qu'est ce que c'est ? . - What is that ? .

La première option Nom vise un retour par le robot 102 10 en association avec l'objet sélectionné. Cette option est détaillée plus loin. The first option Name aims a return by the robot 102 10 in association with the selected object. This option is detailed below.

La seconde option Zoom permet une interaction directe avec la caméra disposée sur le robot 102. Cette option active mécaniquement l'objectif de la caméra 116 pour The second option Zoom allows a direct interaction with the camera arranged on the robot 102. This option mechanically activates the lens of the camera 116 to

15 donner une vue approchée de l'objet sélectionné. L'interaction avec la caméra 116 peut bien évidemment être de toute autre nature et par exemple comprendre une fonction pour photographier l'objet, altérer la couleur de vue, ajuster la précision etc. 15 give an approximate view of the selected object. The interaction with the camera 116 can obviously be of any other nature and for example include a function to photograph the object, alter the color of view, adjust the accuracy etc.

20 Le menu contextuel MENU peut également comprendre d'autres commandes du type comportant des fonctions de déplacement du robot 102. Ainsi, une commande Approcher peut viser l'avancement mécanique du robot 102 vers l'objet sélectionné. D'autres commandes de déplacement peuvent The contextual menu MENU may also include other commands of the type comprising functions for moving robot 102. Thus, an Approcher command may aim at the mechanical advancement of robot 102 towards the selected object. Other displacement commands can

25 consister en un contournement de l'objet sélectionné ou encore une commande pour photographier l'objet sélectionné à partir d'un autre angle. 25 consists of a bypass of the selected object or a command to photograph the selected object from another angle.

La figure 9 montre schéma illustratif d'une saisie ENTR d'un identifiant unique intelligible 508. Figure 9 shows an illustrative diagram of an ENTR entry of a unique intelligible identifier 508.

Lorsque l'option Qu'est ce que c'est ? est sélectionnée, selon l'invention l'utilisateur peut saisir ENTR 508 sur un organe de saisie 108 un identifiant unique intelligible correspondant au modèle d'image généré par l'unité informatique 112. Pour cela, le terminal mobile 100 peut comprendre un clavier physique ou virtuel. Par clavier virtuel on entend ici un clavier représenté et accessible par un écran tactile. Sur un clavier virtuel, la saisie ENTR 508 peut se faire à la main ou avec un stylet adapté pour écran tactile. Selon un autre mode de réalisation terminal mobile communicant 100 est agencé pour communiquer avec un programme de reconnaissance d'écriture, comme représenté sur la figure 9. Ainsi, un utilisateur peut directement écrire à main libre ou avec un stylet sur l'écran tactile du terminal 100. Selon un mode de réalisation préférentiel, le terminal mobile communicant 100 comprend un microphone en association avec un programme de reconnaissance vocale. When the What is it? is selected, according to the invention the user can enter ENTR 508 on an input member 108 a unique intelligible identifier corresponding to the image model generated by the computer unit 112. For this, the mobile terminal 100 may include a physical keyboard or virtual. By virtual keyboard is meant here a keyboard shown and accessible by a touch screen. On a virtual keyboard, input ENTR 508 can be done by hand or with a pen adapted for touch screen. According to another communicating mobile terminal embodiment 100 is arranged to communicate with a handwriting recognition program, as shown in FIG. 9. Thus, a user can directly write with a free hand or with a stylus on the touch screen of the device. In a preferred embodiment, the communicating mobile terminal 100 includes a microphone in association with a voice recognition program.

Ceci facilite la saisie 508 pour l'utilisateur. Le programme de reconnaissance vocale peut notamment comprendre un algorithme de calcul du type comparaison dynamique (anglais : Dynamic Time Warping ). On peut ainsi associer au modèle d'image à un identifiant unique intelligible. Par identifiant unique intelligible on entend ici un identifiant lequel est associé uniquement à un objet précis, à savoir à un unique modèle d'image généré. L'identifiant unique intelligible peut être un son, un mot écrit sur l'écran ou un mot entré sur un clavier physique ou virtuel. Dans l'exemple de réalisation décrit ici, cet identifiant est le mot balle . This facilitates the entry 508 for the user. The speech recognition program may include a dynamic comparison algorithm (Dynamic Time Warping). It is thus possible to associate the image model with a unique intelligible identifier. By intelligible unique identifier is meant here an identifier which is associated only with a specific object, namely a single image model generated. The unique intelligible identifier may be a sound, a word written on the screen or a word entered on a physical or virtual keyboard. In the exemplary embodiment described here, this identifier is the word bullet.

L'invention comprend une zone de mémoire 120 dédiée pour stocker 510 l'association entre le modèle d'image et l'identifiant unique intelligible. Cette zone de mémoire 120 peut soit être directement comprise dans le terminal mobile 100, soit être agencé dans un ordinateur personnel ou dans le robot 102. Le stockage de l'association peut être agencé dans une base de données du type relationnelle. Après stockage dans la zone de mémoire dédiée 120 de l'association entre le modèle d'image et l'identifiant unique intelligible, l'option Nom mentionnée ci-dessus devient accessible avec retour positif. En effet, pour que le reconnaisse de l'objet sélectionné est possible, le modèle d'image correspondant à l'objet doit être préalablement stocké. The invention comprises a dedicated memory area 120 for storing the association 510 between the image model and the unique intelligible identifier. This memory area 120 can either be directly included in the mobile terminal 100, or be arranged in a personal computer or in the robot 102. The storage of the association can be arranged in a database of the relational type. After storage in the dedicated memory area 120 of the association between the image model and the unique intelligible identifier, the Name option mentioned above becomes accessible with positive feedback. Indeed, for the recognition of the selected object is possible, the image model corresponding to the object must be previously stored.

Lorsque ceci n'est pas réalisé (aucun stockage n'est fait) aucune reconnaissance n'est possible, si ce n'est une reconnaissance erronée. Le retour serait alors un ensemble vide ou un ensemble erroné. Lorsqu'un modèle d'image correspondant à l'objet sélectionné est disponible dans la zone de mémoire 120, un retour positif de l'identifiant unique intelligible est accessible. Selon un mode de réalisation avancée, le robot 102 peut disposer d'une commande 604 de recherche d'objet. Ainsi, en fournissant en entrée par saisie 602 de recherche SRCH sur le terminal mobile communicant 100 un identifiant unique intelligible précédemment stocké dans la zone de mémoire dédiée 120, le robot 102 peut rechercher des modèles d'images correspondant à cet identifiant. La figure 10 montre la saisie 602 de recherche SRCH par écriture manuelle. When this is not done (no storage is done) no recognition is possible, if not a wrong recognition. The return would then be an empty set or an erroneous set. When an image template corresponding to the selected object is available in the memory area 120, a positive return of the unique identifiable identifier is accessible. According to an advanced embodiment, the robot 102 may have a command 604 object search. Thus, by inputting SRCH search input 602 on the communicating mobile terminal 100 with a unique intelligible identifier previously stored in the dedicated memory area 120, the robot 102 can search for image templates corresponding to that identifier. Figure 10 shows the SRCH search seizure 602 by handwriting.

L'utilisateur accède 600 à la mémoire dédiée pour sélectionner l'identifiant unique intelligible 602. Lorsque la saisie de recherche SRCH est faite au moyen d'une reconnaissance d'écriture ou vocale, il se peut que plusieurs identifiants uniques intelligibles correspondent à la saisie 602. Ceci dépend directement de la sensibilité des moyens de reconnaissance (d'écriture ou vocale). Ainsi, on peut obtenir un retour multiple de modèles d'images pouvant correspondre à l'identifiant saisi. The user accesses the dedicated memory 600 to select the unique intelligible identifier 602. When the SRCH search entry is made by means of a voice or a voice recognition, it is possible for several unique identifiable identifiers to correspond to the entry. 602. This depends directly on the sensitivity of the means of recognition (writing or vocal). Thus, it is possible to obtain a multiple return of image models that can correspond to the identifier entered.

Selon un mode de réalisation préférentiel la saisie de recherche SRCH 602 est réalisée par reconnaissance vocale pour faciliter la saisie à l'utilisateur. La figure 11 montre le cas d'un retour multiple. La visualisation du terminal mobile communicant 100 affiche des choix graphiques correspondant à un objet et ainsi à un modèle d'image. L'utilisateur peut alors sélectionner CHO l'objet modèle d'image dont il souhaite effectuer une recherche. Selon un mode de réalisation les objets modèles d'images peuvent êtres classées par ordre de pertinence sur la visualisation (par exemple au moyen d'un algorithme de type comparaison dynamique (anglais : Dynamic Time Warping ). Une fois la sélection CHO faite, une commande de scan 604 est transmise vers le robot 102. Ce dernier va alors balayer 606 au moyen d'un scanner son environnement comprenant des objets du monde réel. Le robot 102 est agencé pour sélectionner 608 un objet de l'environnement et pour établir un modèle d'objet 610 dudit objet. Le modèle d'objet est ensuite comparé 612 au modèle d'image stocké dans la mémoire dédiée pour établir un facteur de correspondance 614 selon l'identité entre les deux modèles. L'identité peut être établie au moyen d'algorithmes de superposition d'histogrammes. Si le facteur de correspondance atteint une valeur seuil indiquant que le modèle d'objet correspond au modèle d'image, alors le robot 102 a trouvé l'objet recherché. According to a preferred embodiment the search input SRCH 602 is performed by voice recognition to facilitate the input to the user. Figure 11 shows the case of a multiple return. The visualization of the communicating mobile terminal 100 displays graphic choices corresponding to an object and thus to an image model. The user can then select CHO the image model object he wants to search. According to one embodiment, the model image objects can be classified in order of relevance on the visualization (for example by means of a dynamic comparison algorithm (Dynamic Time Warping). Once the CHO selection is made, a scan command 604 is transmitted to the robot 102. The latter will then scan 606 by means of a scanner its environment including real-world objects The robot 102 is arranged to select 608 an object from the environment and to establish a object model 610 of said object The object model is then compared 612 to the image model stored in the dedicated memory to establish a correspondence factor 614 according to the identity between the two models. using the histogram overlay algorithm If the match factor reaches a threshold value indicating that the object model corresponds to the image model, then the robot 102 found the object r esearch.

Réciproquement, le dispositif de l'invention peut partir directement d'un modèle d'image capté par la caméra du robot 102 dans l'environnement et proposer un choix unique ou des choix multiples d'identifiant uniques intelligibles. Le terminal mobile communicant selon l'invention permet à un utilisateur d'apprendre de manière interactive des mots nouveaux à un robot sans nécessiter de pré-requis ou de formations particulières en robotique et/ou informatique. Un utilisateur du grand public peut ainsi donner à un robot des leçons d'apprentissage de bonne qualité, associant des représentations visuelles (modèle d'image) d'objets à des représentations des référents (identifiant unique intelligible). L'invention est particulièrement adaptée pour l'utilisation dans un cadre de robotique sociale et donc pour une utilisation de service ou ludique dans un environnement domestique. L'invention présente une interaction pour rendre possible en pratique et en environnement non-contraint un apprentissage intuitif et simple qui n'était pas possible jusqu'à présent. Notamment, l'invention permet d'éviter les problèmes de segmentation par la désignation selon l'invention d'un objet du monde réel par un moyen tactile notamment. De plus, l'invention permet de présenter visuellement à l'utilisateur final sur le terminal mobile communicant des hypothèses correspondant à des identifiants uniques intelligibles préalablement entrées par saisie sur le terminal mobile communicant, ce qui permet d'éviter des erreurs et d'affiner l'intelligence artificielle d'un robot. Conversely, the device of the invention can directly start from an image model captured by the robot camera 102 in the environment and offer a single choice or multiple choices of unique identifiable intelligible. The communicating mobile terminal according to the invention allows a user to learn interactively new words to a robot without requiring prerequisites or specific training in robotics and / or computer. A user of the general public can thus give a robot good quality learning lessons, associating visual representations (image model) of objects to representations of referents (unique identifier intelligible). The invention is particularly suitable for use in a context of social robotics and therefore for service or playful use in a home environment. The invention presents an interaction to make practical and uncomplicated environment possible an intuitive and simple learning that was not possible until now. In particular, the invention makes it possible to avoid segmentation problems by the designation according to the invention of an object of the real world by tactile means in particular. In addition, the invention makes it possible to visually present the end-user on the communicating mobile terminal with hypotheses corresponding to unique identifiable identifiers previously entered by input on the communicating mobile terminal, which makes it possible to avoid errors and to refine the artificial intelligence of a robot.

Claims

Revendications1. A communicating mobile terminal (100) arranged to communicate with a robot (102) by a protocol communication means, characterized in that the communicating mobile terminal (100) comprises: a display (104) capable of displaying image data, a selector (106) for designating an image portion, an input device (108), a communication tool (110) arranged to cooperate with a robot (102) and a computer unit (114), and a microprogram (114) arranged to: a.produce on the display (104) a work image from video data received from the robot (102), and corresponding to a shot by that robot, b. receiving selection data of a designated image portion with the selector (106), c. transmitting to the computer unit (112) the selection data to generate an image template corresponding to the designated image portion, and d. transmitting to the computer unit (112) a unique intelligible identifier defined with the input member (108) for storage corresponding to the image template.

The communicating mobile terminal (100) of claim 1, further comprising a touch screen.

3. Communicating mobile terminal (100) according to one of the preceding claims, wherein the input member (108) comprises a physical or virtual keyboard.

4. communicating mobile terminal (100) according to one of the preceding claims, wherein the input member (108) comprises a microphone in association with a voice recognition program.

The communicating mobile terminal (100) according to one of the preceding claims, further comprising a search control input for controlling the robot (102).

6. A cybernetic device (400) comprising a robot (102), a communicating mobile terminal (100) and a computer unit (112), connected together by a communication tool (110) with a protocol, characterized in that the device comprises a camera (116) disposed on the robot (102) for capturing video data, a display (104) on the communicating mobile terminal (100) for forming a work image from the video data received by the camera (116), a selector (106) for designating an image portion on the working image, an image analyzer (118) in the processing computer unit for generating an image template according to the portion of the image image designated with the selector, - an input member (108) for inputting a unique intelligible identifier in correspondence with the image model, and a dedicated memory area (120) for storing the image model and its unique identifier intelligible.

7. Device (400) according to claim 6, wherein the communicating mobile terminal (100) comprises a touch screen.

8. Device (400) according to one of claims 6 to 7, wherein the selector (106) is arranged on the communicating mobile terminal (100).

9. Device according to one of claims 6 to 8, wherein the input member (108) is arranged on the communicating mobile terminal (100).

10. Device (400) according to one of claims 6 to 9, wherein the input member (108) comprises a physical keyboard or virtual.

11. Device (400) according to one of claims 6 to 10, wherein the input member (108) comprises a microphone in combination with a voice recognition program.

12. Device according to one of claims 6 to 11, wherein the robot (102) comprises a means of locomotion.

13. Device (400) according to one of claims 6 to 12, wherein the communicating mobile terminal (100) comprises a command for the means of locomotion of the robot.

14. Device (400) according to one of claims 6 to 13, further comprising: - a search control input, - an environment scanner.

The device (400) of claim 14, wherein the search control input is arranged on the communicating mobile terminal (100).

16. Device according to one of claims 14 or 15, wherein the scanner is arranged on the robot (102).

17. Device (400) according to one of claims 14 to 16, wherein the scanner is arranged in the camera (116).

Cybernetic process comprising the following steps: a. capturing (500) video data by a camera disposed on a robot, said data corresponding to a shot by this robot, b. forming (502) a working image on a communicating mobile terminal from the video data captured in step a. c. designating (504) an image portion on the working image formed in step b., (d) generating (506) an image template from the designated image portion of step c., seizing (508) a unique intelligible identifier corresponding to the image pattern generated in step d., storing (510) the image template and its unique intelligible identifier in a dedicated memory .

The method of claim 18, further comprising the steps of: g. accessing (600) the dedicated memory to select 10 (602) the unique intelligible identifier, h. send (604) by the selection made in step g. a scan command to a robot, i. scanner (606) the environment of the robot, the environment including objects, 15 j. select (608) an object from the environment, k. establishing (610) an object model of the object selected in step j., 1. comparing (612) the object model of step k. to the image model generated in step d., and establish a correspondence factor according to the identity between the two models, m. repeat steps i. to 1. until a threshold value of the correspondence factor is reached indicating that the object model corresponds to the image model.

20. Method according to one of claims 18 to 19, wherein the designation in step c. is performed on the communicating mobile terminal.

21. Method according to one of claims 18 to 20, wherein the designation in step c. is performed by touch.

22. Method according to one of claims 18 to 21, wherein the seizure in step e. is performed on the communicating mobile terminal.

23. Method according to one of claims 18 to 22, wherein the seizure in step e. is performed by voice recognition of a user voice.

24. Method according to one of claims 18 to 23, wherein the seizure in step e. is performed on a physical or virtual keyboard.