WO2010116057A1

WO2010116057A1 - Mobile communication terminal, and method and device for recognizing shapes for a robot

Info

Publication number: WO2010116057A1
Application number: PCT/FR2010/000289
Authority: WO
Inventors: Pierre Rouanet; Pierre-Yves Oudeyer
Original assignee: Inria Institut National De Recherche En Informatique Et En Automatique
Priority date: 2009-04-08
Filing date: 2010-04-06
Publication date: 2010-10-14
Also published as: FR2944402A1; FR2944402B1

Abstract

The invention relates to a mobile communication terminal (100) arranged so as to communicate with a robot (102) by a communication means. The terminal includes a display (104) capable of displaying image data, a selector (106) for designating a portion of an image, an input unit (108), a communication tool (110) arranged so as to cooperate with a robot (102) and a computer unit (112), and firmware (114) arranged so as to produce a working image on the display (104) from video data corresponding to a filming of a robot (102), receiving data for selecting a portion of an image designated with the selector, sending the selection data to the computer unit in order to generate an image model corresponding to the designated portion of the image, and sending the computer unit an intelligible unique identifier defined with the input unit for storage in correspondence with the image model. The invention also relates to a cybernetic device including the terminal and a cybernetic method.

Description

Communicating mobile terminal, device and method of robot pattern recognition

The invention relates to personal robotics.

More particularly, the invention relates to a communicating mobile terminal arranged to communicate with a robot, a cybernetic device and a cybernetic process.

Personal robotics, especially domestic, whether service or fun, is currently experiencing a strong development. In general, robotics involves many techniques of advanced complexity. The state of the art is of a diversified nature, alongside domestic robotics there are also works on humanoid robots.

Among playful robots include the pet robot dog "AIBO" or a humanoid robot "QRIO" (developed by Sony Corporation).

Among the service robots we can cite the humanoid robot ". ASIMO "(developed by Honda Motor Company, Ltd.).

There is also work on androids robots. The patent application WO 98/49629 describes a universal epistemological machine for the creation of a synthetic existence form. Patent application EP 1 477 277 describes a humanoid robot with two legs capable of fluid movements. In personal robotics there is a need to develop intuitive interaction techniques between user and robot. Especially, in the field of playful robotics a simple interaction is more and more requested for example to learn new words to a domestic robot company. But also in the field of service robotics this need is present. Indeed, learning a new task of service to perform to help people with disabilities, elderly or sick must be realized as simply as possible and be accessible to the "general public".

The learning must be sufficiently intuitive and simple for a user with no technical knowledge in robotics and / or computer programming.

Today we know complex learning complex using segmentation algorithms. In these learning interactions between user and robot is limited and especially with regard to learning a robot of new words associated with objects or forms, or new tasks.

Examples can be found notably in the scientific publication: "Luc Steels and Frédéric Kaplan. Aibo's first words: The social learning of language and meaning. Evolution of Communication, 4 (1) .- 3-32, 2000 ".

Segmentation algorithms of the state of the art are almost impossible to implement in an unstressed environment when there is no prior model of an object. Indeed, the identification of object or specific forms in a given three-dimensional space means of segmentation algorithms, is dependent on uniform colors and / or textures in this space. Thus, in an unstressed space such as the real world, the identification is dependent in particular on the light or the angle of view. These algorithms therefore have problems of robustness in unconstrained environments.

In addition, there is a need to facilitate robot controls as well as interaction between the robot and a user. We know work using mediating objects to interact with a robot. In particular, the disclosure of T. W. Fong, C. Thorpe, and B. Glass,

"Pdadriver: A Handheld System for Remote Driving," in IEEE

International Conference on Advanced Robotics 2003. IEEE, July 2003, describes robot remote control via a mobile mediator.

However, the controls of the prior art do not allow access intuitively and simply social interactions, fun or service between user and robot. In addition, the commands of the prior art do not teach a robot additional capabilities, especially real-world object recognition capabilities, in a simple, intuitive and robust.

The present invention improves the situation.

For this purpose, the invention is directed to a communicating mobile terminal arranged to communicate with a robot by a protocol communication means, characterized in that the communicating mobile terminal comprises:

a display able to display image data, a selector for designating a portion of an image,

- an input device,

a communication tool arranged to cooperate with a robot and a computer unit, and

a microprogram arranged for:

at. producing on the visualization a working image from video data received from the robot, and corresponding to a shooting by this robot,

b. receive selection data of a designated image portion with the selector,

vs. transmitting to the computer unit the selection data to generate an image template corresponding to the designated image portion, and

d. transmitting to the computer unit a unique identifier intelligible defined with the input device for storage in correspondence with the image model.

According to one embodiment the communicating mobile terminal further comprises a touch screen to facilitate the interaction between user and terminal.

The input device of the mobile terminal may comprise a physical or virtual keyboard. The input device may also include a microphone in association with a voice recognition program. In addition, according to another embodiment, the terminal includes a search control input for controlling the robot.

The invention also relates to a cybernetic device comprising a robot, a communicating mobile terminal and a computing processing unit, connected together by a protocol communication means characterized in that the device comprises:

a camera arranged on the robot for capturing video data,

a display on the communicating mobile terminal for the formation of a working image from the video data received by the camera,

a selector for designating an image portion on the work image,

an image analyzer in the processing computer unit for generating an image model according to the portion of image designated with the selector,

an input device for entering a unique intelligible identifier in correspondence with the image model, and

a dedicated memory area for storing the image model and its unique intelligible identifier.

According to one embodiment, the communicating mobile terminal included in the device comprises a touch screen.

According to another embodiment, the selector of the device is arranged on the communicating mobile terminal.

According to another embodiment, the input device of the device is arranged on the communicating mobile terminal. The input device may comprise a physical or virtual keyboard to enable input. According to a preferred embodiment of the invention the input member comprises a microphone in association with a voice recognition program to facilitate input by the user. The robot of the device of the invention may comprise a means of locomotion. The means of locomotion can be controlled by the communicating mobile terminal by means of a suitable command.

According to one embodiment, the device of the invention comprises a search control input, and an environment scanner. In this embodiment, the search control input can be arranged on the communicating mobile terminal and the scanner can be arranged on the robot. Preferably, the scanner is arranged in the camera.

In addition, the invention provides a cybernetic method comprising the following steps:

at. capturing video data by a camera disposed on a robot, said data corresponding to a shot by this robot,

b. forming a working image on a communicating mobile terminal (TMC) from the video data captured in step a. ,

vs. designate an image portion on the working image formed in step b.

d. generating an image template from the image portion designated in step c,

e. enter a unique intelligible identifier corresponding to the image model generated in step d. ,

f. store the image template and its unique identifier intelligible in a dedicated memory.

According to one embodiment, the method further comprises the following steps: boy Wut. access the dedicated memory to select the unique identifiable identifier,

h. send by the selection made in step g. a scan command to a robot,

i. scan the environment of the robot, the environment including objects,

j. select an object from the environment,

k. establish an object model of the object selected in step j. ,

1. compare the object model of step k. to the image model generated in step d. , and establish a correspondence factor according to the identity between the two models,

m. repeat steps j. to 1. until a threshold value of the correspondence factor is reached indicating that the object model corresponds to the image model.

The designation in step c. can be performed on communicating mobile terminal.

According to one embodiment, the designation in step c. is performed by touch.

The entry in step e. can be performed on the communicating mobile terminal.

According to one embodiment, the input in step e. is performed by voice recognition of a user voice.

According to another embodiment, the entry in step e. is performed on a physical or virtual keyboard. Other advantages and characteristics will appear on reading the detailed description below and on the appended figures in which:

FIG. 1 is a functional representation of the communicating mobile terminal of the invention according to one embodiment,

FIG. 2 is a functional representation of the communicating mobile terminal of the invention according to another embodiment,

FIG. 3 is a functional representation of the communicating mobile terminal of the invention according to another embodiment,

FIG. 4 is a schematic representation of the device of the invention,

FIG. 5 is a flowchart of the method according to the invention,

FIG. 6 is a flowchart of the method according to another embodiment of the invention,

FIG. 7 is an illustrative diagram of an exemplary embodiment of a designation step according to an embodiment of the invention,

FIG. 8 is an illustrative diagram of an exemplary embodiment of a contextual menu accessible during the method according to one embodiment of the invention,

FIG. 9 is an illustrative diagram of an exemplary embodiment of an input / inputting step of the method according to one embodiment of the invention, FIG. 10 is an illustrative diagram of an exemplary embodiment of a memory access step according to one embodiment of the invention, and

FIG. 11 is an illustrative diagram of an exemplary embodiment of a selective menu accessible during the method according to one embodiment of the invention.

The drawings and the description hereafter contain, for the most part, elements of a certain nature. The drawings are an integral part of the description and can therefore not only serve to better understand the present invention, but also contribute to its definition, if any.

FIG. 1 shows a functional representation of a communicating mobile terminal TMC 100 according to one embodiment of the invention.

The mobile terminal 100 comprises a protocol communication tool 110 (for example "IViFi") for communicating OUTC with a robot 102.

The data exchange D1 which takes place between the robot 102 and the terminal 100 allows in particular the transmission of video data picked up by the robot 102. In addition, the communicating mobile terminal 100 can, via the data exchange D1, transmit commands from 102. For this, an embodiment provides a search control input on the mobile terminal 100 (not shown here).

According to the embodiment described, an exchange of data

D2 between the communication tool 110 and a microprogram 114, makes it possible to transmit data transmitted by the robot 102 to firmware 114. This is typically digital video data.

The firmware 114 can from the data D2 produce

VISU a working image on a display 104. This working image corresponds to a shooting by the robot 102. However, the working image does not necessarily correspond to the video data transmitted in the data Dl. Indeed, in practice the work image generally corresponds to a simplification of the images actually captured by the robot 102.

The display 104 is advantageously tactile to facilitate a direct interaction between the working image and the user (detailed below).

A selector 106 is disposed in the mobile terminal 100 to allow a user to select a SEL of an image portion on the work image. When the display 104 is touch, the selection SEL can be performed by manually designating a selected portion on the working image, for example with a selection pen. The portion designated on the working image corresponds to partial digital data D4 of the digital data D3 forming the working image.

The microprogram 114 is arranged to receive selection data D5 and to transmit these data to a computer unit 112. These data can be altered before transmission for example by filters or other. A distinction is therefore made between the selection data D5 received by the microprogram and the selection data D6 transmitted to the computer unit 112. Nevertheless, D5 and D6 may be identical. Computer unit 112 will then generate UI an image template in correspondence with the image portion designated with the selector on the working image. For this, a histogram generator algorithm can be used and is implemented in the computer unit 112. Note that the generation can be performed in direct cooperation with the microprogram 114 or independently.

The mobile terminal 100 also comprises an input device 108. This can be in particular a physical or virtual keyboard to facilitate an ORGS input. When the input member 108 is a virtual keyboard it can be arranged together with the display 104. Indeed, in this case it is usually a keyboard accessible by touch screen. For design reasons and practical reasons the display 104 and the input member 108 are then arranged together.

It is particularly advantageous to provide an input device 108 comprising a microphone and a voice recognition program. Indeed, this facilitates the input for the end user who would not need to write to enter a unique identifier intelligible. Especially children and people with dyslexia will particularly benefit from this embodiment. According to one embodiment, an implementation of a calculation algorithm of the dynamic comparison (Dynamic Time Warping) type for voice recognition is provided.

The microprogram 114 is arranged to receive data D7 formed with the input device 108. This data D7 comprises a unique identifier intelligible, generally chosen by the user during the ORGS input. The unique intelligible identifier is in direct correspondence with the selection made by means of the selector 106, namely with the image portion chosen by the user.

The microprogram 114 is arranged to transmit D8 to the computer unit 112 a unique intelligible identifier defined with the input device 108. The data transmission D8 is performed to store the unique intelligible identifier in correspondence with the image model generated. by the computer unit 112. The storage can be done in a memory area of the RAM type and can be arranged in a given relational database.

Of course, the storage is not necessarily arranged in the computer unit 112. A separate memory can be used and stored in the choices in the mobile terminal 100, the robot 102 or in the computer unit 112.

In addition, Figure 1 shows a computer unit 112 arranged in the communicating mobile terminal 100. This is not necessarily the case.

FIG. 2 shows a functional representation of a communicating mobile terminal TMC 100 of the invention according to another embodiment of the invention. In this embodiment, the unit 112 can be physically disconnected from the mobile terminal 100. The communication tool 110 then provides data transfer, including D6 and D8. The virtual connection of the microprogram 114 with the computer unit 112 is also provided by the communication tool 110.

In this embodiment the computer unit 112 is arranged in a personal computer which can be advantageous in terms of computing power and storage.

FIG. 3 shows a functional representation of a communicating mobile terminal TMC 100 of the invention according to another embodiment in which the computer unit 112 is arranged in the robot 102. In this embodiment, the unit 112 can The communication tool 110 then transfers the data, in particular D6 and D8. The virtual connection of the microprogram 114 with the computer unit 112 is also provided by the communication tool 110.

A robot generally always includes a computer processing unit. In particular, this unit provides motorization functions of said robot. It may be advantageous to have the computer unit 112 in the robot 102 for reasons of design and ease of use for the end user.

Figure 4 is a schematic representation of the cybernetic device 400 of the invention.

According to the embodiment described here, the cybernetic device 400 comprises a robot 102 a communicating mobile terminal 100 and a computer unit 112. The robot 102, the terminal 100 and the computer unit 112 are interconnected by a communication tool 110 protocol type "NiFi" or other.

The robot 102 of the device 400 includes a camera 116 for capturing CAM video data. The video data generally correspond to a shooting of said robot 102. The communicating mobile terminal 100 includes a display 104 for the formation of a working image. The formation of the working image is made from the video data captured by the camera 116 disposed on the robot 102, which are transmitted to the terminal 100 via the communication tool 110.

The working image can be processed according to the needs and be subjected to one or more filtering for example.

The cybernetic device 400 includes a selector 106 for designating an image portion on the working image. The selector is preferably arranged on the mobile terminal 100 because it facilitates its use to the end user. However, it is not necessarily so. Indeed, the breeder may in particular be arranged in a personal computer which represents the working image on a screen for example. For these reasons, the link between the display 104 and the selector 106 is represented by a broken line in FIG.

Advantageously, the component (computer, TMC, or other) forming the selector 106 comprises touching means for the SEL selection of the image portion of the working image.

The cybernetic device 400 comprises a computer unit 112 in which an image analyzer 118 is arranged. The analyzer 108 generates an image template according to the image portion designated with the selector 106.

The analyzer 118 may include an implementation of an image processing program of the histogram generator type. The device comprises a memory area 120 for storing, on the one hand, the image model generated with the analyzer 118 in the computer unit 112. It can be a memory zone of the RAM type. The memory area may be arranged in the computer unit 112, the communicating mobile terminal 100, the robot 102 or distinctly on a portable hard disk for example.

The device 400 further comprises an input member 108 for the input of a unique intelligible identifier in correspondence with the image model. Generally, it is the end user who will define this identifier.

Preferably, the input device 108 comprises a microphone and a voice recognition program to facilitate its use. The input member may be arranged in the mobile terminal 100 or separately. In particular, the input member 108 may be arranged in a personal computer.

The memory area 120 is arranged to store, next to the image model generated by the analyzer 118, the unique intelligible identifier defined with the input device 108. The memory area 120 can comprise a relational database. to store the correspondence between each unique intelligible identifier entered and its own image model.

According to one embodiment, the cybernetic device 400 comprises a search control input and an environment scanner (not shown). In this embodiment, the search control input can be arranged on the communicating mobile terminal 100 and the scanner can be arranged on the robot 102. Preferably, the scanner is arranged in the camera 116.

Figure 5 shows a functional flowchart of a method implemented with the device of the invention. The cybernetic method of the invention comprises a capturing operation 500 CAPT for capturing video data by a camera 116 disposed on a robot 102. The video data corresponds to a shot by the robot 102.

The next FORM IMG W imaging operation 502 includes the formation of a working image. The working image may be formed by means of a microprogram 114 on a display 104 arranged on the communicating mobile terminal 100. The formation of the working image is performed from the video data captured during the previous operation 500 .

A subsequent DES designation operation 504 comprises designating an image portion on the working image formed at the previous operation 502. The designation can be made with a selector 106. The portion of the work image is generally chosen by the end user and corresponds to an object in the environment of the robot 102. Indeed, the work image corresponds indirectly. to a shot of the robot 102. The working image thus includes objects in the environment of the robot 102.

The following generation operation 506 GEN MOD IMG comprises the generation of an image model from the image portion designated to the designation operation 504. According to an exemplary embodiment, the generation is done with a software of the type histogram generator. The image model is therefore directly related to the actual appearance of the object initially designated by the user on the working image.

The following operation 508 of input IDENT UNI INTEL comprises the input of a unique intelligible identifier corresponding to the image model generated in the operation 506. The choice of the identifier is generally made by the end user to logically define the portion of image previously chosen during the designation operation 504. The capture is carried out by means of an input member 108. Advantageously, this input member comprises a microphone and a speech recognition program using an algorithm for calculating the image. dynamic comparison type (English: "Dynamic Time Warping").

An SVE MEM storage operation 510 then ensures the storage of the image model generated during the GEN GEN generation operation 506 IMG in association with the unique intelligible identifier defined during the IDENT UNI INTEL capture operation 508. Storage can be performed in the memory 120 in particular.

The cybernetic method makes it possible in particular by means of the communicating mobile terminal 100 of the invention and / or the device 400 of the invention to circumvent the problems encountered with the segmentation algorithms.

Figure 6 shows another embodiment of the method of the invention with additional operations.

In this embodiment, the cybernetic method further comprises an operation 600 of access to the memory ACC MEM to access the memory area 120. In the mode described here, the access is made by connection to the apparatus comprising the memory zone 120 (TMC, computer or robot). The access allows with the operation 602 the selection SEL ID UNI INTEL using the selector 106 a unique intelligible identifier previously defined and stored in the memory area 120. The selection 602 can be made by oral input of the identifier when the input member comprises a microphone and a voice recognition program.

The selection 602 of a unique intelligible identifier is directly related to the association in the memory area 120 between this identifier and a corresponding image template generated in the operation 506.

In the following operation 604 of SCAN COM command sending, a scan command is transmitted to a robot 102. The robot 102 goes into a next scanning operation 606 SCAN MR to scan the surrounding real world. In the case of a service robot, the environment may be a hospital center or a nuclear plant for example. In the case of a fun or domestic robot, the environment can be the interior of a house, a car interior or a garden for example. The environment of the robot 102, namely the real world, includes objects.

In a subsequent operation 608, the robot 102 will randomly select an object of the real world and generate in a GEN generation operation 610 GEN OBJ an object model based on the shape of the selected object. The object model generation operation 610 is performed with software identical to that used for the image template generation operation 506.

A following comparison operation 612 compares the object model generated with the operation 610 to the image model generated in the operation 506, and establishes a correspondence factor according to the identity between the two. models. This can be achieved by an overlay respective histograms. When, during the comparison operation 612, the identity of the two models exceeds a predetermined threshold (for example 90% of identity) the robot 102 ends the scanning of the environment and gives an affirmative return to the user indicating that he found the object. When, during the comparison operation 612, the identity of the two models does not exceed the predetermined threshold, the robot 102 resumes scanning operation 606 SCAN MR to scan the real world and select (operation 608) another object to perform a comparison (operation 612). The calculation operation 614 makes it possible to define the identity between the object model and the image model. Generally operation 614 is included in the comparison operation 612.

In the following it is described an embodiment of the invention during its practical implementation by a user. This example is given as an indication to facilitate understanding of the reader and is presented with a non-limiting software implementation. The following example refers to all of Figures 1 to 11.

Thus, the invention makes it possible, in particular, for a user unfamiliar with robotics to learn new words from a robot by using a mediating object: a mobile terminal communicating 100. Advantageously, this communicating mobile terminal can be tactile to facilitate interaction. with the user. As an example of a mobile terminal communicating 100, there may be mentioned a "iPhone" or "iPod Touch" of the company "Apple Inc.".

The communicating mobile terminal 100 comprises a display 104 able to display image data. Thus, the terminal 100 can display a video return of a camera 116 arranged on a robot 102. The video return is generally composed of digital video data captured by the camera 116 disposed on the robot 102, and corresponds to a shooting by this camera. robot 102. Thus, the camera 116 captures the physical environment of the robot 102. The physical environment is generally the real world and includes objects. Examples of objects in the real world include, chairs, tables, televisions, trees, cars, toys (balls etc.) or other. Of course, the objects encountered are dependent on the environment of the robot 102.

The image data displayed on the communicating mobile terminal 100 makes it possible to form a working image 502. In this work image, the user can designate 504 an image portion. This image portion generally includes viewing an object of the real world. The designation 504 therefore makes it possible to "show" objects of the real world to the robot 102.

Figure 7 shows an illustrative diagram of the DES 504 designation of an image portion on a work image including a real world object - here a bullet.

When the mobile terminal 100 comprises a touch screen, the designation DES 504 is generally made by surrounding an image portion on the working image, the image portion comprising an object. The entourage can be done by hand or with a pen adapted for touch screen.

The surrounding of the object not only allows the user to "show" the object to the robot 102, but also to perform a manual segmentation of the image without using of segmentation algorithms. This brings in particular a robustness to the process of the invention.

In the embodiment described here, the mobile terminal 100 includes an implementation of an algorithm called "Navidget". The algorithm is described in the scientific publication "Navidget for 3D Interaction:

Camera Positioning and Further Uses Martin Hachet, Fabrice

December, Sebastian Knό ^' del, Pascal Guitton. Int. J. Human-

Computer Studies - 2008 ", and allows a sensitive designation.

According to the invention, an image template is then generated 506 from the designated image portion (at step 504). The image model generated is directly dependent on the image portion, as well as the shape elements constituting it.

Roughly, the image model corresponds to an object of the real world. The generation 506 of the image model may in particular be carried out using software comprising a histogram generator type algorithm. The software may be contained on an area of memory located in a computer unit 112 that is physically distinct or included in the communicating mobile terminal 100. In the case of a computer unit 112 that is physically distinct from the communicating mobile terminal 100, the latter can be housed in a personal computer or in the robot 102.

According to an optional embodiment, the mobile terminal 100 is arranged to display a MENU context menu on the screen after designation 504. This menu contextual menu may in particular provide the user with different choices of interactions depending on the object surrounded . Figure 8 shows an illustrative diagram of a MENU pop-up menu, offering different control options.

In the embodiment described, the contextual menu MENU offers four options to the user:

- " Last name " ,

" Zoom " ,

"Approach" and

" What is that ? ".

The first option "Name" is for a return by the robot 102 in association with the selected object. This option is detailed below.

The second option "Zoom" allows a direct interaction with the camera disposed on the robot 102. This option mechanically activates the lens of the camera 116 to give an approximate view of the selected object. The interaction with the camera 116 may of course be of any other nature and for example include a function to photograph the object, alter the color of view, adjust the accuracy etc.

The MENU contextual menu may also include other commands of the type comprising functions for moving robot 102. Thus, an "Approach" command may aim at the mechanical advancement of robot 102 towards the selected object. Other move commands may be bypassing the selected object or a command to photograph the selected object from another angle.

Figure 9 shows an illustrative diagram of an ENTR entry of a unique intelligible identifier 508. When the "What is it? Is selected, according to the invention the user can enter ENTR 508 on an input member 108 a unique intelligible identifier corresponding to the image model generated by the computer unit 112. For this, the mobile terminal 100 may include a keyboard physical or virtual. By virtual keyboard is meant here a keyboard shown and accessible by a touch screen. On a virtual keyboard, input ENTR 508 can be done by hand or with a pen adapted for touch screen.

According to another communicating mobile terminal embodiment 100 is arranged to communicate with a handwriting recognition program, as shown in FIG. 9. Thus, a user can directly write with a free hand or with a stylus on the touch screen of the device. terminal 100.

According to a preferred embodiment, the communicating mobile terminal 100 comprises a microphone in association with a voice recognition program. This facilitates the entry 508 for the user. The speech recognition program may include a dynamic comparison algorithm (Dynamic Time Warping).

It is thus possible to associate the image model with a unique intelligible identifier. By intelligible unique identifier is meant here an identifier which is associated only with a specific object, namely a single image model generated. The unique intelligible identifier may be a sound, a word written on the screen or a word entered on a physical or virtual keyboard. In the exemplary embodiment described here, this identifier is the word "ball". The invention comprises a dedicated memory area 120 for storing the association 510 between the image model and the unique intelligible identifier. This memory area 120 can either be directly included in the mobile terminal 100, or be arranged in a personal computer or in the robot 102. The storage of the association can be arranged in a database of the relational type.

After storage in the dedicated memory area 120 of the association between the image model and the unique intelligible identifier, the "Name" option mentioned above becomes accessible with positive feedback. Indeed, for the recognition of the selected object is possible, the image model corresponding to the object must be previously stored.

When this is not done (no storage is done) no recognition is possible, if not a wrong recognition. The return would then be an empty set or an erroneous set.

When an image template corresponding to the selected object is available in the memory area 120, a positive return of the unique identifiable identifier is accessible.

According to an advanced embodiment, the robot 102 may have a command 604 object search. Thus, by inputting SRCH search input 602 on the communicating mobile terminal 100 with a unique intelligible identifier previously stored in the dedicated memory area 120, the robot 102 can search for image templates corresponding to that identifier. Figure 10 shows the SRCH search seizure 602 by handwriting. The user accesses the dedicated memory 600 to select the unique intelligible identifier 602. When the SRCH search entry is made by means of a voice or a voice recognition, it is possible for several unique identifiable identifiers to correspond to the entry. 602. This depends directly on the sensitivity of the means of recognition (writing or vocal). Thus, it is possible to obtain a multiple return of image models that can correspond to the identifier entered.

According to a preferred embodiment the search input SRCH 602 is performed by voice recognition to facilitate the input to the user.

Figure 11 shows the case of a multiple return. The visualization of the communicating mobile terminal 100 displays graphic choices corresponding to an object and thus to an image model. The user can then select CHO the image model object he wants to search. According to one embodiment, the model image objects can be classified in order of relevance on the visualization (for example by means of a dynamic comparison algorithm (Dynamic Time Warping).

Once the selection CHO made, a scan command 604 is transmitted to the robot 102. The latter will then scan 606 by means of a scanner its environment including real-world objects. The robot 102 is arranged to select 608 an object from the environment and to establish an object model 610 of said object. The object model is then compared 612 to the image model stored in the dedicated memory to establish a correspondence factor 614 according to the identity between the two models. Identity can be established by means of algorithms of superposition of histograms. If the matching factor reaches a threshold value indicating that the object model corresponds to the image model, then the robot 102 has found the sought object.

Conversely, the device of the invention can directly start from an image model captured by the robot camera 102 in the environment and offer a single choice or multiple choices of unique identifiable intelligible.

The communicating mobile terminal according to the invention allows a user to learn interactively new words to a robot without requiring prerequisites or specific training in robotics and / or computer.

A user of the "general public" can thus give a robot good quality learning lessons, associating visual representations (image model) of objects with representations of referents (unique identifier intelligible). The invention is particularly suitable for use in a social robotics setting and therefore for service or playful use in a home environment.

The invention presents an interaction to make practical and uncomplicated environment possible an intuitive and simple learning that was not possible until now. In particular, the invention makes it possible to avoid segmentation problems by the designation according to the invention of an object of the real world by tactile means in particular. In addition, the invention makes it possible to visually present to the end-user on the communicating mobile terminal hypotheses corresponding to unique identifiable identifiers previously entered by input on the communicating mobile terminal, thereby avoiding errors and refine the artificial intelligence of a robot.

Claims

claims

A communicating mobile terminal (100) arranged to communicate with a robot (102) by a protocol communication means, characterized in that the communicating mobile terminal (100) comprises:

a display (104) capable of displaying image data,

a selector (106) for designating an image portion,

an input member (108),

a communication tool (110) arranged to cooperate with a robot (102) and a computer unit (114), and

a microprogram (114) arranged for:

at. producing on the display (104) a work image from video data received from the robot (102), and corresponding to a shot by that robot,

b. receiving selection data of a designated image portion with the selector (106),

vs . transmitting to the computer unit (112) the selection data for generating an image pattern corresponding to the designated image portion, and

d. transmitting to the computer unit (112) a unique intelligible identifier defined with the input member (108) for storage in correspondence with the image model.

The communicating mobile terminal (100) of claim 1, further comprising a touch screen.

3. Communicating mobile terminal (100) according to one of the preceding claims, wherein the input member (108) comprises a physical or virtual keyboard.

4. communicating mobile terminal (100) according to one of the preceding claims, wherein the input member (108) comprises a microphone in combination with a voice recognition program.

The communicating mobile terminal (100) according to one of the preceding claims, further comprising a search control input for controlling the robot (102).

6. Cybernetic device (400) comprising a robot

(102), a communicating mobile terminal (100) and a computer unit (112), connected together by a protocol communication tool (110), characterized in that the device comprises:

a camera (116) disposed on the robot (102) for capturing video data,

a display (104) on the communicating mobile terminal (100) for forming a working image from the video data received by the camera (116),

a selector (106) for designating an image portion on the working image, an image analyzer (118) in the processing computer unit for generating an image pattern according to the image portion designated with the selector,

an input device (108) for entering a unique intelligible identifier in correspondence with the image model, and

a dedicated memory area (120) for storing the image model and its unique intelligible identifier.

7. Device (400) according to claim 6, wherein the communicating mobile terminal (100) comprises a touch screen.

8. Device (400) according to one of claims 6 to 7, wherein the selector (106) is arranged on the communicating mobile terminal (100).

9. Device according to one of claims 6 to 8, wherein the input member (108) is arranged on the communicating mobile terminal (100).

10. Device (400) according to one of claims 6 to 9, wherein the input member (108) comprises a physical keyboard or virtual.

11. Device (400) according to one of claims 6 to 10, wherein the input member (108) comprises a microphone in combination with a voice recognition program.

12. Device according to one of claims 6 to 11, wherein the robot (102) comprises a means of locomotion.

13. Device (400) according to one of claims 6 to 12, wherein the communicating mobile terminal (100) comprises a command for the means of locomotion of the robot.

14. Device (400) according to one of claims 6 to 13, further comprising:

- a search command entry,

an environmental scanner.

The device (400) of claim 14, wherein the search control input is arranged on the communicating mobile terminal (100).

16. Device according to one of claims 14 or 15, wherein the scanner is arranged on the robot (102).

17. Device (400) according to one of claims 14 to 16, wherein the scanner is arranged in the camera (116).

Cybernetic process comprising the following steps:

at. capturing (500) video data by a camera disposed on a robot, said data corresponding to a shot by this robot,

b. forming (502) a work image on a communicating mobile terminal from the video data captured in step a.

vs. designating (504) an image portion on the working image formed in step b. d. generating (506) an image template from the image portion designated in step c,

e. enter (508) a unique intelligible identifier corresponding to the image model generated in step d.

f. storing (510) the image template and its unique identifier intelligible in a dedicated memory.

The method of claim 18, further comprising the steps of:

boy Wut. accessing (600) the dedicated memory to select (602) the unique intelligible identifier,

h. send (604) by the selection made in step g. a scan command to a robot,

i. scanner (606) the robot environment, the environment including objects,

j. select (608) an object from the environment,

k. establishing (610) an object model of the object selected in step j. ,

1. compare (612) the object model of step k. to the image model generated in step d., and establish a correspondence factor according to the identity between the two models,

m. repeat steps i. to 1. until a threshold value of the correspondence factor is reached indicating that the object model corresponds to the image model.

20. Method according to one of claims 18 to 19, wherein the designation in step c. is performed on the communicating mobile terminal.

21. Method according to one of claims 18 to 20, wherein the designation in step c. is performed by touch.

22. Method according to one of claims 18 to 21, wherein the seizure in step e. is performed on the communicating mobile terminal.

23. Method according to one of claims 18 to 22, wherein the seizure in step e. is performed by voice recognition of a user voice.

24. Method according to one of claims 18 to 23, wherein the seizure in step e. is performed on a physical or virtual keyboard.