GB2533789A

GB2533789A - User interface for augmented reality

Info

Publication number: GB2533789A
Application number: GB1423328.2A
Authority: GB
Inventors: Fan Lixin; Roimela Kimmo
Original assignee: Nokia Technologies Oy
Current assignee: Nokia Technologies Oy
Priority date: 2014-12-30
Filing date: 2014-12-30
Publication date: 2016-07-06
Also published as: GB201423328D0

Abstract

An image IMGk of a first hand H1 is captured and a virtual image VMG1 is displayed using a display DISP1 of a head mounted device 500 such that the virtual image overlaps the first hand when viewed by a user U1 of the head mounted device. User input is obtained by detecting movement of a hand. The virtual image may be moved or modified based on the detected movement. The virtual image may be moved from the first hand to be associated with a real object. First and second virtual images may be displayed at first and second positions on the first hand, the positions being associated with first and second options. The position of a second hand with respect to the first hand is detected and one of the options is selected based on the detected position of the second hand. The head mounted device may be used for obtaining user input for an augmented reality application.

Description

USER INTERFACE FOR AUGMENTED REALITY FIELD

Some versions may relate to providing user input by using a head mounted device, which comprises a display and a camera.

BACKGROUND

A head mounted device may comprise a display and a camera. The device may display virtual images to a user of the device, and the device may receive user input from the user by gesture recognition. The head mounted device may be used e.g. in augmented reality applications.

SUMMARY

Some versions may relate to a method for providing user input. Some versions may relate to a portable device. Some versions may relate to a computer program for receiving user input. Some versions may relate to a computer program product for receiving user input. Some versions may relate to a method for displaying information on a display of a portable device. Some versions may relate to a user interface.

According to an aspect, there is provided a method comprising: -capturing an image of a first hand, - displaying a virtual image by using a display of a head mounted device such that the virtual image overlaps the first hand when viewed by a user of the head mounted device, and - obtaining user input by detecting a movement of a hand. 30 According to an aspect, there is provided an apparatus, comprising: - a head mounted device having a display, which is configured to display a virtual image such that the virtual image overlaps a hand of a user when viewed by the user of the head mounted device, -a camera to capture an image of the hand of the user, and - a control unit configured to detect a movement of a hand by analyzing an image captured by the camera.

According to an aspect, there is provided a computer program comprising 40 computer program code configured to, when executed on at least one processor, cause an apparatus or a system to: - obtain an image of a first hand, - display a virtual image by using a display of a head mounted device such that the virtual image overlaps the first hand when viewed by a user of the head mounted device, and -obtain user input by detecting a movement of a hand.

According to an aspect, there is provided means for controlling operation of an apparatus, comprising: - means for obtaining an image of a first hand, io -means for displaying a virtual image by using a display of a head mounted device such that the virtual image overlaps the first hand when viewed by a user of the head mounted device, and - means for obtaining user input by detecting a movement of a hand.

is The head mounted device may be used for obtaining user input for an augmented reality application. A communication system or apparatus may comprise the head mounted device. The apparatus may be configured to detect the presence of a hand, and to determine the position of the hand with respect to the head mounted device. Obtaining user input from the user may involve tracking or recognizing interactions between displayed content and the gestures of the user. However, touching or manipulating a virtual object in an empty space may be unnatural and troublesome for an untrained user. Displaying the virtual object on a hand of the user may facilitate touching and/or manipulating the virtual object. Said associating may provide an intuitive and easy user interface for a user to interact with digital augmented content.

The head mounted device may provide a see-through glass user interface, which utilizes palm/hand recognition. The user may use his hand to define a hand region for manipulating and displaying augmented reality contents. A virtual object may be displayed on the hand region such that the user may manipulate the virtual object almost like a real object.

The environment of a user of the device may comprise one or more real objects. A communication system or apparatus may comprise the head mounted device. The head mounted device may be called e.g. as an "augmented reality glasses" or as a "see-through glass type device". The head mounted device may be called e.g. as wearable glasses. When the user explores the environment with the see-through glass type of device, the apparatus may be configured to search information related to an object, to associate information to an object, and/or to display the information to the user. The head mounted device may communicate with one or other devices by e.g. wireless communication. For example, the head mounted device may retrieve data from a cloud service based on the user input.

The apparatus may retrieve digital contents from a database, and to display the digital contents to the user. The apparatus may define and associate digital contents with real world objects. The apparatus may modify and/or use the digital contents for different rendering purposes.

The hand region may define a working area on which augmented reality contents io may be inputted e.g. by drawing with the second hand, by typing, and/or by voice recognition.

The user may use one or more fingers of his second hand to draw a picture or to write text on the hand region. The picture and/or text may be subsequently is associated with a real object of the environment.

Augmented reality contents may be associated with a real object by gesture recognition. For example, turning of the hand and subsequent pointing to a real object may be interpreted as gesture input, which may cause associating a virtual image with the real object. The real object may be e.g. a building. Providing user input by using said gestures may allow simple and intuitive way to associate augmented reality contents with the real object.

Obtaining user input from the user may involve tracking or recognizing interactions between displayed content and the gestures of the user. These operations may also be performed without detecting the presence of five fingertips. These operations may also be performed without determining the three-dimensional position of the camera with respect to the hand.

Rendering of augmented contents in a brightly illuminated environment may sometimes be difficult, because the displayed virtual image may be very dim. displaying the virtual image on the hand region may improve visibility of the virtual image. The hand region may block high background illumination, and may improve visibility of a virtual image displayed on the hand region. This may facilitate working with augmented reality contents.

BRIEF DESCRIPTION OF THE DRAWINGS

In the following examples, several variations will be described in more detail with 40 reference to the appended drawings, in which Fig. 1 shows, by way of example, in a three dimensional view, displaying a virtual image by using a head mounted device, Fig. 2 shows, by way of example, in a three dimensional view, providing user input, Fig. 3a shows, by way of example, an image of a hand, Fig. 3b shows, by way of example, determining the position of a hand, Fig. 3c shows, by way of example, controlling displaying of the virtual image, Fig. 3d shows, by way of example, an image of a first hand and an image of is a second hand, Fig. 3e shows, by way of example, determining the position of a first hand and determining the position of a second hand, zo Fig. 3f shows, by way of example, controlling displaying of the virtual image, Fig. 4 shows, by way of example, in a three dimensional view, the head mounted device, the hands of a user, and an external object, Fig. 5 shows, by way of example, in a three dimensional view, associating a virtual image with the external object, Fig. 6 shows, by way of example, drawing graphics by tracking movements of a hand, Fig. 7a shows, by way of example, a gesture for associating a virtual image with an external object, Fig. 7b shows, by way of example, selecting an option associated with a virtual image, Fig. 8 shows, by way of example, units of an apparatus comprising a head mounted device, Fig. 9 shows, by way of example, method steps for obtaining user input, and Fig. 10 shows, by way of example, a communication system, which comprises a head mounted device.

DETAILED DESCRIPTION

Referring to Fig. 1, a user U1 may wear a head mounted device 500. The head mounted device 500 may comprise a virtual display DISP1 for displaying one or more virtual image VMG1 to the user Ul. The device 500 may comprise a camera CAM1 for monitoring the environment ENV1 of the device 500.

The virtual display DISP1 may be at least partly transparent such that the user Ul is may view external objects through the virtual display DISP1. The virtual display DISP1 may comprise e.g. a first optical engine 10a and an exit pupil extender (EPE) 20a for expanding the exit pupil of the optical engine 10a. The exit pupil extender 20a may be at least partly transparent such that the user U1 may view external objects through the exit pupil extender 20a. The exit pupil extender may also be called e.g. as a beam expander.

The virtual display DISP1 may generate one or more light beams B1, which may impinge on an eye El of a user U1 in order to form an optical image on the retina of the eye El of the user Ul. The light beams B1 may form an optical image on the retina. The virtual display DISP1 may be arranged to display one or more virtual images VMG1 to the user Ul.

The device 500 may optionally comprise a second optical engine 10b and/or a second exit pupil extender 20b for displaying virtual images to the other eye E2 of the user Ul. The optical engines 10a, 10b may be arranged to display the same virtual image for both eyes E1, E2. The optical engine 10a may be arranged to display a first virtual image to the eye El, and the optical engine 10b may be arranged to generate a second different virtual image to the second eye E2, in order to display a stereo image. The second optical engine 10b and the second exit pupil extender 20b may provide a light beam B2, which may impinge on the eye E2 of the user Ul.

The device 500 may comprise a camera CAM1, which may be arranged to capture an image of external objects. The captured images may be called as an image 40 frames or a digital images. The device 500 may comprise a camera CAM1, which may be arranged to capture an image frame of one or more real objects. The camera CAM1 may be arranged to capture an image frame of objects, which are in the field of view of the user of the head mounted device 500.

The device 500 may be arranged to determine the position of an external object by 5 analyzing one or more image frames captured by the camera CAM1. Image frames captured by the camera CAM1 may be analyzed e.g. locally in a data processing unit CNT1 mounted to the head mounted device 500.

The device 500 may optionally comprise e.g. one or more earpieces 91a, 91b to 10 facilitate wearing the device 500. The device 500 may also be attached e.g. to a headgear or to a helmet.

The display DISP1 of the device 500 may also cover only one eye El of the user Ul, and the user may view external objects by his second eye E2. In this case, the display DISP1 does not need to be transparent.

SX, SY and SZ denote orthogonal directions.

Referring to Fig. 2, the user Ul may see his first hand H1 through the virtual display DISP1 of the head mounted device 500. The device 500 may be arranged to detect the presence of the first hand H1 e.g. by image recognition and/or by color recognition. The device 500 may be arranged to determine the position of the first hand H1 e.g. by analyzing an image frame IMGk captured by the camera CAM1. The device 500 may be arranged to display a virtual image VMG1 to the user U1 such that the virtual image VMG1 may overlap the region of the first hand H1, when viewed by the user Ul.

The user U1 may use his second hand H2 for making a gesture. The user U1 may move his second hand H2 in the vicinity of the first hand H1. The user U1 may move his second hand H2 with respect to the first hand H1. The user U1 may touch the first hand H1 with the second hand H2. The user Ul may touch the first hand H1 with one or more fingers of the second hand H2. The device 500 may be arranged to obtain user input by detecting a movement of the second hand H2.

The movement may be detected e.g. by analyzing a video sequence SEQ1 captured by the camera CAM1. The video sequence SEQ1 may comprise a plurality of image frames IMGk-1, IMGk IMGk+i, .... The movement may be detected by analyzing image frames IMGk-1, IMGk... of the video sequence SEQ1.

The first hand H1 may be e.g. the left hand or the right hand of the user Ul. The second hand H2 may be the other hand of the user, respectively.

The displayed image VMG1 may coincide with the region of the first hand H1 such that the user Ul may view the movements of the second hand H2 with respect to image VMG1. The user U1 may view the movements of the second hand H2 with respect to the virtual image VMG1 displayed on the first hand H1. The person U1 using the device 500 is typically accustomed to moving his hand with respect to the other hand. Consequently, the user U1 may also be capable of moving his second hand H2 accurately with respect to the virtual image VMG1 displayed on his first hand H1.

The user U1 may also use his first hand H1 for making a gesture. The device 500 may obtain user input by detecting a movement of a hand H1, H2 of the user U1.

The virtual image VMG1 may be displayed to the user U1 such that the virtual 15 image VMG1 is not projected on the hand H1. Consequently, other persons cannot see the displayed virtual image VMG1 without using a display device.

The method may comprise moving a virtual image VMG1 based on the detected movement. The method may comprise detecting a change of position of a finger with respect to the first hand H1, and moving the virtual image VIMG1 according to the detected change of position. For example, a virtual map VIMG1 may be displayed on the first hand Hi. The user U1 may subsequently manipulate the map with the second hand H2. For example, the user Ul may scroll the map by sliding a finger of the second hand H2 on the palm region of the first hand H1.

The method may comprise modifying a virtual image VIMG1 based on the detected movement. The method may comprise detecting a change of distance between a first finger and a second finger, and changing the size of the virtual image according to the detected change of distance. For example, the user U1 may zoom a digital map by making a pinching movement with fingers of the second hand in the vicinity of the hand region. For example, the user may select a position on the map VIMG1 by touching the hand region H1 with a finger of the second hand H2.

User input may be provided e.g. by changing the distance between a first hand H1 and a second hand H2. User input may be provided e.g. by moving the second hand H2 to a predetermined position with respect to the first hand H1. A gesture may comprise moving the second hand H2 to a predetermined position with respect to the first hand H1. The gesture may be detected by analyzing an image or an video sequence captured by a camera CAM1 of the device 500. Depending on more complex hand gesture recognition e.g. interaction between two hands H1, H2, augmented reality contents may be manipulated according to the detected gesture.

The device 500 may be arranged to obtain user input by: -capturing an image IMGk of a first hand H1, - displaying a virtual image VIMG1 by using a display DISP1 of the head mounted device 500 such that the virtual image VIMG1 overlaps the first hand H1 when viewed by a user U1 of the head mounted device 500, and - obtaining user input by detecting a movement of a hand H1, H2.

The user input may be obtained by detecting a movement of the second hand H2 with respect to the first hand H1. The user input may be obtained by detecting a movement of the first hand H1, e.g. by comparing the position of the first hand H1 with a previous position of the first hand H1.

The user U1 may sometimes carry a wristwatch, a ring, or bracelet. The wristwatch, the ring, or the bracelet may be used as a reference object REF1. The user U1 may use relative motion between the reference object REF1 and his finger of his second hand H2 to control more complex interactions e.g. to perform a scrolling operation or to perform a zooming operation.

The user may even wear a coded item such as a coded ring or a coded bracelet. The coded item may comprise one or more marks so that the orientation of the item may be detected by analyzing an image captured by the camera of the device. User input may be provided by changing the orientation of the coded item.

For example, the user may view several pages of information by rotating the coded item. The device may display a first virtual page to the user when the coded item has a first orientation, and the device may display a second page when the coded item has a second orientation.

Referring to Fig. 3a, the camera CAM1 may capture an image frame IMGk. The image frame may be captured e.g. at a time tk. The image frame IMGk may comprise an image SUB1 of the first hand H1 of the user U1. The image frame IMGk may be called e.g. as a digital image, and the image SUB1 of the hand H1 may be called e.g. as a sub-image of the digital image. SU and SV may denote orthogonal directions of an image coordinate system. umax may denote the width of the digital image IMGk, and vmax may denote the height of the digital image IMGk.

Referring to Fig. 3b, the device 500 may be configured to determine the position of the hand H1. The device 500 may be configured to measure the position of the hand H1 by image analysis. Determining the position may optionally comprise detecting the presence of a human hand H1. This may be useful e.g. when several objects are simultaneously in view of the camera CAM1. The device 500 may be configured to detect the presence of a human hand H1 e.g. by using shape recognition. The device 500 may be configured to compare the shape of the sub-image SUB1 with one or more reference shapes in order to determine whether the sub-image SUB1 matches with a human hand or not. The device 500 may be configured to compare the size of the sub-image SUB1 in order to determine whether the sub-image SUB1 matches with a human hand or not. If the sub-image SUB1 is not in a predetermined size range, the device 500 may determine that the sub-image SUB1 is not a human hand. The device 500 may be configured to detect the presence of a human hand H1 e.g. by using shape recognition, by using color recognition and/or based on analysis of movements. The device 500 may be configured to compare the color of the sub-image SUB1 in order to determine whether the sub-image SUB1 matches with a human hand or not. If the sub-image SUB1 is not in a predetermined color range, the device 500 may determine that the sub-image SUB1 is not a human hand. The method may comprise a step for determining the color range. For example, the device may be configured to measure the color of the skin of the user. For example, the device may be configured to capture a reference image of the hand of the user in order to measure the color of the skin of the user. The hand of the human user U1 may typically make very small random movements, even when the user tries to hold his hand stationary. The device may be arranged to monitor the movements of the sub-image SUB1 in order to determine whether the whether the sub-image SUB1 matches with a human hand or not. The presence of small random movements of the sub-image SUB may be interpreted to increase the probability that the sub-image SUB1 matches with a human hand.

Referring to Fig. 3b, the position of the hand H1 may be determined by analyzing the image frame IMGk. The position of the hand H1 may be determined after the presence of the hand H1 has been detected. The position of the sub-image SUB1 may be determined after the presence of the hand H1 has been detected. The position of the sub-image SUB1 in the image frame IMGk may be specified e.g. by coordinates (u1,v1). The device 500 may be arranged to determine a hand region REG1 such that the position and the size of the hand region REG1 matches with the position of the first hand H1. The device 500 may be arranged to determine a hand region REG1 such that the position and the size of the hand region REG1 matches with the position (u1,v1) of sub-image SUB1 of the first hand H1.

Referring to Fig. 3c, the display DISP1 of the head mounted device 500 may receive image data, which comprises a digital image frame DIMG1. The image frame DIMG1 may be communicated to the display DISP1 in order to display the virtual image VMG1. The image frame DIMG1 may comprise an image element E1, which defines the shape of the virtual image VMG1. The image frame DIMG1 may be formed e.g. by a data processing unit CNT1. The image frames IMGk captured by the camera do not comprise the virtual image VMG1. The image frame DIMG1 does not directly represent an image frame captured by the camera CAM.

The device 500 may be configured to display the virtual image VMG1 by using the display DISP1 such that the virtual image VMG1 overlaps the first hand H1, when viewed by the user Ul. The device 500 may be configured to determine the position (u3,v3) of a primary image El such that the position (u3,v3) of the primary image El in the image frame DIMG1 matches with the position of the hand region REG1. The device 500 may be configured to determine the position (u3,v3) of the primary image El such that that the position (u3,v3) of the primary image El in the image frame DIMG1 matches with the position (ul,v1) of sub-image SUB1 of the first hand H1 in the image frame IMGk.

Referring to Fig. 3d, the image frame IMGk captured by the camera CAM1 may comprise a first sub-image SUB1 of the first hand H1 of the user Ul, and a second sub-image SUB2 of the second hand H2 of the user Ul. The device 500 may be configured to detect the presence of the hands H1, H2 e.g. by using shape recognition, by using color recognition and/or based on analysis of movements.

Referring to Fig. 3e, the device 500 may be configured to determine the position (ul,v1) of the sub-image SUB1 of the first hand H1, and the position (u2,v2) of the sub-image SUB2 of the second hand H2. In particular, the device 500 may be configured to determine the position (ul,v1) of the sub-image SUB1 of the first hand H1, and the position (u2,v2) of the sub-image SUB2 of a finger of the second hand H2.

The device 500 may be arranged to determine a hand region REG1 such that the position and the size of the hand region REG1 matches with the position of the first hand Hl.

Referring to Fig. 3g, the device 500 may be configured to display the virtual image VMG1 by using the display DISP1 such that the virtual image VMG1 overlaps the first hand H1, when viewed by the user Ul. The device 500 may be configured to determine the position (u3,v3) of the primary image El such that that the position (u3,v3) of the primary image El in the image frame DIMG1 matches with the position (ul,v1) of sub-image SUB1 of the first hand H1 in the image frame IMGk.

The second hand H2 may partly cover the first hand H1. The second hand H2 may partly cover the hand region REG1. The covered part POR2 may be included in the hand region REG1 or excluded from the hand region REG1. If the hand region REG1 comprises the covered part POR2, then a part of the virtual image VMG1 may overlap the second hand H2.

If the covered part POR2 is excluded from the hand region REG1, then the displayed virtual image VMG1 does not overlap the second hand H2.

Consequently, the user U1 may see the position of his second hand H2 more clearly. The device 500 may control displaying the virtual image VMG1 such that the virtual image VMG1 is not formed on an image portion POR2 covered by the second hand H2. The device 500 may form the image frame DIMG1 such that the virtual image VMG1 is displayed on the first hand H1 but not on the second hand H2, when viewed by the user Ul.

Referring to Fig. 4, the device 500 may be used in an environment ENV1. The environment ENV1 may comprise one or more real objects 01. The real object 01 may be e.g. a building. The environment ENV1 may comprise one or more external objects 01, in addition to the hands H1, H2.

Referring to Fig. 5, device 500 may be arranged to move the virtual image VMG1 from the hand H1 to the real object 01. The device 500 may be arranged to associate the virtual image VMG1 with the real object 01 according to a gesture made by the hand H1 or H2. For example, the user U1 may point at the object 01 with a finger of the hand H2. The camera CAM1 may capture a video sequence SEQ1, and the device 500 may detect the movement of the hand H2 by analyzing image frames IMGkA, IMGk IMGk.1, ... of the video sequence SEQ1.

The method may comprise moving the virtual image VIMG1 from the position of the first hand H1 to a second position where the virtual image is associated with a real object 01.

The camera CAM1 may capture a sub-image SUB2 image of the hand H2, the device 500 may detect the position of the finger by analyzing the sub-image SUB2 of the hand H2, and the device 500 may change the position of the virtual image VMG1 according to the detected position of the finger. The device 500 may detect the orientation of the finger by analyzing the sub-image SUB2 of the hand H2, and the device 500 may change the position of the virtual image VMG1 according to the detected orientation of the finger. The orientation of the finger may be e.g. vertical or horizontal.

Augmented reality contents may be associated with the real object 01 by gesture recognition. For example, turning of the hand H1 or H2 and subsequent pointing to the real object 01 may be interpreted as gesture input, which may cause associating a virtual image VNG1 with the real object 01. The real object 01 may be e.g. a building. Providing user input by using said gestures may allow simple and intuitive way to associate augmented reality contents with the real object 01. The environment ENV1 may comprise several real objects. The virtual image VNG1 may be associated with a selected real object 01 based on the detected movement of the hand H1 and/or H2.

The apparatus 500 may be configured to use e.g. a three dimensional map of the environment to identify the real object 01, and to determine the position of said real object 01. The augmented reality contents may be subsequently displayed by the device 500 such that the augmented reality content is associated with the position of said real object 01.

Referring to Fig. 6, the user U1 may draw a virtual picture VMG1 on his hand H1 by moving the second hand H2 with respect to the first hand H1. The device 500 may track the movements of the second hand H2 by analyzing a sequence SEQ1 of image frames IMGk_i, IMGk IMGk.1, ... captured by the camera CAM1, and the device 500 may alter the virtual picture VMG1 according to one or more detected movements of the second hand H2. In particular, the device 500 may alter the virtual picture VMG1 according to one or more detected movements of a finger of the second hand H2. The virtual picture VMG1 may be e.g. graphics or text. The virtual image VMG1 may be displayed to the user U1 by the display DISP1 of the device 500 such that the position of the virtual image VMG1 overlaps the hand H1. The user U1 may view the virtual picture VMG1 by looking his hand H1 through the display DISP1.

Referring to Fig. 7a, the user may make a gesture with his hand H1 and/or H2 in order to provide gesture input. For example, the user U1 may swing his hand H1 or H2 towards a real object 01. For example, the user U1 may change the position of his hand H2 from a first position POS1 to a second position POS2. The device 500 may associate the virtual image VMG1 with the real object 01 according to the detected gesture.

Referring to Fig. 7b, an option may be selected by detecting the position of the second hand H2 with respect to the first hand H1.

A plurality of virtual images VIMG1, VIMG2 may be displayed on the hand H1 such that each virtual images VIMG1, VIMG2 is associated with an option.

For example, the device 500 may simultaneously display a first virtual image 5 VIMG1 at a first position POSA of the first hand H1, and a second virtual image VIMG2 at a second different position POSB of the first hand H1. The first virtual image VIMG1 may be displayed at the first position POSA and the second virtual image VIMG2 may be displayed at the second position POSB when viewed by the user U1 of the device 500. Another person cannot view the virtual images VIMG1, 10 VIMG2 without using a display device.

The first virtual image VIMG1 may be associated with a first option OPTA, and the second virtual image VIMG2 may be associated with a second option OPTB. The first virtual image VIMG1 may represent the first option OPTA, and the second virtual image VIMG2 may represent the second option OPTB. The first option OPTA may be associated with the first position POSA, and the second option OPTB may be associated with the second position POSB.

The first option OPTA may represent e.g. a first digital contents, and the second option OPTB may represent e.g. a second digital contents. The first or second digital contents may be retrieved e.g. from a memory of the device or from an internet server based on the selection. Selecting the first option OPTA may e.g. start displaying a first digital photograph, and selecting the second option OPTB may start displaying a second digital photograph. Selecting the first option OPTA may e.g. start displaying a first video clip, and selecting the second option OPTB may start displaying a second video clip. Selecting the first option OPTA may e.g. start displaying a first web document, and selecting the second option OPTB may start displaying a second web document.

The user U1 may select the option OPTA or the option OPTB by pointing at the position POSA or at the position POSB. The user U1 may select the option OPTA or the option OPTB by pointing at the position POSA of the virtual image VIMGA or at the position POSB of the virtual image VIMGB.

The device 500 may be configured to detect whether the hand H2 points at the position POSA or at the position POSB. The device 500 may even be configured to detect whether the second hand H2 touches the first hand H1 at the position POSA or at the position POSB. Touching of the first hand with the second hand at a position may be determined as selection of an option associated with said position. Touching of the first hand with the second hand may be determined as confirmation of a selection of the option associated with said position.

User input for selecting an option may be obtained by a method, which comprises: - displaying a first virtual image VIMG1 at a first position POSA on the first hand H1, the first position POSA being associated with a first option OPTA, -displaying a second virtual image VIMG2 at a second position POSB on the first hand H1, the second position POSB being associated with a second option OPTB, - detecting the position of a second hand H2 with respect to the first hand H1, and - selecting the first option OPT1 or the second option OPT2 based on the detected position of the second hand H2.

The position of a second hand H2 with respect to the first hand H1 may be determined by analyzing the image frame IMGk. The position of a second hand H2 with respect to the first hand H1 may be determined by detecting the position (u2,v2) of the sub-image SUB2 of the second hand H2, and by detecting the position (ui,vi) of the sub-image SUB1 of the first hand H1.

The user U1 may select an option from among a plurality of candidate options by pointing at a position of his hand H1 with his second hand H2, wherein each candidate option may be associated with a virtual image displayed on his hand Hi.

Referring to Fig. 8, the device 500 may comprise a control unit CNT1 for analyzing image frames IMGk and/or for controlling operation of the device 500. The control unit CNT1 may comprise one or more data processors.

The device 500 may comprise a camera CAM1 for capturing image frames IMGk of the hands H1, H2. The camera may also capture an image of one or more external objects 01. The camera CAM1 may comprise imaging optics and an image sensor. The image sensor may comprise a two-dimensional array of detector pixels. The image sensor DET1 may be e.g. CMOS device (Complementary Metal Oxide Semiconductor) or a CCD device (Charge-Coupled Device).

The device 500 may optionally comprise a sensor SEN1, which may comprise e.g. an acceleration sensor, a gyroscope, and/or a magnetic compass. The gyroscope may be e.g. a three-axis gyroscope. The acceleration sensor may be a three-axis acceleration sensor. The acceleration sensor may also provide information for determining the direction of gravity. The sensor SEN1 may be used for determining the position and/or orientation of the device 500 with respect to the environment ENV1.

The device 500 may comprise a user interface UIF1. The user interface UIF1 may be at least partly implemented by using the display DISP1 and the camera CAM1. The device 500 may obtain user input from the user U1 by using the display DISP1 and the camera CAM1.

The user interface UIF1 may optionally comprise e.g. a touch screen, one or more keys, a mouse, a gaze tracking unit, and/or voice recognition unit (and a microphone). User input may also be determined by using the sensor SEN1. For example shaking of the device 500 in a certain way may be detected by an acceleration sensor or gyroscope of the sensor SEN1, and may be determined to represent a predetermined user input.

The device 500 may comprise a communication unit RXTX1 for receiving data from a second device and for transmitting data to a second device. The second 15 device may be e.g. an internet server SERV1. COM1 denotes a communication signal.

The device 500 may comprise a memory MEM1 for storing computer program PROG1 for controlling operation of the device 500. The device 500 may comprise 20 a memory MEM2 for storing data DATA1. The data DATA1 may comprise e.g. image data for displaying the virtual image VMG1.

An apparatus 501 or a device 501 may comprise the head mounted device 500, and optionally one or more auxiliary units. All auxiliary units of the apparatus do not need to be mounted to the head mounted device 500. For example, the apparatus 501 may comprise an external memory and/or an external data processing unit, which may be arranged to communicate with the head mounted device 500 in a wireless manner, via an optical cable, or via an electrical cable. The one or more auxiliary units may be located such that the weight of the auxiliary units is not supported by the head of the user U1. For example, an auxiliary unit may be configured to analyze an image frame captured by the camera CAM1. The one or more auxiliary units may be arranged to communicate with the head mounted device 500 e.g. in a wireless manner, via an optical cable, or via an electrical cable. An auxiliary unit may be located e.g. in a pocket of the user U1, on a table, or in an internet server.

Fig. 9 shows, by way of example, method steps for obtaining user input.

One or more image frames IMGk of the first hand H1 may be captured in step 805.

In step 810, the position of the first hand H1 may be determined. The position of the first hand H1 may be determined by analyzing one or more image frames IMGk captured in step 805. Determining the position of the first hand H1 may comprise determining the position of the sub-image SUB1 of the hand H1.

Determining the position of the first hand H1 may optionally comprise determining whether the sub-image SUB1 corresponds to a human hand or not. Determining the presence of a human hand may be used e.g. when the image frame IMGk comprises sub-images of several real objects.

In step 815, a virtual image VMG1 may be displayed by the head mounted device such that the virtual image VIMG1 overlaps the first hand H1 User input may be obtained in step 820 by detecting one or more movements of a hand. User input may be obtained by detecting a movement of the first hand H1. User input may be obtained by comparing a position of the first hand H1 with a previous position of the first hand Hi. User input may be obtained by detecting a movement of the second hand H2. User input may be obtained by comparing a position of the second hand H2 with a previous position of the second hand H2.

User input may be obtained by detecting a movement of the second hand H2 with respect to the first hand H1.

In step 825, the operation of an apparatus may be controlled based according to the user input obtained in step 820. For example, the position of the displayed virtual image VMG1 may be changed according to movement detected in step 820. For example, the shape of the displayed virtual image VMG1 may be changed according to movement detected in step 820. For example, the size of the displayed virtual image VMG1 may be changed according to movement detected in step 820.

Fig. 10 shows, by way of example, a communication system 1000, which may comprise the device 500 and/or the apparatus 501. The system 1000 may comprise a plurality of devices 500, 600, which may be arranged to communicate with each other and/or with a server 1240. The devices 500, 600 may be portable.

One or more devices 500 may comprise a user interface UlF1 for receiving user input. One or more devices 500 and/or a server 1240 may comprise one or more data processors. The system 1000 may comprise end-user devices such as one or more portable devices 500, 600, mobile phones or smart phones 600, Internet access devices (Internet tablets), personal computers 1260, a display or an image projector 1261 (e.g. a television), and/or a video player 1262. One or more of the devices 500 or portable cameras may comprise an image sensor DET1 for capturing image data. A server, a mobile phone, a smart phone, an Internet access device, or a personal computer may be arranged to distribute image data, and/or other information. Distribution and/or storing data may be implemented in the network service framework with one or more servers 1240, 1241, 1242 and one or more user devices. As shown in the example of Fig. 10, the different devices of the system 1000 may be connected via a fixed network 1210 such as the Internet or a local area network (LAN). The devices may be connected via a mobile communication network 1220 such as the Global System for Mobile communications (GSM) network, 3rd Generation (3G) network, 3.5th Generation (3.5G) network, 4th Generation (4G) network, Wireless Local Area Network (WLAN), Bluetooth®, or other contemporary and future networks. Different networks may be connected to each other by means of a communication interface 1280. A network (1210 and/or 1220) may comprise network elements such as routers and switches to handle data (not shown). A network may comprise is communication interfaces such as one or more base stations 1230 and 1231 to provide access for the different devices to the network. The base stations 1230, 1231 may themselves be connected to the mobile communications network 1220 via a fixed connection 1276 and/or via a wireless connection 1277. There may be a number of servers connected to the network. For example, a server 1240 for zo providing a network service such as a social media service may be connected to the network 1210. The server 1240 may generate and/or distribute additional information for an augmented reality application running on the device 500. A second server 1241 for providing a network service may be connected to the network 1210. A server 1242 for providing a network service may be connected to the mobile communications network 1220. Some of the above devices, for example the servers 1240, 1241, 1242 may be arranged such that they make up the Internet with the communication elements residing in the network 1210. The devices 500, 600, 1260, 1261, 1262 can also be made of multiple parts. One or more devices may be connected to the networks 1210, 1220 via a wireless connection 1273. Communication COM1 between a device 500 and a second device of the system 1000 may be fixed and/or wireless. One or more devices may be connected to the networks 1210, 1220 via communication connections such as a fixed connection 1270, 1271, 1272 and 1280. One or more devices may be connected to the Internet via a wireless connection 1273. One or more devices may be connected to the mobile network 1220 via a fixed connection 1275. A device 500, 600 may be connected to the mobile network 1220 via a wireless connection COM1, 1279 and/or 1282. The connections 1271 to 1282 may be implemented by means of communication interfaces at the respective ends of the communication connection. A user device 500, 600 or 1260 may also act as web service server, just like the various network devices 1240, 1241 and 1242. The functions of this web service server may be distributed across multiple devices.

Application elements and libraries may be implemented as software components residing on one device. Alternatively, the software components may be distributed across several devices. The software components may be distributed across several devices so as to form a cloud.

A video sequence SEQ1 captured by the camera CAM1 may be processed, stored and/or communicated by using a data compression codec, e.g. by using MPEG-4 Part 2 codec, H.264/MPEG-4 AVC codec, H.265 codec, Windows Media Video (WMV), DivX Pro codec, or a future codec (e.g. High Efficiency Video Coding, HEVC, H.265). The video data VDATA1, VDATA2 may encoded and/or decoded e.g. by using MPEG-4 Part 2 codec, H.264/MPEG-4 AVC codec, H.265 codec, Windows Media Video (WMV), DivX Pro codec, or a future codec (e.g. High Efficiency Video Coding, HEVC, H.265). The video data may also be encoded and/or decoded e.g. by using a lossless codec.

For the person skilled in the art, it will be clear that modifications and variations of the devices and the methods according to the present invention are perceivable. The figures are schematic. The particular embodiments described above with reference to the accompanying drawings are illustrative only and not meant to limit the scope of the invention, which is defined by the appended claims.

Claims

CLAIMS1. A method, comprising: - obtaining an image of a first hand, -displaying a virtual image by using a display of a head mounted device such that the virtual image overlaps the first hand when viewed by a user of the head mounted device, and - obtaining user input by detecting a movement of a hand.
2. The method of claim 1 comprising moving the virtual image based on the detected movement.
3. The method of claim 1 or 2 comprising modifying the virtual image based on the detected movement.
4. The method according to any of the claims 1 to 3 comprising moving the virtual image from the position of the first hand to a second position where the virtual image is associated with a real object.zo
5. The method according to any of the claims 1 to 4 wherein the virtual image is digital map.
6. The method according to any of the claims 1 to 5 comprising detecting a change of position of a first finger with respect to the first hand, and moving the virtual image according to the detected change of position.
7. The method according to any of the claims 1 to 6 comprising detecting a change of distance between a first finger and a second finger, and changing the size of the virtual image according to the detected change of distance.
8. The method according to any of the claims 1 to 7 comprising: - displaying a first virtual image at a first position on the first hand, the first position being associated with a first option, - displaying a second virtual image at a second position on the first hand, the second position being associated with a second option, - detecting the position of a second hand with respect to the first hand, and - selecting the first option or the second option based on the detected position of a second hand.
9. An apparatus, comprising: - a head mounted device having a display, which is configured to display a virtual image such that the virtual image overlaps a hand of a user when viewed by the user of the head mounted device, - a camera to capture an image of the hand of the user, and -a control unit configured to detect a movement of a hand by analyzing an image captured by the camera.
10. The apparatus of claim 9, wherein the head mounted device comprises the camera,
11. The apparatus of claim 9 or 10, wherein the apparatus is configured to move the virtual image based on the detected movement.
12. The apparatus according to any of the claims 9 to 11, wherein the apparatus is is configured to modify the virtual image based on the detected movement.
13. The apparatus according to any of the claims 9 to 12, wherein the apparatus is configured to detect a change of distance between a first finger and a second finger, and to change the size of the virtual image according to the detected change of distance.
14. The apparatus according to any of the claims 9 to 13, wherein the apparatus is configured to: - display a first virtual image at a first position on the first hand, when viewed by the user, the first position being associated with a first option, - display a second virtual image at a second position on the first hand of the user, when viewed by the user, the second position being associated with a second option, - detect the position of a second hand with respect to the first hand, and -select the first option or the second option based on the detected position of a second hand.
15. A computer program comprising computer program code configured to, when executed on at least one processor, cause an apparatus or a system to: -obtain an image of a first hand, - display a virtual image by using a display of a head mounted device such that the virtual image overlaps the first hand when viewed by a user of the head mounted device, and - obtain user input by detecting a movement of a hand.
16. The computer program of claim 15, comprising code for moving the virtual image is moved based on the detected movement.
17. A computer program product embodied on a non-transitory computer readable s medium, comprising computer program code configured to, when executed on at least one processor, cause an apparatus or a system to: - obtain an image of a first hand, - display a virtual image by using a display of a head mounted device such that the virtual image overlaps the first hand when viewed by a user of the head mounted io device, and - obtain user input by detecting a movement of a hand.
18. Means for controlling operation of an apparatus, comprising: - means for obtaining an image of a first hand, is -means for displaying a virtual image by using a display of a head mounted device such that the virtual image overlaps the first hand when viewed by a user of the head mounted device, and - means for obtaining user input by detecting a movement of a hand.