WO2009059716A1 - Pointing device and method for operating the pointing device - Google Patents

Pointing device and method for operating the pointing device Download PDF

Info

Publication number
WO2009059716A1
WO2009059716A1 PCT/EP2008/009106 EP2008009106W WO2009059716A1 WO 2009059716 A1 WO2009059716 A1 WO 2009059716A1 EP 2008009106 W EP2008009106 W EP 2008009106W WO 2009059716 A1 WO2009059716 A1 WO 2009059716A1
Authority
WO
WIPO (PCT)
Prior art keywords
6dof
pointing
mode
button
clutch
Prior art date
Application number
PCT/EP2008/009106
Other languages
French (fr)
Inventor
Sebastian Repetzki
Original Assignee
Sebastian Repetzki
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sebastian Repetzki filed Critical Sebastian Repetzki
Publication of WO2009059716A1 publication Critical patent/WO2009059716A1/en

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J13/00Controls for manipulators
    • B25J13/02Hand grip control means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/0304Detection arrangements using opto-electronic means
    • G06F3/0325Detection arrangements using opto-electronic means using a plurality of light emitters or reflectors or a plurality of detectors forming a reference frame from which to derive the orientation of the object, e.g. by triangulation or on the basis of reference deformation in the picked up image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/0346Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor with detection of the device orientation or free movement in a 3D space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/0354Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor with detection of 2D relative movements between the device, or an operating part thereof, and a plane or surface, e.g. 2D mice, trackballs, pens or pucks
    • G06F3/03543Mice or pucks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/04815Interaction with a metaphor-based environment or interaction object displayed as three-dimensional, e.g. changing the user viewpoint with respect to the environment or object
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/245Aligning, centring, orientation detection or correction of the image by locating a pattern; Special marks for positioning

Definitions

  • the present invention relates to input devices for all six spatial degrees of freedom at the human-computer-interface. Furthermore it relates to model-based real-time object tracking methods using single camera image processing techniques.
  • 3D software is defined as a range of computer software dealing with models of objects in the three-dimensional space also referred to as "virtual 3D objects.
  • Engineering sciences make use of 3D models for tasks like mechanical design or the programming of machine tools or robots.
  • 3D software also has been developed for chemistry, architecture, entertainment - e.g. computer games, cartoon and movie production - fashion industry, graphical user interfaces, Internet services and much more.
  • commonly used input devices at the human-computer-interface - such as mouse, digitizer, joystick or game console - detect only two spatial degrees of freedom.
  • Pointing devices with six degrees-of-freedom enable new and more intuitive interaction styles between man and computer.
  • the device has to meet additional demands such as augmented productivity at low cost, precision, robustness, low CPU load, general purpose usability, comfort and compatibility with existing computer workplaces.
  • the future standard input device might consist of a 2D pointing device that is convertible into a 6DOF device but only needs very little extra hardware.
  • pose refers to the six coordinates - three translations: x, y and z, and three rotations: roll, pitch and yaw - that describe the position and orientation of a rigid body with respect to a Cartesian coordinate system with the origin lying in a camera's objective.
  • camera will be used as a placeholder for any type of device that is able to transform a real-world scene into digital images.
  • a camera projects visible objects within the three-dimensional space onto a two-dimensional image in a deterministic and well understood way.
  • the image acquisition frequency of a camera must be sufficiently high, so that movements in an image series appear as continuous to the human eye.
  • 6DOF pointing device depicts an input device that quasi-con- tinuously determines the pose of a physical body that is conducted by the hand of a computer user in order interact with models of 3D geometry within 3D software applications.
  • the term “mouse” will be alternatively to "2D pointing device” for a device that tracks its movement w.r.t. a flat surface, usually the desktop.
  • Roberts L.G., "Machine perception of three-dimensional solids", in Optical and Electronical Information Processing, J. Tippet, Ed. (Cambridge, Mass.: MIT Press, 1966), 159-197.
  • 6DOF pointing devices have been developed using ultrasound time-of-flight sensors, angle encoders in articulated arms or transducers in electromagnetic fields, each of them having specific advantages such as resolution, haptic feedback or working volume.
  • none of these products has become a standard device due to disadvantages like physical size, sensibility to walls or metal, latency, high investment costs, angle and volume restrictions to the devices pose, lack of freedom due to weight, cables or mechanical links and resulting fatigue.
  • 6DOF pointing devices are based on a stereo setup using infrared cameras in combination with spherical reflectors.
  • Stereo setups i.e. two cameras with overlapping viewing volumes, see also [DE 10 2005 011 432 A1], need to be calibrated each time a camera might have moved. They demand more CPU time than single camera solutions and reduce the user's working volume due to the two simultaneously active lines-of-sight.
  • Object tracking methods based on a single camera have advantages w.r.t. stereo setups like: use of less hardware, simplified calibration process and a simple, pyramidal working volume that only depends on the camera's viewing angles.
  • depth the sizes of the real object and its projection must be available in combination with the projection parameters of the camera.
  • camera objectives with short focal lengths augment the precision of depth measurement.
  • the recognition of visual features is limited by the image resolution.
  • edge detection edge detection
  • corner detection blob detection
  • marker detection marker detection and pattern matching.
  • David G. Lowe extracts straight lines using edge detection algorithms in images containing polyhedral objects, i.e. rigid bodies delimited by flat surfaces and straight edges [Lowe].
  • lines are grouped into pairs that meet perceptual conditions like parallelism, collinearity or proximity.
  • Lowe uses higher order grouping, probabilistic methods and a geometric model of the expected object to create pose estimates. Least-squares error minimization and repeated line collecting are balanced such that the probability of finding objects and their precise poses becomes a maximum.
  • Lowe also developed SIFT [US 6 711 296 B1], a 3D pattern matching technique which enables 6DOF tracking of objects that the algorithm had learned previously from reference images.
  • [VacLepFua] aims at tracking general textured objects with known geometry under natural lighting conditions whereas the present invention tracks a specific, non-textured object under controlled lighting conditions.
  • [Dickmanns] used a black-and-white Cuboctahedron that went into orbit on board of a Spaceshuttle in 1993 and was the first free-floating object that has been caught semi- autonomously by a robot.
  • Single camera tracking, edge detection and Kalman filtering were used to estimate and predict object poses. Despite its obvious advantages, this principle has never been used again in the field of object tracking.
  • the patent [US6526166] proposes single camera tracking based on a cube that is covered with three different colours.
  • the use of three colours allows covering the cube such that adjacent surfaces have different colours for each edge.
  • An ideal camera projects those edges as zones of colour contrast along straight line segments.
  • the cube's orientation is determined. If, in addition, at least one line segment's length is measurable, one can calculate the relative pose between camera and cube, supposed that the camera projection parameters are known, too. However, under some orientations, only two or just one square surface is visible to the camera and the estimated cube's pose lacks precision.
  • a 6DOF pose can be calculated if at least three features were found.
  • Marker-based tracking algorithms avoid the disadvantages of colour segmentation by juxtaposing two highly distinctive colours, for convenience: black and white.
  • the colours are applied onto plane square surfaces following well defined motifs.
  • the ARTools software [MagicMouse] is able to track the pose of quadratic makers in realtime using a single camera but rotations around some axis lying within the marker plane are limited to angles less than 90°. If different markers are fixed onto the faces of a cube [AugmentedChemistry], the cube's pose can be tracked from all viewing angles.
  • the tracking algorithm needs to recognise visual features of very different sizes: the surrounding square and the motif contained by the square. By this fact the depth range of this method is reduced w.r.t. methods based on edge or blob search.
  • D. Bradley and G. Roth [Sphere] use small, green and orange marker circles that are distributed over a blue sphere.
  • the sphere's translational coordinates are calculated by segmenting a blue circle out of the camera image. This step is sensible to lighting conditions and to some blue in the background. Then coloured blobs are detected within the blue circle and compared to the known marker distribution on the sphere, thus allowing to determine all three spatial orientation angles.
  • Bradley and Roth's developed this principle into a 6DOF input device for 3D software.
  • Another combined 2D and 6DOF input device is commercialised by the company [Motion4u].
  • the mouse body's pose is being 6DOF tracked.
  • Automatic mode change from 2D to 3D tracking occurs whenever the device lifts above an adjustable height from the supporting surface.
  • 3D software which necessitates to do object selection in 2D mode whereas object positioning is done in 3D mode.
  • the provided device does not meet the criterion for a 6DOF pointing device.
  • the task of the present invention consists of providing the most compact possible, general purpose pointing device, having two distinctive pointing modes, 2D and 6DOF, and a method to easily and intuitively change between these modes.
  • the 6DOF pointing mode has to be defined in analogy to the 2D pointing mode of a standard computer mouse, which comprises a left and right mouse button, a scroll wheel and a pointer displayed on the computer screen.
  • the invention aims to make fast, robust and precise 6DOF pointing available to standard computers by adding only a minimum of extra hardware components to the standard computer mouse and further aims to render the 6DOF pointing mode as easy and productive as is already the 2D pointing mode on standard computers.
  • the "ideal" 6DOF pointing device would consist of just one lightweight body that senses its own pose changes accurately, fast enough and without any external hardware as [DE3223896A1] proposes in claim 2. Such a device is not yet available.
  • the “optimal” solution is according to the invention defined as a solution that necessitates the smallest possible number of hardware components while offering the best usability for the desktop computer user.
  • the present invention provides such an "optimal" solution for a combined 2D/6DOF pointing device and defines its particular operating modes.
  • the present invention provides a pointing device for the human-computer-interface which combines a standard 2D pointing device with a six-degrees-of-freedom (6D0F) tracking method based on a target body and a real-time, single-camera image processing algorithm.
  • 6DOF pointing mode the user moves the device freely inside the camera's viewing volume while the tracking algorithm quasi-continuously and quasi-instantaneously detects the six spatial coordinates of the device's pose with respect to the camera and applies them to a 6DOF pointer onto the screen.
  • the invented device has separate operating modes: 2D pointing, transition to 6DOF pointing, 6DOF pointing, 6DOF clutch, return to 2D pointing and abort.
  • Standard mouse buttons and scroll wheel are operational in either mode.
  • To change between 2D and 6DOF mode the 2D pointing device's body provides one additional button. The same button also temporarily allows interrupting the link between the 6DOF pointing device held by the user's hand and the 6DOF pointer projected onto the screen.
  • the present invention aims to make 6DOF pointing available to any desktop computing workplace, by extending the common computer mouse.
  • Target body, algorithm, operating method and the overall setup of the device are designed to achieve the maximum possible compactness, usability and productivity in particular for interaction with 3D application software.
  • the device of the present invention consists of a standard 2D pointing device in combination with a single target body that is fixed onto the device, furthermore a single camera and an adequate 6DOF tracking algorithm.
  • the algorithm determines the relative pose between target body and camera by processing images from that camera.
  • Target body and algorithm are particularly appropriate to cope with cluttered backgrounds, poor image quality, low cost cameras and / or inhomogeneous lighting conditions.
  • Target body and method are designed to track the spatial input of a human hand by using standard PC hard- and software with only minor changes to the user's desktop.
  • the proposed geometric configuration between target body, mouse, user, screen and camera gives a maximum of freedom to user movements, be it in 2D or in 6DOF pointing mode.
  • the tracking modes, 2D and 6DOF are accompanied by a set of operating modes, defined in the present invention, which include the use of standard mouse buttons and scroll wheel as well as a specific "clutch"-button, used to transit from 2D mode to 6DOF mode and vice versa.
  • a preferred embodiment of the invention relates to a combined 2D and 6DOF Pointing Device consisting of:
  • a 2D pointing device having at least a right button, a left button and a scroll wheel
  • 6DOF clutch mode press on the clutch-button to interrupt 6DOF pointer movement, move the device to the next starting pose, press again or release the clutch-button to reactivate 6DOF pointing mode
  • the pointing device is a wireless pointing device.
  • the target body is fixed onto the 2D pointing device by means of a rigid mechanical link consisting of some lightweight material and having a small shape compared to the size of the target body and the 2D pointing device.
  • the mechanical link is realised as a thin, bar-shaped extension of the housing of the 2D device.
  • the clutch-button consists of an electrical switch or any other sensor having two distinct states which is able to detect an intentional movement of a member of the user's hand with respect to the 2D pointing device.
  • the state of the clutch-button will be transmitted to the connected computer in coded form via the same communication links employed for transmitting the states of mouse buttons, scroll wheel and the 2D tracking unit, i.e. a cable or a wireless communication link.
  • Driver software on the computer is in charge of decoding the clutch-button's state from incoming data flows using methods well known by those skilled in the art of peripheral hardware.
  • the clutch-button is realized as a switch, sensitive to some moderate pressure executed by one of the user's fingers which is in contact with the invented device in charge of keeping the device linked to the hand while the hand guides it over the desktop or in free space.
  • the clutch button is preferably placed in a zone where the thumb or ring finger naturally touches the mouse body. After a brief learning period, the user will press the clutch button in a natural way by applying a stronger grip to the mouse body.
  • the invention furthermore relates to a method for operating a pointing device having at least a right button, a left button and a scroll wheel, an additional clutch-button and a target body fixed on to the 2D pointing device, characterized in, that means enable a selection and/or activation of operating modes: a) 2D pointing mode: the device slides on the desktop surface. Right and left button and scroll wheel are functional. A 2D pointer on the screen represents the current 2D position and movement of the device, b) Transition from 2D to 6DOF mode: the device lifts from the desktop to a starting pose followed by a click on the clutch-button, c) 6DOF pointing mode: device moves freely in space with fully functional right and left button and scroll wheel.
  • a 6DOF pointer on the screen represents the current spatial pose and movement of the device.
  • 6DOF clutch mode press on the clutch-button to interrupt 6DOF pointer movement, move the device to the next starting pose, press again or release the clutch-button to reactivate 6DOF pointing mode
  • the invented device provides the user with operation modes - 2D pointing, transition to 6DOF, 6DOF pointing, 6DOF clutch, return to 2D and abort mode - and with means to select and activate these modes.
  • Operating mode a the 2D pointing mode, is defined exactly like the well known functionality of a standard computer mouse.
  • the device is placed on top of and in contact with a supporting surface, usually the desktop, and a 2D tracking unit within the device detects movements between device and desktop surface.
  • a graphical symbol referred to as "2D pointer", usually a small arrow, is displayed on the computer screen and moves according to the detected movements thus representing the device's position.
  • the user selects and activates the 2D pointing mode by bringing the device into contact with the desktop surface.
  • the user may actuate a left, a right mouse button and/or a scroll wheel provided with the device.
  • Software running on the computer in particular the operating system, the device driving software and application software having a graphical user interface, respond to these actions in a deterministic manner. In particular the response depends on the pointer's position and movement on the screen when the user's action occurs.
  • Operating mode b allows changing from 2D pointing mode to 6DOF pointing mode.
  • the user selects and activates the transition mode by a sequence of actions. Identically to the 2D clutch, the user begins with lifting the device from the desktop. Instead of putting it down, he/she then moves the device on to a desired pose in space and eventually actuates the clutch-button by which the 6DOF pointing mode is activated.
  • Operating mode c referred to as "6DOF pointing mode” is defined in perfect analogy to the 2D pointing mode, in that
  • the 6DOF pointer may replace the 2D pointer in representing the current device's pose on the screen ⁇
  • the 6DOF pointer quasi-synchronously follows the spatial movement of the device
  • the user may actuate the left, right button and scroll wheel simultaneously to any spatial movement
  • Operating system, device driving software and currently running application software with graphical user interfaces may respond to the user's spatial movements and button states.
  • the responses depend on the actual pose and spatial movement of the 6DOF pointer when the user's action occurs.
  • a 3D application software projects a number of geometrical objects onto the screen that were created within a virtual 3D space.
  • the 6DOF pointing mode is active.
  • the user holds and conducts the invented device freely in space.
  • the application software is continuously informed about the device's pose.
  • the application software could use this information to move the 6DOF pointer within the virtual space according to the user's movement and to project the 6DOF pointer onto the screen at its actual pose.
  • the application software could interpret this action in perfect analogy to the 2D pointing device, as a voluntary act of selection of that object by the user.
  • the application software could interpret - and react accordingly - to this action as the user's intention to drag that object in space and to release it at a new pose according to the pose where the left mouse button is eventually released.
  • an application software could interpret this doing as the user's intention to change the projection parameters of the virtual space onto the screen: viewpoint, zoom etc..
  • any combination between the 6DOF pointer and other mouse buttons, the scroll wheel or any other user actions could be interpreted as a specific intention of the user that needs a specific response of 3D application software.
  • a computer screen shows many two-dimensional elements like button, menu bars, windows containing 2D application user interfaces, icons on the virtual desktop etc.. If the 6DOF pointer leaves the screen area containing a projected virtual 3D space, it will change back into the 2D pointer's shape and behaviour although the user still moves the 6DOF pointing device freely in space. Thus, the user may interact with the computer over the entire screen without loosing any productivity of the 2D pointing device. Whenever the 2D pointer slides again over the projected 3D space, it will transform back into a 6DOF pointer with its shape and behaviour.
  • Operating mode d referred to a "6DOF clutch mode" occurs by analogy to 2D pointing devices, if the invented device hits an obstacle or the user feels uncomfortable while moving the 6DOF pointer to a desired position.
  • a 6DOF pointing device has no physical contact to its reference - the entire 3D space - and thus cannot lose that contact.
  • one additional button referred to as “clutch-button” needs to be added to the pointing device.
  • the user selects and activates the 6DOF clutch mode by actuating the clutch-button. Now he/she can displace the device into a more convenient pose without loosing the achieved pose of the 6DOF pointer on the screen.
  • the user actuates the clutch-button a second time and the 6DOF-pointer will immediately restart following the spatial movement of the device.
  • the doing of "actuating the button" may be realized as a simple push down, a click - i.e. a push down and immediate release -, or a double click onto the button.
  • Operating mode e allows the user to exit 6DOF pointing mode in a well controlled manner. He/she selects and activates the “return mode” by actuating the clutch-button, thus interrupting movements of the 6DOF pointer. Instead of pushing again that button, the user lowers the device to the desktop in a way that the 2D tracking unit gets into contact with the desktop surface. When the 2D tracking unit senses the presence of a contact surface or some relative movement, the 2D pointing mode will be activated. The 6DOF tracking algorithm might go then into a stand-by state to save computer resources.
  • Operating mode f forces the reactivation of 2D pointing mode without prior interruption of 6DOF pointing mode by means of the clutch-button.
  • abort mode forces the reactivation of 2D pointing mode without prior interruption of 6DOF pointing mode by means of the clutch-button.
  • A) The position of greylevel gradients within digital images can be determined with sub- pixel precision whereas blob and marker detection algorithms usually start with thresholding at a given greylevel, which deletes most of the sub-pixel information contained in the original image.
  • the target body must be highly distinctive from any other image content within the viewing volume and over a large depth range. To avoid self-occlusion, the target body needs to have a convex shape.
  • the surface of an optimal target body is composed of flat faces, coloured in either black or white or in any two types of light that the camera distinguishes. All borders between faces of different colours have to be designed as straight line segments that coincide with edges. Thus, projected faces appear as polygons and edges appear as straight lines in first order gradient images.
  • the preferred target body of the present invention is a cuboctahedron with triangles attributed to the first colour and squares attributed to the second colour.
  • the preferred target body colours are realized as black and white, in order to project face borders onto images at maximum contrast available.
  • one preferred embodiment of the invention consists of a target body with transparent faces and a light source which is supplied with electric power and built into the target body.
  • faces are covered with a reflective substance and a light source is placed in such a way that its light is sent back to the camera's objective.
  • the user moves the device on a flat surface, the desktop.
  • 6DOF mode the device has to leave the desktop and to move within the viewing volume of the camera. Only the smallest possible effort shall be necessary to transit between the two modes.
  • the hand's position on the 2D pointing device should not change in order to have full access to mouse buttons and scroll wheel in 2D as well as in 6DOF mode.
  • the user should be able to move the target body freely in space without touching its faces.
  • the constellation between 2D pointing device, target body and camera should be chosen such that, for almost any hand and arm poses, the target body is neither obstructed by the 2D pointing device nor by any part of the user's body.
  • the device should be symmetric so that right-handed and left-handed users can use it at the same comfort level.
  • the target body is fixed in front of the 2D pointing device by means of a mechanical link, thus allowing a computer user - without changing the hand position on the mouse - to either move the target body in space or to use it as a normal 2D pointing device on the desktop.
  • Translations are limited by the camera's viewing volume whereas rotations around any axis are allowed to more than 90°.
  • the camera is placed such that its optical axis is perpendicular to the screen. Furthermore the camera is placed close to screen and on the same side of the screen as the mouse.
  • This constellation considerably reduces the complexity of calibration between the reference systems of the real space and virtual objects on the screen.
  • the working volume of the target body coincides closest possible with both, the reachable volume of a user's hand and the natural limits of a hand's orientation angles.
  • the preferred embodiment of the combined 2D/6DOF pointing device consists of a cuboctahedron that is fixed onto a standard 2D desktop computer mouse. Single camera images are processed quasi-continuously in real-time to determine the target body's pose. By means of a clutch button on the mouse body, the device can be operated alternatively as a standard 2D pointing device on top of the desktop or as a 6DOF pointing device within the viewing volume of the camera.
  • a 6DOF pointer appears on the screen and mouse interactivity known from 2D pointing devices becomes available inside the virtual 3D space that is projected by the software on the screen.
  • Fig. 1 depicts the preferred embodiment of the present invention as part of the human- computer-interface of a standard desktop workplace.
  • Fig. 2 depicts two alternative embodiments of the invented device that apply different methods to control lighting conditions.
  • Fig. 3 shows alternative shapes of the target body according to the present invention.
  • Fig. 4 depicts the architecture of hard- and software according to the present invention.
  • Fig. 5 is a flow chart of the algorithm of the present invention.
  • Fig. 6 shows different steps of the algorithm: cutting straight lines out of edge point chains, recombination to polygons, evaluation of the image region enclosed by a polygon.
  • Fig. 7 explains the relationship between an arbitrary image triangle and two regular triangles in space that are projected onto the said image triangle.
  • Fig. 8 explains the transition from 2D to 6DOF mode and back to 2D mode.
  • Fig. 9 depicts an embodiment of the 6DOF pointer and its usage in a virtual 3D space and/or the 2D graphical user interface.
  • Fig. 1a shows an embodiment of the present invention as part of a desktop computer workplace.
  • the workplace typically comprises a central unit 101, a screen 102, a key- board 103, a 2D pointing device (computer mouse) 104, a camera 105 which is connected to the computer as well as the target body 106 which is held by the user's hand either in 2D mode on the desktop 107 or 6DOF mode in front of the camera 108.
  • the camera position is chosen such that the target body is visible to the camera within reach of the user's hand for a sitting person looking at the screen.
  • Fig. 1b shows the preferred embodiment of the present invention including the preferred target body: a cuboctahedron 111 with surface properties chosen such that light is either completely reflected or absorbed thus producing the colours white 112 and black 113 on images taken by a camera.
  • the target body is linked to a standard mouse 114 via a mechanical link 115 to one of the black faces.
  • the standard mouse provides at least a right 116 and a left button 117 as well as a scroll wheel 118.
  • squares 113 are attributed to the colour black, triangles 112 to white. Nevertheless, the scope of the present invention includes any combination of colours that cameras are able to distinguish reliably.
  • Fig. 2a shows a preferred embodiment of the target body 211 with white faces made out of translucent material and containing a light source 212 which is supplied with electrical power 213 from batteries or via a cable.
  • the light source augments the contrast in camera images both, between black and white faces and between white faces and the remaining content of camera images, referred to as background.
  • Fig. 2b depicts another embodiment of the present invention using directed light sources 221 centred by the camera's viewing direction 222 in combination with highly reflective surfaces 223, in particular infrared light sources and retro-reflective surfaces.
  • the light sources are preferably arranged close to the camera's objective and illuminate the entire viewing volume of the camera 224.
  • the scope of the present invention includes all polyhedral bodies covered with two different colours such that two faces of each colour meet in each edge.
  • Fig. 3 shows examples of such bodies.
  • Fig. 3a is an octahedron 310 covered with black and white colour. Alike the cuboctahedron, the octahedron provides the property that exactly four edges 312 meet in each corner 311. Thus, each pair of adjacent faces can form a border line of two distinct colours along their common edge 312.
  • the cuboctahedron 330 shown in Fig. 3c has the advantage that at least four faces and their four common edges are visible under any viewing angle, unless occlusion or out-of-sight situations occur. These four edges project with maximum available contrast onto the image whereas the surrounding edges 331 contrast with an - a priori - unknown background.
  • Fig. 3d shows a Rhombicuboctahedron 340 and Fig. 3e an lcosidodecahedron 350 which also comply with the rule: four edges meet in each corner. However, compared to the cuboctahedron, these polyhedra have shorter edges 341 , 351 with respect to the body's outer diameter which might reduce the distance range covered by the tracking algorithm.
  • Fig. 4 shows the overall information flow for present invention, including the tracking algorithm, the hard- and software of a computer, its human-computer-interface and the computer user.
  • the user 401 with his/her hand 402 guides the target body 403 inside the viewing volume 404 of a camera 405. Digital images taken by the camera are transferred via communication links 406 to an appropriate hardware component 407.
  • the operating systems 408 checks whether a new image is available and provides mechanisms allowing other processes to access to the image.
  • the algorithm 409 of the present invention accesses and loads the image into an internal memory providing fast access.
  • the algorithm also referred to as "driver software" of the invented device, is realized as a computer programme which can be executed in parallel with other programmes on computer hardware 407 under the control of an operating system 408. The results of the algorithm are made accessible and readable to other programmes and processes.
  • an "application programme” is defined as a type of programme which acts in a perceptible way to the user and may be employed consciously in her/his own interests, e.g. for solving problems or to her/his entertainment.
  • Application programme 410 can access and process the results of algorithm 409 for example to change the pose of a virtual object in a geometric model and to project its graphical representation on the screen 411 according to the pose of the target body.
  • Application programmes also could control machines, e.g. a robot, or execute any other action that the user would be able to perceive or interpret as being physically linked to the movement of her/his hand.
  • Fig. 5 shows in detail the present invention's algorithm for the determination of a target body pose relative to a camera.
  • the first objective of the algorithm is to find out whether or not the camera image contains at least one projection of the target body. If this is the case, the target body pose has to be determined from this projection with the highest possible accuracy.
  • the algorithm achieves both objectives by successive steps, alternating the creation and evaluation of candidates, wherein the nature of candidates becomes more and more complex and specific in each step.
  • Step 503 executes a standard edge detection algorithm, comprising: smoothing of image noise, calculation of the magnitude and direction of image gradients, edge point creation with sub-pixel precision, edge point chaining and thresholding. This algorithm creates line segments containing the visible border lines between faces of the target body as well as other visible lines.
  • step 504 the algorithm cuts straight line segments out of the edge point chains thus augmenting the probability to find the projected target body's edges.
  • Step 505 combines these straight line segments to a list of triangles and quadrangles which are then considered as candidates of projected target body faces.
  • Step 506 evaluates the 2D polygon candidates by comparing the colour distribution inside a polygon to the colour of the corresponding target body face, i.e. triangle or square. This step takes advantage of the coincidence of colour borders and polyhedral edges on target bodies chosen by the present invention.
  • step 507 calculates target body poses that project the target body onto the camera image in such a way that one face projects exactly onto that said 2D polygon, thus creating a list of target body pose candidates.
  • step 508 tries to find as many straight lines as possible in the image that fit to projections of other target body's edges.
  • the candidates are evaluated using criteria like the number of fitting lines and the colour distribution within projected faces.
  • Step 509 applies appropriate thresholds to decide whether or not at least one promising target body candidate exists in the image. If this is not the case the algorithm returns to step 501.
  • step 510 chooses the best target body candidate from the remaining list and executes a least-squares minimization, referred to as "Best-Fit", to improve the pose of the target body by reducing the error between its projected edges and the straight line segments attached to them by step 508.
  • the determined pose 512 is made accessible to other programmes before the algorithm starts a new loop beginning with step 501.
  • Fig. 6 details steps 503 to 506 of the algorithm of the present invention.
  • Fig. 6a shows a part of a camera image 611 containing a projected cuboctahedron. Chains of dots 612 drawn onto the image represent the result of the edge chain detection step 503.
  • Fig. 6b illustrates step 504 which cuts straight segments out of the edge point chains 621 for a given linearity threshold 622 and calculates regression lines 623.
  • Fig. 6c the regression lines 623 are delimited near the first 631 and last 632 edge point of the corresponding edge point chain, thus delivering line segments 633.
  • the arrow head on each line indicates a line direction.
  • white faces project on the right and black faces on the left side w.r.t. the line direction.
  • Fig 6b further shows the search for pairs consisting of a starting point 631 and an endpoint 632 belonging to two different line segments 633 whose relative distances are lower than a certain threshold 634.
  • step 505 of the algorithm finds three pairs of endpoints 641 belonging to the same three line segments, and creates the corresponding triangle 642. It will be appreciated by those skilled in the art that the directions of subsequent triangle lines rotate clockwise, which may help to reduce the number of distances to be calculated.
  • the evaluation step 506, Fig. 6e. projects triangle candidates 651 onto the original image and compares the colour distribution inside the triangular region 652 to the given colour of triangular faces of the target body. The more non-white pixels a triangular region contains, the less probably it belongs to a target body projection within the actual image.
  • Fig. 7 explains step 507 of the algorithm which consists of attaching two 3D triangles to each 2D triangle found in the image.
  • all triangles covering the cuboctahedron are regular triangles.
  • the target body's pose has to equal one out of two candidate poses that are determinable directly from that triangular region.
  • Squares project approximately as parallelograms onto images.
  • the algorithm draws a triangle into each parallelogram found in the image such that the perspective distortion of the triangle with respect to a regular triangle is the same as the perspective distortion of the found parallelogram with respect to a square.
  • the triangles are treated by step 507.
  • the resulting poses of regular triangles in space are then transferred back to squares, thus creating two 3D poses of squares with a given side length for each parallelogram.
  • the algorithm places the 3D reference system 701 into the camera's objective and the 2D image reference system 702 in front of the camera into a plane perpendicular to the camera's z-axis. Centred to the z axis, the image plane contains a rectangle 703 which represents the camera's viewing window into space. Starting at the 3D origin, three beams 704 go through the three corners 705 of a triangle 706 on the image plane. The algorithm following the present invention places two regular triangles 707 and 708 of given side length such that each corner touches a different beam.
  • the flow chart, Fig. 8, depicts the operating modes for the invented device and the mode changes that occur on user request. Initially the device works in standard mouse mode 801 , 2D tracking is active in combination with mouse buttons and the scroll wheel. In general, whenever the 2D tracking sensor detects a movement, the device goes into 2D tracking mode.
  • the device's 2D tracking mode 803 is inactive. Should the user place the mouse back onto its support 804 and the device senses some movement, 2D tracking mode reactivates. So far the device behaves like any standard computer mouse.
  • the user may press the clutch-button 805, thus activating the 6DOF tracker 806.
  • the term "press a button” may be realized as a simple push down, a click i.e. a push down and immediate release, or a double click onto the button.
  • an application programme may produce some feedback to the user responding to the user's hand motion in space. User clicks on the right or left button and the scroll wheel rotations continue to be reported to the application programme, hence providing the same productivity as in 2D tracking mode.
  • the user can stop 6DOF tracking 808 and go into the clutch mode, thus being able to displace the target body without letting the application programme to know about.
  • Another push on the clutch-button 805 stops the clutch mode and reactivates the 6DOF tracking.
  • the user may lower the device on its supporting surface 809 and reutilize it as 2D mouse. If the user lowers the mouse without previously pushing the clutch-button 810, the 2D tracking sensor will detect some movement and the device will go back to 2D pointing mode without explicit user request. In 2D pointing mode, pressing the clutch-button 811 does not change the function mode of the present invention.
  • Figure 9 illustrates the utility and usage of a 6DOF pointer as defined by the present invention.
  • the user's hand 911 translates the invented device 912 along a path in space 913 within the camera's 914 viewing volume.
  • the 6DOF pointer 916 synchronously follows an identical path 917 to a new pose 918 where the 6DOF pointer partly penetrates a 3D object 919.
  • the shape of the 6DOF pointer is chosen as a simplified teapot. In general, any finite shape is appropriate to represent the 6DOF pointer provided that, for any pair of distinct poses, the projections of that shape onto the screen are distinct too.
  • Fig. 9b the user pushes the left mouse button 921 with his/her index finger, and conducts the pointing device along a spatial trajectory 922.
  • 6DOF pointer 923 and 3D geometry 924 are tightly coupled together so that the 3D geometry follows the movement of the 6DOF pointer 925, who himself follows the movement along the pointing device's trajectory.
  • the screen in Fig. 9c is divided into a 3D zone 931 containing a projected virtual 3D space and, separated by a line 932, a 2D zone 933 presenting non-3D objects.
  • coexisting 3D and 2D screen contents may be of any shape and be separated by any type of boundary line.
  • the 6DOF pointer shape changes into a 2D pointer shape 937. This indicates to the user that the device in the particular zone of the screen behaves like a normal 2D pointing device, except that the device is actually free-flying above the desktop surface, guided by the user's hand.

Abstract

A pointing device for the human-computer-interface which combines a standard 2D pointing device with a six-degrees-of -freedom (6DOF) tracking method based on a target body and a real-time, single-camera image processing algorithm. In 6DOF pointing mode, the user moves the device freely inside the camera's viewing volume while the tracking algorithm quasi-continuously and quasi-instantaneously detects the six spatial coordinates of the device's pose with respect to the camera and applies them to a 6DOF pointer onto the screen. The invented device has separate operating modes: 2D pointing, transition to 6DOF pointing, 6DOF pointing, 6DOF clutch, return to 2D pointing and abort. Standard mouse buttons and scroll wheel are operational in either mode. To change between 2D and 6DOF mode the 2D pointing device's body provides one additional button. The same button also temporarily allows interrupting the link between the 6DOF pointing device held by the user' s hand and the 6DOF pointer projected onto the screen.

Description

POINTING DEVICE AND METHOD FOR OPERATING THE POINTING DEVICE
Field of the invention
The present invention relates to input devices for all six spatial degrees of freedom at the human-computer-interface. Furthermore it relates to model-based real-time object tracking methods using single camera image processing techniques.
Background of the invention
3D software is defined as a range of computer software dealing with models of objects in the three-dimensional space also referred to as "virtual 3D objects. Engineering sciences make use of 3D models for tasks like mechanical design or the programming of machine tools or robots. 3D software also has been developed for chemistry, architecture, entertainment - e.g. computer games, cartoon and movie production - fashion industry, graphical user interfaces, Internet services and much more. Despite the abundance of 3D software, commonly used input devices at the human-computer-interface - such as mouse, digitizer, joystick or game console - detect only two spatial degrees of freedom.
Pointing devices with six degrees-of-freedom, further referred to as "6DOF", enable new and more intuitive interaction styles between man and computer. However, in order to become widely accepted, the device has to meet additional demands such as augmented productivity at low cost, precision, robustness, low CPU load, general purpose usability, comfort and compatibility with existing computer workplaces. Hence, the future standard input device might consist of a 2D pointing device that is convertible into a 6DOF device but only needs very little extra hardware.
Throughout this text the term of "pose" refers to the six coordinates - three translations: x, y and z, and three rotations: roll, pitch and yaw - that describe the position and orientation of a rigid body with respect to a Cartesian coordinate system with the origin lying in a camera's objective.
The term "camera" will be used as a placeholder for any type of device that is able to transform a real-world scene into digital images. A camera projects visible objects within the three-dimensional space onto a two-dimensional image in a deterministic and well understood way. For the purpose of real-time object tracking, the image acquisition frequency of a camera must be sufficiently high, so that movements in an image series appear as continuous to the human eye.
Within this text, the term "6DOF pointing device" depicts an input device that quasi-con- tinuously determines the pose of a physical body that is conducted by the hand of a computer user in order interact with models of 3D geometry within 3D software applications. The term "mouse" will be alternatively to "2D pointing device" for a device that tracks its movement w.r.t. a flat surface, usually the desktop.
The present invention relates to the state of the art in the following fields:
• 6DOF input and pointing devices
• Real-time object tracking.
The following patents and publications will be cited for their contributions to the state of the art:
• DE 32 23 896 A1
• DE 36 11 337 A1
• DE 10 2005011432 A1
• EP 0903 684 A1
• WO 2004/114112 A1 • WO 2005/075936 A1
• WO 2007/04431 A2
• US 6 526 166
• US 6 711 293 B1 • US 2004 0002642 A1
• US 2005 201613
[Augmented Chemistry]: M Fjeld, BM Voegtli: Augmented Chemistry: an interactive educational workbench. In Mixed and Augmented Reality, 2002. ISMAR 2002. Proceedings, International Symposium on. Page(s): 259- 321.
[Dickmanns]: C. Fagerer, D. Dickmanns, E. D. Dickmanns: Visual Grasping with Long
Delay Time of a Free Floating Object in Orbit. Autonomous Robots 1 , 53-68
(1994), Kluwer Academic Publishers.
[Lepetit&Fua]: Vincent Lepetit and Pascal Fua. Monocular Model-Based 3D Tracking of
Rigid Objects: A Survey. Foundations and Trends in Computer Graphics and
Vision. VoI 1 , No 1 (2005) 1-89. [Lowe]: David G. Lowe: Three-Dimensional Recognition from Single Two-Dimensional Images. In Artificial Intelligence, 31, 3 (March 1987), pp. 355-395.
[MagicMouse]: E Woods, P Mason, M Billinghurst : MagicMouse: an inexpensive 6- degree-of-freedom mouse. In Proceedings of the 1st international conference on
Computer graphics and interactive techniques in Australasia and South East Asia, Melbourne, Australia, Pages: 285 - 286, published: 2003.
[Motion4u]: www.motion4u.org. Trademarks of the company Motion4u LLC: BurstMouse OptiBurst, BurstHardware, BurstManager and BurstPlugin for the
Maya software package.
[Roberts]: Roberts, L.G., "Machine perception of three-dimensional solids", in Optical and Electronical Information Processing, J. Tippet, Ed. (Cambridge, Mass.: MIT Press, 1966), 159-197.
[Sphere]: Bradley, D., Roth G.: Tracking a Sphere with Six Degrees of Freedom. Published as NRC/ERB-1115. October 28, 2004. NRC 47397. [VacLepFua]: Luca Vacchetti, Vincent Lepetit, Pascal Fua. Combining Edge and
Texture Information for Real-Time Accurate 3D Camera Tracking. In: ISMAR 2004, Third IEEE and ACM International Symposium on Mixed and Augmented Reality, p48- 56.
State-of-the-art on 6DOF input devices
Today, a few 6DOF input devices are commercially available for standard computers such as the PC. The most widespread type is based on the relative pose measurement between a handle and its base [DE 3611337 A1], where the displacement of the handle is limited mechanically. The driver software interprets these displacements as translational and rotational speeds. If a 3D software user intends to virtually move to a specific location in space, he/she will act on the handle like a pilot inside a flying aircraft controlling the speed of translation and rotation rather than the pose itself. In order to reach a specific location, the pilot continuously anticipates and adjusts the result of the movement over a certain time. With the present invention, the user will feel as if staying on fixed ground like an air traffic controller, and taking the aircraft virtually into his/her hand guiding it along a spatial trajectory to its destination.
Other 6DOF pointing devices have been developed using ultrasound time-of-flight sensors, angle encoders in articulated arms or transducers in electromagnetic fields, each of them having specific advantages such as resolution, haptic feedback or working volume. However, none of these products has become a standard device due to disadvantages like physical size, sensibility to walls or metal, latency, high investment costs, angle and volume restrictions to the devices pose, lack of freedom due to weight, cables or mechanical links and resulting fatigue.
lnertial tracking sensors, gyroscopes and accelerometers as proposed in [DE 32 23 896 A1] still suffer from sensor noise and other limitations to be integrated into a 6DOF pointing device.
In the field of Virtual Reality Systems, 6DOF pointing devices are based on a stereo setup using infrared cameras in combination with spherical reflectors. Stereo setups, i.e. two cameras with overlapping viewing volumes, see also [DE 10 2005 011 432 A1], need to be calibrated each time a camera might have moved. They demand more CPU time than single camera solutions and reduce the user's working volume due to the two simultaneously active lines-of-sight.
Another reason for the limited success of these devices is lack of general purpose usability. The average day of a computer worker might contain some hours of 6DOF pointing but will always consist of hundreds or thousands of mouse operations such as screen pointing, clicking and scrolling. Thus, an ideal 6DOF pointing device should be just like a standard mouse with a simple 6DOF extension. The extension literally shouldn't exist in the user's consciousness when doing 2D operations, but can be activated instantly when the user needs 6DOF functionality.
A detailed state-of-the-art on 6DOF pointing devices can be found in papers from different authors, e.g. Bill Buxton (Microsoft), Shumin Zhai (IBM), B. Frόhlich and J. Hochstrate (University of Weimar).
State-of-the-art on single camera real-time 6DOF object tracking
Object tracking methods based on a single camera have advantages w.r.t. stereo setups like: use of less hardware, simplified calibration process and a simple, pyramidal working volume that only depends on the camera's viewing angles. On the other hand, to measure the distance between object and camera, further referred to as "depth", the sizes of the real object and its projection must be available in combination with the projection parameters of the camera. Thus, camera objectives with short focal lengths augment the precision of depth measurement. Towards shorter depths visual features become bigger and tend to leave the viewing window. At long distances the recognition of visual features is limited by the image resolution.
A number of apparatuses and methods have been proposed to determine and track an object's pose from images taken by one camera. Yet, 6DOF input devices using such algorithms are far from being popular compared to the number of computers installed worldwide and the abundance of 3D software. [Lepetit&Fua] gives a survey on single camera object tracking techniques.
The following principles can be distinguished to recognize visual patterns in images: edge detection, corner detection, blob detection, marker detection and pattern matching.
As early as 1963, L.G. Roberts realized a computer programme that was able to detect polyhedral objects in photographs [Roberts]. In a first step, his method finds edges in the image. Using angular and proximity relationships he compares combinations of edges with a group of known objects in order to find out, which one is shown and under which viewing angle.
Similarly, David G. Lowe extracts straight lines using edge detection algorithms in images containing polyhedral objects, i.e. rigid bodies delimited by flat surfaces and straight edges [Lowe]. In the following steps, lines are grouped into pairs that meet perceptual conditions like parallelism, collinearity or proximity. Lowe uses higher order grouping, probabilistic methods and a geometric model of the expected object to create pose estimates. Least-squares error minimization and repeated line collecting are balanced such that the probability of finding objects and their precise poses becomes a maximum. Lowe also developed SIFT [US 6 711 296 B1], a 3D pattern matching technique which enables 6DOF tracking of objects that the algorithm had learned previously from reference images.
More recent object tracking methods using edge detection techniques can be found in [US 2004/0002642 A1]. The patent searches for intersection points between four or six edge lines along black and white patches but needs a stereo vision setup.
[VacLepFua] aims at tracking general textured objects with known geometry under natural lighting conditions whereas the present invention tracks a specific, non-textured object under controlled lighting conditions. [Dickmanns] used a black-and-white Cuboctahedron that went into orbit on board of a Spaceshuttle in 1993 and was the first free-floating object that has been caught semi- autonomously by a robot. Single camera tracking, edge detection and Kalman filtering were used to estimate and predict object poses. Despite its obvious advantages, this principle has never been used again in the field of object tracking.
If spherical reflectors are used in combination with a single camera such as [WO 2004/114112 A1], either the tracking accuracy suffers while spheres hide each other or a bigger number of spheres becomes necessary which makes the pointing device more cumbersome, or more cameras are needed and complicate the tracking setup even more.
To detect a known object in a single camera image, other inventions apply colour segmentation algorithms. This approach is sensible to uneven or changing lighting conditions and shade, and inevitably imposes restrictions onto the image background which are difficult to hold in home and office environments.
The patent [US6526166] proposes single camera tracking based on a cube that is covered with three different colours. The use of three colours allows covering the cube such that adjacent surfaces have different colours for each edge. An ideal camera projects those edges as zones of colour contrast along straight line segments. Thus, if three lines can be found in the image, the cube's orientation is determined. If, in addition, at least one line segment's length is measurable, one can calculate the relative pose between camera and cube, supposed that the camera projection parameters are known, too. However, under some orientations, only two or just one square surface is visible to the camera and the estimated cube's pose lacks precision.
Single camera solutions are applied in industrial environments to measure small pose deviations with respect to a reference pose, see [WO 2005 / 075936 A1]. In a first step, known features of a known object are found in the image. A 6DOF pose can be calculated if at least three features were found.
Marker-based tracking algorithms avoid the disadvantages of colour segmentation by juxtaposing two highly distinctive colours, for convenience: black and white. The colours are applied onto plane square surfaces following well defined motifs. The ARTools software [MagicMouse] is able to track the pose of quadratic makers in realtime using a single camera but rotations around some axis lying within the marker plane are limited to angles less than 90°. If different markers are fixed onto the faces of a cube [AugmentedChemistry], the cube's pose can be tracked from all viewing angles. The tracking algorithm needs to recognise visual features of very different sizes: the surrounding square and the motif contained by the square. By this fact the depth range of this method is reduced w.r.t. methods based on edge or blob search.
In the patents [US 2005 201613] and [WO 2007 / 04431 A2], small marker circles are fixed onto a cube or a wand. As with the cube, a grasping hand can easily obstruct these markers.
D. Bradley and G. Roth [Sphere] use small, green and orange marker circles that are distributed over a blue sphere. In a first step, the sphere's translational coordinates are calculated by segmenting a blue circle out of the camera image. This step is sensible to lighting conditions and to some blue in the background. Then coloured blobs are detected within the blue circle and compared to the known marker distribution on the sphere, thus allowing to determine all three spatial orientation angles. Bradley and Roth's developed this principle into a 6DOF input device for 3D software.
None of the previously cited patents and publications proposes a device that integrates standard mouse functionality and/or 2D pointing into a 6DOF pointing device.
The patent application [EP 0 903 684 A1] describes a device providing both, 2D and 6DOF input modes with automatic mode change from 2D to 6DOF whenever the device leaves its supporting surface. The patent claims the use of electromagnetic tracking hardware for both, 2D and 6DOF mode, which necessitates a cable between the device and a computer. Three buttons are provided on the device's body. The text does not provide specific operating modes or new interaction techniques of the device, in particular a pointing or selection technique in 6DOF mode.
Another combined 2D and 6DOF input device is commercialised by the company [Motion4u]. By means of three infrared cameras, three retro-reflective spheres attached to an off-the-shelf computer mouse and image processing software, the mouse body's pose is being 6DOF tracked. Automatic mode change from 2D to 3D tracking occurs whenever the device lifts above an adjustable height from the supporting surface. Provided is an interaction technique with 3D software which necessitates to do object selection in 2D mode whereas object positioning is done in 3D mode. Not allowing to select 3D objects in 6DOF tracking mode, the provided device does not meet the criterion for a 6DOF pointing device.
Task
The task of the present invention consists of providing the most compact possible, general purpose pointing device, having two distinctive pointing modes, 2D and 6DOF, and a method to easily and intuitively change between these modes. The 6DOF pointing mode has to be defined in analogy to the 2D pointing mode of a standard computer mouse, which comprises a left and right mouse button, a scroll wheel and a pointer displayed on the computer screen. The invention aims to make fast, robust and precise 6DOF pointing available to standard computers by adding only a minimum of extra hardware components to the standard computer mouse and further aims to render the 6DOF pointing mode as easy and productive as is already the 2D pointing mode on standard computers.
Summary of the invention
The "ideal" 6DOF pointing device would consist of just one lightweight body that senses its own pose changes accurately, fast enough and without any external hardware as [DE3223896A1] proposes in claim 2. Such a device is not yet available.
While the "ideal" solution remains unachieved, the "optimal" solution is according to the invention defined as a solution that necessitates the smallest possible number of hardware components while offering the best usability for the desktop computer user. The present invention provides such an "optimal" solution for a combined 2D/6DOF pointing device and defines its particular operating modes.
The present invention provides a pointing device for the human-computer-interface which combines a standard 2D pointing device with a six-degrees-of-freedom (6D0F) tracking method based on a target body and a real-time, single-camera image processing algorithm. In 6DOF pointing mode, the user moves the device freely inside the camera's viewing volume while the tracking algorithm quasi-continuously and quasi-instantaneously detects the six spatial coordinates of the device's pose with respect to the camera and applies them to a 6DOF pointer onto the screen.
The invented device has separate operating modes: 2D pointing, transition to 6DOF pointing, 6DOF pointing, 6DOF clutch, return to 2D pointing and abort. Standard mouse buttons and scroll wheel are operational in either mode. To change between 2D and 6DOF mode the 2D pointing device's body provides one additional button. The same button also temporarily allows interrupting the link between the 6DOF pointing device held by the user's hand and the 6DOF pointer projected onto the screen.
The present invention aims to make 6DOF pointing available to any desktop computing workplace, by extending the common computer mouse. Target body, algorithm, operating method and the overall setup of the device are designed to achieve the maximum possible compactness, usability and productivity in particular for interaction with 3D application software.
Device
The device of the present invention consists of a standard 2D pointing device in combination with a single target body that is fixed onto the device, furthermore a single camera and an adequate 6DOF tracking algorithm. The algorithm determines the relative pose between target body and camera by processing images from that camera.
Target body and algorithm are particularly appropriate to cope with cluttered backgrounds, poor image quality, low cost cameras and / or inhomogeneous lighting conditions. Target body and method are designed to track the spatial input of a human hand by using standard PC hard- and software with only minor changes to the user's desktop.
The proposed geometric configuration between target body, mouse, user, screen and camera gives a maximum of freedom to user movements, be it in 2D or in 6DOF pointing mode.
The tracking modes, 2D and 6DOF are accompanied by a set of operating modes, defined in the present invention, which include the use of standard mouse buttons and scroll wheel as well as a specific "clutch"-button, used to transit from 2D mode to 6DOF mode and vice versa.
A preferred embodiment of the invention relates to a combined 2D and 6DOF Pointing Device consisting of:
1 ) a 2D pointing device having at least a right button, a left button and a scroll wheel
2) an additional "clutch"-button on the 2D pointing device's body, 3) a target body fixed onto the 2D pointing device,
4) a camera and
5) an algorithm capable of tracking the relative 6DOF pose of the target body from images taken by the camera, providing operating modes: a) 2D pointing mode: the device slides on the desktop surface. Right and left button and scroll wheel are functional. A 2D pointer on the screen represents the current 2D position and movement of the device, b) Transition from 2D to 6DOF mode: the device lifts from the desktop to a starting pose followed by a click on the clutch-button, c) 6DOF pointing mode: device moves freely in space with fully functional right and left button and scroll wheel. A 6DOF pointer on the screen represents the current spatial pose and movement of the device. d) 6DOF clutch mode: press on the clutch-button to interrupt 6DOF pointer movement, move the device to the next starting pose, press again or release the clutch-button to reactivate 6DOF pointing mode, e) Return from 6DOF to 2D mode: click the clutch-button and lower the device onto the desktop, f) Abort 6DOF mode: whenever the 2D pointing device detects movement relative to the desktop, change automatically from 6DOF to 2D pointing mode
Preferably, the pointing device is a wireless pointing device.
The target body is fixed onto the 2D pointing device by means of a rigid mechanical link consisting of some lightweight material and having a small shape compared to the size of the target body and the 2D pointing device. Preferably, the mechanical link is realised as a thin, bar-shaped extension of the housing of the 2D device. The clutch-button consists of an electrical switch or any other sensor having two distinct states which is able to detect an intentional movement of a member of the user's hand with respect to the 2D pointing device. The state of the clutch-button will be transmitted to the connected computer in coded form via the same communication links employed for transmitting the states of mouse buttons, scroll wheel and the 2D tracking unit, i.e. a cable or a wireless communication link. Driver software on the computer is in charge of decoding the clutch-button's state from incoming data flows using methods well known by those skilled in the art of peripheral hardware.
Preferably, the clutch-button is realized as a switch, sensitive to some moderate pressure executed by one of the user's fingers which is in contact with the invented device in charge of keeping the device linked to the hand while the hand guides it over the desktop or in free space. The clutch button is preferably placed in a zone where the thumb or ring finger naturally touches the mouse body. After a brief learning period, the user will press the clutch button in a natural way by applying a stronger grip to the mouse body.
Operating method:
The invention furthermore relates to a method for operating a pointing device having at least a right button, a left button and a scroll wheel, an additional clutch-button and a target body fixed on to the 2D pointing device, characterized in, that means enable a selection and/or activation of operating modes: a) 2D pointing mode: the device slides on the desktop surface. Right and left button and scroll wheel are functional. A 2D pointer on the screen represents the current 2D position and movement of the device, b) Transition from 2D to 6DOF mode: the device lifts from the desktop to a starting pose followed by a click on the clutch-button, c) 6DOF pointing mode: device moves freely in space with fully functional right and left button and scroll wheel. A 6DOF pointer on the screen represents the current spatial pose and movement of the device. d) 6DOF clutch mode: press on the clutch-button to interrupt 6DOF pointer movement, move the device to the next starting pose, press again or release the clutch-button to reactivate 6DOF pointing mode, e) Return from 6DOF to 2D mode: click the clutch-button and lower the device onto the desktop, f) Abort 6DOF mode: whenever the 2D pointing device detects movement relative to the desktop, change automatically from 6DOF to 2D pointing mode.
The invented device provides the user with operation modes - 2D pointing, transition to 6DOF, 6DOF pointing, 6DOF clutch, return to 2D and abort mode - and with means to select and activate these modes. At any time, for maximum productivity of the device, at least one right and one left mouse button and a scroll wheel are available to the user's input.
Operating mode a), the 2D pointing mode, is defined exactly like the well known functionality of a standard computer mouse. The device is placed on top of and in contact with a supporting surface, usually the desktop, and a 2D tracking unit within the device detects movements between device and desktop surface. A graphical symbol, referred to as "2D pointer", usually a small arrow, is displayed on the computer screen and moves according to the detected movements thus representing the device's position.
The user selects and activates the 2D pointing mode by bringing the device into contact with the desktop surface. In addition to the 2D pointing mode being activated, the user may actuate a left, a right mouse button and/or a scroll wheel provided with the device. Software running on the computer, in particular the operating system, the device driving software and application software having a graphical user interface, respond to these actions in a deterministic manner. In particular the response depends on the pointer's position and movement on the screen when the user's action occurs.
If a computer mouse hits an obstacle or the user feels uncomfortable while moving the pointer to a desired position, he/she usually lifts the mouse from its support, moves it to a more appropriate position and puts it down to continue the movement. For its analogy to the clutch of a car, this action, which interrupts 2D tracking of the device, is also referred to as "2D clutch".
Operating mode b), referred to as "transition mode", allows changing from 2D pointing mode to 6DOF pointing mode. The user selects and activates the transition mode by a sequence of actions. Identically to the 2D clutch, the user begins with lifting the device from the desktop. Instead of putting it down, he/she then moves the device on to a desired pose in space and eventually actuates the clutch-button by which the 6DOF pointing mode is activated.
Operating mode c). referred to as "6DOF pointing mode" is defined in perfect analogy to the 2D pointing mode, in that
■ the device's 6DOF pose is being tracked by specific means provided by the invention
A geometric symbol, the 6DOF pointer, may replace the 2D pointer in representing the current device's pose on the screen ■ The 6DOF pointer quasi-synchronously follows the spatial movement of the device
The user may actuate the left, right button and scroll wheel simultaneously to any spatial movement
Operating system, device driving software and currently running application software with graphical user interfaces may respond to the user's spatial movements and button states.
In particular the responses depend on the actual pose and spatial movement of the 6DOF pointer when the user's action occurs.
Given the following scenario: a 3D application software projects a number of geometrical objects onto the screen that were created within a virtual 3D space. The 6DOF pointing mode is active. The user holds and conducts the invented device freely in space. The application software is continuously informed about the device's pose. The application software could use this information to move the 6DOF pointer within the virtual space according to the user's movement and to project the 6DOF pointer onto the screen at its actual pose.
If the user succeeds to move the 6DOF pointer towards one of the existing geometrical objects and continues until the 6DOF pointer partially penetrates that object and then pushes the left mouse button (for a right handed user), the application software could interpret this action in perfect analogy to the 2D pointing device, as a voluntary act of selection of that object by the user.
If the user subsequently, without releasing the left mouse button, moves the device in space and by consequence the 6DOF pointer in the virtual space, the application software could interpret - and react accordingly - to this action as the user's intention to drag that object in space and to release it at a new pose according to the pose where the left mouse button is eventually released.
Otherwise, if the 6DOF pointer doesn't penetrate any of the displayed geometrical objects and the user pushes the left mouse button while moving the device in space, an application software could interpret this doing as the user's intention to change the projection parameters of the virtual space onto the screen: viewpoint, zoom etc..
Similarly inspired by the classical 2D pointing device, any combination between the 6DOF pointer and other mouse buttons, the scroll wheel or any other user actions could be interpreted as a specific intention of the user that needs a specific response of 3D application software.
Besides the projection of a virtual space created by 3D application software, a computer screen shows many two-dimensional elements like button, menu bars, windows containing 2D application user interfaces, icons on the virtual desktop etc.. If the 6DOF pointer leaves the screen area containing a projected virtual 3D space, it will change back into the 2D pointer's shape and behaviour although the user still moves the 6DOF pointing device freely in space. Thus, the user may interact with the computer over the entire screen without loosing any productivity of the 2D pointing device. Whenever the 2D pointer slides again over the projected 3D space, it will transform back into a 6DOF pointer with its shape and behaviour.
Operating mode d). referred to a "6DOF clutch mode", occurs by analogy to 2D pointing devices, if the invented device hits an obstacle or the user feels uncomfortable while moving the 6DOF pointer to a desired position. A 6DOF pointing device has no physical contact to its reference - the entire 3D space - and thus cannot lose that contact. To allow a user to stop the 6DOF pointer moving and displace the device in space, one additional button, referred to as "clutch-button" needs to be added to the pointing device.
The user selects and activates the 6DOF clutch mode by actuating the clutch-button. Now he/she can displace the device into a more convenient pose without loosing the achieved pose of the 6DOF pointer on the screen. To deactivate the 6DOF clutch mode, the user actuates the clutch-button a second time and the 6DOF-pointer will immediately restart following the spatial movement of the device. The doing of "actuating the button" may be realized as a simple push down, a click - i.e. a push down and immediate release -, or a double click onto the button.
Operating mode e), referred to as "return mode", allows the user to exit 6DOF pointing mode in a well controlled manner. He/she selects and activates the "return mode" by actuating the clutch-button, thus interrupting movements of the 6DOF pointer. Instead of pushing again that button, the user lowers the device to the desktop in a way that the 2D tracking unit gets into contact with the desktop surface. When the 2D tracking unit senses the presence of a contact surface or some relative movement, the 2D pointing mode will be activated. The 6DOF tracking algorithm might go then into a stand-by state to save computer resources.
Operating mode f), referred to as "abort mode", forces the reactivation of 2D pointing mode without prior interruption of 6DOF pointing mode by means of the clutch-button. To select the abort mode, the user just lowers the device onto the desktop surface until the 2D tracking unit senses its presence or relative movement.
Choice of the target body
A) The position of greylevel gradients within digital images can be determined with sub- pixel precision whereas blob and marker detection algorithms usually start with thresholding at a given greylevel, which deletes most of the sub-pixel information contained in the original image.
B) Given an ideal camera, visible straight lines in the real world transform into straight lines on the image plane. With imperfect cameras, image lines are distorted into curves but this effect can be eliminated in a straightforward way.
C) The target body must be highly distinctive from any other image content within the viewing volume and over a large depth range. To avoid self-occlusion, the target body needs to have a convex shape.
As a result from A, B and C, the surface of an optimal target body is composed of flat faces, coloured in either black or white or in any two types of light that the camera distinguishes. All borders between faces of different colours have to be designed as straight line segments that coincide with edges. Thus, projected faces appear as polygons and edges appear as straight lines in first order gradient images.
The preferred target body of the present invention is a cuboctahedron with triangles attributed to the first colour and squares attributed to the second colour. For use with standard colour cameras, the preferred target body colours are realized as black and white, in order to project face borders onto images at maximum contrast available.
In order to augment and to homogenize the intensity of light received by the camera and to reduce colour alterations caused by external light sources, one preferred embodiment of the invention consists of a target body with transparent faces and a light source which is supplied with electric power and built into the target body.
In another preferred embodiment of the invention, faces are covered with a reflective substance and a light source is placed in such a way that its light is sent back to the camera's objective.
Choice of the constellation
In 2D mode, the user moves the device on a flat surface, the desktop. In 6DOF mode, the device has to leave the desktop and to move within the viewing volume of the camera. Only the smallest possible effort shall be necessary to transit between the two modes. In particular, the hand's position on the 2D pointing device should not change in order to have full access to mouse buttons and scroll wheel in 2D as well as in 6DOF mode. However, it is indispensable that the user indicates his/her intention to change the mode.
The user should be able to move the target body freely in space without touching its faces. The constellation between 2D pointing device, target body and camera should be chosen such that, for almost any hand and arm poses, the target body is neither obstructed by the 2D pointing device nor by any part of the user's body. The device should be symmetric so that right-handed and left-handed users can use it at the same comfort level.
On the preferred embodiment of the invention, the target body is fixed in front of the 2D pointing device by means of a mechanical link, thus allowing a computer user - without changing the hand position on the mouse - to either move the target body in space or to use it as a normal 2D pointing device on the desktop. Translations are limited by the camera's viewing volume whereas rotations around any axis are allowed to more than 90°.
Within the preferred constellation, the camera is placed such that its optical axis is perpendicular to the screen. Furthermore the camera is placed close to screen and on the same side of the screen as the mouse. This constellation considerably reduces the complexity of calibration between the reference systems of the real space and virtual objects on the screen. Furthermore, within this constellation, the working volume of the target body coincides closest possible with both, the reachable volume of a user's hand and the natural limits of a hand's orientation angles.
Resume
The preferred embodiment of the combined 2D/6DOF pointing device consists of a cuboctahedron that is fixed onto a standard 2D desktop computer mouse. Single camera images are processed quasi-continuously in real-time to determine the target body's pose. By means of a clutch button on the mouse body, the device can be operated alternatively as a standard 2D pointing device on top of the desktop or as a 6DOF pointing device within the viewing volume of the camera. In combination with 3D application software, a 6DOF pointer appears on the screen and mouse interactivity known from 2D pointing devices becomes available inside the virtual 3D space that is projected by the software on the screen.
Brief description of the drawings
Fig. 1 depicts the preferred embodiment of the present invention as part of the human- computer-interface of a standard desktop workplace. Fig. 2 depicts two alternative embodiments of the invented device that apply different methods to control lighting conditions.
Fig. 3 shows alternative shapes of the target body according to the present invention. Fig. 4 depicts the architecture of hard- and software according to the present invention. Fig. 5 is a flow chart of the algorithm of the present invention. Fig. 6 shows different steps of the algorithm: cutting straight lines out of edge point chains, recombination to polygons, evaluation of the image region enclosed by a polygon.
Fig. 7 explains the relationship between an arbitrary image triangle and two regular triangles in space that are projected onto the said image triangle.
Fig. 8 explains the transition from 2D to 6DOF mode and back to 2D mode.
Fig. 9 depicts an embodiment of the 6DOF pointer and its usage in a virtual 3D space and/or the 2D graphical user interface.
Detailed description of the invention
Fig. 1a shows an embodiment of the present invention as part of a desktop computer workplace. The workplace typically comprises a central unit 101, a screen 102, a key- board 103, a 2D pointing device (computer mouse) 104, a camera 105 which is connected to the computer as well as the target body 106 which is held by the user's hand either in 2D mode on the desktop 107 or 6DOF mode in front of the camera 108. The camera position is chosen such that the target body is visible to the camera within reach of the user's hand for a sitting person looking at the screen.
Fig. 1b shows the preferred embodiment of the present invention including the preferred target body: a cuboctahedron 111 with surface properties chosen such that light is either completely reflected or absorbed thus producing the colours white 112 and black 113 on images taken by a camera. The target body is linked to a standard mouse 114 via a mechanical link 115 to one of the black faces. The standard mouse provides at least a right 116 and a left button 117 as well as a scroll wheel 118.
In the depicted embodiment, squares 113 are attributed to the colour black, triangles 112 to white. Nevertheless, the scope of the present invention includes any combination of colours that cameras are able to distinguish reliably.
Fig. 2a shows a preferred embodiment of the target body 211 with white faces made out of translucent material and containing a light source 212 which is supplied with electrical power 213 from batteries or via a cable. The light source augments the contrast in camera images both, between black and white faces and between white faces and the remaining content of camera images, referred to as background.
Fig. 2b depicts another embodiment of the present invention using directed light sources 221 centred by the camera's viewing direction 222 in combination with highly reflective surfaces 223, in particular infrared light sources and retro-reflective surfaces. The light sources are preferably arranged close to the camera's objective and illuminate the entire viewing volume of the camera 224.
The scope of the present invention includes all polyhedral bodies covered with two different colours such that two faces of each colour meet in each edge. Fig. 3 shows examples of such bodies.
Fig. 3a is an octahedron 310 covered with black and white colour. Alike the cuboctahedron, the octahedron provides the property that exactly four edges 312 meet in each corner 311. Thus, each pair of adjacent faces can form a border line of two distinct colours along their common edge 312.
On a cube as shown in Fig. 3b, only three edges meet in each corner. In order to cover the cube completely with only two colours and to make each edge a colour border, front and rear surface were covered with a pattern of four triangles.
Compared to the octahedron, the cuboctahedron 330 shown in Fig. 3c , has the advantage that at least four faces and their four common edges are visible under any viewing angle, unless occlusion or out-of-sight situations occur. These four edges project with maximum available contrast onto the image whereas the surrounding edges 331 contrast with an - a priori - unknown background.
Fig. 3d shows a Rhombicuboctahedron 340 and Fig. 3e an lcosidodecahedron 350 which also comply with the rule: four edges meet in each corner. However, compared to the cuboctahedron, these polyhedra have shorter edges 341 , 351 with respect to the body's outer diameter which might reduce the distance range covered by the tracking algorithm.
Fig. 4 shows the overall information flow for present invention, including the tracking algorithm, the hard- and software of a computer, its human-computer-interface and the computer user. The user 401 with his/her hand 402 guides the target body 403 inside the viewing volume 404 of a camera 405. Digital images taken by the camera are transferred via communication links 406 to an appropriate hardware component 407. The operating systems 408 checks whether a new image is available and provides mechanisms allowing other processes to access to the image.
The algorithm 409 of the present invention accesses and loads the image into an internal memory providing fast access. The algorithm, also referred to as "driver software" of the invented device, is realized as a computer programme which can be executed in parallel with other programmes on computer hardware 407 under the control of an operating system 408. The results of the algorithm are made accessible and readable to other programmes and processes.
For one of ordinary computer skills, an "application programme" is defined as a type of programme which acts in a perceptible way to the user and may be employed consciously in her/his own interests, e.g. for solving problems or to her/his entertainment. Application programme 410 can access and process the results of algorithm 409 for example to change the pose of a virtual object in a geometric model and to project its graphical representation on the screen 411 according to the pose of the target body. Application programmes also could control machines, e.g. a robot, or execute any other action that the user would be able to perceive or interpret as being physically linked to the movement of her/his hand.
Fig. 5 shows in detail the present invention's algorithm for the determination of a target body pose relative to a camera.
The first objective of the algorithm is to find out whether or not the camera image contains at least one projection of the target body. If this is the case, the target body pose has to be determined from this projection with the highest possible accuracy. The algorithm achieves both objectives by successive steps, alternating the creation and evaluation of candidates, wherein the nature of candidates becomes more and more complex and specific in each step.
When a new camera image 500 arrives, the algorithm accesses the image in step 501 and transforms it in step 502 into a greylevel image in such a way that the first colour of the target body transforms into the minimal grey value and the second colour transforms into the maximum grey value. Step 503 executes a standard edge detection algorithm, comprising: smoothing of image noise, calculation of the magnitude and direction of image gradients, edge point creation with sub-pixel precision, edge point chaining and thresholding. This algorithm creates line segments containing the visible border lines between faces of the target body as well as other visible lines.
In step 504, the algorithm cuts straight line segments out of the edge point chains thus augmenting the probability to find the projected target body's edges.
Step 505 combines these straight line segments to a list of triangles and quadrangles which are then considered as candidates of projected target body faces.
Step 506 evaluates the 2D polygon candidates by comparing the colour distribution inside a polygon to the colour of the corresponding target body face, i.e. triangle or square. This step takes advantage of the coincidence of colour borders and polyhedral edges on target bodies chosen by the present invention.
For each 2D polygon candidate having passed the previous evaluation, step 507 calculates target body poses that project the target body onto the camera image in such a way that one face projects exactly onto that said 2D polygon, thus creating a list of target body pose candidates.
For each target body pose candidate, step 508 tries to find as many straight lines as possible in the image that fit to projections of other target body's edges. The candidates are evaluated using criteria like the number of fitting lines and the colour distribution within projected faces.
Step 509 applies appropriate thresholds to decide whether or not at least one promising target body candidate exists in the image. If this is not the case the algorithm returns to step 501.
Otherwise, step 510 chooses the best target body candidate from the remaining list and executes a least-squares minimization, referred to as "Best-Fit", to improve the pose of the target body by reducing the error between its projected edges and the straight line segments attached to them by step 508. In step 511, the determined pose 512 is made accessible to other programmes before the algorithm starts a new loop beginning with step 501.
Fig. 6 details steps 503 to 506 of the algorithm of the present invention.
Fig. 6a shows a part of a camera image 611 containing a projected cuboctahedron. Chains of dots 612 drawn onto the image represent the result of the edge chain detection step 503.
Fig. 6b illustrates step 504 which cuts straight segments out of the edge point chains 621 for a given linearity threshold 622 and calculates regression lines 623.
In Fig. 6c the regression lines 623 are delimited near the first 631 and last 632 edge point of the corresponding edge point chain, thus delivering line segments 633. The arrow head on each line indicates a line direction. By convention, white faces project on the right and black faces on the left side w.r.t. the line direction. Fig 6b further shows the search for pairs consisting of a starting point 631 and an endpoint 632 belonging to two different line segments 633 whose relative distances are lower than a certain threshold 634.
Shown in Fig. 6d, step 505 of the algorithm finds three pairs of endpoints 641 belonging to the same three line segments, and creates the corresponding triangle 642. It will be appreciated by those skilled in the art that the directions of subsequent triangle lines rotate clockwise, which may help to reduce the number of distances to be calculated.
The evaluation step 506, Fig. 6e. projects triangle candidates 651 onto the original image and compares the colour distribution inside the triangular region 652 to the given colour of triangular faces of the target body. The more non-white pixels a triangular region contains, the less probably it belongs to a target body projection within the actual image.
For target body faces with square shape, the algorithm works similarly, except that subsequent square lines rotate counter-clockwise and that squares on the preferred target body are coloured black.
Fig. 7 explains step 507 of the algorithm which consists of attaching two 3D triangles to each 2D triangle found in the image. Those skilled in the art will appreciate that all triangles covering the cuboctahedron are regular triangles. By taking advantage of the inherent symmetries of regular triangles and for a given triangle side length, there exist in general two 3D poses that project a regular triangle onto a given 2D image triangle. In other words, given a white triangular region in an image, the target body's pose has to equal one out of two candidate poses that are determinable directly from that triangular region.
Squares project approximately as parallelograms onto images. The algorithm draws a triangle into each parallelogram found in the image such that the perspective distortion of the triangle with respect to a regular triangle is the same as the perspective distortion of the found parallelogram with respect to a square. The triangles are treated by step 507. The resulting poses of regular triangles in space are then transferred back to squares, thus creating two 3D poses of squares with a given side length for each parallelogram.
By convenience, the algorithm places the 3D reference system 701 into the camera's objective and the 2D image reference system 702 in front of the camera into a plane perpendicular to the camera's z-axis. Centred to the z axis, the image plane contains a rectangle 703 which represents the camera's viewing window into space. Starting at the 3D origin, three beams 704 go through the three corners 705 of a triangle 706 on the image plane. The algorithm following the present invention places two regular triangles 707 and 708 of given side length such that each corner touches a different beam.
The flow chart, Fig. 8, depicts the operating modes for the invented device and the mode changes that occur on user request. Initially the device works in standard mouse mode 801 , 2D tracking is active in combination with mouse buttons and the scroll wheel. In general, whenever the 2D tracking sensor detects a movement, the device goes into 2D tracking mode.
If the user lifts the mouse 802 from its support, or the 2D tracking sensor does no longer detect any movement w.r.t. the support, the device's 2D tracking mode 803 is inactive. Should the user place the mouse back onto its support 804 and the device senses some movement, 2D tracking mode reactivates. So far the device behaves like any standard computer mouse.
Instead of putting the mouse down (804), the user may press the clutch-button 805, thus activating the 6DOF tracker 806. The term "press a button" may be realized as a simple push down, a click i.e. a push down and immediate release, or a double click onto the button. Whenever the target body is visible to the camera and the tracking algorithm correctly determines its pose from images received by the camera, an application programme may produce some feedback to the user responding to the user's hand motion in space. User clicks on the right or left button and the scroll wheel rotations continue to be reported to the application programme, hence providing the same productivity as in 2D tracking mode.
By pushing the clutch-button 807 again, the user can stop 6DOF tracking 808 and go into the clutch mode, thus being able to displace the target body without letting the application programme to know about. Another push on the clutch-button 805 stops the clutch mode and reactivates the 6DOF tracking. Alternatively to 808, the user may lower the device on its supporting surface 809 and reutilize it as 2D mouse. If the user lowers the mouse without previously pushing the clutch-button 810, the 2D tracking sensor will detect some movement and the device will go back to 2D pointing mode without explicit user request. In 2D pointing mode, pressing the clutch-button 811 does not change the function mode of the present invention.
Figure 9 illustrates the utility and usage of a 6DOF pointer as defined by the present invention. In Fig. 9a, the user's hand 911 translates the invented device 912 along a path in space 913 within the camera's 914 viewing volume. On the screen 915 the 6DOF pointer 916 synchronously follows an identical path 917 to a new pose 918 where the 6DOF pointer partly penetrates a 3D object 919. In the depicted example, the shape of the 6DOF pointer is chosen as a simplified teapot. In general, any finite shape is appropriate to represent the 6DOF pointer provided that, for any pair of distinct poses, the projections of that shape onto the screen are distinct too.
In the following step, Fig. 9b, the user pushes the left mouse button 921 with his/her index finger, and conducts the pointing device along a spatial trajectory 922. On the button push, 6DOF pointer 923 and 3D geometry 924 are tightly coupled together so that the 3D geometry follows the movement of the 6DOF pointer 925, who himself follows the movement along the pointing device's trajectory.
The screen in Fig. 9c is divided into a 3D zone 931 containing a projected virtual 3D space and, separated by a line 932, a 2D zone 933 presenting non-3D objects. In reality, coexisting 3D and 2D screen contents may be of any shape and be separated by any type of boundary line. If the user guides the 6DOF pointing device along a direction 934 such that the 6DOF pointer 935 following the direction 936 crosses over a 3D/2D-boundary line, the 6DOF pointer shape changes into a 2D pointer shape 937. This indicates to the user that the device in the particular zone of the screen behaves like a normal 2D pointing device, except that the device is actually free-flying above the desktop surface, guided by the user's hand.
List of names
101 Central processing unit
102 Screen
103 Keyboard
104 Computer mouse
105 Camera
106 Target body
107 Hand of the user in 2D mode
108 Hand of the user in 6DOF mode
111 Cuboctahedron
112 White triangle
113 Black square
114 Mouse body
115 Bridge
116 Right mouse button
117 Left mouse button
118 Scroll wheel
211 Target body
212 Light source
213 Electric power supply
221 Directed light source
222 Viewing direction
223 Retro-reflective surface
224 Viewing volume
310 Octahedron
311 Octahedron corner
312 Octahedron edge
320 Cube
321 Cube corner
322 Cube edge
323 Front face of a cube
324 Triangle on the cube's front face
330 Cuboctahedron
331 Outline of projected cuboctahedron
340 Rhombicuboctahedron
341 Rhombicuboctahedron edge
350 lcosidodecahedron
351 lcosidodecahedron edge
401 Computer user
402 Hand of the user 403 Target body
404 Viewing window of the camera
405 Camera
406 Data communication line
407 Computer hardware
408 Operating system
409 Algorithm of the present invention
410 Application programme
411 Screen
500 Digital image taken by a camera
501.. .511 Steps 501 to 511 of the algorithm
512 6DOF pose
611 Cuboctahedron image
612 Edge point chain
621 Edge point chain
622 Tolerance strip evaluating straightness
623 Regression line
631 Starting point of a line segment
632 Endpoint of a line segment
633 Line segment
634 Distance threshold
641 Pair of starting and endpoint
642 Triangle
651 Region covered by a triangle
652 Greylevels inside the region
701 Camera reference system
702 Image reference system
703 Image rectangle and viewing window
704 Beam
705 Triangle corner
706 Triangle
707 First regular triangle
708 Second regular triangle
801 .. .811 Device modes, choices, actions and resulting changes
911 User's hand
912 6DOF pointing device
913 Translation in space
914 Camera
915 Screen
916 6DOF pointer at start pose
917 Virtual translation of the 6DOF pointer 918 Final pose of the 6DOF pointer
919 Virtual 3D geometry
921 Index finger on the left mouse button
922 Spatial trajectory of the invented device 923 6DOF pointer
924 Virtual 3D geometry
925 Spatial trajectory of the 6DOF pointer
931 Screen area containing projected virtual 3D geometry
932 Boundary between 3D and 2D area 933 Screen area containing non-3D elements
934 Trajectory of the 6DOF pointing device
935 6DOF pointer within the 3D area
936 Trajectory of a pointer crossing over the 3D/2D boundary
937 2D pointer within the 2D area

Claims

Claims
1.) A Pointing Device consisting of
1 ) a 2D pointing device having at least a right button, a left button and a scroll wheel
2) an additional clutch-button on the 2D pointing device's body,
3) a target body fixed onto the 2D pointing device,
4) a camera and 5) an algorithm capable of tracking the relative 6DOF pose of the target body from images taken by the camera,
comprising means which are capable of providing a selection and/or activation of operating modes:
a) 2D pointing mode: the device slides on the desktop surface. Right and left button and scroll wheel are functional. A 2D pointer on the screen represents the current 2D position and movement of the device. b) Transition from 2D to 6DOF mode: the device lifts from the desktop to a starting pose followed by a click on the clutch-button, c) 6DOF pointing mode: device moves freely in space with fully functional right and left button and scroll wheel. A 6DOF pointer on the screen represents the current spatial pose and movement of the device. d) 6DOF clutch mode: press on the clutch-button to interrupt 6DOF pointer movement, move the device to the next starting pose, press again or release the clutch-button to reactivate 6DOF pointing mode, e) Return from 6DOF to 2D mode: click the clutch-button and lower the device onto the desktop, f) Abort 6DOF mode: whenever the 2D pointing device detects movement relative to the desktop, change automatically from 6DOF to 2D pointing mode.
2. A device according to claim 1 wherein the 2D pointing device is a wireless pointing device.
3. A device according to claim 1 or claim 2, wherein the target body is a Cuboctahedron with white triangles and black squares or any other regular polyhedron providing four edges meeting in each corner and faces coloured white and black such that any pair of adjacent faces contains both colours.
4. A device according to any of the preceding claims wherein the brightness of white faces is enhanced by one of the following means: A) white faces materialized with translucent material and a light source placed inside the target body, or B) white faces materialized with retro-reflective substance and a light source placed close to the camera.
5. A device according to any of the preceding claims wherein the camera is placed close to the screen with its viewing axis perpendicular to the screen plane and on the same side w.r.t. the user, as the device of the present invention.
6. A device according to claim any of the preceding claims wherein the target body is placed at the front of the 2D pointing device's body, i.e. between the wrist of the pointing hand and the camera, thus remaining visible for almost all possible poses of a hand of a person sitting in front of a camera.
7. A device according to any of the preceding claims wherein the 6DOF pose of the device is determined at real-time frequency in order to allow 3D software applications running on a computer or machine controller to respond quasi- instantaneously and quasi-continuously to the users input.
8. Method for operating a pointing device having at least a right button, a left button and a scroll wheel, an additional clutch-button and a target body fixed on to the 2D pointing device, characterized in, that means enable a selection and/or activation of operating modes: a) 2D pointing mode: the device slides on the desktop surface. Right and left button and scroll wheel are functional. A 2D pointer on the screen represents the current 2D position and movement of the device.
b) Transition from 2D to 6DOF mode: the device lifts from the desktop to a starting pose followed by a click on the clutch-button, c) 6DOF pointing mode: device moves freely in space with fully functional right and left button and scroll wheel. A 6DOF pointer on the screen represents the current spatial pose and movement of the device. d) 6DOF clutch mode: press on the clutch-button to 6DOF pointer movement, move the device to the next starting pose, press again or release the clutch- button to reactivate 6DOF pointing mode, e) Return from 6DOF to 2D mode: click the clutch-button and lower the device onto the desktop, f) Abort 6DOF mode: whenever the 2D pointing device detects movement relative to the desktop, change automatically from 6DOF to 2D pointing mode.
PCT/EP2008/009106 2007-11-05 2008-10-29 Pointing device and method for operating the pointing device WO2009059716A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE102007053008A DE102007053008A1 (en) 2007-11-05 2007-11-05 Target body and method for determining its spatial position
DE102007053008.2 2007-11-05

Publications (1)

Publication Number Publication Date
WO2009059716A1 true WO2009059716A1 (en) 2009-05-14

Family

ID=40090481

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2008/009106 WO2009059716A1 (en) 2007-11-05 2008-10-29 Pointing device and method for operating the pointing device

Country Status (2)

Country Link
DE (1) DE102007053008A1 (en)
WO (1) WO2009059716A1 (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011075226A1 (en) * 2009-12-18 2011-06-23 Sony Computer Entertainment Inc. Locating camera relative to a display device
WO2012129958A1 (en) * 2011-03-25 2012-10-04 Hu Shixi Finger mouse
KR20130094128A (en) * 2012-02-15 2013-08-23 삼성전자주식회사 Tele-operation system and control method thereof
WO2013177520A1 (en) * 2012-05-25 2013-11-28 Surgical Theater LLC Hybrid image/scene renderer with hands free control
CN104647390A (en) * 2015-02-11 2015-05-27 清华大学 Multi-camera combined initiative object tracking method for teleoperation of mechanical arm
JP2015152973A (en) * 2014-02-10 2015-08-24 レノボ・シンガポール・プライベート・リミテッド Input device, input method, and program which computer can execute
CN105278700A (en) * 2015-11-09 2016-01-27 深圳Tcl新技术有限公司 Wireless mouse working mode switching method and device
JPWO2015020134A1 (en) * 2013-08-08 2017-03-02 国立大学法人京都大学 Control command generation method and control command generation device based on body movement
US9788905B2 (en) 2011-03-30 2017-10-17 Surgical Theater LLC Method and system for simulating surgical procedures
US10178155B2 (en) 2009-10-19 2019-01-08 Surgical Theater LLC Method and system for simulating surgical procedures
CN110763141A (en) * 2019-08-29 2020-02-07 北京空间飞行器总体设计部 Precision verification method and system of high-precision six-degree-of-freedom measurement system
US10843067B1 (en) * 2019-10-04 2020-11-24 Varjo Technologies Oy Input device, system, and method
US10861236B2 (en) 2017-09-08 2020-12-08 Surgical Theater, Inc. Dual mode augmented reality surgical system and method
US10866652B2 (en) 2017-11-13 2020-12-15 Samsung Electronics Co., Ltd. System and method for distributed device tracking
US11197722B2 (en) 2015-10-14 2021-12-14 Surgical Theater, Inc. Surgical navigation inside a body
US11547499B2 (en) 2014-04-04 2023-01-10 Surgical Theater, Inc. Dynamic and interactive navigation in a surgical environment

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102010016964B4 (en) * 2010-05-17 2014-05-15 Krauss-Maffei Wegmann Gmbh & Co. Kg Method and device for controlling a computer-generated display of a virtual object
FR3014348B1 (en) * 2013-12-06 2016-01-22 Commissariat Energie Atomique MULTIDIRECTIONAL EFFORT RETENTION CONTROL DEVICE
PL3316066T3 (en) * 2016-10-26 2020-04-30 Einhell Germany Ag Method for determining at least a part of a border

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2388418A (en) * 2002-03-28 2003-11-12 Marcus James Eales Input or pointing device with a camera
US20040212589A1 (en) * 2003-04-24 2004-10-28 Hall Deirdre M. System and method for fusing and displaying multiple degree of freedom positional input data from multiple input sources

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE3223896C2 (en) * 1982-06-26 1984-07-05 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V., 8000 München Method and device for acquiring control data for rail-controlled industrial robots
JP3100233B2 (en) * 1992-07-27 2000-10-16 三菱重工業株式会社 3D position locating device
DE19536295C2 (en) * 1995-09-29 2000-12-14 Daimler Chrysler Ag Spatially designed signal mark
US6973202B2 (en) 1998-10-23 2005-12-06 Varian Medical Systems Technologies, Inc. Single-camera tracking of an object
US6711293B1 (en) 1999-03-08 2004-03-23 The University Of British Columbia Method and apparatus for identifying scale invariant features in an image and use of same for locating an object in an image
US6417836B1 (en) * 1999-08-02 2002-07-09 Lucent Technologies Inc. Computer input device having six degrees of freedom for controlling movement of a three-dimensional object
US6526166B1 (en) 1999-12-29 2003-02-25 Intel Corporation Using a reference cube for capture of 3D geometry
DE10112732C2 (en) * 2001-03-14 2003-02-06 Boochs Frank Method for determining the position of measurement images of an object relative to the object
SG115546A1 (en) 2003-06-23 2005-10-28 Affineon Technologies Pte Ltd Computer input device tracking six degrees of freedom
DE102004005380A1 (en) 2004-02-03 2005-09-01 Isra Vision Systems Ag Method for determining the position of an object in space
US7474318B2 (en) * 2004-05-28 2009-01-06 National University Of Singapore Interactive system and method
DE102005011432B4 (en) 2005-03-12 2019-03-21 Volkswagen Ag Data glove
WO2007004431A1 (en) 2005-07-04 2007-01-11 Elfo-Tec Co., Ltd. Method of forming high-resolution pattern and apparatus therefor

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2388418A (en) * 2002-03-28 2003-11-12 Marcus James Eales Input or pointing device with a camera
US20040212589A1 (en) * 2003-04-24 2004-10-28 Hall Deirdre M. System and method for fusing and displaying multiple degree of freedom positional input data from multiple input sources

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
"BurstMouse, Motion Capture on your desktop!", 26 October 2006, XP002510264 *
"GO 2.4 GHz CORDLESS OPTICAL AIR MOUSE", 2005, THOMSON INC., INDIANAPOLIS, IN, USA, XP002510134 *
"http://www.3dlinks.com/Rating.cfm?linkid=3794: sheet with text of the webpage which is not legible on printout, and publication date evidence from the Wayback Machine", 14 January 2009 *
"The BurstManager", 2008, MOTION4U *
"THE TRACKING CUBE: A THREE-DIMENSIONAL INPUT DEVICE", IBM TECHNICAL DISCLOSURE BULLETIN, IBM CORP. NEW YORK, US, vol. 32, no. 3B, 1 August 1989 (1989-08-01), pages 91 - 95, XP000029910, ISSN: 0018-8689 *
FAGERER ET AL: "Visual Grasping with Long Delay Time of a Free Floating Object in Orbit", AUTONOMOUS ROBOTS, vol. 1, 1994, pages 53 - 68, XP002510262, Retrieved from the Internet <URL:http://www.springerlink.com/content/m70m110121j23767/fulltext.pdf> [retrieved on 20090114] *
SHEERIN: "Product Review: RoninWorks BurstMouse", 23 June 2004 *
SHEERIN: "RoninWorks Ships BurstMouse--a 3D Mouse/Mocap Device", 24 March 2004, XP002510263 *

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10178155B2 (en) 2009-10-19 2019-01-08 Surgical Theater LLC Method and system for simulating surgical procedures
US10178157B2 (en) 2009-10-19 2019-01-08 Surgical Theater LLC Method and system for simulating surgical procedures
US8497902B2 (en) 2009-12-18 2013-07-30 Sony Computer Entertainment Inc. System for locating a display device using a camera on a portable device and a sensor on a gaming console and method thereof
WO2011075226A1 (en) * 2009-12-18 2011-06-23 Sony Computer Entertainment Inc. Locating camera relative to a display device
WO2012129958A1 (en) * 2011-03-25 2012-10-04 Hu Shixi Finger mouse
US11024414B2 (en) 2011-03-30 2021-06-01 Surgical Theater, Inc. Method and system for simulating surgical procedures
US9788905B2 (en) 2011-03-30 2017-10-17 Surgical Theater LLC Method and system for simulating surgical procedures
KR20130094128A (en) * 2012-02-15 2013-08-23 삼성전자주식회사 Tele-operation system and control method thereof
KR101978740B1 (en) * 2012-02-15 2019-05-15 삼성전자주식회사 Tele-operation system and control method thereof
US10943505B2 (en) 2012-05-25 2021-03-09 Surgical Theater, Inc. Hybrid image/scene renderer with hands free control
US10056012B2 (en) 2012-05-25 2018-08-21 Surgical Theatre LLC Hybrid image/scene renderer with hands free control
WO2013177520A1 (en) * 2012-05-25 2013-11-28 Surgical Theater LLC Hybrid image/scene renderer with hands free control
JPWO2015020134A1 (en) * 2013-08-08 2017-03-02 国立大学法人京都大学 Control command generation method and control command generation device based on body movement
US9870061B2 (en) 2014-02-10 2018-01-16 Lenovo (Singapore) Pte. Ltd. Input apparatus, input method and computer-executable program
JP2015152973A (en) * 2014-02-10 2015-08-24 レノボ・シンガポール・プライベート・リミテッド Input device, input method, and program which computer can execute
US11547499B2 (en) 2014-04-04 2023-01-10 Surgical Theater, Inc. Dynamic and interactive navigation in a surgical environment
CN104647390B (en) * 2015-02-11 2016-02-10 清华大学 For the multiple-camera associating active tracing order calibration method of mechanical arm remote operating
CN104647390A (en) * 2015-02-11 2015-05-27 清华大学 Multi-camera combined initiative object tracking method for teleoperation of mechanical arm
US11197722B2 (en) 2015-10-14 2021-12-14 Surgical Theater, Inc. Surgical navigation inside a body
CN105278700A (en) * 2015-11-09 2016-01-27 深圳Tcl新技术有限公司 Wireless mouse working mode switching method and device
WO2017080194A1 (en) * 2015-11-09 2017-05-18 深圳Tcl新技术有限公司 Air mouse working mode switching method and device
US10861236B2 (en) 2017-09-08 2020-12-08 Surgical Theater, Inc. Dual mode augmented reality surgical system and method
US11532135B2 (en) 2017-09-08 2022-12-20 Surgical Theater, Inc. Dual mode augmented reality surgical system and method
US10866652B2 (en) 2017-11-13 2020-12-15 Samsung Electronics Co., Ltd. System and method for distributed device tracking
CN110763141A (en) * 2019-08-29 2020-02-07 北京空间飞行器总体设计部 Precision verification method and system of high-precision six-degree-of-freedom measurement system
CN110763141B (en) * 2019-08-29 2021-09-03 北京空间飞行器总体设计部 Precision verification method and system of high-precision six-degree-of-freedom measurement system
US10843067B1 (en) * 2019-10-04 2020-11-24 Varjo Technologies Oy Input device, system, and method

Also Published As

Publication number Publication date
DE102007053008A1 (en) 2009-05-14

Similar Documents

Publication Publication Date Title
WO2009059716A1 (en) Pointing device and method for operating the pointing device
US9606630B2 (en) System and method for gesture based control system
JP6116064B2 (en) Gesture reference control system for vehicle interface
US5512920A (en) Locator device for control of graphical objects
US8923562B2 (en) Three-dimensional interactive device and operation method thereof
Fong et al. Novel interfaces for remote driving: gesture, haptic, and PDA
CA2880052C (en) Virtual controller for visual displays
Leibe et al. The perceptive workbench: Toward spontaneous and natural interaction in semi-immersive virtual environments
Leibe et al. Toward spontaneous interaction with the perceptive workbench
CN106030495A (en) Multi-modal gesture based interactive system and method using one single sensing system
US5982353A (en) Virtual body modeling apparatus having dual-mode motion processing
JP2007323660A (en) Drawing device and drawing method
TWI553508B (en) Apparatus and method for object sensing
JP2004246814A (en) Indication movement recognition device
WO2003003185A1 (en) System for establishing a user interface
JP2005322071A (en) Indication operation recognition device, indication operation recognition method and indication operation recognition program
JP5788853B2 (en) System and method for a gesture-based control system
Kolaric et al. Direct 3D manipulation using vision-based recognition of uninstrumented hands
Yuan Visual tracking for seamless 3d interactions in augmented reality
TWI796022B (en) Method for performing interactive operation upon a stereoscopic image and system for displaying stereoscopic image
Ma et al. Remote Object Taking and Movement in Augment Reality
Leibe et al. Integration of wireless gesture tracking, object tracking, and 3D reconstruction in the perceptive workbench
Matulic et al. Above-Screen Fingertip Tracking with a Phone in Virtual Reality
Riccardi et al. A framework for inexpensive and unobtrusive tracking of everyday objects and single-touch detection
CN116243786A (en) Method for executing interaction on stereoscopic image and stereoscopic image display system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08847535

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
122 Ep: pct application non-entry in european phase

Ref document number: 08847535

Country of ref document: EP

Kind code of ref document: A1