AU2013206686A1 - Adaptive and passive calibration - Google Patents

Adaptive and passive calibration Download PDF

Info

Publication number
AU2013206686A1
AU2013206686A1 AU2013206686A AU2013206686A AU2013206686A1 AU 2013206686 A1 AU2013206686 A1 AU 2013206686A1 AU 2013206686 A AU2013206686 A AU 2013206686A AU 2013206686 A AU2013206686 A AU 2013206686A AU 2013206686 A1 AU2013206686 A1 AU 2013206686A1
Authority
AU
Australia
Prior art keywords
pointer
image
brightness
light source
calibration line
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU2013206686A
Inventor
Cameron Murray Edwards
Richard Ling
Ben Yip
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Priority to AU2013206686A priority Critical patent/AU2013206686A1/en
Publication of AU2013206686A1 publication Critical patent/AU2013206686A1/en
Abandoned legal-status Critical Current

Links

Abstract

Abstract ADAPTIVE AND PASSIVE CALIBRATION A method (500) of calibrating a user interface (101), having a light source (120) and a camera (110) for capturing an image (150') of a pointer (150) illuminated by the light source 5 (120), the method comprising measuring a brightness (540) of the image of the pointer and a position (560) of the pointer during a touch gesture, determining (820) a distance (707) between the position of the pointer and a calibration line (705) extending from the light source, the calibration line being located to one side of the pointer; estimating (820) a brightness of the pointer at the calibration line (705) by adjusting (820) the measured 0 brightness of the pointer using a model (710) of light intensity variation of the light source and the determined distance to the calibration line; and updating (580) calibration parameters (585) for the user interface system using the estimated brightness at the predetermined calibration line P073400 / 7580563_2 C) CY) FiD P073400/7580574 1

Description

1 ADAPTIVE AND PASSIVE CALIBRATION TECHNICAL FIELD The current invention relates to touch detection systems, and in particular to a system and method of adaptive and passive calibration that enables the determination of finger position relative to an electronic device. BACKGROUND Touchscreens are becoming increasingly common in many electronic devices such as mobile phones, tablets and portable computers. The predominant touchscreen technology comprises a transparent electronic sensing layer which is overlaid on a rigid electronic display panel and which can directly sense the touch of a physical pointer, such as a finger or stylus. A touchscreen can support richer user interaction if it can sense the presence of a physical pointer that is close to, but not actually in contact with, the touchscreen. Such interaction is called "hover". Because the pointer is not in contact with an electronic sensing layer, the system must incorporate sensors able to remotely detect the pointer. Various parties have designed and built such non-conventional touchscreen systems. One current system comprises multiple cameras having fields of view looking generally across a surface, where the camera images are analysed to detect a pointer and the pointer's reflection in the surface. Triangulation between the X positions reported by the multiple cameras with different viewpoints gives the absolute pointer position, which then gives the distance for all cameras, and then for each camera, triangulation of distance with the Y separation of the pointer and reflection gives the elevation (with multiple redundancy). However, it is not desirable to use multiple cameras. Another current system emits structured light across a region that is monitored by a camera. Images from the camera are analysed to identify a geometric characteristic of a light pattern formed on a pointer. The geometric characteristic is used to derive the depth of the pointer. However, such a system does not determine the height of a pointer above a surface. Another current method determines the distance of an object using infrared illumination. The system requires prior calibration by positioning the object at several known distances and angles relative to an illumination source. Subsequently, sensor data is analysed to determine distance to the object based on the Phong illumination model. However, such a system does not adapt to changes in external factors.
2 SUMMARY OF THE INVENTION It is an object of the present invention to substantially overcome, or at least ameliorate, one or more disadvantages of existing arrangements. Disclosed are arrangements, referred to as Dynamic Touch Based Calibration (DTBC) arrangements, which seek to address the above problems by using pointer touches to a projection surface to continuously calibrate a number of calibration curve models that are used to map pointer brightness in a three dimensional coordinate space to a three dimensional position of the pointer. According to a first aspect of the present invention, there is provided a method of calibrating a user interface system, having a light source and a camera at a predetermined position and orientation relative to a surface for capturing an image of a pointer illuminated by the light source, the method comprising the steps of: measuring a brightness of the image of the pointer and a position of the pointer during a touch gesture, a position of the pointer corresponding to a point of contact between the pointer and the surface; determining a distance between the position of the pointer and a predetermined calibration line extending from the light source, the calibration line being located to one side of the pointer; estimating a brightness of the pointer at the predetermined calibration line by adjusting the measured brightness of the pointer using a predetermined model of light intensity variation of the light source and the determined distance to the calibration line; and updating calibration parameters for the user interface system using the estimated brightness at the predetermined calibration line. According to another aspect of the present invention, there is provided an apparatus for implementing any one of the aforementioned methods. According to another aspect of the present invention, there is provided a computer program product including a computer readable medium having recorded thereon a computer program for implementing any one of the methods described above. Other aspects of the invention are also disclosed. BRIEF DESCRIPTION OF THE DRAWINGS One or more embodiments of the invention will now be described with reference to the following drawings, in which: Fig. 1 is a touch system according to the present DTBC arrangement; 3 Fig. 2 is a system architecture diagram of a device according to the preferred DTBC arrangement; Fig. 3 is a side view of the preferred DTBC arrangement when the fingertip is above the desk surface; Fig. 4 is an image captured by the camera in the configuration shown in Fig. 3; Fig. 5 is a flow chart illustrating the preferred DTBC arrangement; Fig. 6 is an image captured by the camera when a fingertip is touching the surface; Fig. 7 is a plot of the relationship of 0 and L in calibration space for a given d; Fig. 8 is a flow chart illustrating the detailed steps of updating the curve model; Figs. 9A and 9B collectively form a schematic block diagram representation of an electronic device upon which described arrangements can be practised; DETAILED DESCRIPTION INCLUDING BEST MODE Where reference is made in any one or more of the accompanying drawings to steps and/or features, which have the same reference numerals, those steps and/or features have for the purposes of this description the same function(s) or operation(s), unless the contrary intention appears. It is to be noted that the discussions contained in the "Background" section and that above relating to prior art arrangements relate to discussions of documents or devices which may form public knowledge through their respective publication and/or use. Such discussions should not be interpreted as a representation by the present inventor(s) or the patent applicant that such documents or devices in any way form part of the common general knowledge in the art. The present DTBC arrangement relates to a touch system and method able to determine the position of a finger (or stylus) relative to the system. In particular, this DTBC arrangement relates to a calibration technique that enables the determination of the finger's position in the presence of changing external factors, such as ambient illumination or the reflectivity of the finger. The system includes a surface, a camera having a field of view looking generally across the surface, a source of illumination, and a processor. The camera captures images of a finger lit by the source of illumination, and the processor processes the images to detect the image coordinates (xf, yf) of the finger, and measures the brightness L of the finger in the camera image. The DTBC arrangement uses this information (xf, yf, L) to derive the three dimensional physical coordinates (X, Y, Z) of the position of the finger relative to the system.
4 Accordingly, (X, Y, Z) coordinates are physical three-dimensional coordinates established by a set of axes 170 in Fig. 1. In contrast, (xf, yrf) coordinates are two-dimensional coordinates referred to an image such as 400 that is captured by a camera 110 in Fig. 1. The origin position of the image coordinate space is located at the top left of an image. Fig. 1 shows a user interface system 101 according to the present DTBC arrangement. In this DTBC arrangement, a portable electronic device 100 contains a projector 130, the infrared camera 110, an infrared light source 120, and a processor 240 (see Fig. 2). The apparatus 100 rests on a passive, flat surface 140, for example, a table or desk. The projector 130 projects, as depicted by dashed lines such as 102, interactive content 104 (for example, World Wide Web content, photo albums, video content) onto the surface 140, forming a projected display 160. The infrared light source 120 illuminates, as depicted by dot-dashed lines such as 103, a region around and above the projected display 160 with invisible infrared light. The infrared camera 110 captures images of the region around and above the projected display 160. A finger 150 of a user (not shown) can be used as a pointer to interact with the interactive content 104 shown on the projected display 160. Other similarly-shaped objects, such as a stylus, pen, or pencil, can also be used as a pointer. The terms "finger" and "pointer" are used interchangeably in this specification unless indicated otherwise explicitly or by context. A frequency of the light emitted by the infrared light source 120 is chosen to be distinguishable from frequencies of light in the surrounding environment. If the projector 130 is present, the frequency of light emitted by the light source 120 is chosen to be different from those frequencies of light emitted by projector 130. The infrared camera 110 comprises a filter that is matched to the frequency of light emitted by infrared light source 120. The filter makes the infrared camera 110 sensitive only to the frequency of light emitted by the infrared light source 120. The infrared camera 110 forms a grayscale image in which pixel brightness corresponds to the brightness of infrared light received. If the projector 130 is present, the surface 140 must permit the user to clearly see the projected display 160. For example, a white surface satisfies this requirement, and allows high contrast viewing of the projected display 160. The DTBC arrangement operates within a three dimensional physical coordinate system represented by the axes 170. In this description an uppercase (X, Y, Z) notation is used to refer to three dimensional physical coordinates. The origin position (0, 0, 0) of the physical coordinate system is located at the base 105 of the device 100 and on the surface 140, directly underneath the camera 110. Positive X is taken to be a direction projecting to the right of the device when observing the projected display 160 from the origin position (0, 0, 0). Positive Y is taken to be a 5 direction projecting upwards above and perpendicular to the surface. Positive Z is taken to be a direction projecting towards the projected display 160 relative to origin position (0, 0, 0) and parallel to the surface. Fig. 2 is a diagram of a system architecture 200 of the device 100, described hereinafter in more detail with reference to Figs. 9A and 9B. The components of the device 100 are connected by a system bus 260. The processor 240 reads and executes instructions from memory storage 250, and controls the infrared camera 110, the infrared light source 120 and the projector 130. The memory storage 250 stores data including instructions for the processor 240, images captured by the infrared camera 110, calibration data, and the interactive content for display by the projector 130. The processor 240 of the device 100 receives images captured by the infrared camera 110 and processes the images to detect the finger 150. Figs. 9A and 9B collectively form a schematic block diagram of a general purpose electronic device 901 including embedded components, upon which the DTBC methods to be described are desirably practiced. The electronic device 901 may be, for example, a mobile phone, a portable media player or a digital camera, in which processing resources are limited. Nevertheless, the methods to be described may also be performed on higher-level devices such as desktop computers, server computers, and other such devices with significantly larger processing resources. As seen in Fig. 9A, the electronic device 901 comprises an embedded controller 902. Accordingly, the electronic device 901 may be referred to as an "embedded device." In the present example, the controller 902 has the processing unit (or processor) 240 which is bi directionally coupled to the internal storage module 250. The storage module 250 may be formed from non-volatile semiconductor read only memory (ROM) 960 and semiconductor random access memory (RAM) 970, as seen in Fig. 9B. The RAM 970 may be volatile, non volatile or a combination of volatile and non-volatile memory. The electronic device 901 includes a display controller 907, which is connected to a video display 914, such as a liquid crystal display (LCD) panel or the like. The display controller 907 is configured for displaying graphical images on the video display 914 and/or the projected display 160 in accordance with instructions received from the embedded controller 902, to which the display controller 907 is connected. The electronic device 901 also includes user input devices 913 which are typically formed by keys, a keypad or like controls. Alternately, the user can control the device by using his or her finger 150 to interact with the DTBC arrangement by means of display status and 6 control information 106 which is projected by the projector 130 onto the surface 140 together with the interactive content 104. In some implementations, the user input devices 913 may include a touch sensitive panel physically associated with the display 914 to collectively form a touch-screen. Such a touch-screen may thus operate as one form of graphical user interface (GUI) as opposed to a prompt or menu driven GUI typically used with keypad-display combinations. Other forms of user input devices may also be used, such as a microphone (not illustrated) for voice commands or a joystick/thumb wheel (not illustrated) for ease of navigation about menus. As seen in Fig. 9A, the electronic device 901 also comprises a portable memory interface 906, which is coupled to the processor 240 via a connection 919. The portable memory interface 906 allows a complementary portable memory device 925 to be coupled to the electronic device 901 to act as a source or destination of data or to supplement the internal storage module 250. Examples of such interfaces permit coupling with portable memory devices such as Universal Serial Bus (USB) memory devices, Secure Digital (SD) cards, Personal Computer Memory Card International Association (PCMIA) cards, optical disks and magnetic disks. The electronic device 901 also has a communications interface 908 to permit coupling of the device 901 to a computer or communications network 920 via a connection 921. The connection 921 may be wired or wireless. For example, the connection 921 may be radio frequency or optical. An example of a wired connection includes Ethernet. Further, an example of wireless connection includes BluetoothTm type local interconnection, Wi-Fi (including protocols based on the standards of the IEEE 802.11 family), Infrared Data Association (IrDa) and the like. Typically, the electronic device 901 is configured to perform some special function. The embedded controller 902, possibly in conjunction with further special function components 910, is provided to perform that special function. For example, where the device 901 is a digital camera, the components 910 may represent a lens, focus control and image sensor of the camera. In the present DTBC arrangements, the special function components are the infrared camera 110, the infrared LED 120, and the projector 130. The special function components 910 are connected to the embedded controller 902. As another example, the device 901 may be a mobile telephone handset. In this instance, the components 910 may represent those components required for communications in a cellular telephone environment. Where the device 901 is a portable device, the special function 7 components 910 may represent a number of encoders and decoders of a type including Joint Photographic Experts Group (JPEG), (Moving Picture Experts Group) MPEG, MPEG-1 Audio Layer 3 (MP3), and the like. The DTBC methods described hereinafter may be implemented using the embedded controller 902, where the processes of Figs. 5 and 8 may be implemented as one or more software application programs 933 executable within the embedded controller 902. The electronic device 901 of Fig. 9A implements the described methods. In particular, with reference to Fig. 9B, the steps of the described methods are effected by instructions in the software 933 that are carried out within the controller 902. The software instructions may be formed as one or more code modules, each for performing one or more particular tasks. The software may also be divided into two separate parts, in which a first part and the corresponding code modules performs the described methods and a second part and the corresponding code modules manage a user interface between the first part and the user. The DTBC software 933 of the embedded controller 902 is typically stored in the non volatile ROM 960 of the internal storage module 250. The software 933 stored in the ROM 960 can be updated when required from a computer readable medium. The software 933 can be loaded into and executed by the processor 240. In some instances, the processor 240 may execute software instructions that are located in RAM 970. Software instructions may be loaded into the RAM 970 by the processor 240 initiating a copy of one or more code modules from ROM 960 into RAM 970. Alternatively, the software instructions of one or more code modules may be pre-installed in a non-volatile region of RAM 970 by a manufacturer. After one or more code modules have been located in RAM 970, the processor 240 may execute software instructions of the one or more code modules. The DTBC application program 933 is typically pre-installed and stored in the ROM 960 by a manufacturer, prior to distribution of the electronic device 901. However, in some instances, the application programs 933 may be supplied to the user encoded on one or more CD-ROM (not shown) and read via the portable memory interface 906 of Fig. 9A prior to storage in the internal storage module 250 or in the portable memory 925. In another alternative, the software application program 933 may be read by the processor 240 from the network 920, or loaded into the controller 902 or the portable storage medium 925 from other computer readable media. Computer readable storage media refers to any non-transitory tangible storage medium that participates in providing instructions and/or data to the controller 902 for execution and/or processing. Examples of such storage media include floppy disks, magnetic tape, CD-ROM, a 8 hard disk drive, a ROM or integrated circuit, USB memory, a magneto-optical disk, flash memory, or a computer readable card such as a PCMCIA card and the like, whether or not such devices are internal or external of the device 901. Examples of transitory or non-tangible computer readable transmission media that may also participate in the provision of software, application programs, instructions and/or data to the device 901 include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions and information recorded on Websites and the like. A computer readable medium having such software or computer program recorded on it is a computer program product. The second part of the application programs 933 and the corresponding code modules mentioned above may be executed to implement one or more graphical user interfaces (GUIs) to be rendered or otherwise represented upon the display 914 of Fig. 9A and/or the projected display 160. Through manipulation of the user input device 913 (e.g., the keypad), and/or the projected display 160 a user of the device 901 and the application programs 933 may manipulate the interface in a functionally adaptable manner to provide controlling commands and/or input to the applications associated with the GUI(s). Other forms of functionally adaptable user interfaces may also be implemented, such as an audio interface utilizing speech prompts output via loudspeakers (not illustrated) and user voice commands input via the microphone (not illustrated). Fig. 9B illustrates in detail the embedded controller 902 having the processor 240 for executing the application programs 933 and the internal storage 250. The internal storage 250 comprises read only memory (ROM) 960 and random access memory (RAM) 970. The processor 240 is able to execute the application programs 933 stored in one or both of the connected memories 960 and 970. When the electronic device 901 is initially powered up, a system program resident in the ROM 960 is executed. The application program 933 permanently stored in the ROM 960 is sometimes referred to as "firmware". Execution of the firmware by the processor 240 may fulfil various functions, including processor management, memory management, device management, storage management and user interface. The processor 240 typically includes a number of functional modules including a control unit (CU) 951, an arithmetic logic unit (ALU) 952 and a local or internal memory comprising a set of registers 954 which typically contain atomic data elements 956, 957, along with internal buffer or cache memory 955. One or more internal buses 959 interconnect these 9 functional modules. The processor 240 typically also has one or more interfaces 958 for communicating with external devices via system bus 981, using a connection 961. The application program 933 includes a sequence of instructions 962 though 963 that may include conditional branch and loop instructions. The program 933 may also include data, which is used in execution of the program 933. This data may be stored as part of the instruction or in a separate location 964 within the ROM 960 or RAM 970. In general, the processor 240 is given a set of instructions, which are executed therein. This set of instructions may be organised into blocks, which perform specific tasks or handle specific events that occur in the electronic device 901. Typically, the application program 933 waits for events and subsequently executes the block of code associated with that event. Events may be triggered in response to input from a user, via the user input devices 913 of Fig. 9A and/or the projected display 160 as detected by the processor 240. Events may also be triggered in response to other sensors and interfaces in the electronic device 901. The execution of a set of the instructions may require numeric variables to be read and modified. Such numeric variables are stored in the RAM 970. The disclosed method uses input variables 971 that are stored in known locations 972, 973 in the memory 970. The input variables 971 are processed to produce output variables 977 that are stored in known locations 978, 979 in the memory 970. Intermediate variables 974 may be stored in additional memory locations in locations 975, 976 of the memory 970. Alternatively, some intermediate variables may only exist in the registers 954 of the processor 240. The execution of a sequence of instructions is achieved in the processor 240 by repeated application of a fetch-execute cycle. The control unit 951 of the processor 240 maintains a register called the program counter, which contains the address in ROM 960 or RAM 970 of the next instruction to be executed. At the start of the fetch execute cycle, the contents of the memory address indexed by the program counter is loaded into the control unit 951. The instruction thus loaded controls the subsequent operation of the processor 240, causing for example, data to be loaded from ROM memory 960 into processor registers 954, the contents of a register to be arithmetically combined with the contents of another register, the contents of a register to be written to the location stored in another register and so on. At the end of the fetch execute cycle the program counter is updated to point to the next instruction in the system program code. Depending on the instruction just executed this may involve incrementing the address contained in the program counter or loading the program counter with a new address in order to achieve a branch operation.
10 Each step or sub-process in the processes of the methods described below is associated with one or more segments of the application program 933, and is performed by repeated execution of a fetch-execute cycle in the processor 240 or similar programmatic operation of other independent processor blocks in the electronic device 901. Fig. 3 shows a side view 300 of the preferred DTBC arrangement, in which the finger 150 (i.e. the pointer) is above the surface 140. A reflection 410 of the finger 150 is reflected in the surface 140. The device 100 is resting on the surface 140. The light source 120 emits infrared light rays 301, 301' and illuminates the finger. Fig. 4 depicts the image 400 captured by the camera 110. The image shows an image 150' of the finger 150. The surface 140 is also lit by the light source 120, so an image 140' of the surface 140 is observable in the image 400. An image 410' of the reflection 410 of the finger 150 is also observable in the image 400. In the present DTBC arrangement, the sensor 110 is an infrared camera of image resolution 320 pixels width by 240 pixels height. Each pixel of the camera 110 captures an associated brightness of the scene as seen by the pixel in question. The brightness corresponding to each pixel can be represented as an integer ranging from 0 (darkest) to 255 (brightest). In the camera image 400, the image 410' of the reflection 410 of the finger 150 is observable provided there is sufficient surface reflectivity of the surface 140 and a high angle of incidence of light rays meeting the surface 140 after being reflected off the finger 150. Provided that the aforementioned conditions are met, a portion of light emitted by the light source 120 is reflected off the finger 150, reflected off the surface 140 and finally detected by the camera 110, thus forming the observable image 410' of the reflection 410 of the finger 150 in the camera image 400. Let image coordinates (xf, yf) (having respective reference numerals 401, 402) denote image coordinates of an image pixel 420 of a tip 302 of the finger 150 observed in the image 400. Given (xf, yf) only, it is not possible to determine a position (X, Y, Z) of the tip 302 of the finger 150 in three-dimensional physical coordinates. However, given the image coordinates (xf, yf), a line 310 in the three-dimensional physical coordinate space can be constructed, defining a range of possible physical coordinate fingertip positions corresponding to the image coordinate (xf, yf). In other words, the tip 302 of the finger 150 can lie anywhere along the line 310 in physical coordinate space and the image 420 of the tip 302 of the finger 150 will still have the image coordinates (xf, yf). This line is known as the line of projection constraint 310 of the given pixel coordinates (xf, yf), as depicted in Fig. 3. In general, the geometry of the line 310 11 depends on the lens properties of the camera. The present DTBC arrangement adopts a pinhole camera model, and hence the line 310 is a straight line. Definition of calibration In order to determine the position of the tip 302 of the finger 150 on the line of projection constraint 310 in three dimensional coordinates, the present DTBC arrangement determines a brightness of the image of the finger 150' in the camera image 400, and operates on the basis that the brighter the image of the finger 150', the closer the finger 150 is to the device 100. To obtain a quantifiable understanding of how close the finger 150 is to the device 100, it is necessary to gain knowledge of the Z value (in three dimensional space) that corresponds to a pixel intensity (L) of the image of the finger 150' in the camera image 400 for each possible pixel coordinate (xf, yf). Thus, for example, data can be collected by placing the finger 150 at a number of known three dimensional physical coordinate positions (Xi, Yi, Zi), a pixel intensity Li can be determined at each such three dimensional physical coordinate, and a sufficient number of samples (xf, yf, L, Z) can be stored as calibration data. In this description, the process of obtaining such knowledge is called calibration. Once the system is calibrated, i.e. once a sufficient number of (xf, yf, L, Z) values have been determined and stored, assuming that external factors remain unchanged, it is possible to determine a value for the Z coordinate of the fingertip 302 given (xf, yf, L) for the fingertip 420, where the values of (xf, y) are determined from an image such as 400 in Fig. 4, and the value of L is determined by a step 540 in Fig. 5 (described hereinafter in more detail with reference to Fig. 5). Given the Z value, and given the line of projection constraint 310, it is then possible to determine a position of the finger 150 in physical coordinates (X, Y, Z) relative to the device 100. However in typical usage scenarios, external factors can change after calibration has occurred. For example, the device 100 may be used indoors, or outdoors, or may be used by different people with different finger reflectivity. To address this problem, the end user of the system could calibrate the system before each use of the system, and each time there is a change in external factors. However, this would be contrary to end users' expectations that the device should function reliably regardless of external factors. It is not possible to predict when external factors will change. For this reason, a desirable calibration technique should continuously monitor for changes to external factors, and adapt to those changes as the end user uses the device 100 for its intended purpose. Moreover, a desirable calibration technique should passively and continuously adapt to changes in external factors, without direct user involvement.
12 The quality of calibration Collecting inaccurate information during calibration can result in a system that is poorly calibrated. As described earlier, calibration requires knowledge of which Z value corresponds to which pixel intensity (L). The Z value of the tip 302 of the finger 150 can be determined most accurately when the tip 302 of the finger 150 is touching the surface 140 during a touch gesture, rather than when the tip 302 of the finger 150 is hovering above the surface 140. This is because when the tip 302 of the finger 150 is touching the surface 140, the image coordinates of the image 420 of the tip 302 of the finger 150 (xf, yf) are sufficient to derive Z using a pinhole camera model. Assuming that (xf, yf) and the pinhole camera model are sufficiently accurate, the derived Z value will also be reasonably accurate. In contrast, to obtain the Z value of the fingertip when the fingertip is above the surface, the finger needs to either hover above a known (X, Z) position on the surface 140, or hover above the surface at a known Y elevation. To hover the finger precisely above a known position or at a specific elevation requires human judgment and hand-eye coordination, and is particularly difficult because the finger cannot rest against the surface. For this reason, the Z value obtained is not as accurate compared to when the finger is in contact with the surface. The essence of this DTBC arrangement is about collecting calibration data when the finger is touching the surface, and utilising the calibration data for the situation when the finger is hovering above the surface. Fig. 5 shows a flow chart of an example 500 of a process for implementing the DTBC arrangement. Each step of the process 500 is stored as instructions residing in the DTBC software application 933 in the memory storage 250. Each step of the process 500 is executed by the processor 240. The process starts at a step 510 when the device 100 is powered on. Following an arrow 501, in a step 520 the processor 240 directs the camera 110 to capture an image such as 400 in Fig, 4, and the captured image is stored in the memory 250. Following an arrow 502, in a step 530 the processor 240 determines if an image 150' of a finger 150 is present in the image. The step 530 can use a process known in the art as the Viola-Jones Object Detection Framework, published in "Robust real-time face detection", P. Viola and M. Jones, 2004. This process learns the visual characteristics of a class of objects (such as faces or fingers or other pointers) from a set of example images of the class such as the image 400, and encodes the visual characteristics in a data structure. After learning is complete, the process can use the data structure to detect objects of the class in new (previously unseen) images or video. The Viola- 13 Jones Object Detection Framework has several attributes that make it suitable for use in the preferred DTBC arrangement. It is able to detect objects of the class that vary in size, overall brightness, or overall contrast. It is able to generalize well (for example, to differences in users' finger shapes). It is also able to process images in real time, creating a smooth interactive experience for the user. Alternatively, the step 530 may use a method known in the art as template matching. According to this detection method, a template image (a normalized example image of a finger, previously calculated and stored in the memory of the device) is superimposed on the captured image at every possible pixel position. At each pixel position, the pixels of the captured image that are overlaid by the template are themselves normalized and then the sum of absolute differences (SAD) between the pixels of the template and the corresponding normalized pixels of the captured image is calculated. The more the pixels of the template differ from the pixels of the captured image at the given pixel position, the greater the SAD will be. The pixel position having the minimum SAD (i.e., the greatest similarity to the template) is deemed to be the position (xf, yf) of the tip 420 of the image 150' of the finger 150 in the captured image 400. If the minimum SAD is greater than a threshold value (i.e., sufficiently dissimilar to the template), the finger is deemed to be absent. Here, normalization refers to adjusting pixel values to use the entire available dynamic range, which assists in detecting fingers that vary in brightness or contrast. In order to detect fingers at various distances from the camera, it is necessary to repeat the detection process with multiple templates of varying sizes. Other processes with suitable attributes can be used with the DTBC arrangement. If no finger presence is detected, then following a NO arrow 503, the process is directed back to the step 520 to capture the next image from camera 110. Returning to the step 530, if the presence of a finger is detected, the detection process (such as the Viola-Jones Object Detection Framework) returns a sub-image of the finger, in which the finger pixels (those pixels corresponding to the observed finger) have pixel values of one, and non-finger pixels have pixel values of zero. The size of the sub-image is large enough to contain the tip 420 of the image 150' of the finger 150 and at least NUMBRIGHTEST (being a parameter with a predefined value) finger pixels above the tip of the finger 150'. The size of the sub-image can vary depending on the size of the image 150' of the finger 150. Using a camera 110 of image resolution 320 x 240 pixels, the parameter NUMBRIGHTEST typically equals 200 and the sub-image is typically between 40 x 40 pixels and 80x80 pixels in size. Image coordinates (xf, yf) of the tip 420 of the image 150' of the finger 150 correspond to the lowest 14 finger pixel in the sub-image. If there is more than one lowest finger pixel in the sub-image then the lowest finger pixel that is closest to the centre of the sub-image is used. Following an arrow 505, in a step 540 the processor 240 determines the brightness (L) of the image 150' of the finger. The finger brightness (L) is defined as the sum of the NUMBRIGHTEST identified brightest pixels among those image pixels corresponding to the image 150' of the finger 150. As described above, for a camera 110 of image resolution 320 x 240 pixels, the parameter NUMBRIGHTEST typically equals 200. Following an arrow 506, in a step 550 the processor 240 determines whether the finger 150 is touching the surface 140. Referring to Fig. 4, as described above, the image 410' of the reflection 410 of the finger 150 is observable under the conditions of sufficient surface reflectivity and high angle of incidence of light rays meeting the surface 140 after being reflected off the finger 150. Fig. 6 depicts an image 600 of the image 150' of the finger 150 captured by the camera 110 when the finger 150 is touching the surface 140. When the finger 150 touches the surface 140, the image 410' of the reflection 410 of the finger 150 meets the image 150' of the finger 150. Detecting the finger touching the surface is thus equivalent to detecting the collision of the fingertip 150 and its reflection 410 within the camera image. Still in the step 550, after the image 150' of the finger 150 is detected in the image 600, the processor 240 searches for the image 410' of the reflection 410 by scanning a typically rectangular subregion 601 of the image 600 below the detected image 150' of the finger 150. The top of the subregion 601 starts at the bottom of the detected image 150' of the finger 150. The bottom of the subregion 601 is at the bottom of the image 600. The left and right edges of the subregion 601 correspond to the left and right edges of the sub-image returned by the finger detection process (such as the Viola-Jones Object Detection Framework),. The processor 240 in the step 550 then finds the y image coordinates of each of the NUMBRIGHTEST brightest pixels in the subregion 601. If any of these y coordinates are within two pixels of the y image coordinate of the fingertip (yf), then the finger 150 is considered to be touching the surface 140. At the step 550, if the finger 150 is touching the surface, then the process 500 follows a YES arrow 509 and the processor 240 collects calibration data by performing steps 560, 570 and 580. If, however, at the step 550 the finger 150 is not touching the surface, then processing follows a NO arrow 507 and continues at a step 590.
15 At the step 560, the processor 240 determines the location of where the touch occurred on the surface 140. The location is calculated using a pinhole camera model and the two dimensional image coordinate (xf, yr) of a tip 602 of the image 150' of the finger 150. A pinhole camera model is a simple and common camera model used in computer vision. It defines the relationship between a point in 3D physical space (X, Y, Z) and its corresponding image pixel coordinates (u, v) as captured by the camera. According to the pinhole camera model, a point in three dimensional space (X, Y, Z) and its corresponding image pixel coordinates (u, v) are related by a linear equation as stated in Equation 1. [u T1 r12 T1 3 tl1X s A [r 2 1 r 2 2 r 2 3 t 2 Y Equation 1 S r 3 1 r 2 3 r 3 3 t3 ] where: * vector (u, v) is a camera image position in pixels; e vector (X, Y, Z) is a position in the physical coordinate space; * matrix A is called the camera intrinsic matrix and is of the form A s 0 s c 1 . 0 0 1. The camera intrinsic matrix A relates to the focal length and principal point of camera 110. The parameter sx defines the relationship between horizontal physical units (X), distance from the camera (Z) and corresponding horizontal pixel units (x) such that x = sx * X/Z. The parameter sy defines the relationship between vertical physical units (Y), distance from the camera (Z) and corresponding horizontal pixel units (y) such that y = sy * Y/Z. The principal point (cr, cy) is the position where the optical axis of the lens of camera 110 meets the image sensor of the camera 110. The principal point is expressed in pixels and is usually located near the centre of the camera's image; [r 1 r 12 r 1 3 ti e the matrix T 21 r 22
T
2 3 t 2 ] describes the position and orientation of the camera T3 r 23
T
33 t 3 . rT1 r 12 r 13 relative to the surface 140. The sub-matrix [rT r 2 2 T23 is a rotation matrix T31 r 23 r 33 . describing the 3D rotation of the physical coordinate system relative to the camera's coordinate system and (ti, t 2 , t 3 ) is a translation vector describing the offset of the physical coordinate origin position relative to the camera coordinate system origin; 16 e s is a normalization scalar chosen to ensure the third element in [ equals 1. To obtain the camera intrinsic matrix and the extrinsic matrix, this DTBC arrangement uses the camera calibration functions provided by the OpenCV (http://opencywilowgarage.com) computer vision library. The camera calibration process involves printing a chessboard pattern, mounting the chessboard pattern to a rigid flat surface and capturing a number of images of the chessboard pattern using the camera 110. Images of the chessboard pattern are captured when the chessboard is held in 8 or more different orientations within the field-of-view of the camera 110. An image of the chessboard pattern is also captured when the chessboard pattern is placed on the surface 140. Using the functions of the OpenCV library, image coordinates of the chessboard corners are determined and intrinsic and extrinsic matrices are generated. Referring to the physical coordinate system 170, a coordinate on the surface 140 has a Y value of zero (Y=0). Thus, using Equation 1, the relationship between image coordinates (u, v) corresponding to a point on the surface (X, Y=0, Z) is defined as follows: ru rll r2 3 t2 rx [u rnA r 12 r23 t]Z] Equation 2 r31 r 33 t 3 1 where all parameters are defined as described above for the pinhole camera model (Equation 1). At the step 530, the image coordinate of the tip 420 of the image 150' of the finger 150 is found to be (xf, yf), as depicted by a dashed arrow 515 in Fig. 5. If the fingertip 150 is touching the surface 140, then following an arrow 509, at the step 560, the processor 240 determines the physical coordinate X and Z values of the fingertip 150 as follows: r r[3 1 [ Equation 3 s' r2 1 r2 3 t2 A-' yr qato r31 r 33 t 3 where: * the notation M-1 denotes the inverse of a matrix M; * (xf, yf) are the camera image coordinates of the tip of the finger; * value s' is a scalar chosen to ensure the third value of Z is 1; * all other parameters are defined as described above for the pinhole camera model (Equation 1).
17 Still at the step 560, the physical coordinates (X, Y=O, Z) of the fingertip 150 on the surface 140 are converted to polar coordinate (d, 0) form by the processor 240 and stored in the memory 250. The polar coordinates of the fingertip are defined as follows: d =X 2
+Z
2 Equation 4 X 0 = atan(-) where: * d is distance of a position below the fingertip corresponding Y=0 on the surface 140 from the physical coordinate origin * 0 is the angular position of the tip of the finger, being the angle between the positive Z axis and a vector from the physical coordinate origin to a position below the fingertip corresponding Y=0 on the surface 140; e all other parameters are defined as described above for the pinhole camera model (Equation 1). The value of 0 is zero when the finger is directly in front of the camera, aligned with the Z axis. The possible range of 0 is bounded by the horizontal angle of view of the camera. In one described DTBC arrangement, the camera 110 used with the DTBC arrangement, has an angle of view such that -25" < 0 25". In the described DTBC arangement, the camera 110 is placed flat on the surface 140 so that the optical axis of the camera is parallel to the surface. It is approximated that, ti = t3 = 0, and the rotation matrix is the identity matrix. With this particular configuration, one may . x approximate - to: X Xf CX Equation 5 Z sX where: * xf is defined as described above for Equation 3 e all other parameters are defined as described above for the pinhole camera model (Equation 1). The definition of 0 is now: Xf - CX 0 = atan( -) Equation 6 SX where: 18 * 0 is the angle between the positive Z axis and a vector from the physical coordinate origin to a position below the fingertip corresponding Y=0 on the surface 140; * xf is defined as described above for Equation 3 e all other parameters are defined as described above for the pinhole camera model (Equation 1). The advantage of using Equation 6 (instead of Equation 4) is that 0 can be calculated based on the x image coordinate of the tip 420 of the image 150' of the finger 150 alone. Following an arrow 512, in a step 570 the processor 240 forms, from the finger brightness (L) which was derived in step 540 along with the corresponding touch position (d, 0) determined in the step 560, a sample (L, d, 0) and the sample is added to a sample buffer 575 residing in memory storage 250. Initially when the device 100 is powered on, an initialisation procedure is performed (described hereinafter in more detail with reference to Fig. 8) using a number of predefined samples already stored in memory 250. In the current DTBC arrangement, the sample buffer 575 has a predetermined size. Before adding a new sample to the sample buffer 575, if the buffer is full, then an existing sample must be removed from the buffer. In the current DTBC arrangement, the oldest sample in the sample buffer 575 is removed to make space for a new sample. This allows the system to adapt to the current environment. Other strategies for selecting samples to be removed from the sample buffer 575 are possible. For example, an outlier sample (a sample that is inconsistent with other collected samples) can be selected to be removed from the sample buffer 575. Within the calibration space, using the samples contained in sample buffer 575, it is possible to determine a d value given L and 0, and hence derive the three dimensional physical coordinates of the tip 302 of the finger 150. For example, given values of L and 0, the processor 240 can search the sample buffer 575 for the closest sample, or alternately can calculate a weighted average of the k-closest samples to L and 0. However, in practice this would require a large, dense collection of samples uniformly distributed across the allowable ranges of L and 0. A more robust method is to model the calibration space using curves that are fitted to the samples in the buffer 575. At the step 570, after a sample has been added to the sample buffer 575, processing follows an arrow 514 and continues to a step 580. At the step 580, the processor 240 generates a number of curve models based on the samples in the sample buffer 575. A curve model is a mathematical curve fitted to a number of samples.
19 A curve model is represented as a number of parameters that describe a mathematical curve. The parameters of each curve model are stored in a buffer 585 residing in the memory storage 250. In the described DTBC arrangements, each curve model defines a relationship between distance (d) and brightness (L) at a particular value of 0. The value of 0 to which a curve corresponds is denoted curve. The present DTBC arrangement models the calibration space using three curve models, namely a curve model for cuvre = -250, a curve model for cuvre = 00 and a curve model for curve = 250. The aforementioned values of 0 represent three radial calibration lines depicted as 704, 705 and 706 extending from the light source in Fig. 7. Other types and numbers of curve models can also be used. Accordingly, more than three curves can be used, and the curves need not be radial lines. A curve model, say the curve model for curve 250, is modelled by the following equation: 1 L = ao + bo. d + co Equation 7 where: * L is brightness of the finger 150' in the camera image 400 e d is defined as described above for Equation 4. * values ao, bo, co are the parameters (also referred to as the calibration parameters) of the curve model for curve = -250. The parameters of the curve ao, bo, co are chosen, typically using least squares curve fitting, so that the curve most closely fits the stored samples. In this DTBC arrangement, the near edge of projected display 160 is located at Z = 0.1 metres, and the far edge of projected display 160 is located at Z = 0.26 metres. The far corners of projected display 160 are located at X = +0.055 metres. Thus, applying Equation 4, the far corners of projected display are located at a distance of d = V0.0552 + 0.262 ~ 0.27 metres. As such, this DTBC arrangement needs only to detect fingertips within the projected display 160 which lies in the range 0.1 d 0.27 metres. This range is referred to as the expected range of d. Due to the natural phenomenon that light intensity is weaker at further distance, the finger intensity L is expected to be strictly decreasing within the expected range of d. This means that the derivative of each curve in the curve model must always be negative within the range 0.1 d 0.27 metres. This can be expressed as a mathematical constraint on boas follows: 20 a 0 L'(d) -- + bo < 0 vd e [0.1,0.27] ao Equation 8 bo < - Vd C [0.1,0.27] bo < ', where dmx = 0.27 where: * L'(d) is the first derivative of the brightness of the finger 150' in the camera image 400 with respect to distance (d); e dmx equals 0.27 metres corresponding to the maximum expected fingertip distance e other parameters are defined as described above for Equation 7. Additionally, the inverse-square law means that the rate of decrease of finger intensity L is expected to be greater when the finger is closer to the light source 120. This means that the second derivative of each curve in the curve model must be positive within the range 0.1 d < 0.27 metres. This can be expressed as a mathematical constraint on a 0 as follows: a 0 L"(d) = 2 - > 0 Vd e [0.1,0.27] d3 Equation 9 a 0 > 0 where: * L"(d) is the second derivative of the brightness of the finger 150' in the camera image 400 with respect to distance (d); e other parameters are defined as described above for Equation 7. At the step 580, described hereinafter in more detail with reference to Fig. 8, the processor 240 generates three curve models. The process of Fig. 8 generates one curve model for a specified value of 0 ,erve. So, at step 580, the process of Fig. 8 is executed 3 times, once for curve = -25", once for curve = 0" and once for curve = 25'. At step 580, the value of curve is initialised and stored in memory storage 250 accordingly prior to each of the three executions of the process of Fig. 8. Fig. 8 depicts a process 580 showing one example of how the step 580 in the process 500 in Fig. 5 can be implemented. This process 580 generates a single curve model corresponding to a specific curve value. The process of generating a curve model begins at a step 800 and then following an arrow 821 the process is directed to a step 802 in which the processor 240 21 determines whether the device 100 has just been powered-on. If the device has just been powered-on, then an initialisation procedure must be performed, and thus following a YES arrow 823 the process is directed to a step 805. Otherwise, following a NO arrow 822 the process is directed to a step 810. At the step 805 the processor 240 loads a number of predefined samples from the memory storage 250. Each predefined sample is of the form (L, d, 0). These predefined samples allow the system to be initialised such that the device 100 provides reasonable performance when the system is first powered on before any samples are collected (i.e. before the user touches the surface). In this DTBC arrangement, there are 9 predefined samples. Three predefined samples have 0 = -25", three predefined samples have 0 = 00, and three predefined samples have 0 = 25". At the step 805, only the three predefined samples with = e,, , are loaded from the memory storage 250, where 0 ,erve was initialised and stored in memory storage 250 prior to execution of the process of Fig. 8, as described above.. A NUMSAMPLES value, stored in the memory 250, records the number of loaded samples. At the step 805, NUMSAMPLES is set to 3. After step 805, following an arrow 825 the process is directed to a step 830 in which the processor 240 continues processing of the predefined samples. Returning to the step 810, in that step the processor 240 loads a number of collected samples from the sample buffer 575 residing in the memory storage 250. Each collected sample is of the form (L, d, 0). Each collected sample has been obtained when a user has touched the surface 140 using their finger 150 or another suitable pointer at some point since the device 100 was powered-on. Note that each collected sample can have a 0 value anywhere in the range - 2 5 " < o 25'. The collected samples all lie on a two dimensional plane corresponding to the surface 140. Let the collected samples be denoted by {(d 0
,L
0 ), ..., (dN,LN)}. At the step 810, the processor 240 sets the parameter NUMSAMPLES to the total number of collected samples loaded from the sampled buffer 575. Following an arrow 824 the process is directed to a step 820 in which the processor 240 derives a projected sample from each collected sample, described hereinafter in detail with reference to Fig. 7. As mentioned above, each collected sample can have a 0 value anywhere in the range - 2 5 " < 0 25". However, the curve model being generated corresponds to a particular value of 0, say 0 ,ere = -25". For this reason, it is necessary to calculate a projected sample corresponding to c urve from each collected sample. The process of calculating a projected sample from a collected sample will now be described in detail. Let (Lc,, dco,, Ocol) be a collected sample loaded at the step 810. Let eL-2 5 denote the estimated brightness of the image 150' of the finger 150 at a distance dc,, and at 0 = -25".
22 Using the existing curve model for 0,, = -250, we can estimate the value of eL-2 5 as follows: 1 eL-2 5 = ao - + bo. dcoi + co Equation 10 where: * eL-2 5 is estimated brightness (according to existing curve parameters) of the image of the finger at a distance deol and at 0 = -250 * deoi is distance of the collected sample being used to calculate the projected sample. * values ao, bo, co are the parameters of the curve corresponding to Ocurve = -250. The estimated brightness values eLo and eL 25 (corresponding to 0eurve = 0 0 and curve 250 respectively) are similarly defined: 1 eLO =, ai + b1. deol + ci Equation 11 eL 25 a 2 + b 2 . dcoi + c 2 Equation 12 deoi where: * eLo is estimated brightness (according to existing curve parameters) of the image of the finger at a distance deol and at 0 = 00 * eL 25 is estimated brightness (according to existing curve parameters) of the image of the finger at a distance deol and at 0 = 250 * deoi is distance of the collected sample being used to calculate the projected sample * values a 1 , bi, ci are the parameters of the curve corresponding to Ocurve = 00 * values a 2 , b 2 , c 2 are the parameters of the curve corresponding to Ocurve = 250. Fig. 7 shows a plot 700 of the points (eL-2 5 , -25), (eLo, 0) and (eL 25 , 25) for a particular distance d = deel. In Fig. 7, the horizontal axis is 0 and the vertical axis is L. The infrared light source 120 used in this DTBC arrangement has highest intensity along a central axis 705 of illumination of the light source corresponding to 0 = 00. Moreover, at a fixed distance d, the intensity of light emitted from light source 120 decreases as the distance (i.e. the angular offset) to the central axis of illumination increases. For this reason, it is expected that eLo > eL 25 and eLo > eL-2 5 . In this DTBC arrangement, the brightness decrease (falloff) of the light source is 23 modelled as a linear function 710 of angle 0, as shown in Fig. 7. Other falloff models can also be used. Without loss of generality, take 0 ,ol > 0. With the linear falloff assumption modelled by a line 710, one may use the existing curve models 585 to determine an estimated brightness eLco of the fingertip at the sampled position (dco,, 0,o ), which represents the position of the finger for the sample in question, as follows: eLe 0 i = eL25- e 0 L col + eL 0 Equation 13 where: * eLeol is estimated brightness (according to existing curve parameters) of the image of the finger at a distance deo, and at 0 = col e eL 0 is defined as described above for Equation 11 * eL 25 is defined as described above for Equation 12 * col is the angular position of the collected sample being used to calculate the projected sample. The data point (eLeol, Ocol) is shown in Fig. 7. Let Ecol be the difference between the collected sample brightness and the estimated brightness, Ecol eL co - Leal. A projected sample (pLo, pdo, p0o) 720 corresponding to 0 0 of the collected sample (Leol, deol, Oeol) is determined using a linear projection 730, projecting the collected sample, to thereby adjust the collected sample, as follows: (pLo, pdo, p~o) = (eL 0 - Ecol, dcol, 0) Equation 14 where: * pL 0 is brightness of the projected sample corresponding to 0 = 0 e pdo is distance of the projected sample corresponding to 0 = 0 e p 0 o is angular position of the projected sample corresponding to 0 = 0 * eL 0 is defined as described above for Equation 11 e Ecol is the difference between the collected sample brightness and the estimated brightness according to existing curve parameters e deo, is distance of the collected sample being used to calculate the projected sample The projected samples of the collected sample onto 0 = 25 and 0 = -25, are similarly defined: 24 (pL 25 , pd 2 s, P 0 25) (eL 2 5 - Ecol, dcol, 25) Equation 15 (pL-2 5 , pd-2 5 , PO-25) (eL-2 5 - Ecol, dcol, -25) Equation 16 where: * pL 25 is brightness of the projected sample corresponding to 0 = 25 e pd 2 s is distance of the projected sample corresponding to 0 = 25 e P 0 25 is angular position of the projected sample corresponding to 0 = 25 e pL-2 5 is brightness of the projected sample corresponding to 0 = -25 e pd-2 5 is distance of the projected sample corresponding to 0 = -25 e PO-25 is angular position of the projected sample corresponding to 0 = -25 * eL-2 5 is defined as described above for Equation 10 * eL 25 is defined as described above for Equation 12 e Ecol is the difference between the collected sample brightness and the estimated brightness according to existing curve parameters of the curve corresponding to the value 0 * deel is distance of the collected sample being used to calculate the projected sample The projected samples of the collected sample onto 0 = 25 and 0 = -25 are depicted by respective "X" symbols 702, 701. As noted, at the step 820, projected samples are calculated, for each collected sample loaded in the step 810 and for each of the curve models. Each projected sample is stored in memory storage 250. Following an arrow 826 the process is directed to a step 830 in which the processor 240 determines optimal curve parameters for a curve model using samples obtained from a previous step (either the step 805 or the step 820). The process of determining optimal curve parameters for the curve model with Ocre = -25 will now be described in detail. A similar process is applied to find optimal curve parameters for the other two curve models. The curve models map the samples from the steps 810 and/or 805, which lie on a two dimensional plane corresponding to the surface 140 in the three dimensional space defined by the axes 170 in Fig. 1, to three dimensional samples which can be used to determine a finger position in the three dimensional space defined by the axes 170 in Fig. 1 in a step 590, described hereinafter in more detail. Let (Lk, dk) denote the brightness and the distance of the kth sample. The value k is restricted to the range 1 k N where N = NUMSAMPLES . Using Equation 7 and the N samples a linear system for the curve model with Ocre = -25 is formed as follows: 25 - , 1 - di 1 d 1 r a 0 - d 1 1 bo = Lk Equation 17 dL
C
0 ] ' .LN dN 1 where: * dk is distance of the kth sample * Lk is the brightness of the k th sample * values ao, bo, co are the parameters of the curve corresponding to 0eurve = -5 * N is the number of samples. At the step 830, the processor 240 determines a least squares solution (aO, bo, co) by solving the linear algebra normal equations as follows (note: the matrix A used below is different to the previously discussed matrix A used in Equation 1): ao~ bo= (AT A)-'AT B C1= di -Li - Equation 18 1 where A dk 1, B = Lk dN 1 LN -dN and where: * dk and Lk are defined as described above for Equation 17 * values ao, bo, co are a least squares solution being the parameters of the curve corresponding to curve = -25" * N is the number of samples. Following the step 830, as directed by an arrow 827 processing continues at a step 840. At the step 840, the processor 240 determines if the solution (aO, bo, co) obtained at the step 830 is the optimal solution. If the solution (aO, bo, co) satisfies the derivative constraints defined by Equation 8 and Equation 9 then the solution is the optimal solution and, following a YES 26 arrow 828, processing continues at a step 860. If the solution (ao, bo, co) does not satisfy the derivative constraints, then following a NO arrow 829 processing continues at a step 850. At the step 850, the processor 240 finds an optimal solution (ao, bo, co) that is on the boundary of a solution space. The solution space boundary is derived from the derivative constraints defined by Equation 8 and Equation 9. There are three boundary conditions to solve. The first boundary is constrained by Equation 8. The second boundary is constrained by Equation 9. The third boundary is constrained by both Equation 8 and Equation 9. The first boundary condition, defined by the derivative constraint of Equation 8, is bo = . Substituting this into Equation 17 produces the following linear system: 1 di -+ 1 d 1 d n 1+ dk ao L Equation 19 dk Mxd C 0 I 1 dN LN - + - 1 N d2 where: * dk and Lk are defined as described above for Equation 17 * values ao, co are a least squares solution for curve parameters corresponding to a first boundary condition * dmx is defined as described above for Equation 8 * N is the number of samples. At the step 850, the processor 240 determines a least squares solution for (ao, co) by solving the linear algebra normal equations as follows: [ao] (T A)- 1 'AT B + d 1 d1 d -L1 1 Mx 1Equation 20 where A = + 1,B Lk + 1 -LN -dN dmx and where: * dk and Lk are defined as described above for Equation 17 27 * values ao, co are defined as described above for Equation 19 * dmx is defined as described above for Equation 8 * N is the number of samples. Still at the step 850, the processor 240 also determines the error Ei of the solution (aO, bo = , cO) as follows: mx /
E
1 = A Lc-B Equation 21 where: * E 1 is the error of the solution corresponding to the first boundary condition * values ao, co are defined as described above for Equation 19 * A and B are defined as described above for Equation 20 If ao < 0, the solution violates the constraint defined in Equation 9. Such a solution is not appropriate. If the processor 240 finds such an inappropriate solution then processor 240 sets
E
1 = 00. The second boundary condition, defined by the derivative constraint of Equation 9, is ao = 0. Substituting this into Equation 17 produces the following linear system: d, 1 -L 1 dk 1 [N] Lk Equation 22 dN 1-
-LN
where: * dk and Lk are defined as described above for Equation 17 * values bo, co are a least squares solution for curve parameters corresponding to a second boundary condition * N is the number of samples. At the step 850, the processor 240 determines a least squares solution for (bo, co) by solving the linear algebra normal equations as follows: 0 ]= (ATA)-lATB Equation 23 Col 28 -d 1 1- -Lj whereA= dk 1,B= Lk dN l- -LN and where: * dk and Lk are defined as described above for Equation 17 * values bo, co are defined as described above for Equation 22 * N is the number of samples. Still at the step 850, processor 240 also determines the error E 2 of the solution (aO = 0, bo, co) as follows:
E
2 = A [b -B Equation 24 where: * E 2 is the error of the solution corresponding to the second boundary condition * values bo, co are defined as described above for Equation 22 * A and B are defined as described above for Equation 23 If bo > 0, the solution violates the constraint defined by Equation 8. Such a solution is not appropriate. If the processor 240 finds such an inappropriate solution then the processor 240 sets E 2 = 00. The third boundary condition, corresponds to the intersection of the first and second boundary conditions defined by Equation 8 and Equation 9 respectively. The third boundary condition is ao = 0 AND bo = 0. Accordingly, at step 850, the processor 240 determines a solution for co as follows: co Lk Equation 25 where: * Lk is defined as described above for Equation 17 e co is a solution for curve parameters corresponding to a third boundary condition * N is the number of samples 29 Still at the step 850, the processor 240 also determines the error E 3 of the solution (aO = 0, bo = 0, co) as follows:
E
3 =IA co - BI -1- -L 1 Equation 26 where A = B = Lk
LN
and where: * E 3 is the error of the solution corresponding to the third boundary condition * co is defined as described above for Equation 25 * Lk is defined as described above for Equation 17 To summarise the above, at the step 850, three boundary solutions are obtained and an error of each boundary solution is calculated. The optimal solution is the one with the smallest error, being min(E 1 , E 2 , E 3 ). Following calculation of the optimal solution, as directed by an arrow 831, processing continues at a step 860. At the step 860, the processor 240 stores the optimal solution of (aO, bo, co) in a buffer 585 residing in the memory storage 150. After this, the process of generating a curve model ends at a step 833. The process of generating a curve model for 0,, = -25" has been described above with reference to Fig. 8. The process of generating a curve model for other values of 0,er, is similar to generating a curve model for curve = -25". Specifically, for the curve model for curve 0", (ai, bi, ci) is used instead of (aO, bo, co) and, similarly, for the curve model for curve = 25", (a 2 , b 2 , c 2 ) is used instead of (aO, bo, co). Returning to Fig. 5, at the step 580, after the process of Fig. 8 is executed 3 times, once for curve = -25", once for curve = 0" and once for curve = 25", the three curve models have been generated, and the three curve models can be expressed as follows: 30 1 L ao + bo.d + co, for Ocurve = -25 1 L =ai - + b. d + ci, for curve = 00 Equation 27 d 1 L =a 2 - + b 2 . d + c 2 , for Ocurve = 25 where: * L is estimated brightness of the finger 150' in the camera image 400 according to a curve model * d is defined as described above for Equation 4 e ao, bo, co are parameters of an optimal solution for the curve model for cuvre = -250 e a 1 , bi, ci are parameters of an optimal solution for the curve model for cuvre = 00 * a 2 , b 2 , c 2 are parameters of an optimal solution for the curve model for cuvre = 250 * curve is an angle of a radial calibration line corresponding to a particular curve model After the step 580 has been performed, following arrows 511, 508 processing continues at a step 590. At the step 590 the three curve models stored in buffer 585 are used by the processor 240 to find the physical position of the tip 302 of the finger 150. The fingertip is generally in three dimensional space defined by the coordinate axes 170 in Fig. 1, and may or may not be touching the surface 140. Let L denote the brightness of the fingertip, and let (f&, fy) denote the image coordinate of the fingertip 150, obtained at steps 540 and 530 respectively. The value 0 is obtained by Equation 6, 0 = atan(fX CX) If coincidently 0 = -250, then based on Equation 27, L = ao + bo. d + co, the distance of the fingertip 150 can be determined by applying the quadratic formula as follows: 1 0 = d(ao. + bo. d + co - L) Equation 28 -(co - L) - V(c - 2- 4b 0 ao 2bo where: * d is distance of a position below the fingertip corresponding Y=0 on the surface 140 from the physical coordinate origin 31 * L is brightness of the finger 150' in the camera image 400 * values ao, bo, co are defined as described above for Equation 27 Equation 28 solves the distance d, using one of the roots of a quadratic equation. In this DTBC arrangement, the other root of the quadratic solution does not give a valid result. In most cases when 0 is not one of the values -250, 00 or 250, the parameters a, b, c are linearly interpolated. For example, consider a value of 0 between 00 and 250. Using Equation 27 and Equation 28, the solution of d can be found by linearly interpolating parameters (ai, bi, ci) and (a 2 , b 2 , c 2 ) as follows: d (ce - L) - V(c- L) 2 - 4be ae Equation 29 2be where: 0 a =- (a 2 - a 1 ) + ai be = -(b 2 - bi) + b 1 Equation 30 25 0 CO = (c 2 - c 1 ) + c 1 and where: * d is distance of a position below the fingertip corresponding Y=0 on the surface 140 from the physical coordinate origin * L is brightness of the finger 150' in the camera image 400 * values a 1 , bi, ci are defined as described above for Equation 27 * values a 2 , b 2 , c 2 are defined as described above for Equation 27 0 0 is defined as described above for Equation 6 At the step 590, once d is calculated, the fingertip position in three dimensional physical coordinates (X, Y, Z) is determined by processor 240 as follows: Z = d cos(0) X1 L r 1 1 r 1 2 r 1 3 .Z + t - 1 sx 0 cx-1 x- Equation 31 s'I Y r 21 r 22 r 2 3 -Z + t 2 0 s, cy Yf 1. r 31 r 32 r 33 .Z + t 3 . 0 0 1. 1 where: 32 * d is distance of a position below the fingertip corresponding Y=0 on the surface 140 from the physical coordinate origin; 0 0 is defined as described above for Equation 6; * X, Y, Z are the physical coordinates of the tip 302 of the finger 150; * value s' is a scalar chosen to ensure the third value of 1 is 1; * (xf, yf) are the image coordinates of the tip 420 of the finger 150' in the camera image; e the notation M-1 denotes the inverse of a matrix M; e all other parameters are defined as described above for the pinhole camera model (Equation 1). With this DTBC arrangement, calibration data is collected passively by processor 240 when the user's finger 150 touches the surface 140 to interact with the projected display 160. Collected calibration data is used by processor 240 to update the parameters of the curve models stored in buffer 585. The updated curve models 585 reflect the current external conditions that enable the system to function adaptively. Alternative DTBC arrangements One alternative DTBC arrangement determines a measurement of the confidence of the detection of a finger in the step 530. The Viola-Jones Object Detection Framework can detect multiple sub-images of the finger. The detection has high confidence when there are multiple sub-image regions that overlap with the detected sub-image. Another alternative DTBC arrangement determines a measurement of the confidence of a detected touch at the step 550. A measure of confidence of the finger touching the surface can be determined based on the motion profile of the fingertip over time as the fingertip approaches the surface 140. For example, if the fingertip touch motion starts clearly above (say Y > 0.03 metres above) the surface and strikes the surface in one motion, then a high measure of confidence can be associated with the detected touch. If however, the touch motion of the finger commences very close to (say Y <0.005 metres above) the surface before a striking the surface, then a low measure of confidence can be associated with the detected touch. Another alternative DTBC arrangement processes the sample buffer 575 alternatively when the sample buffer 575 is full. In this alternative DTBC arrangement, a weighting is assigned to each sample. The weighting depends on when the sample was collected, the confidence of the 33 detection of the fingertip, the confidence of the detection of the fingertip touching the surface, the influence of the sample (defined by the volume of a Voronoi cell), and how well it fits the existing curve models stored in the buffer 585. The sample with the least weighting is removed. Another alternative DTBC arrangement uses more than 3 curve models stored in buffer 585 to model the calibration space. The curve models could have angles other than to 0 = 250 or 0 = -25'. For example, curve models within the calibration space could have 0 values chosen to correspond to the positions of particular user interface (UI) elements within projected display 160. However, because of the nature of the infrared light, it is recommended that at least one curve should have 0 = 00 which captures the brightest positions of the fingertip, and the other curves should be scattered symmetrically on the left and on the right of this centre curve. Another alternative DTBC arrangement calculates a weighting of the projected sample in the step 820. The weighting depends on the angular distance between the projected sample and its corresponding collected sample. For a collected sample with angle = 0 ,o, projected on to a curve model at curve, Ocurve E {-250, 00, 250}, one possible weighting scheme is: we - "curve - Oco1 Equation 32 50 J where: * we is a weighting to be applied to a particular sample for a particular curve model * 0c is the angular position of the collected sample * curve is an angle of a radial calibration line corresponding to the particular curve model This weighting of a sample influences the curve parameters (a, b, c) such that the curve would fit closer to highly weighted projected samples. Instead of using Equation 18, a weighted least squares solution is used to find (a, b, c) as follows: b = (AT A)-AT B We wo.pd 1 we
-
w e .pL 1
-
Equation 33 where A - we-.pdk We ,B = we.pLk We we.pdN Wg -w.pLN and where: 34 * (a, b, c) are curve parameters being determined for a particular curve model * we is a weighting to be applied to a particular sample for the particular curve model * pdk is the distance of the kth projected sample for the particular curve model * pLk is the brightness of the kth projected sample for the particular curve model * N is the number of projected samples being used to determine curve parameters for the particular curve model Similarly, a weighted least squares solution can also be used at the boundary conditions replacing Equation 20, Equation 23 and Equation 25. INDUSTRIAL APPLICABILITY The arrangements described are applicable to the computer and data processing industries and particularly for the computer vision industries. The foregoing describes only some embodiments of the present invention, and modifications and/or changes can be made thereto without departing from the scope and spirit of the invention, the embodiments being illustrative and not restrictive. In the context of this specification, the word "comprising" means "including principally but not necessarily solely" or "having" or "including", and not "consisting only of'. Variations of the word "comprising", such as "comprise" and "comprises" have correspondingly varied meanings.

Claims (8)

1. A method of calibrating a user interface system, having a light source and a camera at a predetermined position and orientation relative to a surface for capturing an image of a pointer illuminated by the light source, the method comprising the steps of: measuring a brightness of the image of the pointer and a position of the pointer during a touch gesture, a position of the pointer corresponding to a point of contact between the pointer and the surface; determining a distance between the position of the pointer and a predetermined calibration line extending from the light source, the calibration line being located to one side of the pointer; estimating a brightness of the pointer at the predetermined calibration line by adjusting the measured brightness of the pointer using a predetermined model of light intensity variation of the light source and the determined distance to the calibration line; and updating calibration parameters for the user interface system using the estimated brightness at the predetermined calibration line.
2. A method according to claim 1, wherein the step of measuring a brightness of the image of the pointer comprises the steps of: identifying a predefined number of the brightest pixels in the image of the pointer; and defining the brightness of the image of the pointer to be dependent upon the brightness of the identified pixels.
3. A method according to claim 2, wherein the brightness of the image of the pointer is dependent upon a sum of the brightness of the identified pixels.
4. A method according to claim 1, wherein the step of measuring the position of the pointer during the touch gesture comprises the steps of: determining that the finger is touching the surface by detecting a collision between an image of a tip of the pointer and an image of the reflection of the pointer; and applying a pinhole camera model to an image coordinate of the tip of the pointer to determine a location of the pointer in three dimensional space. 36
5. A method according to claim 1, wherein the calibration line is a radial calibration line.
6. A method according to claim 5, wherein the predetermined model of light intensity variation of the light source is a linear brightness decrease dependent upon an angular offset from a central axis of the light source.
7. An apparatus for calibrating a user interface system, the system comprising: a light source and a camera at a predetermined position and orientation relative to a surface for capturing an image of a pointer illuminated by the light source; a processor; and a memory storing a computer executable software program for directing the processor to execute a method comprising the steps of: measuring a brightness of the image of the pointer and a position of the pointer during a touch gesture, a position of the pointer corresponding to a point of contact between the pointer and the surface; determining a distance between the position of the pointer and a predetermined calibration line extending from the light source, the calibration line being located to one side of the pointer; estimating a brightness of the pointer at the predetermined calibration line by adjusting the measured brightness of the pointer using a predetermined model of light intensity variation of the light source and the determined distance to the calibration line; and updating calibration parameters for the user interface system using the estimated brightness at the predetermined calibration line.
8. A computer readable storage medium storing a non-transitory computer executable software program for directing a processor to execute a method of calibrating a user interface system having a light source and a camera at a predetermined position and orientation relative to a surface for capturing an image of a pointer illuminated by the light source, the method comprising the steps of: measuring a brightness of the image of the pointer and a position of the pointer during a touch gesture, a position of the pointer corresponding to a point of contact between the pointer and the surface; 37 determining a distance between the position of the pointer and a predetermined calibration line extending from the light source, the calibration line being located to one side of the pointer; estimating a brightness of the pointer at the predetermined calibration line by adjusting the measured brightness of the pointer using a predetermined model of light intensity variation of the light source and the determined distance to the calibration line; and updating calibration parameters for the user interface system using the estimated brightness at the predetermined calibration line. CANON KABUSHIKI KAISHA Patent Attorneys for the Applicant/Nominated Person SPRUSON & FERGUSON
AU2013206686A 2013-07-04 2013-07-04 Adaptive and passive calibration Abandoned AU2013206686A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2013206686A AU2013206686A1 (en) 2013-07-04 2013-07-04 Adaptive and passive calibration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2013206686A AU2013206686A1 (en) 2013-07-04 2013-07-04 Adaptive and passive calibration

Publications (1)

Publication Number Publication Date
AU2013206686A1 true AU2013206686A1 (en) 2015-01-22

Family

ID=52388434

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2013206686A Abandoned AU2013206686A1 (en) 2013-07-04 2013-07-04 Adaptive and passive calibration

Country Status (1)

Country Link
AU (1) AU2013206686A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110211225A (en) * 2019-06-05 2019-09-06 广东工业大学 Three-dimensional rebuilding method and system based on binocular vision

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110211225A (en) * 2019-06-05 2019-09-06 广东工业大学 Three-dimensional rebuilding method and system based on binocular vision

Similar Documents

Publication Publication Date Title
US9288373B2 (en) System and method for human computer interaction
US8854433B1 (en) Method and system enabling natural user interface gestures with an electronic system
JP5802247B2 (en) Information processing device
JP6201379B2 (en) Position calculation system, position calculation program, and position calculation method
KR20220053670A (en) Target-object matching method and apparatus, electronic device and storage medium
US9207779B2 (en) Method of recognizing contactless user interface motion and system there-of
US20190385285A1 (en) Image Processing Method and Device
US20120182396A1 (en) Apparatuses and Methods for Providing a 3D Man-Machine Interface (MMI)
WO2022042566A1 (en) Method and apparatus for recognizing three-dimensional gesture on the basis of markers, and device
CN104081307A (en) Image processing apparatus, image processing method, and program
KR20210069491A (en) Electronic apparatus and Method for controlling the display apparatus thereof
CN108200335A (en) Photographic method, terminal and computer readable storage medium based on dual camera
US20140354784A1 (en) Shooting method for three dimensional modeling and electronic device supporting the same
GB2530150A (en) Information processing apparatus for detecting object from image, method for controlling the apparatus, and storage medium
US11886643B2 (en) Information processing apparatus and information processing method
US8866921B2 (en) Devices and methods involving enhanced resolution image capture
US9041689B1 (en) Estimating fingertip position using image analysis
TWI424343B (en) Optical screen touch system and method thereof
US20170302908A1 (en) Method and apparatus for user interaction for virtual measurement using a depth camera system
US9229586B2 (en) Touch system and method for determining the distance between a pointer and a surface
US10599225B2 (en) Projection-based user interface
AU2013206686A1 (en) Adaptive and passive calibration
EP2975503A2 (en) Touch device and corresponding touch method
AU2013206691A1 (en) Three dimensional estimation using two dimensional surface calibration
Cheng et al. Fingertip-based interactive projector–camera system

Legal Events

Date Code Title Description
MK4 Application lapsed section 142(2)(d) - no continuation fee paid for the application