AU746844B2

AU746844B2 - Practical optical range estimation using a programmable spatial light modulator

Info

Publication number: AU746844B2
Application number: AU43893/01A
Authority: AU
Inventors: Kieran Gerald Larkin; Julian Frank Andrew Magarey
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2000-05-16
Filing date: 2001-05-15
Publication date: 2002-05-02
Anticipated expiration: 2021-05-15
Also published as: AU4389301A

Description

PRACTICAL OPTICAL RANGE ESTIMATION USING A PROGRAMMABLE SPATIAL LIGHT MODULATOR FIELD OF INVENTION The current invention relates to passive optical range estimation, and in particular to the use of programmable spatial light modulators in the optical path for this purpose.

DESCRIPTION OF BACKGROUND ART In the act of projecting a three-dimensional scene onto a two-dimensional sensor to form an image, the third dimension, that of range or depth distance from the camera), is lost. The aim of range estimation is to restore the lost third dimension so as to enrich the available information about the captured scene. Passive optical ranging methods attempt to do this by relying solely on the ambient light, and not requiring the use of any exploratory signals, such as laser stripes. This considerably reduces the cost of ranging sensors. However, the disadvantage is that their success depends to some extent on the content of the scene, and in particular to the presence of sufficient image texture or detail.

Passive optical ranging methods nearly always rely on the laws of geometric optics, which govern the relationship between scene geometry, camera parameters, and the formed image under incoherent or broad-spectrum light conditions. The basic principle is to capture and digitise one or more images under controlled optical conditions, and analyse the resulting image(s) using the known camera parameters to infer the range of objects or regions in the scene, thus producing a "range map". In the particular case where the range map contains a range estimate for every pixel in the image, the range map is termed "full density". Examples of this technique include depth from focus (DFF), depth from defocus (DFD), and depth from stereo (DFS).

Both DFF and DFD ranging rely on the out-of-focus blurring caused by an optical system with limited depth-of-field. The blurring process may be modelled mathematically by a local averaging operation applied to a putative "sharp image", which is in focus at all points. The "sharp image" is that which would be obtained by a pinhole camera placed at the same location. The shape of the averaging kernel around a pixel, known as the Point Spread Function (PSF), is determined largely by the aperture of the camera, while its extent depends on the distance of the corresponding scene point from the focal plane of the system that is, its range. DFF methods require a set of images 554439.doc captured from the same viewpoint with multiple focal planes. A "focus operator" is then applied to each image. The focus operator, usually of simple heuristic or ad-hoc design, returns a maximum when the PSF around each pixel has minimum extent. By selecting the image (and knowing the corresponding focus setting) for which a given image region has maximum focus, the range of the region may be inferred. This technique is used by most so-called "autofocus" cameras and ranging binoculars. However, to produce a full density range map using DFF methods is extremely laborious and time-consuming.

DFD methods, by contrast, require only two images for range estimation, though more may be used to add robustness to the range map. The captured images need not be "in focus" at any point, so no focus operator is required. Instead, a precise, parametrisable form is assumed for the system PSF. Since the underlying "sharp image" is the same for all captured images (aside from known changes in magnification), its effects may be "divided out", leaving a parametrised transformation relationship between the blurred images in which the only unknown parameter is related to range. This parameter may be inferred, and a range map produced, by Maximum Likelihood (ML) or other statistical methods applied to the captured images. Such methods nearly always involve using local regions around each pixel over which range is assumed to be constant. This can be a disadvantage around regions containing rapid changes in range, such as at the edge of an discrete object.

S 20 The third common approach, DFS, uses two (or more) images captured from the same camera with different viewpoints. The difference between the captured images may, as in the DFD approach, be modelled by a range-parametrised transformation, so similar statistical methods may be applied to estimate a range map from the captured images.

S However points near the edge of objects are visible in one image of the scene but not the other the so-called "occlusion" problem. In this case the image transformation is no longer one-to-one. The extra processing required to handle occlusion adds expense to DFS methods.

**Midway between the DFD and DFS approaches is the range-by-opticaldifferentiation (ROD) approach. It relies on precise control of the system PSF, made possible by the use of a spatial light modulator (SLM) in the optical path of the system.

The space-varying transparency or "mask function" imposed by the SLM sets the PSF of the camera. The SLM is programmable, so the PSF may be rapidly changed between captured images. Using two appropriately designed mask functions, the two captured images may be transformed into what is effectively a differential stereo pair. That is, the 554439.doc second image is the derivative of the first with respect to viewpoint position. Simple arithmetic manipulations of the images then yield a full-density range map.

A feature of the ROD approach is that, in principle, it is inherently localised it does not require a local region of assumed constant range around a pixel to produce a range estimate at that pixel. It is therefore better suited to handling range discontinuities, which are common in natural scenes, than other methods previously mentioned.

Unfortunately the ROD approach also has drawbacks. Firstly, the recommended method of achieving an exact differential stereo pair using only two images is theoretically incorrect for any practical PSF shape. Secondly, the algorithm is highly susceptible to sensor noise which always accompanies captured images.

SUMMARY OF THE INVENTION The present invention, based on the ROD principle, addresses the shortcomings of prior art ROD range estimation methods, while maintaining the advantages of the method. It does this by using three mask functions to produce three captured images which may be manipulated to obtain not one, but two exact differential stereo pairs.

Processing these pairs to obtain a single range map provides a significant improvement over standard ROD in noise robustness.

Therefore, according to a first aspect of the invention, there is provided a range 20 sensor for determining range information of a visual scene, said range sensor comprising: a single optical path; an optical sensor for recording optical intensity of light travelling from said visual scene through said optical path; an aperture controlling means for alternatively providing at least a first mask function, a second mask function and a third mask function for masking said optical path to prevent a portion of said optical intensity of said light from reaching said sensor; and range determination means for determining the range of points in said visual scene from at least three images, captured with said at least three mask functions respectively, said images being transformed to form at least a first and a second transformed image, wherein said second transformed image is substantially a derivative of said first transformed image.

According to a second aspect of the invention, there is provided a method for determining range information of points in a visual scene, said method comprising the steps of: 554439.doc controlling an aperture controlling means with a first mask function; capturing a first image of said visual scene on an optical sensor; controlling said aperture controlling means with a second mask function; capturing a second image of said visual scene on said optical sensor; controlling said aperture controlling means with at least a third mask function, said mask functions alternatively preventing portions of light intensity from reaching said optical sensor; capturing at least a third image of said visual scene on said optical sensor; and determining the range of points in said visual scene from at least three of said images, captured with said mask functions respectively, said images being transformed to form at least a first and a second transformed image, wherein said second transformed image is substantially a derivative of said first transformed image.

BRIEF DESCRIPTION OF THE DRAWINGS A preferred embodiment of the present invention will now be described with reference to the drawings, in which: Fig. 1 is a block diagram showing an apparatus according to the preferred i* embodiment of the invention; Fig. 2 is a block diagram showing the optical system of the apparatus shown in Fig. 1; Figs. 3A to 3C contain plots of the proposed mask functions to be applied to the SLM of the apparatus illustrated in Fig. 1; Fig. 4A illustrates a process for determining a range estimate map from three captured images; Fig. 4B illustrates a process for determining a range estimate map from five captured images; and Fig. 5 is a block diagram of the optical system showing the limits of performace of the apparatus shown in Fig. 1.

.oe.ei DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION Fig. 1 illustrates a single-camera passive stereo range finding system 10. The range finding system 10 includes an optical system 200 and a processor module 100.

Reflected light from an object 300 passes along an optical axis 400, through a programmable spatial light modulator (SLM) 210 and an image forming device(s) 220 554439.doc such as a lens or compound lens, and is projected onto an image capture mechanism 230.

The programmable SLM 210 is preferably placed in an aperture plane 240 of the optical system 200 in such a way that it forms the smallest aperture of the system 200. In this way a mask function imposed on the system 200 by the programmable SLM 210, sets the aperture function of the system 200. The image capture mechanism 230 consists of a 2D array of photosensitive elements. A processor module 100 controls the programmable SLM 210 and analyses captured images from the image capture mechanism 230.

The processor module 100 typically includes at least one processor unit 114, a memory unit 118, for example formed from semiconductor random access memory (RAM) and read only memory (ROM), input/output interfaces including an image interface 122 for receiving image data from the image capture mechanism 230, and an I/O interface 116 for controlling the programmable SLM 210. A storage device 120 is also provided. The components 114 to 120 of the processor module 100, typically communicate via an interconnected bus 130 and in a manner that results in a conventional mode of operation of the processor module 100 known to those in the relevant art.

Preferably the processor module 100 is implemented together with the optical .system 200 in a camera (not illustrated). Alternatively, a conventional computer (not illustrated) may be connected to the optical system 200 for controlling the programmable SLM 210 and analysing captured images from the image capture mechanism 230.

SRefering to Fig. 2, the image capture mechanism 230 is positioned do units behind the aperture plane 240 in such a way that the focal distance Z• of the system 200 is in the middle of an expected range Z of the scene. The Lens Law relates distance do to focal distance Zo as follows: 1 1 (1) f Z• do wherefis the focal length of the system 200.

'The processor module 100 controls the programmable SLM 210 to impose a first mask M• on the optical system 200 and, at a first time instant, controls the image capture mechanism 230 to capture a first image ro. At a second sampling instant, a second mask MI is applied on the programmable SLM 210 and a second image rl is captured on the image capture mechanism 230. Finally, a third mask M 2 is applied and a third image r 2 is captured. The three images ro, r¢ and r2 are then processed by the processor module 100 554439.doc I I~El as described below and a range map Z(xp) is produced, where xp is a subgrid of an image sampling grid. An additional output may be a confidence map C(xp), which contains for each range estimate Z(xp) an absolute level of confidence in that estimate Z(xp). The confidence is defined as the inverse of the variance o 2 of the range estimate Z(xp), expressed as a multiple of the variance of the (assumed) additive white Gaussian sensor noise. Such confidence measures are useful for robust postprocessing, such as surface fitting, to be carried out on the range map.

Mask definitions.

The desired outcome of the mask definitions is a set of "virtual masks" ei(u), with the following derivative properties: eo M(u) q, M(u) (2) Bu e 2 (u)j a 2 72 2M (u) 98u where u [u v]Tare the aperture plane coordinates, M(u) is a basic mask function and rll and 112 are (known) constants.

The virtual masks ei(u) are obtained by linear transformation of the set of "physical masks" as applied to the optical system 200 by the programmable SLM 210, as follows: Mo (u) =Q M, (3) e2 M2u)J The mask value represents the transparency of the programmable SLM 210 to ambient light. Therefore, the physical masks Mi(u), to be physically achievaBle, must have values in the range over the full dimensions of the SLM 210, and zero outside.

In the preferred embodiment, the basic mask function M(u) is a normalised Gaussian of standard deviation y, chosen such that the mask function M(u) is negligible at the edges of the SLM 210, as follows: exp (4) 27cr 2 2 554439.doc Therefore, the actual choice of y depends on the physical dimensions of the SLM 210 and quantisation stepsizes of its possible transparency values. For example, for an 8bit grayscale SLM, the quantisation stepsize is 1/256. If the radius of the SLM 210 is c, then y should be chosen such that exp -cT (Y 256 In the preferred embodiment y 0.3c is used.

It may be shown that, using the following preferred physical mask definitions, Mo(u) 8oexp[ j 2(6)

M,

1 i(+u)exp 2u (7)

M

2 12 2 exp )U (8) (where the Pi are chosen such that the physical masks M(u) are physically achievable), the required relation from Equations and are satisfied with 1 1 0 0 27ry 2 0 -1 1 SQ 0 (9)

A

.1+ 2 2,p2 and 15 '1 9 2 2ny Plots of the three physical masks Mo, M 1 and M 2 of Equations to are shown in Figs. 3A to 3B respectively for a typical value of y 3 (thus requiring 3o 1 and P1 P2 1/10.33).

554439.doc II ;r r At this point it is noted that previous work has attempted to define physically achievable masks Mo and M, via an invertible 2 by 2 matrix P'acting on a Gaussian mask M(u) and its derivative as follows: [MO(u)

M

(u) M, M(u) au i The virtual masks ei(u), like the physical masks Mi(u), should be identically zero outside the dimensions of the programmable SLM 210. This means that both the basic mask functions M(u) and its derivative must have this property. However, it au may be shown that for any symmetrical, continuous, twice differentiable mask M with this property, it is not possible to achieve the constraint of Equation (11) exactly. In other words, to achieve true optical differentiation, at least three masks must be used.

Many masks other than the Gaussian are possible. The following are further examples of basic mask functions M(u) using sinusoids or polynomials to achieve the exact constraint matching at the aperture edges. They are separable functions and strictly valid over a square region which could correspond to a square region of the 15 programmable SLM 210.

*uv C*o. C cos3(V U, SM(u)= c s -cu (12) 0 otherwise 2' 2\ 3 M( 1- 1- -c~u,v~c M(u) c 2 c 2 (13) )0 otherwise The virtual masks ei(u) follow from differentiation of the basic mask function M(u) as in Equation with 1 1 r 2=1.

20 It may be shown in each case that an invertible matrix P may be found to transform the virtual masks ei(u) to physically achievable masks Mi(u). The matrix Q, satisfying Equation is then the simple inverse of matrix P.

In the next section we set out how the three images ri obtained from the required three physical masks Mi(u) may be optimally utilised to achieve range estimation.

554439.doc II II- Image formation.

According to the laws of geometric optics, a point light source on the optical axis 400 at range Z is imaged on the image capture mechanism 230 as a scaled and dilated version of the corresponding mask function This point image is the PSF for the system 200, and is defined as: h M, (14) so So o for the physical PSFs (from the physical masks and e so so for the virtual PSFs (from the virtual masks where: the image plane coordinates are x [xy] the dilation factor so is given by fo (Z o Z) (16) s z) (16)

Z(Z

o -f) It follows that the same linear relationship exists between the physical and virtual PSFs hi(x) andpi(x) as between the physical and virtual masks Mi(u) and ei(u) as set out in Equation Therefore, the virtual PSFs pi(x) may be written as: o) h(u) Q (17)

P

2 u) h 2 (u) Furthermore, because of Equation the derivative relationships between the virtual masks ei(u) as set out in Equation may be inherited by the virtual PSFs pi(x), with the dilation factor as an extra multiplier as follows: pi 771Po p, x S: 20 so 0 2 (18) p2(X) -1P x where an x-subscript indicates partial differentiation in the horizontal direction.

If it is assumed that the captured scene is in a plane at range Z, the image blurring process may be modelled as a convolution of a perfectly focused image fi with the physical and virtual PSFs hi(x) and pi(x), as follows: 554439.doc

I

r Jf(x t)h (t)dt (19) for the physical images (from the physical PSFs and gi(x)= Jf, (x t)pi (t)dt for the virtual images (from the virtual PSFs pi(x)).

Because convolution is a linear process, the physical and virtual images are related in the same fashion as the physical and virtual PSFs hi(x) and pi(x), as is set out in Equation (17): go W(x) roW(x) r(x) (21) Also, it follows from Equation (20) that the virtual images gi(x) inherit the same derivative relationships as the virtual PSFs pi(x), which is set out in Equation (18).

Therefore, the virtual images gi(x) may be written as: sor72 g x (x (22) g2(x) -L1 L I S..In the general case, where the range Z of the radiating points that make up the scene varies over the captured image ri(x), the parameter so also varies as a function of 15 position on the image capture mechanism 230. Furthermore, the convolutions in Equations (19) and (20) must be replaced with more general linear space-variant transformations. It can be shown, however, that the derivative relations set out in Equation (22) between the virtual images gi(x) substantially still hold, provided range Z(x) does not change too rapidly with the image plane coordinates x.

Obtaining the range map Equation (22) contains two linear constraints at each pixel x on the unknown S* space-varying range-related parameter so(x) in terms of the virtual images gi(x). These may be solved jointly in any manner desired. In the preferred implementation, a maximum-likelihood (ML) estimation is used to obtain parameter so independently over disjoint tiles of size p by p pixels. The tiles are denoted by R(xp), where positions x, are the tile centres, evenly spaced on a grid with spacing p in each direction.

554439.doc If independent, equivariant Gaussian noise of variance o 2 is assumed on the physical images ri, the noise being introduced by sensor noise, quantisation etc., the ML range estimate is obtained by forming the following error criterion at each position Xp: (g,(x)-srlgo,(X)) 2 x(71g2(X)-S72g x(X)) 2 (23) xeR(xp) where the weighting parameter Xx is chosen according to maximum-likelihood estimation theory, to equalise the variances of the two terms of the sum as follows: x a 2 (gl) (24) A (24) In the case of Gaussian noise on the physical images ri as assumed above, the variance of the resulting noise in the virtual images may be calculated by examining Equation (21) as follows: 2 IQ 2 j) j=1 In the preferred (Gaussian mask) embodiment as set out in Equations and the noise variances become 2 S'2 (go) (26) 1 1 15 f-2(g1) ;and (27) 2 I 2 (28) o The error criterion in Equation (23) may be minimised independently with respect to parameter s at each tile centre x to give an estimate of parameter so(xp). Because the error criterion Ex is a quadratic function of parameter s, the global minimum s of the error function may be easily located as follows: -d (Ex xPs4=o) s(xp) d2 (29) ds2 E (xpsjs=o 554439.doc i 12- The minimum location at each tile center x, may be transformed to a range estimate Z by inverting Equation (16) as follows: fz Z(xp)= -o r sx Zo-f)+f In practice, a lookup table derived from a calibration step may be used to obtain a range estimate Z from the corresponding minimum location S.

Additionally, the confidence measure C(xp), derived ultimately from an amount of image texture around each pixel xp, may be computed as the inverse of the variance c2z of the range estimate, relative to the additive noise variance a2 as follows: 2 2 (31) s where, from Equation (16) az =-ZO-f z2 (32) as fZo and o* 2 (33) xeR(xp) i=0 ari(X is the variance of the estimate of parameter so(xp). This may be found by the use 15 of Equation (21) to transform from physical ri(x) to virtual images gi(x), followed by differentiation of Equation (29) with respect to each contributing pixel x of each virtual image gi(x).

The result is the following expression: 5 5 2 2 5 2 xgx(x) 62 k ZgOX) 92 Expss] xeR(xp) P 2 xeR(xp) PIP xeR(xp) (34) 554439.doc F4- 13- The choice of the grid spacing p is a trade-off between the resolution of the estimated range map and the robustness to image noise of each estimate. To obtain a full-density range and confidence map, setp 1.

A structure 50 of an algorithm for obtaining the estimated range map Z(xp) from three physical captured images ro(x) to r 2 is shown in Fig. 4A. In a process 55, the processor module 100 controls the programmable SLM 210 to impose the first mask Mo on the optical system 200 and controls the image capture mechanism 230 to capture a first image ro. This is followed by applying the second mask Mi to the programmable SLM 210 before the second image rl is captured. Finally, a third mask M 2 is applied and a third image r 2 is captured. In process 60, the three (physical) images ro, rl and r2 are then transformed to virtual images gi(x), using Equation (21).

Note that the captured images ri(x), and therefore also the virtual images gi(x), are sampled rather than in the continuous domain, so differentiation can only be carried out approximately by a finite differencing operation on neighbouring pixels. Any finite difference kernel may be used, but in the preferred implementation, a symmetrical five-tap convolution operator in the horizontal direction is used to differentiate the virtual images go and gi with respect to the direction parameter x, and the corresponding smoothing :operator applied horizontally to the virtual images gL and g2. In process 62, the smoothing *'operator is applied vertically to the virtual images go, gl and g2 to produce vertically smoothed virtual images g, gk and g. Process 65 differentiates the vertically smoothed virtual images go and g horizontally and applies the smoothing operator S.horizontally to the vertically smoothed virtual images g and g2. These two convolution kernels are set out in Table 1. The outputs of process 65 are partially differentiated (in the horizontal direction) virtual images gOx and gx, as well as smoothed (in both directions) 25 virtual images g and g2.

io •6o 554439.doc 14- Tap Differentiating Smoothing -2 .09205 .02475 -1 .31838 .24629 0 0.0000 .45789 1 31838 .24629 2 09205 .02475 Table 1: Discrete convolution kernels (preferred embodiment) for image differentiation and smoothing operations.

Process 70 uses the outputs from process 65 to form the error criterion Ex(xp,s) by substituting the smoothed virtual images into Equation Next, using Equation (29), the minimum location s at each tile center x is found in process 75. Finally, process transforms the minimum locations 9 to the estimated range map using Equation or a look-up table.

Optionally, a confidence map C(xp) may be obtained in process 85 using Equations and (34).

Robustness of the estimated range map Z(xp) may be increased using two more mask functions M 3 and M 4 respectively to capture images r 3 and r4. The mask functions

M

3 and M 4 implement differentiation in the vertical direction y. These may be obtained by rotating the masks M, and M 2 through 90 degrees. The physical images ro to r 4 may be transformed to virtual images go to g4 using an extended version of Equation (21): go(x) g2W g,(x) g 3 (x) g 4 x) g,(W_ r, (x) rW(x) r4(x) where Q is a 5 by 5 matrix given by: 554439.doc r i 1 26 0 0 0 0 0 0 1 0 0

P

1

P

2 A1 1 (1+7 2 1 1 Q -flo 21 2fl 0 0 (36) 0 0 0 -1 1 F P 2/82 A similar relationship to Equation (22) may be written between the virtual images g3 and g4: g 3 y(x) g4 (x)l =So77 l g'y(X) (37) 5 A second error criterion Ey(xp,s) may be formed over each tile as follows: ooo* and combined with error criterion Ex(xp,s) to form a fully two-dimensional error criterion E(xp,s) as follows: E,(xs) (39) 10 which may be minimised in the above fashion to form a range map Z(xp) with spacing p. This has the additional benefit of removing any bias towards the horizontal direction x that may be present in the process described. The main drawback, along with added processing time, is that the scene must remain static for five image capture intervals rather than three.

Fig. 4B shows a structure 51 of an algorithm for obtaining the estimated range map Z(xp) from five physical captured images ro(x) to r 4 The processor module 100 controls the programmable SLM 210 in process 56 to impose the masks Mo to M 4 and the image capture mechanism 230 to capture each of the images ro to r 4 In process 61, the physical images ro, rl, r 2 r 3 and r 4 are then transformed to virtual images gi(x), using Equation 554439.doc 16- In process 63, the smoothing operator is applied vertically to the virtual images go, gl, g2, g3 and g4 to produce vertically smoothed virtual images go, g 1 g 2 g 3 and g4.

Virtual images go and g3 are also differentiated vertically to form goy and g3y. Process 67 follows by differentiating the vertically smoothed virtual images go and g, horizontally and by applying the smoothing operator horizontally to the vertically smoothed virtual images g 3 and g 4 and vertically differentiated virtual images goy and g3y. The outputs of process 66 are partially differentiated virtual images go0y, go, g and g3y, as well as smoothed (in both directions) virtual images g 2 g 3 and g 4 Process 71 uses the outputs from process 66 to form the error criterion E(xp,s) using the smoothed virtual images in Equations (38) and Next, the minimum location 9 at each tile center xp is found in process 76. Finally, process 81 transforms the minimum locations to the estimated range map Z(xp) using Equation (30) or a look-up table.

S. Optionally, a confidence map C(xp) may be obtained in process 86.

Limits to performance Referring to Fig. 5, the range estimation apparatus 10, shown in Fig. 1, has a finite, conical "field of view" 420. If a bright point 320 at range Z is further than some limit from the optical axis 400, its range Z cannot be estimated. This limit is imposed by the requirement that the blurred image of each bright point is wholly contained within the dimensions of the image capture mechanism 230, having a radius D. This requirement is satisfied for a bright point at a distance H from the optical axis provided Ko(Z)H <D-c.so(Z) where Ko(Z) is the absolute magnification of the optical system, given by fz0 Ko (41) Z(Zo f) Substituting in Equation (40) for Ko(Z) and so(Z) gives the radius Hax of the field of view at range Z as

H.

x f D(Z f)-c (42) ax fzo 554439.doc r:- 17- Note that 0 when Z given by

Z

n fZ (43) cf D(Z o f) This is the absolute lower limit of usefulness for the ranging apparatus The upper limit is less well-defined, but is noticeable as a dropoff in confidence measure C(xp), and a consequent increase in noise sensitivity. As points are further from the focal plane Zo, their images are more blurred and the image spatial gradients are reduced, reducing the confidence C(xp) in the corresponding range estimate Z(xp). At some point, dependent on the signal-to-noise ratio of the optical system 200, this effect makes the noise sensitivity so large that the range estimates Z(x) are effectively useless.

The foregoing describes only some embodiments of the present invention and modifications, obvious to those skilled in the art, can be made thereto without departing *from the scope of the present invention.

The term "comprising" is used herein in the inclusive sense to mean "including" or "having" and not in the exclusive sense meaning "consisting only of'.

554439.doc 9

Claims

1. A range sensor for determining range information of a visual scene, said range sensor comprising: a single optical path; an optical sensor for recording optical intensity of light travelling from said visual scene through said optical path; an aperture controlling means for alternatively providing at least a first mask function, a second mask function and a third mask function for masking said optical path to prevent a portion of said optical intensity of said light from reaching said sensor; and range determination means for determining the range of points in said visual scene from at least three images, captured with said at least three mask functions respectively, said images being transformed to form at least a first and a second transformed image, wherein said second transformed image is substantially a derivative of said first transformed image.

2. A range sensor as claimed in claim 1, wherein said transform results in a third transformed image, wherein said third transformed image is substantially a derivative of said second transformed image.

3. A range sensor for determining range information of a visual scene, said range sensor comprising: a single optical path; an optical sensor for recording optical intensity of light travelling from said visual scene through said optical path; an aperture controlling means for alternatively providing at least a first mask function, a second mask function, a third mask function, a fourth mask function and a fifth mask function for masking said optical path to prevent a portion of said optical intensity of said light from reaching said sensor, said fourth mask function and said fifth mask function being rotated versions of said second mask function and said third mask function respectively; and range determination means for determining the range of points in said visual scene from at least five images, captured with said at least five mask functions respectively, said

554439.doc R' -19- images being transformed to form at least a first, a second and a third transformed image, wherein said second transformed image is substantially a derivative in a first direction of said first transformed image and said third transformed image is substantially a derivative of said first transformed image in a rotated direction with respect to said first direction.

4. A range sensor as claimed in claim 3, wherein said transform results in fourth and fifth transformed images, wherein said fourth transformed image is substantially a derivative in a first direction of said second transformed image and said fifth transformed image is substantially a derivative of said third transformed image in a rotated direction with respect to said first direction. A range sensor as claimed in any one of claims 1 to 4, wherein said optical sensor comprises a CCD array. s. o. 15 6. A range sensor as claimed in any one of claims 1 to 5, wherein said .ooo **aperture controlling means comprises a programmable spatial light modulator having i spatially varying transparency which is controllable to form said mask functions. s.. 7. A method for determining range information of points in a visual scene, s* 20 said method comprising the steps of: controlling an aperture controlling means with a first mask function; capturing a first image of said visual scene on an optical sensor; S .controlling said aperture controlling means with a second mask function; capturing a second image of said visual scene on said optical sensor; S 25 controlling said aperture controlling means with at least a third mask function, said mask functions alternatively preventing portions of light intensity from reaching said optical sensor; capturing at least a third image of said visual scene on said optical sensor; and determining the range of points in said visual scene from at least three of said images, captured with said mask functions respectively, said images being transformed to form at least a first and a second transformed image, wherein said second transformed image is substantially a derivative of said first transformed image. 554439.doc li;l ;r 8. A method as claimed in claim 7, wherein said transform results in a third transformed image, wherein said third transformed image is substantially a derivative of said second transformed image. 9. A method for determining range information of points in a visual scene, said method comprising the steps of: controlling an aperture controlling means with a first mask function; capturing a first image of said visual scene on an optical sensor; controlling said aperture controlling means with a second mask function; capturing a second image of said visual scene on said optical sensor; controlling said aperture controlling means with a third mask function; capturing a third image of said visual scene on said optical sensor; controlling said aperture controlling means with a fourth mask function being a rotated version of said second mask function; 15 capturing a fourth image of said visual scene on said optical sensor; ~controlling said aperture controlling means with at least a fifth mask function, said fifth mask function being a rotated version of said third mask function, said mask functions alternatively preventing portions of light intensity from reaching said optical sensor; 20 capturing at least a fifth image of said visual scene on said optical sensor; and determining the range of points in said visual scene from at least five of said images, captured with said mask functions respectively, said images being transformed to form at least a first, a second and a third transformed image, wherein said second transformed image is substantially a derivative in a first direction of said first transformed ••go S 25 image and said third transformed image is substantially a derivative of said first transformed image in a rotated direction with respect to said first direction. A method as claimed in claim 9, wherein said transform results in a fourth and a fifth transformed image, wherein said fourth transformed image is substantially a derivative in a first direction of said second transformed image and said fifth transformed image is substantially a derivative of said third transformed image in a rotated direction with respect to said first direction. 554439.doc -21 11. A method as claimed in any one of claims 7 to 10, wherein said optical sensor comprises a CCD array. 12. A method as claimed in any one of claims 7 to 11, wherein said aperture controlling means comprises a programmable spatial light modulator having spatially varying transparency which is controllable to form said mask functions. 13. A range sensor substantially as described herein with reference to the accompanying drawings. 14. A method for determining range information, said method being substantially as described herein with reference to the accompanying drawings. A camera comprising a range sensor according to any one of claims 1 to 6 15 or 13. 0 16. A camera as claimed in claim 15 wherein said single optical path comprises the only optical path by which said camera records images. *P *0.0 Dated this Fourteenth day of May 2001 CANON KABUSHIKI KAISHA Patent Attorneys for the Applicant :Spruson Ferguson S554 554439.doc