GB2346494A

GB2346494A - Stereographic image processing

Info

Publication number: GB2346494A
Application number: GB9902680A
Authority: GB
Inventors: Ivan Daniel Meir; Jeremy David Norman Wilson
Original assignee: Tricorder Technology PLC
Current assignee: Tricorder Technology PLC
Priority date: 1999-02-05
Filing date: 1999-02-05
Publication date: 2000-08-09
Also published as: AU2448300A; WO2000046751A2; WO2000046751A3; GB9902680D0

Abstract

In a method of processing first and second stereo pairs of images (I<SB>L</SB>, I<SB>R</SB>; I<SB>L</SB>', I<SB>R</SB>') of an object (3), imaged by a stereo rig, a first ordered distortion map (D-MAP 1) of the local geometric transformations required to map regions of one image (I<SB>L</SB>) of the first pair onto the corresponding regions of the other image (I<SB>R</SB>) of that first pair and a second distortion map (D-MAP 2) of the local geometric transformations required to map regions of one image (I<SB>L</SB>') of the second pair onto the corresponding regions of the other image (I<SB>R</SB>') of that second pair are generated. At least some regions of the first and second distortion maps are correlated to register at least one image (I<SB>L</SB>) of the first stereo pair with at least one image (I<SB>L</SB>') of the second stereo pair. Gruen's algorithm can be used to generate the distortion maps and for the correlation.

Description

Image Processing Method and Apparats The present invention relates to a method and apparatus for processing images, and relates particularly but not exclusively to methods and apparatus employing a stereoscopic camera arrangement for acquiring overlapping images of an object and deriving the three-dimensional (3D) shape of the object in the region of overlap.

Suitable algorithms for correlating image regions of corresponding images (eg photographs taken during airborne surveys) are already known-eg Gruen's algorithm (see Gruen, A W"Adaptive least squares correlation: a powerful image matching technique"S Afr. J of Photogrammetry, remote sensing and Cartography Vol 14 No 3 (1985) and Gruen, A W and Baltsavias, E P"High precision image matching for digital terrain model generation"Int Arch photogrammetry Vol 25 No 3 (1986) p254) and particularly the"region-growing"modification thereto which is described in Otto and Chau"Region-growing algorithm for matching terrain images"Image and Vision Computing Vol 7 No 2 May 1989 p83, all of which are incorporated herein by reference.

Essentially, Gruen's algorithm is an adaptive least squares correlation algorithm in which two image patches of typically 15 x 15 to 30 x 30 pixels are correlated (ie selected from larger left and right images in such a manner as to give the most consistent match between patches) by allowing an affine geometric distortion between coordinates in the images (ie stretching or compression in which originally parallel lines remain parallel in the transformation) and alIowing an additive radiometric distortion between the grey levels of the pixels in the image patches, generating an over-constrained set of linear equations representing the discrepancies between the correlated pixels and finding a least squares solution which minimises the discrepancies.

The Gruen algorithm is essentially an iterative algorithm and requires a reasonable approximation for the correlation to be fed in before it will converge to the correct solution. The Otto and Chau region-growing algorithm begins with an approximate match between a point in one image and a point in the other, utilises Gruen's algorithm to produce a more accurate match and to generate the geometric and radiometric distortion parameters, and uses the distortion parameters to predict approximate matches for points in the region of the neighbourhood of the initial matching point. The neighbouring points are selected by choosing the four adjacent points on a grid having a grid spacing of eg 5 or 10 pixels in order to avoid running Gruen's algorithm for every pixel.

Hu et al"Matching Point Features with ordered Geometric, Rigidity and Disparity Constraints"IEEE Transactions on Pattern Analysis and Machine Intelligence Vol 16 No 10,1994 pplO41-1049 (and references cited therein) discloses further methods for correlating features of overlapping images.

The affine transformation relating the x and y coordinates of a point (xa, Ya) in one image to the x and y coordinates of a corresponding point (xb, yb) in the other image, as used in all the above algorithms, can be represented by the matrix expression:

wherein a, b, c, d, e and f are constants. If such a transformation is applied to the pixels in a small (assumed flat) rectangular region of one image of the stereo pair then these pixels will map onto a region in the form of a parallelogram in the other image. The skewness (distortion) of the parallelogram relative to the rectangle can be represented by the value of the determinant of

The values of a, b, c and d will be constant over a small flat area and would be unity if there were no distortion between the images ie the camera axes were normal to the plane of the selected small rectangular region. It should be emphasised that in the Gruen and other algorithms noted above, the values of a, b, c and d are found incidentally as a by-product of the correlation process and are subsequently discarded.

It will be appreciated that a single stereo pair of images will enable only a small portion of the object surface to be derived, namely that portion lying in the region of overlap of the fields of view of the cameras. Methods of combining 3D surface portions to generate an overall surface description are described in our patent GB 2,292,605B.

Such combining methods require significant computation if an accurate registration of two 3D surface portions is to be achieved.

An object of the present invention is to provide a further processing method in which the registration can be performed economically and accurately.

The present invention provides a method of processing first and second stereo pairs of images of an object, the method comprising the steps of generating a first ordered data set of the local geometric transformations required to map regions of one image of the first pair onto the corresponding regions of the other image of that first pair, generating a second ordered data set of the local geometric transformations required to map regions of one image of the second pair onto the corresponding regions of the other image of that second pair, and correlating at least some regions of the first and second data sets to register at least one image of the first stereo pair with at least one image of the second stereo pair or the object surface derivable from the first stereo pair with the object surface derivable from the second stereo pair.

Advantageously the method is performed in conjunction with the known 3D image processing (using eg Gruen's algorithm or the like as described above) of the images of each stereo pair to derive the 3D surface portion corresponding to that stereo pair.

Such 3D image processing will provide the above-mentioned first and second ordered data sets of local geometric transformations.

Preferably the correlation of a region of the first ordered data set is performed by seeking a region of the second ordered data set which is related to said region of the first data set by an affine transformation.

Preferably said geometric transformations are also affine transformations and each of said ordered data sets is representative of the distribution of a single parameter of the affine transformation over the image area corresponding to that ordered data set. Preferably said parameter is a function of at least one of i) (a, d) and ii) (b, c) where the affine transformation relating the x and y coordinates of a point (xa, Ya) in one image to the x and y coordinates of a corresponding point (xb, Yb) in the other image is of the form:

wherein a, b, c, d, e and f are constants.

Preferably said parameter is (ad-bc) or mod (ad-bc) adlbc or bc/ad.

A preferred embodiment of the invention is described below by way of example only with reference to Figures 1 and 2 of the accompanying drawing, wherein: Figure 1 is a diagrammatic representation of a stereoscopic image acquisition arrangement in accordance with the invention, and Figure 2 is a diagrammatic representation of the processing steps utilised in the arrangement of Figure 1.

Referring to Figure 1, the apparatus comprises a personal computer 4 (eg a Pentiumfl'PC) having conventional CPU, ROM, RAM and a hard drive and a connection at an input port to a freely movable stereoscopic rig RG comprising left and right digital cameras L and R rigidly mounted on a common frame. Computer 4 has a video output port connected to a screen 5 and conventional input ports connected to a keyboard and a mouse 6 or other pointing device. The hard drive is loaded with conventional operating system such as Windowsa3'95 and software: a) to display images acquired by the cameras L and R; b) to correlate points in overlapping regions of stereo pairs of images input from the cameras.

The rig RG can be freely moved to a variety of positions such as RG'and stereo pairs of images acquired and downloaded at such positions. At least one point P is assumed to be in the field of view of both cameras L and R in the initial position and the position RG'of the stereo rig. Each camera has an image plane I.

The relative orientation between the cameras as well as their separation and other optical parameters such as focal length are known and the surface region of object 3 in the region of overlap of each stereo pair is derived in conventional fashion by correlating the pixels of each image of a stereo pair (using eg Gruen's algorithm or a variant as described above) and then reconstructing the surface from the correlated pairs of pixels and the above parameters.

In accordance with the invention the resulting 3D surface portions are registered to generate an overall surface description by the process illustrated in Figure 2.

Initially (step S1) the pixels of image IL from camera L of rig RG are correlated with the pixels of image IR. This process is repeated for image IL in relation to image In derived from cameras L and R when the rig is moved to viewpoint RG'.

The angular displacement between these two viewpoints is preferably no more than 20 degrees.

In step S2, for each pixel of the left hand image I L or IL, the rectangular group of (say) 15 x 15 surrounding pixels is processed by Gruen's algorithm, which involves a search for a neighbouring group of pixels in the right hand image IR or IR'which is geometrically related to the rectangular group by an affine transformation and radiometrically related to the rectangular group by a linear transformation. In other words, the relationship between the groups is such that for each pixel in the rectangular group in the left-hand image, there is a corresponding pixel in the righthand image which is displaced from that pixel according to an affine transformation and for each pair of corresponding pixels in the two groups, the grey level (brightness) of the pixel in the right-hand imageis related to the grey level (brightness) in the left hand image by an additive transformation (ie involving addition or deletion of a constant brightness level, this constant being unchanged throughout the group).

One related pair of groups of pixels centred on point P is shown in respect of the stereo pair of images acquired from rig RG, the left hand group being square and the right hand group being in the form of a parallelogram abcd. Only four pixels are shown in each group for the sake of clarity.

The skewness of the transformation between the groups of each pair is found by evaluating the determinant of the right hand group (ie mod (ad-bc) in the example shown) and the geometric distortion of the affine transformations is represented as a distortion map D-MAP 1 in respect of rig position RG and D-MAP 2 in respect of RG'. The distortion can be represented in other ways eg (ad-bc) or ad/bc or bc/ad.

In step S3 a similar process is applied to the distortion maps D-MAP 1 and D-MAP 2, except that the distortion distribution represented by each map is processed in place of the grey level (brightness) map represented by the images acquired by the respective cameras.

The resulting correlation between the distortion maps (ie a mapping of each pixel in one map to a corresponding pixel in the other map) is also a correlation of the 3D coordinates of those pixels and hence a correlation of the 3D surface portions corresponding to the two distortion maps and to their associated pairs of left and right-hand images IL and IR and IL and IR. In this manner the surface portions can be registered.

The overall surface profile of the object can be found from the registered surface portions given the calibration data of the rig, in known manner. Accordingly no further description of this aspect is necessary.

Since only six corresponding points are needed to define the relative orientation of the surface portions, it is not necessary for the correlation process of step S3 to be applied to the entire image-instead it can be applied to eg six widely spaced regions thereof. In some cases (eg if the orientation or one or more coordinates of the rig are maintained constant) fewer than six regions would need to be correlated.

The distortion maps are represented as two-dimensional arrays purely for ease of visualisation and can be represented for computational purposes as any ordered data array.

It is not essential for the correlation of the distortion maps to be performed by the Gruens algorithm or a variant thereof as described above. The distortion represented in the distortion maps is closely related to surface slope, and any method of image correlation is expected to be suitable for correlating the distortion maps, particularly bearing in mind that it is not necessary to correlate the distortion maps over their entire area. It is however desirable for the correlation method to be recognise that one distortion map can be mapped by an affine transformation onto another distortion map derived by the same stereo rig.

In other embodiments the right-hand image of each pair could be correlated with its corresponding left image by searching for groups of pixels in the left-hand image which could be mapped onto the right-hand image by an affine geometric transformation and an additive radiometric transformation. The resulting distortion maps would then relate to the right-hand image of each pair and their correlation would be a correlation of the object surface regions as seen by the right-hand camera of the rig rather than the left-hand camera as in the described embodiment. In other embodiments the distortion map could be derived by selecting a notional image plane eg intermediate the left and right image planes and representing the local distortion as the sum of a) the geometric distortion of the affine transformation required to map a rectangular group of pixels of the notional image plane onto the left-hand image plane and b) the geometric distortion of the affine transformation required to map that group of pixels onto the right-hand image plane.

Claims

Claims 1. A method of processing first and second stereo pairs of images of an object, the method comprising the steps of generating a first ordered data set of the local geometric transformations required to map regions of one image of the first pair onto the corresponding regions of the other image of that first pair, generating a second ordered data set of the local geometric transformations required to map regions of one image of the second pair onto the corresponding regions of the other image of that second pair, and correlating at least some regions of the first and second data sets to register at least one image of the first stereo pair with at least one image of the second stereo pair or the object surface derivable from the first stereo pair with the object surface derivable from the second stereo pair.
2. A method as claimed in claim 1 wherein the correlation of a region of the first ordered data set is performed by seeking a region of the second ordered data set which is related to said region of the first data set by an affine transformation.
3. A method as claimed in claim 1 or claim 2 wherein said geometric transformations are affine transformations and each of said ordered data sets is representative of the distribution of a single parameter of the affine transformation over the image area corresponding to that ordered data set.
4. A method as claimed in claim 3 wherein said parameter is a function of at least one of i) (a, d) and ii) (b, c) where the affine transformation relating the x and y coordinates of a point (xa, Ya) in one image to the x and y coordinates of a corresponding point (xb, Yb) in the other image is of the form:

wherein a, b, c, d, e and f are constants.
5. A method as claimed in claim 4 wherein said parameter is mod (ad-bc).
6. A method as claimed in any preceding claim wherein the ordered data sets are derived from a correlation of corresponding points of the respective images of the first stereo pair.
7. A method of processing overlapping first and second stereo pairs of images of an object, the method being substantantially as described hereinabove with reference to Figures 1 and 2 of the accompanying drawing.
8.. A stereoscopic image acquisition arrangement comprising two cameras arranged to acquire overlapping first and second stereo pairs of images of an object and processing means arrange to process the stereo pairs of images by a method as claimed in any preceding claim.
9. A stereoscopic image acquisition arrangement substantially as described hereinabove with reference to Figures 1 and 2 of the accompanying drawing.