WO1999019837A1

WO1999019837A1 - Image processing system, method and computer program product

Info

Publication number: WO1999019837A1
Application number: PCT/GB1998/003041
Authority: WO
Inventors: Kenneth Philip Appleby
Original assignee: Harlequin Limited
Priority date: 1997-10-10
Filing date: 1998-10-12
Publication date: 1999-04-22
Also published as: TW390091B; AU9634798A; GB9721592D0; GB2330266A

Abstract

The present invention relates to an image processing system, method and computer program product for processing or generating a camera image data from given at least two source images by iteratively estimating the parameters constituting the camera image data and using a predeterminable comparison between the first and second images as the basis for determining whether or not to continue said iterative estimating.

Description

Image processing system, method and computer program product

The present invention relates to an image processing system, method, and computer program product.

There exists a significant quantity of recorded image data such as, for example, video, television, films, CDRO images and computer generated images . Typically once the images have been recorded or generated and then recorded, it is very_^time consuming and very difficult to give effect to any ^'changes to the images .

It is often the case that a recorded sequence of images may contain artefacts which were not intended to be included within the sequence of images. Alternatively, artefacts may be added to a sequence of images. For example, it may be desirable to add special effects to the images . The removal or addition of such artefacts is referred to generically as post-production processing and involves a significant degree of skill and time.

Producing a mosaic of or modifying images which have already been recorded is a very difficult task. The degree of effort required to modify or mosaic images or a sequence of images can be reduced if information is available which relates to at least one of the orientation or the position of a camera at the time when the images were produced. Deriving image camera data from a sequence of images is disclosed in, for example, "Creating Full View Panoramic Image Mosaics and Environment Maps", Szeliski R., Shum, HY, Computer Graphics Proceedings, Annual Conference Series, pp 251-258, 1997. However, the processing required by the techniques disclosed within the above paper are very complex and require a significant degree of numerically intensive processing in order to solve equations containing eight unknowns . Often, special effects may be added to an image or other edits may be made either digitally or manually. The placement of such special effects typically relies upon the judgment of the artist or operator giving effect to the edits. Clearly, there is scope for some error in the placement of the special effects as they are made to a clean plate .

It is therefore an object of the present invention to at Feast mitigate some of the problems of the prior art .

Accordingly the present invention provides a method for determining camera image data for at least one of first and second images of a sequence of images within an image processing system comprising memory for storing said first and second images, said method comprising the steps of storing in said memory the first and second images; and deriving, as a consequence of a comparison between the first and second images, camera image data for at least one of the first and second images.

Advantageously, the camera image data for each image within a sequence thereof can be determined. The determined camera image data can then be used in order to give effect to further image processing such as, for example, producing a mosaic of selected images .

A second aspect of the present invention provides an image processing system for determining camera image data for at least one of first and second images of a sequence of images, the system comprising memory for storing said first and second images, means for storing in said memory the first and second images; and means for deriving, as a consequence of a comparison between the first and second images, camera image data for at least one of the first and second images .

It will be appreciated that the means referred to above and in the claims can be preferably realised using a computer such as that schematically depicted in figure 1 and suitable program code.

Accordingly, a third aspect of the present invention provides a computer program product for determining camera imag-e data for at least one of first and second images of a sequence of images within an image processing system comprising memory for storing said first and second images, the product comprising computer program code means for storing in said memory the first and second images; and computer program code means for deriving, as a consequence of a comparison between the first and second images, camera image data for at least one of the first and second images.

Embodiments of the present invention will now be described, by way of example only, with reference to the accompanying drawings in which: figure 1 illustrates a computer suitable for implementing an image processing system or method according to the embodiments of the present invention; figure 2 depicts a sequence of source images from which at least one generated image can be produced; figure 3 shows a world co-ordinate system having an image i positioned therein according to corresponding image camera data cj., β_i7 θi and λj_.; figure 4 illustrates several equations utilised by the present invention; figure 5 shows a flow chart for deriving camera image data, (Xj_. , β_t ,θ^',

and μ; and figure 6 illustrates a further equation utilised by the present invention.

Referring to figure 1 there is shown a computer 100 suitable for implementing embodiments of the present invention. The computer 100 comprises at least one microprocessor 102, for executing computer instructions to process data, a memory 104, such as ROM and RAM, accessible by the microprocessor via a system bus 106. Mass storage devices 108 are also accessible via the system bus 106 and are -used to store, off-line, data used by and instructions for execution by the microprocessor 102. Information is output and displayed to a user via a display device 110 which typically comprises a VDU together with an appropriate graphics controller. Data is input to the computer using at least one of either of the keyboard 112 and associated controller and the mass storage devices 108.

Referring to Figure 2, there is illustrated schematically several images 1 to N of a sequence of images 200. The images may be derived from a film, video tape, CD- ROM, video disc or other mass storage medium. Typically, the image data varies as between images as a consequence of a combination of motion of an object within an image and as a consequence of, for example, variations in the position or orientation of a camera during image capture. Typically, only once the images are recorded, the camera orientation information is lost.

Referring to figure 3 there is shown schematically a world co-ordinate system 300 comprising three axes 302, 306 and 308 which correspond to the x, y, and z axes respectively. The position of one image, image j, with reference to another image, image i, can be described in terms of the orientation of a unit vector, u, within the world co-ordinate system in conjunction with a degree of rotation, θ_i7 about that vector. The orientation of the unit vector within the world co-ordinate system 300 can be specified using two angles, cq and βi . The combination of cq, βi and θi define a rotation which maps image i onto image j or which can be used to identify corresponding portions of images i and j . The distance λ_t represents the distance from the origin to the centre of the image i. Similarly, the distance λ represent the distance from the origin to the centre. of image j. Conceptually, the camera is positioned such that the optical centre thereof is as close as possible to the origin. ^" The line of sight 308 of the camera is normal to the centre of the image i, which represents in reality the field of view of the camera at a focal length determined by λi.

It will be appreciated that the camera image data C-, βi, θ_t, and λ_t is not available when retrieving images from some mass storage media such as, for example, a video disc or CD-ROM or the like.

The relationship between a point Q=(x,y,z), in terms of world co-ordinates, within an image frame or field of view at a first orientation of the camera and the same point Q=(x',y',z') within the field of view at a second orientation of the camera is given by the equations shown in figure 4 where u,., U_y and u_z are the components of the unit vector u, θ is the degree of rotation about the unit vector, Q_r= (x_r/y_r, z_r) represents the rotation of the point Q about the unit vector u and x' , y' and z' represent the location, in terms of world co-ordinates, of the projection of point Q_r onto the second field of view. It will be appreciated that f is the scaling required after rotation of Q by R to project the rotated point onto the field of view of at the second orientation.

Referring to figure 5 there is shown a flow chart for deriving camera image data, cc._. , βi ,θ^' and Λ(=ln(λi)), describing the orientation within a world co-ordinate system of a camera which captured each of the images in the sequence of images shown in figure 2, wherein θ^' decouples θ and λ using the equation shown in figure 6.

Essentially, the determination of the above parameters utilises a function which compares the colour differences between pixels of any two given images. The first image of the two images is addressed using conventional pixel row and address information. The second image of the two images is addressed using co-ordinates which have been transformed using the matrix T as shown in figure 4. The second image is preferably addressed using sub-pixels so that colour interpolation can be performed. The function which is used to measure the colour difference and hence derive optimal values for cX , β_t ,θ^' .and Λ is as follows:

ΔC=(l/N)Σ((δr)²+(δg)²+(δb)²) for all N

where δr = r_L - r_jf δg = g_L - g_jy δb = bi - h_i represent the differences between the RGB colour components any given pixels i and j , and

N is the number of pixels for which the transformed coordinates fall inside the image frame boundary of the second image .

The minimisation process preferably uses the simplex method as is well known within the art (see, for example, "Numerical Recipes in C" , The Art of Scientific Computing, second edition, Press by Teukolsky, Vetterling, Flannery, CUP, IUSBN 0 521 431085, the entire content of which is incorporated herein by reference) .

Referring more particularly to figure 5, two images i and j are selected at step 500 from a sequence of images 200. A portion or sample, P, of pixels from the first image i of the two images is selected at step 502. The sample P preferably comprises 2000 pixels which are expressed in terms of pixel rows and columns .

The initial estimates of the camera image data parameters are set at step 504. Preferably, at least three of the parameters α. , βi ,Θ^',Λ and μ should be initialised. The initialised parameters are used to establish a first estimate of the matrix T illustrated in figure 4.

At step 506, the pixels of the sample P are transformed, using the current estimate of matrix T, into the second image to produce a second set of sample pixels P^' . The two images i and j are compared at steps 508 and 510 to determine the degree of similarity or alignment therebetween. Preferably, effect is given to the comparison by extracting image data, such as, for example, rgb colour data, from the pixels at sample points P in image i and at sample points P' in image j and determining the differences between the colour data. Still more preferably, the differences are evaluated using the function given above for calculating ΔC.

A determination is made at step 512 as to whether or not a minimum value of ΔC has been reached. If not, processing continues at step 514 in which the estimates or current values of a , βi ,Θ^',Λ and μ are altered in accordance with the Simplex method and processing continues again at step 506. If a minimum has been reached, the current estimates for cq , β_t ,θ^', Λ and μ are stored for the pair of images i and j at step 516. This defines the relationship between corresponding portions of images i and j .

A determination is made at step 518 as to whether or not all images of the sequence of images have been processed. If not, processing continues at step 500 with the ^"selection of another two images from the sequence of images. If all source images have been processed a subset of images or eligible images are selected from the sequence of images according to predeterminable criteria at step 520. Preferably, eligible source images are those pairs of consecutive images in which the relative motion, that is distance, therebetween is at least 0.2 in terms of frame coordinates and for which there are no closer pairs of frames in the sequence. The estimate of the distance between selected images utilised the estimated values of α, β, θ, λ and μ.

The re-iteration of steps 500 to 518 for selected or all pairs of eligible images is commenced at step 530. Once it is determined that all eligible images have been processed, average values for Λ and μ are determined at step 540. Preferably, the averages are weighted according to the reciprocals of the final colour difference ΔC between the pairs of images .

Once estimates for Λ and μ have been obtained, these values are fixed and a re-iteration of steps 500 to 518 is repeated for all or a selected portion of the images in the sequence of images using the fixed values of Λ and μ in order to produce refined estimates of α, β, and θ at step 550.

After the above iteration of the processing shown in the flow chart, each image pair has a set of camera image data governing the orientation, focal length and lens distortion of the camera at the instant the pair of images were captured.

For each pair of images a (3x3) relative rotation matrix,^* Ri_j, is calculated, using the estimated values of α, β, θ, and λ. A -relative rotation matrix governs the relative rotation between the two images. The orientation of the first or a reference image of the sequence of images is selected and the relative rotation matrices are used to produced a set of matrices of each image relative to the first or reference image by combining or compositing the relative rotation matrices.

The derived rotation matrices or the relative rotation matrices can be used to define the absolute or relative positions in a world co-ordinate system of corresponding images relative to a reference image or relative to an arbitrary common reference .

Preferably, once the relative rotation matrices Ri have been derived for each pair of images i and j , a set of noncontiguous images, which are preferably substantially evenly distributed throughout the sequence of images, is selected such that there is some overlap between pairs of the selected images. For each consecutive pair of selected images i and j an estimate of the camera image data, α, β, θ, λ and μ, is determined as above to produce rotation matrices S_i;j . The rotation matrix S_i:j between any two images i and j is calculated from the relative rotation matrices R of the images which are between images i and j in the sequence. The difference between S._ and R.. represents an error matrix E.. such that S._j =E_1-.R_1-. Therefore, S.-.R^^'1 = E.._. The inverse matrix R..^"1 is calculated by negating the value of θ in the matrix R_l: . The matrix E_ι, that is the error, is then divided or spread across the relative rotation matrices for all or a selected number of the images which are between the two images i and image j . Assuming that the error is to be spread across n images, the error E._ is expressed in terms of α, β and θ and a rotation defined by α, β and θ/ή is applied to each relative rotation matrix for all images between images i and j . The value of n may represent all images which are between the two images i and j or n may present a selected number of the images between the two images i and j .

The resulting matrices can then be used for further image processing, such as producing a mosaic of images, clean plate generation, improving the image quality or resolution and the like.

Image capture includes capture of images using camera such as, for example, charge coupled devices, or computer generated images within, for example, a virtual reality context or computer animation context using a virtual camera.

Claims

1. A method for determining camera image data for at least one of first and second images of a sequence of images within an image processing system comprising memory for storing said first and second images, said method comprising the steps of storing in said memory the first and second images; and deriving, as a consequence of a comparison between the first and second images, camera image data for at least one

_* of t e first and second images .

2. A method as claimed in claim 1, wherein the step of deriving comprises the steps of iteratively comparing selectable portions of the first and second images ; and producing, in response to the comparison, a first estimate of the camera image data;

3. A method as claimed in claim 2, wherein the camera image data comprises a plurality of parameters, and the method further comprises the steps of producing a fixed estimate of at least a selectable one, preferably, two, of the parameters; and re-iterating the steps of deriving, producing and comparing in order to produce a refined estimate of the camera image data.

4. A method as claimed in claim 3, wherein the step of producing comprises the steps of repeating the step of deriving for all or a selected portion of the images in the sequence of images, preferably, the selected portion of images which satisfy predeterminable criteria; and calculating an average value, preferably a weighted average value, of said at least one of the parameters.

5. A method as claimed in claim 4, wherein the predeterminable criteria relates to the distance between the centres of selected pairs of all or the selected portion of images .

6. A method as claimed in any preceding claim, further comprising the steps of selecting a subset of images from the sequence of images which satisfy further predeterminable criteria; and re-iterating the step of deriving using the subset of images as the sequence of images from which the first and second images are selected to produce for each pair or for selected pairs of images subset camera image data.

7. A method as claimed in claim 6, further comprising the steps of determining a set of error values from the subset camera image data; and refining the camera image data for the first and second images .

8. A method as claimed in claim 7, wherein the step of refining the camera image data for the first and second images comprises the step of dividing the error between any images which fall between the first and second within the sequence of images.

9 . A method substantially as described herein with reference to, and/or as illustrated in, the accompanying drawings .

10. An image processing system for determining camera image data for at least one of first and second images of a sequence of images, the system comprising memory for storing said first and second images; means for storing in said memory the first and second images ; and means for deriving, as a consequence of a comparison between the first and second images, camera image data for at least one of the first and second images.

11. ^" A 'system as claimed in claim 10, wherein the means for deriving comprises means for iteratively comparing selectable portions of the first and second images; and producing, in response to the comparison, a first estimate of the camera image data;

12. A system as claimed in claim 11, wherein the camera image data comprises a plurality of parameters, and the system further comprises means for producing a fixed estimate of at least a selectable one, preferably, two, of the parameters; and means for re-iterating the steps of deriving, producing and comparing in order to produce a refined estimate of the camera image data.

13. A system as claimed in claim 12, wherein the means for producing comprises means for repeating the step of deriving for all or a selected portion of the images in the sequence of images, preferably, the selected portion of images which satisfy predeterminable criteria; and means for calculating an average value, preferably a weighted average value, of said at least one of the parameters.

14. A system as claimed in claim 13 , wherein the predeterminable criteria relates to the distance between the centres of selected pairs of all or the selected portion of images .

15. A system as claimed in any of claims 10 to 14, further comprising means for selecting a subset of images from the sequence of images which satisfy further predeterminable criteria; and means for- re-iterating the step of deriving using the subset of images as the sequence of images from which the first and second images are selected to produce for each pair or for selected pairs of images subset camera image data.

16. A system as claimed in claim 15, further comprising means for determining a set of error values from the subset camera image data; and means for refining the camera image data for the first and second images.

17. A system as claimed in claim 16, wherein the means for refining the camera image data for the first and second images comprises means for dividing the error between any images which fall between the first and second within the sequence of images .

18. An image processing system substantially as described herein with reference to, and/or as illustrated in, the accompanying drawings .

19. A computer program product for determining camera image data for at least one of first and second images of a sequence of images within an image processing system comprising memory for storing the first and second images, the product comprising computer program code means for storing in said memory the first and second images; and computer program code means for deriving, as a consequence of a comparison between the first and second images, camera image data for at least one of the first and second images .

20. A product as claimed in claim 19, wherein the computer program code means for deriving comprises computer program code means for iteratively comparing selectable portions of the first and second images; and producing, in response to the comparison, a first estimate of the camera image data;

21. A product as claimed in claim 20, wherein the camera image data comprises a plurality of parameters , and the product further comprises computer program code means for producing a fixed estimate of at least a selectable one, preferably, two, of the parameters ,- and computer program code means for re-iterating the steps of deriving, producing and comparing in order to produce a refined estimate of the camera image data.

22. A product as claimed in claim 21, wherein the computer program code means for producing comprises computer program code means for repeating the step of deriving for all or a selected portion of the images in the sequence of images, preferably, the selected portion of images which satisfy predeterminable criteria; and computer program code means for calculating an average value, preferably a weighted average value, of said at least one of the parameters .

23. A product as claimed in claim 22, wherein the predeterminable criteria relates to the distance between the centres of selected pairs of all or the selected portion of images.

24. A product as claimed in any of claims 19 to 23, further comprising means for selecting a subset of images from the sequence of images which satisfy further predeterminable criteria; and means for re-iterating the step of deriving using the subset of images as the sequence of images from which the first and second images are selected to produce for each pair or for selected pairs of images subset camera image data.

25. A product as claimed in claim 24, further comprising computer program code means for determining a set of error values from the subset camera image data; and computer program code means for refining the camera image data for the first and second images .

26. A product as claimed in claim 25, wherein the computer program code means for refining the camera image data for the first and second images comprises computer program code means for dividing the error between any images which fall between the first and second within the sequence of images.

27. An image processing system substantially as described herein with reference to, and/or as illustrated in, the accompanying drawings .