GB2330974A

GB2330974A - Image matte production without blue screen

Info

Publication number: GB2330974A
Application number: GB9815695A
Authority: GB
Inventors: Giovanna Giardino; Kenneth Philip Appleby
Original assignee: Harlequin Group PLC
Current assignee: Harlequin Group PLC
Priority date: 1997-10-10
Filing date: 1998-07-20
Publication date: 1999-05-05
Also published as: GB9815695D0

Abstract

An action matte image is produced from a input sequence of images by producing a current clean plate image representing the background from the input sequence of images; comparing the current clean plate image and a current input image of the input sequence of images; and constructing the action matte image representing the foreground image in response to the result of the comparison, thus allowing the extraction of action sequences or the separation of the images of actors from an image comprising image data relating to both the actors and the general background without the requirement for a blue background. Another aspect provides a method and system for automatically identifying and defining regions of an image which correspond to a moving image.

Description

Image processing system, method and computer program product The present invention relates to an image processing system, method and computer program product and more particularly to an image processing system, method and computer program product for producing an action matte image for an input image or mask from an input sequence of images.

Within the film industry, techniques have been developed which allow a sequence of images of some action to be composited with a suitable background sequence of images to produce a sequence of images containing both the background and the foreground action sequence. Typically, the actors acting out the action sequence are filmed against a blue background. The captured images are processed subsequently to isolate the actors that is, the blue portions of the captured images are removed. The remaining image data relates only to the action sequence. The isolated action sequence can then be used for subsequent processing such as, for example, the production of an image sequence containing both the actors and some other image data, such as special effects, for example, explosions and the like or any other alternative background containing various images. Green is also suitable as a background colour which can be used as an alternative to blue.

It will be appreciated that the above procedure is cumbersome and requires a dedicated set having an exclusively blue background against which filming of actors can take place.

The above procedure cannot isolate or extract image data relating to some form of action, such as, for example, an actor or a moving vehicle, from a general scene which comprises an other than uniformly blue or green background.

Furthermore, using such a blue or green backdrop typically requires uniform illumination of both the backdrop and the moving object.

It is an object of the present invention to at least mitigate some of the problems of the prior art.

Accordingly, a first aspect of the present invention provides a method four producing an action matte image from input data comprising an input sequence of images, the method comprising the steps of producing a current clean plate from the input sequence of images; comparing the current clean image and a current input image of the input sequence of images; and constructing the action matte image in response to the comparison.

The invention advantageously allows the extraction of action sequences or the separation of the images of moving objects such as, for example, actors or vehicles, from an image comprising image data relating to a non-uniform and moving background scenery.

All images of the input images sequence are captured using a steerable camera. Therefore, the background image data, for example, as a moving object is tracked by the camera or as a consequence of camera movement, will change as between image frames. The term camera includes a physical camera and a virtual camera in the case of computer generated images.

Preferably, each input image comprises camera geometry data which provides an indication of at least the orientation of the camera at the time a corresponding image was captured or generated. In preferred embodiments, the camera movements are nodal movements about a fixed point and do not include translations of the camera. As an alternative to considering the camera geometry data as data relating to the camera, the same data can be considered to define the location in a reference frame of the position of a corresponding image frame.

A second aspect of the present invention provides a system for producing an action matte image from input data comprising an input sequence of images, the system comprising means for producing a current clean plate from the input sequence of images; means for comparing the current clean image and a current input image of the input sequence of images; and means for constructing the action matte image in response to the comparison.

A third aspect of the present invention provides a computer program product for producing an action matte image from input data comprising an input sequence of images, the product comprising a data storage medium having stored thereon computer program code for producing a current clean plate from the input sequence of images; comparing the current clean image and a current input image of the input sequence of images; and constructing the action matte image in response to the comparison.

As stated above, conventionally, action is filmed against an appropriately and uniformly coloured and illuminated backdrop. The action can only be extracted when using such a backdrop. The prior art lacks a method for automatically identifying regions of an image which contain image data representing a moving object when the image data constituting the image comprises data relating to a non-uniform background and the moving object. The prior art cannot differentiate between moving foreground image data and background image data which changes as a result of camera movement.

Accordingly, there is provided a method for determining the location of image data with at least one input image of an input image sequence which corresponds to image data representing a moving object, a method comprising the steps of; selecting two current input images from the input image sequence; comparing the two current input images; and identifying the location of image data within at least one of the two current input images in response to the comparison.

There is also provided a system for determining the location of image data with at least one input image of an input image sequence which corresponds to image data representing a moving object, the system comprising means for selecting two current input images from the input image sequence; means for comparing the two current input images; and means for identifying the location of image data within at least one of the two current input images in response to the comparison.

There is further provided a computer program product for determining the location of image data with at least one input image of an input image sequence which corresponds to image data representing a moving object, the product having stored thereon computer program code for selecting two current input images from the input image sequence; comparing the two current input images; and identifying the location of image data within at least one of the two current input images in response to the comparison.

Advantageously, image data corresponding to a moving object can be identified within an image comprising non-uniform and/or moving background image data, which varies in, for example, colour, illumination and/or position as between image frames. The moving object image data can then be extracted for further processing such as, for example, incorporation into a different image frame or for use in producing an image frame, such as, a clean plate image corresponding to the original image within which the moving object image data was identified.

Other features of the present invention are described hereafter and are defined in the claims.

Embodiments of the present inventions will now be described by way of example only with reference to the accompanying drawings in which: figure 1 illustrates a data processing system upon which an embodiment of the first invention can be realised; figure 2A shows a flow chart for implementing an embodiment of a first invention; figure 2B illustrates the intermediate results of processing a clean plate image and an input image in accordance with the flow chart of figure 2A; figures 3A and 3B illustrate a flow chart of a second invention for automatically identifying and/or extracting regions of an image containing moving objects; figure 3C shows schematically the results of processing two input images to produce an output action envelope; figure 4 depicts a flow chart for a clustering technique which is used to identify regions of interest within an input image; and figure 5 illustrates an equivalence criterion used during processing regions of interest identified by the process of figure 4.

Referring to figure 1 there is shown a data processing system 100 upon which an embodiment of the present inventions can be realised. The data processing system 100 comprises at least one microprocessor 102, for executing computer instructions to process data, a memory 104, such as ROM and RAM, accessible by the microprocessor via a system bus 106. Mass storage devices 108 are also accessible via the system bus 106 and are used to store, off-line, data used by and instructions for execution by the microprocessor 102.

Information is output and displayed to a user via a display device 110 which typically comprises a VDU together with an appropriate graphics controller. Data is input to the computer using at least one of either of the keyboard 112 and associated controller and the mass storage devices 108.

Referring to figure 2A, there is shown a flow chart 200 for implementing an embodiment of the first invention. At step 202 input data including an input image sequence are retrieved from the mass storage devices 108 or memory 104 of the data processing system. Preferably, the input data also comprises camera geometry information which provides for each image of the input image sequence an indication of the orientation of the camera used to capture the images at the time of capture of the images. The camera geometry information is retrieved or calculated at step 204.

The camera geometry data may have been obtained concurrently with the image capture by using suitable transducers arranged to record camera orientation data or, in the absence of any such transducers, may have been calculated using the camera geometry derivation technique of the Panopticaw product available from Harlequin Limited. Further details of the camera geometry derivation technique are available from UK Patent Application number 9721592.5 which is incorporated herein by reference for all purposes and a copy of which set out in Appendix A.

For each frame of the input image sequence, at least one region is identified and defined at step 206 which surrounds, preferably completely, at least one moving object contained within that frame. It will be appreciated that the region which surrounds the moving object can be identified and defined by either a user of the data processing system, using conventional input devices, or, preferably, can be identified and defined automatically as is described hereafter in relation to figures 3A, 3B, 3C, 4 and 5.

The input image sequence is processed to produce at least one action matte corresponding to at least one input image and preferably an action matte image sequence corresponding to an input image sequence. The action matte image represents a hard or soft mask which can be used to extract data from or isolate data within a corresponding image frame of the input image sequence. An action matte image is constructed by following the remaining steps of figure 2A. The remaining processing of the embodiment depicted in figure 2A will be explained with reference to figure 2B and a current clean plate image 230, a current input image 232 to produce a current action matte image 234.

A current clean plate image 230, corresponding to a current input image 232, is either retrieved from storage or memory or generated from the input image sequence at step 210. The clean plate comprises image data which represents only the background image data of the current input image. The difference between the current clean plate image and the current input image is determined at step 212. Preferably, the difference represents a three-channel difference which comprises one image 236 to 240 for each of the rgb colour channels, or subpixels, of the colour current clean plate image and input images. Each colour channel difference image represents the difference between, preferably, the subpixels, of the current clean plate image 230 and the current input image 232. The difference can be considered to be the noise of a colour channel image. A difference between the current clean plate image 230 and the current input image 232 images can arise for at least one of two reasons. Firstly, a difference can arise as a consequence of corresponding pixels or subpixels of the two images containing background image data and foreground image data. This may follow from movement of an object within the images. Secondly, as the clean plate can be constructed by mapping background image data from other input image frames into the clean plate frame, slight colour variations due to, for example, changes in illumination, noise or errors in correcting for changes in perspective, may occur between the images which would also lead to colour differences.

A first noise threshold is calculated at step 214.

Preferably, the first noise threshold is calculated by determining the sum of the squares of the standard deviations of the noise of the red, green and blue colour channel difference images for only those points or pixels 242 which do not fall within the identified and defined regions 244 established at steps 206 and 208.

In a preferred embodiment, the first noise threshold is calculated as

where r.2 is the red channel value for pixel i,

where g, 2 is the green channel value for pixel i, and

where bj2 is the blue channel value for pixel i.

Typically, N comprises between 100 and 1000 pixels.

A grey scale or single colour channel image 246 is calculated at step 216. In a preferred embodiment, the value of each pixel of the grey scale image is derived from the rgb values of the three colour channel images 236 to 240. Preferably, the modulus of the sum of the squares of the rgb values for each pixel is calculated to produce the grey scale image 246.

Preferably, a segmentation technique is then applied to the grey scale image at step 218. The segmentation technique is used to identify regions of interest within the grey scale image which contain image data representing a moving object.

The action matte frame is initialised preferably to contain all zeros.

In one embodiment, the segmentation technique involves at least two stages. In the first stage, the action matte frame is constructed by assigning in a first pass over the grey scale image a first selectable value, such as binary 1, in the case of a Boolean action matte image, to each pixel within action matte frame for which the corresponding value in the grey scale image is above the first noise threshold by a first predeterrined margin. Preferably in the first pass, a binary one is assigned to those pixels of the action matte frame for which the corresponding grey scale image pixel value exceeds the first noise threshold by a factor of seven.

It will be appreciated that after the first pass the action matte will comprise several regions of connected, that is, adjacent pixels. Each of these regions can be labelled. In a second stage, during a second pass over the grey scale image the identified regions are grown iteratively, that is expanded, by identifying further pixels which are adjacent to the above identified regions and which exceed the first noise threshold by a second predetermined margin. Preferably, the above identified regions are grown by assigning a second selectable value, such as binary 1, in the case of a Boolean action matte image, to each pixel within action matte frame for which the corresponding value in the grey scale image is above the first noise threshold by a second predetermined margin. Preferably, the second predetermined margin is factor of 1.5 x the first noise threshold.

In an alternative embodiment, the segmentation process involves several stages. Firstly, the difference image is filtered using a low pass filter to produce a low pass filtered image. The low pass filtering decreases pixel differences using a filtering kernel of a predetermined size S. Typically, S has a value of three. The low pass filtering assists in removing isolated small discrepancies from the action matte. Secondly, regions of connected, that is adjacent, pixels are identified in the low pass filtered image having values which exceed the first noise threshold by a predetermined margin, preferably, by a factor of seven.

These regions are assigned a value of 1. Thirdly, the identified regions are expanded iteratively by including adjacent pixels which exceed the first noise threshold by a predetermined margin, preferably by a factor of 1.5.

Fourthly, the edges of the resulting expanded regions are cropped by S pixels, that is, the S outermost pixels of the resulting regions of connected pixels are assigned values of zero. Fifthly, the regions are further grown by identifying all pixels of the difference image which are connected to the regions and which exceed the first noise threshold by a predetermined margin, preferably by a factor of 1.5. The action matte can then be output for further processing, such as softening, or for use in relation to additional image processing. Typical low pass filters may comprise an Average Filter or Median filter as are well known within the art.

Although, factors of seven and 1.5 have been selected for the preferred embodiments, the present invention is not limited thereto. An embodiment can equally well be realised in which some other suitable factor or criterion is used. The predetermined margins should be selected to facilitate differentiation between those parts of the current input image which correspond to moving objects and those parts of the current input image which correspond to background scenery.

A technique for region growing can be found in chapter 5; Computer Vision, Prentice-Hall, Inc; by Dana H. Ballard and Christopher M. Brown which is incorporated herein by reference for all purposes.

Furthermore, a soft action matte image can be constructed which contains other values in addition to the Boolean pixels of on or off. Typically, the other values would represent, for example, the pixels of a current input image which are at or which represent the boundary between the image data representing the moving object and the image data representing the background scenery. In a preferred embodiment, the values assigned to the soft action matte pixels are calculated by replacing each action matte edge pixel with a calculated value. For example, for any given action matte edge pixel the replacement calculated value may be (7-n)/8, where n is the number of adjacent pixels which have a value of zero. In an alternative embodiment, the values assigned to the soft action matte frame pixels are determined, for example, according to whether corresponding grey scales values are above the first noise threshold by a particular margin, below the first noise threshold by a particular margin or within a particular margin of the first noise threshold.

The current action matte 234 is then output for further processing, such as storage on the mass storage medium, or for use in further image processing at step 220.

A determination is made at step 222 as to whether or not all images of the input image sequence have been processed. If all images have been processed, the production of corresponding action matte images or masks is terminated. If all such images have not been processed, the index used to address the input images is incremented and the next input image of the input image sequence is retrieved for processing.

Once action mattes have been produced for all or selected input images of the input image sequence, the image data, that is either foreground moving objects or moving background scenery, of any given input image can be extracted or isolated and used to produce a further composite image using either the same background image data, some other background image data and/including other action data such as, for example, other extracted action data or computer generated image data.

In step 210, the synthesis of a clean plate image involves producing from the input image sequence an image which contains only image data relating to the background scenery of the sequence of input images. The process of producing a current clean plate image for a corresponding current input image involves: distinguishing between those regions of the current input image which contain foreground image data and those regions which contain background image data; distinguishing between those regions of at least one of the other input images of the input image sequence which contain foreground image data and those regions which contain background image data; and mapping relevant background image data derived or retrieved from the other input images of the input sequence which contain only background image data into the region of the current clean plate image which corresponds to the foreground region of the current input image.

Preferably, the relevant background image data is identified using the invention which is the subject of UK patent application number 9721591.7, which is set out in Appendix B of this application, and incorporated herein for all purposes. Similarly, the derivation of the colour data for a particular background pixel to be inserted into the foreground region of the current clean plate preferably uses the techniques disclosed in UK patent application number 9721591.7.

In one embodiment, the step of distinguishing between regions of an input image which contain foreground image data and regions which contain background image data, can be performed manually by displaying an image and allowing the user via a suitable user interface to define regions which contain the respective types of image data, that is action/foreground image data and background image data. Typically, any such user defined regions comprise a rectangle(s) which has been appropriately scaled and positioned relative to an input image currently under consideration. However, a preferred embodiment uses an invention which automatically identifies and defines the regions which contain the respective image data types.

Referring to figures 3A and 3B, there is shown a flow chart 300 of an invention for distinguishing between or identifying regions of an input image which contains foreground or action image data, presenting a moving object, and regions of the input image which contain background data. Preferably, this invention automatically identifies and defines the regions of an input image which contains a moving object(s). The invention for automatically identifying and defining the above regions will now be described with reference to figures 3A, 3B and 3C. A preferred embodiment also utilises the method depicted in the flow chart of figure 4.

At steps 302 and 304, image data including two input images 332 and 334 of the input image sequence 330 together with corresponding camera geometry data, preferably adjacent input images, are retrieved from, for example, mass storage. The camera geometry data may comprise either the rotation between the two images 332 and 334 or data from which such a rotation can be derived, such as the position within a notional reference frame of the two images. The two input images 332 and 334 are correctly positioned 336 with respect to each other using the camera geometry data and a three channel difference image 338 is generated for the region of overlap 340 between the two input images. The data for the three channel difference image is derived as follows: for regions of overlap 340 between the two input images, the image data for the three channel difference image 338 is derived from the image data of the regions of overlap 340 between the two input images; and for non-overlapping regions 342 to 346, the image data in the three channel difference images 338 is set to zero, for example.

The three channel difference image 338 is subjected at step 308 to a low pass filter to produce a low pass filtered three channel difference image 348 comprising image data which represents slowly varying features of the two input images 332 and 334.

A high frequency noise image, that is a high pass filtered image, 350 is produced at step 310 by calculating the difference between the three channel difference image 338 and the low pass filtered difference image 348. The high frequency noise image 350 comprises image data which represents quickly varying features of the images, such features comprise noise and changes in the relative positions of edges.

At step 312 the standard deviations of the colour channels of the high frequency noise image 350 are calculated and a second noise threshold, NT, is calculated as

where SD2g represents the standard deviation of the red channel image data, SDs, W represents the standard deviation of the blue channel image data and SD,, represents the standard deviation of the green channel image data.

A grey scale image 352 having values derived from the modulus of the three channel filtered difference image 348 is calculated at step 314. The modulus of a pixel value is calculated as

where r represents the red colour channel pixel value, blue represents the blue colour channel pixel value and green represents the green colour channel pixel value.

The modulus image (grey scale image 352) of the three channel filtered difference image is segmented into regions having values which are below a threshold level of (4.NT) and regions having values which are above the threshold value of (4.NT) at step 316. Further, rectangular sub-regions 354 to 360 are defined or constructed at step 316 which surround the pixels having values which are above (4.NT). These subregions 354 to 360 represent regions which contain significant movement or errors for the 3 channel difference image and therefore for at least one, preferably both, of the two current input images 332 and 334.

The sub-regions 354 to 360 are classified into equivalent regions and then equivalent regions are merged into a first estimation of action envelopes at step 318. In the example shown in figure 3C, sub-regions 354 and 356 are deemed to be equivalent and they are therefore merged into a single region 362. The first estimation of the action envelopes or images 364 containing the first estimation of the action envelopes 362 and 366 are stored for further processing or can be output for use in image processing as a first approximation of the location of moving objects within an image.

However, in a preferred embodiment, a determination is made at step 320 as to whether or not there exists a previous input image for which corresponding action envelopes of a corresponding action envelope image 368 have been identified.

If there does exist such a previous image 368 for which corresponding action envelopes have been produced, at step 322 the image 368 containing the action envelopes for the previous input image and the image 364 containing the action envelopes for the current two input images 332 and 334 are superimposed by either merely overlapping the complete images or by correctly aligning the images using the respective camera geometry data with respect to each other. Once the images are overlapped or aligned, only those action envelopes 362 which intersect or overlap with the action envelopes 370 of the previous input image are retained as the action envelopes for the current two input images 332 and 334.

Preferably, once the action envelopes for the current image have been determined after comparison with the action envelopes of a previous image at step 322 or if it was determined at step 320 that the current image is the first image for which action envelopes have been constructed, the action envelopes 372 of the current image are enlarged by a selectable percentage, preferably, 20% at step 324. Still more preferably, after the enlargement, any intersecting action envelopes are merged into a single envelope.

These enlarged and/or merged action envelopes 374 are then stored as being the action envelopes corresponding to the two current input images at step 326.

In an embodiment, a determination is made between steps 320 and 322 as to whether or not the number of anticipated action envelopes is greater than one or greater than a predetermined constant value. If so, then steps 322 and 324 are omitted and the action matte resulting from step 318 is output as the action matte for the two current input images. This facilitates the processing of images which comprise image data relating to two or more separate moving objects. If not, then processing continues with step 322.

The action envelopes 374 for an input image can be used to produce a corresponding clean plate image. The action envelope 374 defines the region of a current input image 332 or 334 into which background image data is to be mapped from one of the other input images which contains relevant image data, that is background, to remove the action image data of the current input image. The relevant image data is identified by mapping the action envelope onto the other input image using the camera geometry data associated with the current input image and the other input image.

A

A flow chart 400 for the process classification and merging sub-action regions is illustrated in figure 4 and described in detail below. A list of sub-action regions produced at step 316 is passed for processing according to the flow chart 400. At step 402 a group flag is set to false. At step 404 each region in the list is classified in respective equivalence classes according to a predetermined equivalence criterion. Therefore, an equivalence class may comprise a plurality of action envelopes.

A determination is made at step 406 as to whether or not a current equivalence class contains more than one sub-action regions. The process of figure 4 is initialised by setting the first region, Region 1, to be a selectable one of the set of sub-action regions currently under consideration. Each sub-action region in a current equivalence class is processed. For a current region, region i, of a current equivalence class the processing is as follows. Region 1 is expanded in order to encompass the current region, region i, at step 408. A determination is made at step 410 as to whether or not Region 1, that is the enlarged region, has a size which represents less than a predeterminable percentage, preferably 30i, of the current image frame. If so, the group flag is set to true at step 412, the enlarged region 1 is appended to the list of output regions after having processed all sub-action regions. If not, processing continues at step 414 to append region i to the list of output regions and region 1 retains its original size immediately prior to being enlarged. In summary, region i is appended to the list of output regions if there was no merging and region 1, that is the expanded region, is appended to the list of output regions if there was merging between sub-action regions.

A determination is then made at step 416 as to whether or not all regions in the current equivalence class have been processed. If not, the region index, i, is incremented by one, a new current region is obtained and processing continues at step 408. If all regions of the current equivalence class have been processed, region 1 is appended to the output list of regions at step 418.

A determination is made at step 420 as to whether or not all equivalence classes for a list of rectangular regions have been processed. If not, the next equivalence class is selected at step 422 and processing continues at step 406.

If so, the output list of regions is stored at step 422. A determination is made at step 424 as to whether or not the group flag has been set to true. If so, the input list of regions is set to equal the current output list of regions at step 426. If not, the current output list of regions is output as representing the action envelopes for the current two input images.

Referring to figure 5, there is shown a portion 500 of an image containing action sub-region A and action sub-region 502 and 504. The equivalence criterion established in the above, is as follows: action sub-regions A and B are deemed to satisfy the equivalence criterion if either region A intersects region B or the horizontal distance between the regions is less than the aggregate width of the regions and the vertical distance between the regions is less than the aggregate heights of the regions.

Although the invention uses the above equivalence criterion, it is not limited thereto. An embodiment can equally well be realised in which a different equivalence criterion is used such as, for example, the distance between selectable corners, preferably the closest corners, of the action envelopes may be compared with the internal diagonals of the action envelopes.

The sub-action regions and action envelopes in the above embodiments are defined as rectangular regions of the corresponding images. However, the present invention is not limited to such rectangular regions. Embodiments can be realised in which the sub-action regions and action envelopes are irregular regions which comprise only appropriate pixel images.

The camera geometry data of the input images is preferably calculated using the invention which is the subject of UK patent application 9721592.5. However, alternative methods may be used to derive the camera geometry data such as the methods described in "Creating Full View Panoramic Image Mosaics and Environment Maps", Computer Graphics Proceedings Annual Conference Series, 1997, by Szeliski, ACM-0-89791-8967/97/008, which is incorporated herein by reference for all purposes.

The above embodiments are described in terms of producing a difference image and then deriving threshold values from the difference image or a processed version thereof. However, it will be appreciated that the threshold value can be calculated on a pixel by pixel basis, that is, without producing a complete difference image. The individual pixels of, for example, the current clean plate image and the current input image may be used directly in producing the threshold value.

Claims

CLAIMS 1. A method for producing an action matte image from input data comprising an input sequence of images, the method comprising the steps of producing a current clean plate from the input sequence of images; comparing the current clean image and a current input image of the input sequence of images; and constructing the action matte image in response to the comparison.
2. A method as claimed in claim 1, wherein the step of comparing comprises the step of producing a difference image, preferably a three-channel difference image, indicating the differences between the current clean plate image and the current input image.
3. A method as claimed in claim 2, wherein the step of producing the difference image comprises the step of subtracting image data values of corresponding pixels of the current clean plate image and the current input image.
4. A method as claimed in any of claims 2 to 3, further comprising the step of selecting at least one portion of the difference image; calculating a first threshold value from the image data of the at least one portion of the difference image
5. A method as claimed in claim 4, wherein the step of selecting the at least one portion of the difference image comprises the steps of distinguishing between those regions of the current input image which contain image data of a first type and those regions of the current input image which contain image data of a second type.
6. A method as claimed in claim 5, further comprising the step of setting the at least one portion of the difference image to correspond to at least one of those regions of the current input image which contains image data of the second type.
7. A method as claimed in either of claims 5 or 6, wherein the image data of a first type comprises image data representing a moving object.
8. A method as claimed in any of claims 5 to 7, wherein the image data of the second type comprises image data representing background scenery.
9. A method as claimed in any preceding claim, wherein the step of comparing comprises identifying portions of the difference image which represent image data of a moving object.
10. A method as claimed in claim 9 wherein the step of identifying comprises the step of establishing whether or not image data derived from the image data of the difference image meets a predetermined criterion.
11. A method as claimed in claim 10 wherein the step of establishing comprises the step of producing a modulus image comprising modulus image values derived from the image data of the difference image; setting the predetermined criterion to be that the image data derived from the image data of the difference image differs from the threshold value by a predetermined margin.
12. A method as claimed in claim 11, further comprising the step of setting the predetermined margin to be a factor of 7.
13. A method as claimed in either of claims 11 or 12 further comprising the step of setting the predetermined margin to be a factor of 1.5.
14. A method as claimed in any of claims 10 to 13, wherein the step of establishing further comprises the steps of concluding that a portion of the difference image represents image data of a moving object if the image data derived from the modulus image exceeds the threshold value by the predetermined margin.
15. A method as claimed in any of claims 2 to 14, wherein the step of constructing comprises the step of assigning a first selectable value to an action matte frame at sections which correspond to the identified portions of the difference image to produce the action matte image.
16. A method as claimed in any of claims 2 to 15 wherein the step of constructing comprises assigning a second selectable value to the action matte frame at sections which do not correspond to the identified portions of the difference image.
17. A method as claimed in either of claims 15 or 16 further comprising the step of initialising the image data of the action matte frame to a third selectable value.
18. A method as claimed in any preceding claim, wherein the input data comprises, for each or selected input images of the input image sequence, corresponding camera geometry data which includes an indication of the orientation and/or focal length data of the corresponding images at the time of capture or generation of the corresponding images.
19. A method as claimed in claim 18 further comprising the step of deriving the camera geometry data for each or selectable input images of the input image sequence from the input image sequence.
20. A method as claimed in any preceding claim, wherein the step of producing a current clean plate image from the input sequence of images comprises the step of constructing the current clean plate image by mapping image data derived from selectable portions of the input sequence of images into the corresponding portions of the current clean plate image.
21. A method as claimed in claim 20, wherein the step of constructing the current clean plate image comprises the steps of distinguishing between a first region of the current input image which contains image data of a first type, preferably image data representing a moving object, and a second region of the current input image which contains image data of a second type, preferably background image data; mapping image data derived from a region of at least one other input image of the input sequence of images, which region corresponds to the first region of the current input image, into the current clean plate image corresponding to the current input image at a region of the current clean plate image which corresponds to the first region.
22. A method as claimed in any preceding claim, further comprising the step of using the action matte image to produce an image containing image data in selected regions defined by the action matte image.
23. A method as claimed in claim 22, wherein the selected regions defined by the action matte image correspond to either background image data or foreground image data.
24. A method for producing an action matte image from an input sequence of images substantially as described herein with reference to and/or as illustrated in the accompanying drawings.
25. A method as claimed in any of claims 5 to 24, wherein the step of distinguishing comprises the step of determining the location of image data within the current input image which corresponds to image data of a moving object.
26. A method as claimed in any of claims 2 to 25, further comprising the step of concluding that a portion of the difference image represents image data of a moving object if the image data derived from the image data of the difference image exceeds the threshold value by the predetermined margin and the image data derived from the difference image is adjacent to image data of the difference image which exceeds the threshold value by the predetermined margin.
27. A system for producing an action matte image from input data comprising an input sequence of images, the system comprising means for producing a current clean plate from the input sequence of images; means for comparing the current clean image and a current input image of the input sequence of images; and means for constructing the action matte image in response to the comparison.
28. A system as claimed in claim 27, wherein the means for comparing comprises means for producing a difference image, preferably a three-channel difference image, indicating the differences between the current clean plate image and the current input image.
29. A system as claimed in claim 28, wherein the means for producing the difference image comprises means for subtracting image data values of corresponding pixels of the current clean plate image and the current input image.
30. A system as claimed in any of claims 28 to 29, further comprising means for selecting at least one portion of the difference image; means for calculating a first threshold value from the image data of the at least one portion of the difference image
31. A system as claimed in claim 30, wherein the means for selecting the at least one portion of the difference image comprises means for distinguishing between those regions of the current input image which contain image data of a first type and those regions of the current input image which contain image data of a second type.
32. A system as claimed in claim 31, further comprising means for setting the at least one portion of the difference image to correspond to at least one of those regions of the current input image which contains image data of the second type.
33. A system as claimed in either of claims 31 or 32, wherein the image data of a first type comprises image data representing a moving object.
34. A system as claimed in any of claims 31 to 33, wherein the image data of the second type comprises image data representing background scenery.
35. A system as claimed in any of claims 27 to 34, wherein the means for comparing comprises means for identifying portions of the difference image which represent image data of a moving object.
36. A system as claimed in claim 35 wherein the means for identifying comprises means for establishing whether or not image data derived from the image data of the difference image meets a predetermined criterion.
37. A system as claimed in claim 36 wherein the means for establishing comprises means for producing a modulus image comprising modulus image values derived from the image data of the difference image; means for setting the predetermined criterion to be that the image data derived from the image data of the difference image differs from the threshold value by a predetermined margin.
38. A system as claimed in claim 37, further comprising means for setting the predetermined margin to be a factor of 7.
39. A system as claimed in either of claims 37 or 38, further comprising means for setting the predetermined margin to be a factor of 1.5.
40. A system as claimed in any of claims 36 to 39, wherein the means for establishing further comprises means for concluding that a portion of the difference image represents image data of a moving object if the image data derived from the modulus image exceeds the threshold value by the predetermined margin.
41. A system as claimed in any of claims 28 to 40, wherein the means for constructing comprises means for assigning a first selectable value to an action matte frame at sections which correspond to the identified portions of the difference image to produce the action matte image.
42. A system as claimed in any of claims 28 to 41, wherein the means for constructing comprises means for assigning a second selectable value to the action matte frame at sections which do not correspond to the identified portions of the difference image.
43. A system as claimed in either of claims 41 or 42, further comprising the means for initialising the image data of the action matte frame to a third selectable value.
44. A system as claimed in any of claims 27 to 43, wherein the input data comprises, for each or selected input images of the input image sequence, corresponding camera geometry data which includes an indication of the orientation and/or focal length data of the images at the time of capture or generation of the corresponding images.
45. A system as claimed in claim 44, further comprising means for deriving the camera geometry data for each or selectable input images of the input image sequence from the input image sequence.
46. A system as claimed in any of claims 72 to 45, wherein the means for producing a current clean plate image from the input sequence of images comprises means for constructing the current clean plate image by mapping image data derived from selectable portions of the input sequence of images into the corresponding portions of the current clean plate image.
47. A system as claimed in claim 46, wherein the means for constructing the current clean plate image comprises means for distinguishing between a first region of the current input image which contains image data of a first type, preferably image data representing a moving object, and a second region of the current input image which contains image data of a second type, preferably background image data; mapping image data derived from a region of at least one other input image of the input sequence of images, which region corresponds to the first region of the current input image, into the current clean plate image corresponding to the current input image at a region of the current clean plate image which corresponds to the first region.
48. A system as claimed in any of claims 27 to 47, further comprising means for using the action matte image to produce an image containing image data in selected regions defined by the action matte image.
49. A system as claimed in claim 48, wherein the selected regions defined by the action matte image correspond to either background image data or foreground image data.
50. A system as claimed in any of claims 31 to 49, wherein the means for distinguishing comprises means for determining the location of image data within the current input image which corresponds to image data of a moving object.
51. A system as claimed in any of claims 28 to 50, further comprising means for concluding that a portion of the difference image represents image data of a moving object if the image data derived from the image data of the difference image exceeds the threshold value by the predetermined margin and the image data derived from the difference image is adjacent to image data of the difference image which exceeds the threshold value by the predetermined margin.
52. A system for producing an action matte image from an input sequence of images substantially as described herein with reference to and/or as illustrated in the accompanying drawings.
53. A method for determining the location of image data with at least one input image of an input image sequence which corresponds to image data representing a moving object, a method comprising the steps of selecting two current input images from the input image sequence; comparing the two current input images; and identifying the location of image data within at least one of the two current input images in response to the comparison.
54. A method as claimed in claim 53, further comprising the step of modifying, preferably by re-binning, the two current input images.
55. A method as claimed in either of claims 53 or 54, wherein the input image data comprises camera geometry data which includes an indication of the orientations and/or the focal length of the camera at the time of capture or creation of the input images of the input image sequence.
56. A method as claimed in claim 55, wherein the step of comparing comprises the step of aligning the two current input images using the camera geometry data.
57. A method as claimed in either of claims 55 or 56, further comprising the step of deriving the camera geometry data from the input image sequence.
58. A method as claimed in either of claims 56 or 57, wherein the step of comparing comprises the step of calculating a threshold value indicative of the differences between the two aligned current input images, the differences being attributable to selectable changes in corresponding image data of the two current input images.
59. A method as claimed in claim 58 further comprising the step of producing a segmented image which comprises a plurality of or identifying image data providing an indication of the locations of corresponding portions to of the two aligned current input images which differ from the threshold value by a predetermined margin.
60. A method as claimed in claim 59, further comprising the step of merging selectable ones of the plurality of sub action regions to produce at least one action merge envelope.
61. A method as claimed in claim 60, further comprising the step of outputting said at least one merged action envelope as the identity of the location of image data within at least one of the two current input images which corresponds to image data of a first type, preferably image data representing a moving object.
62. A method as claimed in claim 61, further comprising the step of comparing said at least one merged action envelope with at least one other action envelope corresponding to another input image of the input image sequence of images; and identifying any overlapping action envelopes.
63. A method as claimed in claim 62, further comprising the step of outputting the identified overlapping action envelope as the identity of the location of image data within at least one of the two current input images which corresponds to image data of a first type, preferably image data representing a moving object.
64. A method as claimed in either of claims 60 or 61, further comprising the step of enlarging the at least one merged action envelope; and outputting the enlarged merged action envelope as the identity of the location of image data within at least one of the two current input images which corresponds to image data of a first type, preferably representing a moving object.
65. A method as claimed in either of claims 62 or 63, further comprising the step of enlarging the overlapping action envelope; and outputting the enlarged overlapping action envelope as the identity of the location of image data within at least one of the two current input images which corresponds to image data of a first type, preferably image data representing a moving object.
66. A method as claimed in any of claims 53 to 65, wherein the step of identifying comprises the step of producing an image frame having portions therein which define the identity of the location of the image data within at least one of the two current input images which corresponds to image data of a first type, preferably image data representing a moving object.
67. A method as claimed in any of claims 59 to 66, wherein the action envelope, overlapping action envelope, sub action regions, enlarged action envelope and/or the enlarged overlapping action envelope are defined as rectangles which encompass the image data representing the difference between the two current input images.
68. A method as claimed in any of claims 58 to 67, wherein the step calculating the threshold value comprises the step of producing a difference image, preferably a three-channel difference image, containing an indication of the differences between the two current input images or the two current aligned input images.
69. A method as claimed in claim 68 wherein the step of producing a difference image comprises the step of subtracting image data values of corresponding portions of the two aligned current input images.
70. A method as claimed in any preceding claim, further comprising the step of producing a high pass filtered image comprising image data of a first image type.
71. A method as claimed in claim 70 wherein the step of producing a high pass filtered image comprises the steps of applying a high pass filter to the difference image.
72. A method as claimed in claim 70 wherein the step of producing a high pass filtered image comprises the step of producing a low pass filtered image of the difference image to remove image data of a first image data type; and subtracting the difference image and the low pass filtered image.
73. A method as claimed in any of claims 70 to 72, wherein the step of calculating the threshold value comprises the step of calculating the threshold value using the image data of the high pass filtered image.
74. A method of determining the location of image data within at least one input image of an input image sequence corresponding to image data representing a moving object substantially as described herein with reference to and/or as illustrated in the accompanying drawings.
75. A method as claimed in claim 25 wherein the step of identifying the location of image data comprises the steps of a method as claimed in any of claims 53 to 74.
76. A system for determining the location of image data with at least one input image of an input image sequence which corresponds to image data representing a moving object, the system comprising means for selecting two current input images from the input image sequence; means for comparing the two current input images; and means for identifying the location of image data within at least one of the two current input images in response to the comparison.
77. A system as claimed in claim 76, further comprising means for modifying, preferably by re-binning, the two current input images.
78. A system as claimed in either of claims 76 or 77, wherein the input image data comprises camera geometry data which includes an indication of the orientations and/or the focal length of the camera at the time of capture or creation of the input images of the input image sequence.
79. A system as claimed in claim 78, wherein the means for comparing comprises means for aligning the two current input images using the camera geometry data.
80. A system as claimed in either of claims 78 or 79, further comprising means for deriving the camera geometry data from the input image sequence.
81. A system as claimed in either of claims 79 or 80, wherein the means for comparing comprises means for calculating a threshold value indicative of the differences between the two aligned current input images, the differences being attributable to selectable changes in corresponding image data of the two current input images.
82. A system as claimed in claim 81, further comprising means for producing a segmented image which comprises a plurality of or identifying image data providing an indication of the locations of corresponding portions to of the two aligned current input images which differ from the threshold value by a predetermined margin.
83. A system as claimed in claim 82, further comprising means for merging selectable ones of the plurality of sub action regions to produce at least one action merge envelope.
84. A system as claimed in claim 83, further comprising means for outputting said at least one merged action envelope as the identity of the location of image data within at least one of the two current input images which corresponds to image data of a first type, preferably image data representing a moving object.
85. A system as claimed in claim 84, further comprising means comparing said at least one merged action envelope with at least one other action envelope corresponding to another input image of the input image sequence of images; and means for identifying any overlapping action envelopes.
86. A system as claimed in claim 85, further comprising means for outputting the identified overlapping action envelope as the identity of the location of image data within at least one of the two current input images which corresponds to image data of a first type, preferably image data representing a moving object.
87. A system as claimed in either of claims 83 or 84, further comprising means for enlarging the at least one merged action envelope; and means for outputting the enlarged merged action envelope as the identity of the location of image data within at least one of the two current input images which corresponds to image data of a first type, preferably representing a moving object.
88. A system as claimed in either of claims 85 or 86, further comprising means for enlarging the overlapping action envelope; and outputting the enlarged overlapping action envelope as the identity of the location of image data within at least one of the two current input images which corresponds to image data of a first type, preferably image data representing a moving object.
89. A system as claimed in any of claims 76 to 88, wherein the means for identifying comprises means for producing an image frame having portions therein which define the identity of the location of the image data within at least one of the two current input images which corresponds to image data of a first type, preferably image data representing a moving object.
90. A system as claimed in any of claims 81 to 89, wherein the action envelope, overlapping action envelope, sub action regions, enlarged action envelope and/or the enlarged overlapping action envelope are defined as rectangles which encompass the image data representing the difference between the two current input images.
91. A system as claimed in any of claims 81 to 90, wherein the means for calculating the threshold value comprises means for producing a difference image, preferably a three-channel difference image, containing an indication of the differences between the two current input images or the two current aligned input images.
92. A system as claimed in claim 91, wherein the means for producing a difference image comprises means for subtracting image data values of corresponding portions of the two aligned current input images.
93. A system as claimed in any of claims 76 to 92, further comprising means for producing a high pass filtered image comprising image data of a first image type.
94. A system as claimed in claim 93, wherein the means for producing a high pass filtered image comprises means for applying a high pass filter to the difference image.
95. A system as claimed in claim 93, wherein the means for producing a high pass filtered image comprises means for producing a low pass filtered image of the difference image to remove image data of a first image data type; and means for subtracting the difference image and the low pass filtered image.
96. A system as claimed in any of claims 93 to 95, wherein the means for calculating the threshold value comprises means for calculating the threshold value using the image data of the high pass filtered image.
97. A system for determining the location of image data within at least one input image of an input image sequence corresponding to image data representing a moving object substantially as described herein with reference to and/or as illustrated in the accompanying drawings.
98. A system as claimed in claim 41, wherein the step of identifying the location of image data comprises a system as claimed in any of claims 76 to 97.
99. A computer program product comprising a storage medium having stored thereon computer program code for implementing a method as claimed in any of claim 1 to 26 or a system as claimed in any of claim 28 to 52.
100.A computer program product for a storage medium having stored thereon computer program code for implementing a method as claimed in any of claim 53 to 75 or a system as claimed in any of claim 76 to 98.