GB2352899A

GB2352899A - Tracking moving objects

Info

Publication number: GB2352899A
Application number: GB0012234A
Authority: GB
Inventors: Graham Alexander Thomas; Richard Thomas Russell; Timothy John Sargent
Original assignee: British Broadcasting Corp
Current assignee: British Broadcasting Corp
Priority date: 1999-05-21
Filing date: 2000-05-19
Publication date: 2001-02-07
Anticipated expiration: 2020-05-19
Also published as: AU5084100A; GB0012234D0; GB2352899B; WO2000072264A1; GB9911935D0

Abstract

A system which determines the position of an actor in a studio comprises a video camera 30 and a processor. Processor 32 identifies a portion of the actor in an image taken by the camera, typically the feet or head, and combines this with the camera position and orientation to determine the position of the actor. The invention exploits the height of the camera above the floor. More than one camera may be used.

Description

2352899 TRACKING OF MOVING OBJECTS The present invention relates to the

tracking of moving objects and is particularly concerned with tracking actors in a television studio for virtual production.

In order to generate a virtual production, in which an actor is mapped onto a simulated background, it is necessary to have some infon-nation concerning the position of the actor.

Various known methods for locating actors in a studio exist. However, many of these are expensive and cumbersome to implement, relying on complex or heavy devices worn by an actor.

Surprisingly, we have found that a useful measure of the position of an actor can be obtained far more conveniently than with conventional methods.

In a first aspect the invention provides a method of determining a measure of the position of a character in a space comprising: obtaining an image of the character from a camera; identifying a portion of the character in the image; and determining a measure of the position of the character based on the position of the identified portion in the image and stored information concerning the orientation of the camera with respect to the space.

By appropriate selection of the portion of the character identified the position of the character can be identified surprisingly accurately. The character may be an actor, or any other moving object such as an animal or an object moved under external control.

Most preferably the portion identified is a foot or the feet of the character. The feet can be readily identified based on a simple assumption that they are the lowest portion of the character in the image and the position can be determined based on the assumption that the feet remain on the floor of the space. It will be appreciated that a simple assumption about the height of the portion enables the two-dimensional image from a single camera to be mapped to three-dimensional space.

Although this method may fail if the character jumps, or if the feet become obscured, this simple set of assumptions can provide remarkably reliable identification suitable for most productions.

In a similar manner, an alternative preferred portion is the head of the character. This has the advantage that characters (usually) have only one head and so a single portion can give an indication of position. The head will usually be the highest portion of the character, so can be readily identified in a similar manner to the feet based on an assumption about the height of the portion. The height of the head must of course be set to a defined value. This may be achieved by using more than one camera, as described elsewhere, or more simply by assuming a particular height for each character.

Preferably the camera is mounted behind the character, viewing the character from a direction approximately opposite a main studio camera. However, the main studio camera may be used. A plurality of cameras may be used, either simultaneously, when the character is visible by more than one, to obtain a more accurate measure, or to enable a larger working space to be used so that the actor is visible by at least one camera over most of the space. Where more than one camera is used, position can be determined by projecting lines from each camera based on. the position of the portion in each camera image and determining the point of closest intersection of the lines from each camera. This may enable height to be determined, which is particularly useful if the portion is not assumed to be on the floor (for example if the portion is the head), or may be combined with an assumption about height to enable more accurate measurement or to enable accuracy to be determined. It is important to note that although more than one camera may be used to improve accuracy, the method can function with a single camera.

The method may include obtaining an indication of whether the character's feet are on the ground, which may be a simple input from an observer (for example by pressing a button when the character jumps or runs) or based on transducers in the actor's shoes or based on analysis of observed motion.

The method preferably further includes providing an output, preferably as a data stream, of a measure of the character position, preferably at regular intervals of at least once every second, and preferably several times a second.

The method may further comprise supplying the position measure to a video effects processor, for example either a 2D or 3D background generator, to generate a background taking into account the measured position. For example, with an updated measure of position, a 2D background generator (which generates a background based on a flat image) and a depth map, an actor can appear to move in and out of virtual objects, both in front and behind, by appropriate masking based on the estimate of the actor's position.

Further aspects and preferred features are set out in the claims.

The invention extends to apparatus implementing the method and to virtual production systems incorporating the method or apparatus for character position estimation, as well 5 as to computer programs and computer program products.

An embodiment of the invention will now be described, by way of example, with reference to the accompanying drawings in which:

Fig. I depicts a typical set up; Fig. 2 shows schematically the image processing required; Fig. 3 illustrates camera positioning in a studio; Fig. 4 shows actor positions from a camera viewpoint; Fig. 5 illustrates how the lowest point in an image seen by a camera is the actor's feet; Fig. 6 illustrates potential errors when two actors are close together; Fig. 7 illustrates potential errors when two actors are in line; Fig. 8 illustrates potential errors when an actor jumps; Fig. 9 illustrates camera field of view in a typical studio;

Fig. 10 illustrates how studio coverage can be extended; Fig. I I shows a test set-up with a camera positioned above a retro- reflective cloth; Fig. 12 illustrates a a partial search before the lowest point of an object is found; Fig. 13 illustrates a a partial search after the lowest point of an object is found; Fig. 14 depicts a pinhole model of a camera with pinhole shown in front of the lens; Fig. 15 shows an example with two points A and B which are close, so the true position of the object is midway between these points; Fig. 16 shows an example with two points A and B which are not close so two object positions can be determined; Fig. 17 shows an example with two points A and B 'crossed' so an actor must be in the air (e.g. jumping), with his/her true position between A and B; Fig. 18 shows an example with two object detected, each with points A and B crossed'so two actors are detected, both of whom are in the air; Figure 19 shows a block diagram of an actor tracking system, based on PC frarnegrabber card in which image processing is performed by the PC; Figure 20 shows a block diagram of an actor tracking system, based on a serial port camera in which image processing is performed by dedicated hardware; Fig. 2 1 a shows a keyed camera image; Fig. 21b shows an image thresholded to two levels; and Figs. 2 1 c-2 I f show points detected by the search algorithm.

Referring to Figure 1, a main camera 10 obtains an image of actor 20 for use in a virtual production. This camera may include a camera tracking system, for example as described in our Intemational Patent Application No. WO 98/54593 and a chroma keying system, for example as described in our GB-A-2321565 both of whose disclosures are incorporated herein by reference. The output of the camera is supplied to a 2D effects processor 34, preferably including our defocussing and depth mapping to map selected portions of the camera image onto a simulated background. The 2D effects processor normally receives stored information concerning the depth at which the actor is to be placed.

With the improved system, the processor 32 determines the position of the character's feet, using conventional image recognition software and the assumption that the feet are normally the lowest visible part of the actor in the image. Based on stored information concerning the position and orientation of the camera 30, the processor can map image positions to floor positions and hence determine a measure of the position of the actor in the studio space. Where more than one camera is present, the processor can determine the orientation of a line extending from each camera to the portion based on the position of the portion in each camera image, and preferably the line may have a thickness associated with it based on a measure of error in the position determination. The point of closest intersection of the lines may be calculated and the position of the portion calculated from the point of intersection. The processor may determine a full 3D position. By analysing the image, the processor may determine a measure of orientation of the actor, for example to detect whether the actor is seated or crouching, standing or walking (a simple comparison of the approximate height of the character to a threshold may give a measure of orientation). However, a simple 2D position measure of an approximate actor position on the studio floor may be sufficient and in some cases only a simple I D measure of the actor's distance from the camera or depth may be required. As a further alternative, the processor may simply indicate in which of a number of predefined zones the actor is located, the zones preferably corresponding to regions of a virtual space being simulated.

In the example given, when the actor moves forwards, towards camera 10, the effects processor can register that the actor is now in front of, and so will mask, portions of the image which previously should have masked the actor.

The position of the actor may be supplied to a 3D effects generator and may be employed to control interactive video effects, for example if the actor walks over a virtual trap door.

The information may be supplemented by other information gained from other sources. 5 It will be appreciated that the above described process can be repeated for more than one character in a space independently. Where more than one character is present, it is desirable to provide some means for distinguishing between characters. One way in which this could be done is to derive a measure of the position of portions, for example the feet, of each character and when those portions are within a pre- determined distance from each other, to proceed on the basis that the portions belong to a single character. When the portions are more than a certain distance apart, the processor should proceed on the basis that the portions belong to different characters. A suitable distance would typically be about I metre, although this may have to be adjusted for certain productions.

For certain productions, particularly involving intimate scenes, this method may fail and it may be desirable explicitly to identify characters based on their positions within the camera image at a given point, for example when two characters meet in the space, the system may be told to recognize the character that enters from the right of the space, then provides the upper image at a certain point, and then leaves to the left as character A and the other as character B. Where explicit character identification is provided, it is desirable to present a simplified image during the rehearsal phase, for example a silhouette image based on chroma-keying with labels attached to characters and the option to label scenes or actions, the labels and actions being editable during the rehearsal, for use in identification later.

To surnmarise, the processor may perform the following steps:- (1) Isolate character from camera image or otherwise identify or highlight character.

Where the background is arranged for chroma keying, as will usually be the case in virtual productions, this may be achieved by chroma keying.

(2) Identify chosen portion of character and measure position of chosen portion. In the simplest implementation, the processor may simply determine the x and y co- ordinates in the image of the vertically lowest portion of the shape identified to be a character and use this as an indication of the position of the character's feet. More complex algorithms determining, for example, the position of both feet of a person, and some indication of the approximate posture may also be employed. An indication of posture can be obtained by correlating the image with images of the character, or similar characters, in various positions, for example, sitting, standing, lying, kneeling, walking, running, jumping, etc. Views frorn more than one camera may be correlated to obtain a 5 more accurate or reliable result. It is important to note that, in contrast with stereogrammetry techniques which require at least two cameras to obtain a position estimate, whilst more than one camera may be employed to give a better position estimate, a position estimate is obtained from a single image based on the assumption about height of the portion. The image identification process can be refined by recording an image of a character at a given point, or performing a specific action, during a rehearsal (in which an explicit indication of particular points is entered into the recognition apparatus) and then correlating the image to stored images to identify particular events or movements during a live run. This may be used to trigger complex special effects automatically, for example when a character reaches to switch on a virtual light or to knock on a virtual door, this can be signalled to the effects processor automatically based on pattern recognition.

(3) Map measured position to a position within the space. Using the assumption that the lowest portion of the character is on the floor, each pixel position in the camera field of view will correspond to a unique location on the floor of the space. For the majority of the space, assuming a constant floor height, mapping may be achieved simply by means of a linear calculation with co- efficients based on the height and angle of the camera. Where the space includes sections of floor at varying heights, the mapping will need to apply different mappings at different positions, so the mapping algorithm will need to incorporate a first step of testing which zone the pixel lies in before applying the appropriate mapping. On highly contoured surfaces, it may be more desirable to store an explicit mapping from each pixel location to an (x,y,z) co-ordinate; this requires more memory, but is faster to implement.

(4) Output measured position. This is preferably provided as a datastrearn containing a regularly updated measure of character position. With sophisticated image recognition algorithms, it would also be possible to provide an indication of velocity or other attributes such as posture, and, if "trigger points" have been stored during a rehearsal or otherwise defined, a signal indicating that a "trigger point" has been reached can be output.

Although a number of suitable image recognition algorithms for detecting portions of an image may be employed, a specific example of the processing in a typical implementation will now be described.

To recap, analysing the image obtained from a single video camera looking down to the studio floor provides sufficient information for the positions of actors to be determined, at least in most situations. From knowledge of camera position, rotation (orientation) and focal length, a mapping can be determined to map the object position in the image to object position in the real world. Combining the output of a second camera viewing the samescene from a different viewpoint gives further accuracy and offers other benefits, as described below.

An outline of a system based on this technique is shown in Figure 2.

The positioning of a tracking camera is important to improve the utility of the actor tracking system. A number of possible mounting locations available in a television studio have been considered, including the comers of the studio, the ceiling (on the lighting grid) and at the top of the front, rear or side walls (which term is intended to encompass temporary partitions or similar defining a working space). Mounting cameras on the lighting grid has advantages in that cameras are unobtrusive and likely to have a good 'view' of the studio floor, but certain areas of the studio may be obscured by lights or circular bar-codes. Rather than mounting a camera in the centre of the lighting grid, there are a number of advantages to having cameras at the edge of the studio, where obstruction of the view by other objects is less likely. The or each camera is preferably positioned at a height of at least 2m, preferably at least 2.5m, more preferably at least about 3m. Ideally, the camera is at a height of at least about 4m and typically about 51n so as to look down at the portion to be identified at an angle.

It is useful to first consider a small studio, which requires only a single tracking camera, as schematically depicted in Fig. 3 which shows the position of a single tracking camera, as might be used in a very small studio. The camera is mounted on the back wall facing down and towards the front. It is this arrangement which is considered most suitable. The positions of actor A and actor B correspond to points A and B seen from the camera viewpoint in Figure 4.

With the camera mounted on the rear wall and since nearly all 'action' occurs with actors facing the main studio camera, the lowest point in the camera's viewpoint is likely to be the actors feet, as can be seen from Fig. 5. This simplifies the image processing necessary to determine actor position.

A number of situations exist where an actor tracking system using cameras may produce an erroneous result. Some of these situations are extremely unlikely to occur, whilst others could be considered as normal 'day to day' occurrences in a television studio. A discussion of how the camera arrangement described above can be expected to perform in each situation follows.

Adjacent Objects If objects are adjacent (that is they touch or are close to one another), it is sometimes difficult to resolve the objects as being separate. Consequently, two objects may share an x, y position (white dot) that in fact lies at the lowest point in the image, rather than having two discrete (and correct) x, y positions (black dots). This effect can be seen in Figure 6.

This situation is undesirable, but may not prevent adequate performance of the system. 'It is reasonable when switching a mask signal, generating a depth map or controlling the position of an acoustic source (three possible applications of co-ordinate data given in section 1) to represent the two objects as a single co-ordinate.

An incorrect result is produced if one object appears directly 'in front' of the other from the viewpoint of the tracking camera. Figure 7 demonstrates this situation in which the position of the actor closest to the tracking camera (actor A) is interpreted as the position of the only object in the scene. A second, rear-facing tracking camera mounted at the front of the studio could be used to overcome this problem. If the y-position obtained from the front and the rear cameras is sufficiently different, it can be reported that two object are present.

Feet leave the floor The feet of the actor (or lowest point of any other object) are used to determine the (x, y) position. Therefore if the feet of the actor leave the floor, an incorrect position is calculated (see Figure 8). In most virtual studio productions, it is unlikely that an actor will be jumping around, but it is likely that an actor may climb onto a covered block in order to stand on a virtual object.

The error that is produced varies according to the distance between the camera and the step. A step up near the front edge of the studio results in a large error, whilst a step up next to the back wall gives little or no error. A second, rear-facing tracking camera would provide sufficient extra information to allow the exact position of the object to be 5 triangulated.

Arms in the air Accurate positional measurement relies upon the feet of the actor being the lowest point 10 in the image seen by the tracking camera. This is the case in most normal conditions. However, when an actor waves his or her arms around, the viewpoint of the tracking camera may cause the anus to become the 'lowest' point on the image. The position of the tracking camera is chosen to minimise this situation - it only occurs if the actor is standing towards the back of the studio, facing the rear wall, with his/her arms stretched out at 90' - a very unlikely pose! Again, a second camera looking from the front of the studio would provide useful extra information.

Sitting, Laying, Crouching, Bending Actors sitting on a chair, on the floor, or on some other object are all likely scenarios. When sitting on the floor, bending or crouching, the position reported by the tracking system will be the position of some part of the actors body, but not necessarily of the feet. When sitting on a chair (assuming the chair was facing forwards) a single rear-mounted tracking camera would report the depth of the actor to be the position of the back legs of the chair. In reality, the error between the position of the back of the chair and the head and body of the actor is likely to be around 12cm, a result acceptable for most situations.

Amount of Retroreflective/Kgy-colour cloth The generation of an accurate 'key signal' from the tracking camera relies on there being background cloth (preferably retro-reflective background cloth as described in our UK patent number GB-B-2321565) covering the entire field of view of the camera. With a tracking camera looking from the back of the studio towards the front, it is conceivable that the upper part of the actor may not be seen against cloth. An assumption can be made that the actor never leaves the retro-reflective background covered floor area in the studio, therefore at all times some part of the actor is viewed against cloth. The algorithm described below searches each object for the lowest point in the image (i.e. the point nearest the back wall of the studio) starting from the bottom of the image. It may be necessary to ignore any positional information past the edge of the retro-reflective background/blue cloth (this is a trivial operation, since we know the coverage area of cloth).

Multiple Cameras Above we have considered how in some situations a single tracking camera may produce a false position. Although this may be acceptable in many cases, such errors can be reduced or eliminated by the use of a second camera covering the same area but from a different viewpoint. More accurate positioning of actors is offered by using two cameras in the following situations:

- Actors standing directly in front and behind one another. - Actors working at a height - Actors waving hands.

Multiple cameras also offer the advantage of a larger coverage area, and in all but the smallest studios an array of a number of cameras is preferably provided to track actors across the entire floor area.

A PC based graphical user interface can readily allow monitoring and control of multiple cameras, and could potentially offer control of additional features, such as allowing the user to define areas of the studio in which objects are not tracked. Thesegarbage zones' are useful in situations where studio cameras or production staff may be moving.

Camera Installation in a Typical Studio We will discuss some of the issues surrounding installation of tracking cameras in a typical virtual studio. Suitable mounting positions are presented for a studio of approximately 9.5m by 7.5m by 5m Mountin Positions An optimum camera position was discussed earlier in this document, and is shown in Figure 3. Retro-reflective cloth was arranged to surround three sides of the studio, and supported by a tubular frame at 5m from the floor. The tracking camera(s) were mounted on this frame.

Lens angle of view Wide viewing angles offer a large area of coverage with minimal cameras, but introduce significant amounts of lens distortion. It is possible to correct the effects of this distortion if a camera system is properly calibrated.

Calculations of angle of view for given focal lengths are based on a camera with 1/2" (6.4mm by 4.8mm) CCD, such as the Pulnix TMC-6EX.

Three lenses were considered as follows:

-7ff- Mfii6htal angle,,of. Vertical., angle 6f FOOM leng view 4.8mm 71036 54o33 6.5mm. 52o25 40031 8.5mm 41015 022 The coverage area that each lens offers depends on the orientation of the camera ('landscape or portrait'), as show in Figure 9. A lens with 4.8nim. focal length (the widest angle of view) is shown for example. Area calculations are based on a camera mounted at a height of 5m; it can be seen that a coverage area of about 51n by 15m was obtained with a portrait orientation and 7m by 7m with a landscape orientation.

Number of cameras reguired The studio dimensions (i.e. the area covered in Retro-reflective material) are approximately 9.5m by 7.5m by 5m. For coverage of all or most of the studio floor, the most suitable camera arrangements (for single camera operation) are shown in Figure 10 25 which shows floor coverage offered by various camera arrangements.

For a simple system consisting of a single camera, a 4.8mm focal length lens should be used. The camera should be mounted landscape (i.e. its usual orientation). Although this arrangement would not provide total coverage of the studio, a significant area would be 30 seen by the tracking camera.

Algorithm Development Calibration Camera calibration involves determining the relationship between any point in the CCD image and the real-world location of that point. It is also necessary to obtain information about the relative positions of several cameras in the studio. Software is available that can be adapted to perform the necessary functions.

It can be assumed that the intrinsic parameters (including focal length and lens distortion) 10 of the camera system are known. There are six degrees of freedom for the extrinsic camera parameters - three for the camera translation and three for the camera orientation.

Identifying Objects in an Image This section describes a technique developed to identify objects in the image, and find the lowest co-ordinates of each object.

Test images sequences - experimental results Test sequences of humans walking and running around a Retro-reflective covered studio floor were captured using a camera equipped with a ring of blue LED's to illuminate the Retro-reflective cloth. A total of around 12 minutes of footage was shot, which consists of views of calibration markers, of humans walking, running, and sitting. Short sequences were captured; most of which is a chroma-keyed version of the camera output.

Of these 12 minutes on Betacarn SP, around 5 minutes of image data was transferred to 25 a digital image disk store (the Accom, 2XTREME) to act as test data for the algorithm described below. A number of individual frames can be seen at various points during image analysis in Figures 21.

Data was captured demonstrating workability even with less than optimal camera 30 positioning. For gathering these sequences, the camera was pointing directly down to the floor, as shown in Figure 11, which shows the position of the camera for test data.

Image Analy is The image from each camera is analysed to identify the important features in the keyed silhouette. Since the input is a keyed image (it has been through an external chromakeyer), this is a simple threshold operation. Thresholding can be performed using relatively simple algorithms which may be available from many graphic toolkit libraries.

Images are thresholded at a value of -80 (-112 is black, + 107 is white), to yield a binary image. This operation separates the background (cso cloth) from any objects, and effectively sharpens the object edges.

A software algorithm was written which expects a foreground object to beblack (less than - 1) and the image background to be non-black (any other grey value).

Two options are available for image analysis and object position calculation. Firstly, a PC-based video capture card may be used to grab the image from a camera. Subsequent analysis and object position calculation can be performed by a PC based. actor tracking application. Secondly, a modified version of the hardware developed for the free-d camera tracking system could carry out image analysis and object detection, transmitting any (image) object co- ordinates to the actor tracking application via a RS232 or 422 serial link.

Finding Object Position in an Image This section describes the algorithm developed to detect the lowest point of each object in the processed camera image. It will be appreciated of course that the algorithm can be readily modified to find the highest point in the image (if the head is being used as a reference point).

In order to determine the position of the lowest point in any object, an area below and to either side of each pixel lying on an object is searched. If the area contains part of an object, the bottom is not at the current (x, y) position. If the area does not contain any further object points, then the bottom of the object is at the current (x, y) position.

Figures 12 and 13 demonstrate this, in which Figure 12 shows the case where the cur-rent (x, y) position is not the bottom of the object and Figure 13 shows the case where the current (x, y) position is the bottom of the object.

In the following algorithm, xsearch and ysearch are user determined search parameters describing the width and height of the search area. These need to be adjusted manually. A search area which is too small may result in a single object being interpreted as many objects, i.e. each foot is seen as a new object. Too large a search area may not be able to resolve two objects as being separate. The choice of xsearch and ysearch are a compromise between these two considerations. Default values are xsearch = 100 and ysearch = 40. These values work well for most images.

Search algorithm:

Get xsearch and ysearch; FOR (x=l, x++, X=720); for all x FOR (y=l, y++, y=576); for all y IF (x,y) is not part of objectO do nothing ELSE search the area:

X (xsearch - 2) to x + (xsearch - 2), y + I to y + ysearch IF area contains no black pixels, object FOUND.

Bottom of object is at (xy). Store position.

ELSE, bottom of object NOT FOUND.

Results Figure 21 contains a number of single frame images from the test data before, after, and at various stages throughout the image analysis and object finding process. It can be seen, particularly from Fig. 21c that the algorithm for finding the "lowest" point in the image can pick points which are not the most appropriate when the camera is directly above the studio. For this reason, the positioning shown in Fig. 3 is preferable. Preferably the camera is mounted on or adjacent a wall, preferably adjacent the rear wall of the studio (with respect to the main camera) and preferably at a height above the portion of interest, preferably in the range 2m to I Om, typically about 5 m. If a camera is positioned looking down, it may be preferable to adapt the algorithm for example to determine a measure of position of the centre of the character, for example by determining the centroid based on an averaging algorithm.

Conversion from Image to Real-world position Given the co-ordinate positions of an object in the image plane, we need to know the corresponding position of the object on the studio floor. In other words, given image coordinates (x, y), it is necessary to perfonn the the mapping which gives studio coordinates (X, Y, Z).

It is useful to consider the camera as a pinhole model. In this case, the pinhole will be in front of the lens to avoid reversal of the y-axis. Fig. 14 depicts a pinhole model of a camera with pinhole shown in front of the lens.

The vector from the pinhole to the point on the image sensor where an object appearsis:

X P= Y r, r, 2 rl 3 M = r2, r22 r23 where M is a 3 x 3 rotation matrix. -r3, r32 r33 C is the position of the camera, and M is the rotation of the camera. These are determined by the calibration routine. L describes the position on the studio floor of the ob ect seen in the image sensor. Since we are only interested in objects on the floor, the j z-co-ordinate component of L is always zero.

So, for a point (x, y) in our image (in units of focal length),it is necessary to find the position of a point on the floor (L) of the studio that corresponds to this.

Firstly, the mapping from world point to camera, C - L, is:

- X C-L=,u.M.P=y.M. y So: Cx- X X Cy - Y =Y.M. Y -CZ-O- -- I- (X and Y are the x and y co-ordinates of L, and the 'unknowns' that we are trying to find.

Now: Cx-X=g.[M.P1., (where means the x component of Cy - Y= g. [M. P]y Cz = R. [M.P]z Multiplying top two equations by bottom equation gives:

(CX - X).[M.P], = Cz.[M.P],, (Eq. 1) (CY - Y). [M.P] Z = Cz. [M.P] Y (Eq. 2) Rearranging Eq. I (X unknown):

Cx.[M.P]z - xfm.p]z cz[M.P]x Cx. [M.P]z + CZ[M.P];, X [M.P] z Cx. [M.P]z cz[M.P]x x [M.P]z Similarly for Eq. 2 (Y unknown):

CY[M.P]z -CZ[M.P]x Y [M.P]z r, r, 2 r, 3 X M-P = r2, r22 r23 Y Since r3l r32 r33 xr,+ Yr12 - r,3 M -P = Xr2l + Yr22 - r23 -Xr3, + Yr32 - r33 X CX-(X.r3, + Y-r32 r33) CZ(X.rl, + Y.r12 - r1l) (X-r3, + Y' r32 - r33 X Cx - Cz. X.ri I + Y.r12 - r, 3 X,r3I + Y-r32 - r33 Similarly, from the equation for Y: 35 Y = Cy - Cz. X-ri I + Y'r,2 - r,3 X-r3, + Y,r32 - r33 Integration of data from multiple cameras Above we discussed the idea of allowing multiple cameras to be used in an actor tracking system; some further considerations are discussed.

Zoned cameras To allow for coverage of a large studio floor area, it is necessary to have a multiple camera system where-by the area is split into 'zones', each zone being viewed by a different camera. In combining data from multiple cameras it is important to consider that zones may overlap therefore the same object may be viewed simultaneously by cameras from adjacent zones. The actor tracking system must know of the location and orientation of each of the cameras with respect to one another and with respect to the studio origin.

Front and rear looking cameras For a system with more than one camera viewing the same area of studio (to offer increased accuracy), it must be possible to distinguish between front and rear camera information. Information from two cameras can be used to obtain accurate positions of objects when they are off the ground.

Figures 15 to 18 demonstrate four situations that may be encountered when using front and rear looking cameras. These diagrams help to visualise exactly which points of each object will be detected. The location of these points can tell us about the number of objects in a scene, and if the objects have left the floor.

In Fig. 15, points A and B are close, so the true position of the object is midway between these points.

In Fig. 16, points A and B are not close - two object positions can be determined.

In Fig. 17, points A and B are 'crossed' - actor must be in the air (e.g. jumping), with his/her true position between A and B. In Fig. 18, two objects are detected, each with points A and B 'crossed' - two actors are detected, both of whom are in the air.

A user determined parameter may be used to set the distance threshold between objects being interpreted as close, and objects being interpreted as far apart, the parameter being set empirically for a particular production or studio.

Design of user interface An actor tracking system can readily be implemented as a piece of PC software that provides an intuitive graphical user interface (GUI). A suitable GUI allows the user to control acquisition of co-ordinate data from specifically designed hardware through the serial port, or from a video capture card based on the PC. Camera calibration and control of the destination of actor location data via a Network link may be invoked and monitored. The software may also store data associated with an, individual studio set-up (for example studio size, tracking camera locations and other calibration data).

Two suggested block diagram designs are shown in Figures 19 and 20 Figure 19 shows a block diagram of an actor tracking system, based on PC framegrabber card in which image processing is performed by the PC.

Figure 20 shows a block diagram of an actor tracking system, based on a serial port camera in which image processing is performed by dedicated hardware A number of modifications will be apparent. For example, it would be possible to replace the cameras described above with cameras sensitive only to infrared light. Any 'illumination' of the studio would be invisible both to the human eye and to studio cameras, which have infrared filters attached. Structured light may be used to enhance 20 measurement of position; it is important to note that this is not essential.

All preferable and optional features mentioned above may be independently provided or applied to other methods, unless otherwise stated. The appended abstract is incorporated herein by reference.

Claims

1. A method of determining a measure of the position of a character in a space comprising:

obtaining an image of the character with camera means; identifying a portion of the character; and determining a measure of the position of the character within the space based on the position of said portion within the image and the position and orientation of the camera means with respect to the space.

2. A method according to Claim 1, wherein the image is mapped to a position within the three dimensional space based on an a predetermined constraint.

3. A method according to Claim 2, wherein the constraint comprises an assumption about the height of the portion.

4. A method according to any preceding claim, wherein the portion comprises at least one of the character's feet.

5. A method according to Claim 4 as dependent on Claim 3, wherein the height is assumed to be substantially ground level.

6. A method according to any of Claims I to 3, wherein the portion comprises the head of the character.

7. A method according to Claim 6 as dependent on Claim 3, wherein the height is assumed to be a predetermined stored height.

8. A method according to Claim 7, further comprising inputting a value for the W_Pht.

9. A method according to Claim 7, further comprising determining a measure of the height based on an image obtained from the camera means.

A method according to any preceding claim wherein the camera means includes a camera mounted behind the character.

11. A method according to any preceding claim, wherein the camera means comprises a plurality of cameras arranged to view the character from different points.

12. A method according to Claim 11, wherein an estimate of position is obtained based on images from more than one camera if available, but an estimate of position is obtained from a single camera if only a single image is available.

13. A method according to any preceding claim wherein the camera means includes a studio camera.

14. A method according to any preceding claim further comprising supplying a regularly updated measure of position to an effects processorfor simulating a virtual scene.

15. A method according to any preceding claim further comprising outputting an indication of the character substantially attaining a pre-determined position.

16. A method according to Claim 15, wherein said position or configuration is identified based on a configuration or position stored during a rehearsal or training phase.

17. A method according to Claim 7 or 8, further comprising triggering an effect based on said indication.

18. A method according to any preceding claim, wherein more than one character is identified.

19. A method according to Claim 18, further comprising distinguishing between characters.

20. A method according to Claim 19, wherein said distinguishing is based on separation within the image.

21. A method according to Claim 19 or 20, wherein distinguishing is based on movement of the characters, preferably based on predetennined movement or position criteria.

22. A method according to any preceding claim wherein identifying the character comprises isolating the character from other portions -of the camera image.

23. A method according to Claim 22, wherein isolating is based on chroma keying.

24. A method according to any preceding claim wherein the camera is positioned at an edge of the space.

25. A method according to any preceding claim wherein the camera is positioned above the portion to be identified.

26. A method according to any preceding claim wherein the camera is positioned above the portion to be identified and offset from the space in which the character moves.

27. A method according to any preceding claim wherein the space comprises a studio and the camera is positioned at a height of at least 2.5m adjacent a studio wall.

28. Apparatus adapted and arranged to implement the method of any preceding claim.

29. A method or apparatus substantially as any one herein described.

30. Apparatus for determining a measure of the position of a character in a space comprising:

image input means for obtaining an image of the character from camera means; processing means for identifying a portion of the character; means for storing a measure of the camera position; and position determining means for determining a measure of the position of the character within the space based on the position of said portion within the image and the position and orientation of the camera means with respect to the space.

31. Apparatus according to Claim 30, wherein the position determining means is arranged to determine the position based on an assumption about the height of the portion.

32. Apparatus according to Claim 30 or 31 wherein the portion comprises a foot of the character.

33. Apparatus according to any of Claims 30 to 32 arranged to perform a method according to any of Claims I to 27.

34. A computer program or computer program product containing instructions for performing a method according to any of Claims I to 27.