WO2009109804A1 - Method and apparatus for image processing - Google Patents

Method and apparatus for image processing Download PDF

Info

Publication number
WO2009109804A1
WO2009109804A1 PCT/IB2008/001375 IB2008001375W WO2009109804A1 WO 2009109804 A1 WO2009109804 A1 WO 2009109804A1 IB 2008001375 W IB2008001375 W IB 2008001375W WO 2009109804 A1 WO2009109804 A1 WO 2009109804A1
Authority
WO
Grant status
Application
Patent type
Prior art keywords
meshes
image
layers
layer
images
Prior art date
Application number
PCT/IB2008/001375
Other languages
French (fr)
Inventor
Stephane Jean Louis Jacob
Original Assignee
Dooworks Fz Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/261Image signal generators with monoscopic-to-stereoscopic image conversion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/275Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals

Abstract

A stream of images is acquired from a computer generated source or elsewhere. Elements of the images are assigned to different layers on a frame by frame basis and layering data is held in an associated alpha channel. The layers are then rendered and texture mapped onto a respective one of a plurality of meshes. A portion of the mapped data is selected by a virtual camera which is controlled by a user to move around and through the meshes. The selected data comprises data from one or more of the meshes and is displayed to a user.

Description

METHOD AND APPARATUS FOR IMAGE PROCESSING

The invention relates to image processing and, in particular, to the processing of two dimensional images to provide three dimensional or stereoscopic effects.

There have been many attempts made to produce three dimensional movies which, although projected onto a two dimensional screen, give an illusion of depth. Historically the source image is filmed simultaneously by two cameras positioned side by side and facing each other filming at a 90° angle using mirrors. The cameras must be perfectly synchronised and have identical technical characteristics. Both images are exposed on the film slightly spaced apart and are viewed in a way that causes the viewer's brain to interpret the pair of images stereoscopically as a three dimensional image. In one known method anaglyph images are produced by forming the pair of images as different Colours. The viewer sees the movie through a pair of glasses which have a different colour filter in each eye; usually red in the left and cyan in the right. Another viewing technique uses polarised glasses. The two images are projected onto the screen using orthogonal polarising filters and the viewer watches the images through a pair of glasses that have similarly orthogonal polarising filters in each eye. The effect is that each eye sees one of the two images and the brain forms these into a single stereoscopic image.

There have also been a number of autostereoscopic systems proposed which do not require special viewing aids. These systems tend to use lenticular lenses or parallax barriers as a flat panel display and require the viewer to be positioned correctly to perceive a different image with each eye. Systems are known that use eye tracking to adjust the displayed images to follow the viewer's eyes as his head moves. Autostereoscopic systems are more suitable for television or other home use, rather than for movie theatres. We have appreciated that existing stereoscopic and 3-D systems all suffer from significant drawbacks. For example, the movie theatre systems have the considerable disadvantage of requiring twin projectors and also, whether anaglyphic or polarised, require the user to wear special glasses to appreciate the three dimensional effects. Autostereoscopic systems only produce a limited three dimensional effect as the effect is generated by the manner in which the surface of the display is formed and so requires correct positioning of the viewer which is neither practical nor desirable.

The present invention aims to provide a method and apparatus which improves on the prior art systems described above.

In its broadest form, the invention processes image data to divide the image into different layers, each of which is processed independently. The user can navigate through the layers giving an illusion of three dimensions.

More specifically, the invention is defined in the independent claims to which reference is directed.

In a preferred embodiment of the invention a sequence of images such as a movie is separated into a number of layers which are then texture mapped onto meshes. To play out the video, an area of the meshes must be selected. Preferably this is achieved using a virtual camera whose position is moveable around the meshes. To the observer, as the camera passes through a foreground mesh, through that mesh towards the next mesh, the viewing experience will be one of passing into the depth of the film behind objects of image areas that are in the innermost, a foreground layer. Thus, the illusion of three dimensional is created.

Preferably objects or image areas are selected and assigned to a given layer. For n layers, n - 1 masks are created corresponding to the image areas or objects selected for that layer. Preferably, areas or objects assigned to a layer can be tracked from frame to frame, for example by using motion detection techniques.

The meshes may be concentric nested meshes, for example spherical or hemispherical. The meshes may be flat. Preferably, the meshes are parallel and may be equi-spaced.

Embodiments of the invention will now be described, by way of example only, and with reference to the accompanying drawings, in which:

Figure 1 is a top view of an image to be divided into layers and showing the positioning of a pre-rendering virtual camera;

Figure 2 is a side view of the image of Figure 1 ;

Figure 3 is an equirectangular view of three layers of the image produced from Figures 1 and 2;

Figure 4 is an equirectangular view of a background layer of the image produced from Figures 1 and 2;

Figure 5 is an equirectangular view of a mid-ground layer of the image of Figures 1 and 2 together with a mask defined in an alpha channel for forming the layer;

Figure 6 is an equirectangular view of a foreground layer of the image of Figures 1 and 2 together with a mask defined in an alpha channel for forming the layer;

Figure 7 is a schematic view of a system and process embodying the invention;

Figure 8 shows a games controller which may be used to control the position of a virtual camera to determine the output video; Figures 9a, 9b, 9c and side, front and internal views of nested spherical meshes used in mapping rendered layers from which a portion can be selected for output according to the position of a virtual camera;

Figures 10 and 11 show similar meshes suitable where the input images are flat; and

Figure 12 is a schematic view of the main components of a system embodying the invention.

The embodiment to be described allows stereoscopic illusion and interactivity to be achieved in a 3D computer graphic environment or with a video file which may be flat video or immersive, for example 360° field of view. The source content, which may be computer generated or video is converted into multiple layers which are synchronised and which are accompanied by an alpha channel which carries data regarding the layers and which may comprise masks for some or all of the layers. The layered content is played into the specific player at which point each layer is played out at the same time. The final displayed output is generated in real time from the output data using an end user input that modifies the position angle and zoom function of a virtual camera to give an illusion of stereoscopy and the sense of movement by the user through the layers of the content and interactivity with the content.

The concept of layering and the generation of layers will first be described referring to Figures 1 to 9. Figures 1 to 6 show how layering data may be generated in a 360 immersive environment. The embodiments allow a conventional image source to be processed to produce a stereoscopic or three dimensional effect. Where the input images are video, it is convenient to convert them first into CG format and then process them in the same manner as CG images. The image source is not relevant to the invention. Embodiments of the invention could be used with movie film, in which case the film is first converted to video using a telecine or similar device.

As a preliminary step the data on an image frame is divided into a number of layers n. The number of layers is greater than 1 and may be any defined number. However, the larger the number of layers, the higher the processing overhead. In the following example, n = 3. As a preprocessing step, the system operator assigns objects from the image to a layer. In Figures 1 to 6, the first layer is a foreground layer, the second or middle layer is a mid-ground layer and the third layer is a background layer. In practice, objects need only be assigned to n-1 layers as any remaining unassigned objects or image areas must then belong to the remaining layer. In this example, objects are assigned to the first and second layers. Object assignment is performed on a frame-by-frame basis, but objects may be tracked from frame to frame so that, for example, once a particular object is assigned to the foreground layer, it will be tagged as foreground and by using known tracking techniques, such as edge detection and motion vectors, the object may be detected automatically in the next frame.

Assignment of objects in a frame, or areas of an image, to a layer will be performed by an operator working on a display of the image and identifying areas or objects using a pointer or other device to select an image portion.

Thus, Figures 1 to 6 show generation of three layers in the 360, for example. Figures 1 and 2 show top and side views, respectively of the image that is being sub-divided into layers. It can be seen that the image includes a number of objects: two trees, a car, a flower, a house, an aeroplane and the sun. The layers are assembled by selecting the objects that form a given layer and generating a mask that makes the remainder of the image opaque. The first layer can then be generated by placing a virtual camera at the centre of the image and sweeping the camera through 360° to render the frame. It follows that objects only need to be assigned to (n - 1) layers as any unassigned objects will form part of the nth layer, in this case the background layer.

The layers are produced as equirectangular images and Figure 3 shows the equirectangular view of the three layers placed on top of each other. Figure 4 shows the equirectangular view of the background layer, showing those objects that have not been tagged as belonging to the first or second layers.

Figures 5 and 6 show, respectively, the equirectangular views of the mid- ground and foreground layers together with their masks which ensure that only the selected objects are seen by the virtual camera. Thus, in Figure 5 only the house has been assigned to the mid-ground. In Figure 6 the car and the flower have been assigned to the foreground layer. In both cases, the mesh is the inverse of the image objects or areas.

In each of the layers shown, specific objects are identified and assigned to a layer. However, the operator could select an area of the image which does not contain a specific object or contains a portion of an object and assign that area to a layer.

The masks are defined in an alpha channel which, as will be described, is used to control rendering of the layers. The use of such a channel, or alpha compositing, is well known and combines an image with a background to create the appearance of partial transparency. It is often usual to render image elements in separate passes and then combine the resulting multiple 2D images in a single, final image in a process called compositing. Compositing is used widely when combining computer rendered image elements with live footage.

To combine image elements correctly, an associated matte must be kept for each element. The matte contains the shape of the geometry being drawn, that is the shape of the element, making it possible to distinguish between parts of the image where the geometry was actually drawn, and parts of the image which are empty.

It is presently preferred that each layer has its own alpha channel, however a single alpha channel could be provided for all the layers.

In a 2D image element which stores a colour for each pixel, an additional value is stored in the alpha channel containing a value from 0 to 1. A value of 0 means that the pixel does not have any coverage information and is fully transparent, i.e. there was no colour contribution from any geometry because the geometry did not overlap this pixel. A value of 1 means that the pixel is fully opaque because the geometry completely overlapped the pixel.

As can be seen from Figures 5 and 6, the masks can be defined by setting the alpha channel value to 1 for all pixels not included in a layer. Thus, it can be seen that the layer comprises image areas which are pre-selected. These image areas may equate to objects, such as a tree, house etc, but are defined in the alpha channel as pixel references.

The three layers are produced as three separate video streams that can be processed in parallel and then reassembled to provide a single layered image as shown in Figure 12 and as explained below. As part of the process the equirectangular images are mapped onto a mesh. Figure 15 shows an example of the meshes that may be used. For a CG image, the preferred mesh is a hemisphere or sphere but it will be noted that the diameter of the mesh is different from each of the layers. It will be seen that the foreground layer is mapped onto the smallest diameter mesh followed by the mid-ground layer and then the background layer. Thus, the meshes and the mapped images are nested. The nested meshes are preferably concentric but this is not essential. They are also preferably equi-spaced, but this is also not essential. Various special effects may be created by varying the relative spacing and positioning of the meshes. By placing a virtual camera at the centre of the nested meshes, and moving that camera in the X, Y and Z directions, the effect of passing through the layers can be achieved. Thus, with the camera located at the centre, as shown in the figure, the three layers will be seen but as the camera moves along the X or Y axis, the foreground layer appears to zoom until the virtual camera bisects the first mesh, whereupon the foreground layer is no longer in view and the camera sees "behind" the foreground layer to the mid-ground and the background. Similarly the camera can further pass through the mid-ground layer so that only the background layer is visible. As the camera passes through each layer, objects in other layers become visible, giving the viewer the impression of moving through the depth of the image.

The effect of zooming can be achieved by manipulation of the relative mesh sizes between layers. Thus, if the mesh size for the foreground layer were reduced, the effect would be that objects in that layer would appear larger relative to objects in other layers.

In the playout of video, the user has control of the positioning of the virtual camera which enables movement of the camera around the meshes and between meshes, resulting in a sequence of images that give the illusion of a viewing point that meshes around the image and into the depth of the image behind the n-1 layers in front of the background layer.

Figure 7 shows an embodiment of the invention which incorporates the layer information into a video processing process. The embodiment is based on the system disclosed in our earlier applications GB 0718015 and GB 0723538 the content of which is hereby incorporated by reference. The embodiment described is based on the starting point being computer generated data. However, as mentioned above, it could be flat video or immersive video or have originated on another medium, such as film. As shown in Figure 7, in an embodiment using computer-generated images, a 3D world is created 3D in software at 10 and pre-rendered at 20 using particular parameters. A selected portion of the 3D world is displayed under the control of user defined pan, tilt and zoom parameters input into the system.

The process is able to match the final frame rate displayed at 50 to the frame rate created 10.

The 3D world 10 may be created using any known 3D graphic design software. The 3D world 10 is composed with multiple elements to create the illusion of volume and reality. Each object is composed with a plurality of meshes, textures, lights, movement. Then, a virtual camera is located in that environment.

The virtual camera parameters (focal length, aperture, ratio, projection, movement) are then set up and sent to the renderer 20. The renderer 20 has parameters set up, such as resolution, frame rate with number of frames, and the type of renderer algorithm. Any suitable algorithm may be used. Known algorithms include Phong, Gouraud, Ray Tracing and Radiosity. For the avoidance of doubt, the term rendering applies to the processing of computer generated images. The renderer operates on either CG images or images acquired from other sources such as a video camera. In this embodiment a radiosity renderer is preferred.

In our earlier application, the renderer 20 rendered a single video file. In this embodiment, the renderer 20 renders the elements of the 3D world separately under the control of the alpha channel to obtain the different layers. Once the layers are generated, they are imported into converter 30 where they are combined to form a single synchronised file. This file embeds metadata such as the number of layers, distances between layers, virtual camera position, distance between camera and layers, the area that is displayed, camera aspect ratio and other parameters. The multi-layered data that is output from the converter is played out by the player 40 under the control of a user.

The main camera and renderer common parameter is to set the view projection as equirectangular, with an image ratio to 2:1. This parameter may be changed according to user preference.

Then the image sequence is rendered and saved as an images file sequence or a video file for each layer. The resulting file or files are then converted as a texture file or files. This is a pre-rendering step.

The texture file is mapped onto a mesh in the form of a 3D sphere (normal orientation=inside / minimum number of vertex = 100) using texture mapping parameters (global illumination - 100%). A virtual camera is then located in the centre of that sphere. The mapping onto the mesh is the key step that saves computation time. This is a rendering step. The virtual camera parameters can be modified in real time. For example, the focal length can be very short, providing a wide angle view. The mesh whether flat or a sphere, the texture and the virtual camera are combined into software giving the end user control the pan, tilt and zoom function of the virtual camera.

This combination has a 3D frame rate (rendered in real-time). The texture has a 2D frame rate (pre-rendered).

Figure 8 shows a suitable control device, in this case a games controller for a Sony (RTM) PS3 (RTM) games console. The controller may be configured such that, for example, the left joystick controls point and tilt of the virtual camera, the right joystick control Y and X positioning of the virtual camera and the top two buttons control zoom and Z axis position in the 3D real-time world. The zoom control may also control focus modification applied to the virtual camera while it is moving. The 2D frame rate is still continuing during all the sequence. The 3D frame rate continues until the end user stops the application. The software application can embed normal movie player's functions such as pause, rewind, forward and stop. These functions act on the 2D frame rate.

Thus a 3D view of the world as first created by any known 3D graphic techniques and rendered into a 2D equirectangular view comprised of several layers. The equirectangular views are part of a sequence of 2D representations with a 2D frame rate being a sequence representing a view as a camera tracks through a scene. The camera is, of course, a virtual camera as the image is computer-generated. The key step in software is then taking the equirectangular image frames of the layers and mapping these onto a mesh, in this case a spherical mesh, in such a way that a virtual camera located at the centre of that mesh views a non- distorted view of the image in any direction. Consequently, as shown in Figure 2, a camera, with given pan, tilt and zoom parameters, will view a portion of the scene represented on the mesh substantially non-distorted. The user can alter the pan, tilt and zoom parameters of the virtual camera, in any direction, to select a portion of the scene to view. As the user also has control of the whole sequence of frames, the user is able to step forwards or backwards, in the manner of re-winding or playing, as a virtual video player.

Production of meshes is well known and also described in our earlier application GB 0723538 referred to above. Texture mapping techniques are used to map the equirectangular layer images onto the mesh. A portion of the spherical image may be selected for viewing substantially without distortion.

It will be appreciated that the virtual camera is a tool for selecting a defined portion of the image rendered onto the 3D mesh for viewing. Figures 9a - 9c show an example of a suitable mesh for a 360 immersive image. In this example, the meshes comprises three nested, concentric meshes 200, 210, 220 each of which is spherical. The virtual camera 230 is shown at the centre, but as explained above can be moved around by the user to select which parts of the images mapped onto the meshes will be displayed. The virtual camera may be moved to any point between the centre and the outermost mesh and may be zoomed, that is, the number of pixels viewed by the camera is reduced. It will be appreciated that as the camera moves radially outwards from the centre, it will pass first though the innermost mesh and then through the middle mesh as it travels towards the outermost mesh. To the viewer, who is controlling the virtual camera, the displayed video will appear to have depth as the point of view travels through the layers. The user can position the camera between layers and then view layers from behind. This control enables the illusion of stereoscopy and 3D to be created and provides interactivity.

In the examples of Figures 9a - 9c, the 3D world onto which the equirectangular image is mapped is a sphere. This is a convenient shape and particularly suited to mapping computer generated images. However, the choice of 3D shape is defined by the system provider. Where the input is immersive video, acquired using a camera and a fish eye lens, it is appropriate to use a mesh which approximates the fish eye lens. Thus, a 180° fish eye will use a hemispherical mesh. In practice, the mesh may be adjusted to compensate for optical distortions in the lens. The present invention is not limited to any particular mesh shape, although for a given input, the correct choice of mesh is key to outputting good distortion free images. Where the image source is video, the equirectangular images of the computer graphics example are replaced by mapped texture images.

Figures 10 and 11 show the use of flat, rather than spherical or hemispherical meshes. Such an arrangement may be appropriate where the source is flat or less than 180 degrees. An example is where the images have been acquired from movie film. Thus, in Figure 11 , the meshes are shown arranged parallel to one another and again comprise a foreground layer 320, a mid-layer 310 and a background layer 300. The virtual camera is positioned in front of the meshes and may move in the X, Y and Z directions under the user's control. The field of view of the camera is indicated by the chain dotted line 330 in Figures 10 and 11 and is a cone, thus, as the camera moves towards the meshes, the area displayed is reduced. Again, the camera position can move through the foreground and mid-ground meshes to give the impression of depth.

In this example, the meshes are all the same size, are equi-spaced and parallel. However, none of these properties is essential.

Figure 12 shows, schematically, the manner in which the data may be processed. The image source is shown at 400. This may be computer generated, flat or immersive video. It may be captured by a camera and fish eye lens system. The layers are assigned and the data for the alpha channel is derived at 410 and the streams of image data transmitted to a converter which operates on data from the source, which may be in a variety of formats. The connector may include any suitable digital video Codec, but it is presently preferred to use the systems described in our co- pending applications GB 0709711 and GB 0718015.

Once the multiple streams have been processed by the Codec 420 they are converted into texture mapping data by converter 430.

The texture files are mapped onto a 3D mesh, for example, normal orientation=inside / minimum number of vertex = 100 and using texture mapping parameters for example with global illumination = 100%. For playback, a virtual camera is then located in the centre of that mesh, where the mesh is spherical or hemispherical, or in front of the mesh where it is flat as shown in Figures 9 and 10. The process of playing a sequence of video frames as provided by the pre-rendering process, is computationally simple and may be undertaken by a standard graphics card on a PC. As described above, to play each frame of a sequence of images pre-rendered according to the pre- rendering arrangement, a virtual camera is defined an arranged in software to view the pre-rendered scene on the image mesh. User definable parameters, such as pan, tilt and zoom, are received from, for example, a games console controller, at an input and applied to the virtual camera, so that an appropriate selection of the pixels from the image mesh is made and represented on a 2D screen. The selection of the pixels does not require computationally intensive processes such as interpolation because the pre-rendering process has ensured that the pixel arrangement, as transformed onto the mesh, is such that a simple selection of a portion of the pixels in any direction (pan or tilt) or of any size (zoom) is already appropriate for display on a 2D display. The selection merely involves selecting the appropriate part of the image mapped onto the mesh. This process can be repeated for each frame of an image thereby creating an video player.

Thus, the image player takes data mapped on the image mesh and transforms this to a 2D image. This step is computationally simple as it only involves taking the pixels mapped on the mesh, and the pan, tilt and zoom parameters input by the user, to select the pixels to be presented in a 2D image.

The virtual camera parameters can be modified in real time. For example, the focal length can be very short, providing a wide angle view. The sphere or other mesh shape, the texture and the virtual camera are combined in software which the end user controls by adjusting pan, tilt and zoom function of the virtual camera. Thus, the embodiment described enables a computer generated movie, for example, or a conventionally filmed movie converted to CG format or video acquired from another source, such as an immersive camera, to be divided into a number of layers and for a user to be able to navigate through those layers giving an impression of stereoscopy, depth and three-dimensions. This effect may be utilised in a wide range of environments, for example, in the computer games industry which uses the technique to enable users to explore actual movie scenes which have been processed using an embodiment of the invention to form the images into a number of layers. In that case, a conventional games controller may be used to navigate around the images and through the layers. It will be appreciated that although the generation of layers, including the assignment of objects to layers, is performed frame-by-frame on a non- real time basis, the play out of the layered video, and the input of user information, may be performed in real time with a frame rate equal to the frame rate of the original image source.

Claims

Claims
1. A method of processing a stream of images for display, comprising:
acquiring a stream of images from an image source;
dividing the images into a plurality of layers of image data to form a plurality of streams of image layer data;
mapping the images of each image layer data stream onto a respective mesh to provide representation of each layer, the meshes being spaced apart from each other;
selecting a portion of the representations of the layers mapped onto the meshes, the selected portions comprising image data from at least one layer; and
outputting the selected portion as a 2-dimensional stream of images.
2. A method according to claim 1 , wherein the selection of a portion of the representations is performed by a virtual camera moveable with respect to the meshes.
3. A method according to claim 2, wherein the virtual camera is moveable through at least some of the plurality of meshes.
4. A method according to any preceding claim, wherein the meshes are parallel to one another.
5. A method according to any preceding claim, wherein the meshes are flat.
6. A method according to any of claims 1 to 4, wherein the meshes are three dimensional and nested.
7. A method according to claim 1 or 6 wherein the meshes are spheres.
8. A method according to claim 1 or 6, wherein the meshes are hemispheres.
9. A method according to any of claims 6, 7 or 8, wherein the plurality of nested meshes are concentric.
10. A method according to any of claims 1 to 9, wherein the dividing of image data into a plurality of layers comprises assigning objects in an image or areas of an image to a layer on a frame-by-frame basis.
11. A method according to claim 10, wherein the dividing of image data into a plurality of layers comprises tracking the movement of assigned objects or image areas from one frame to the next.
12. A method according to claim 10 or 11 comprising forming a mask for each of (n - 1) layers where n is the number of layers, the mask for a given layer corresponding to the objects or areas of an image assigned to the layer.
13. A method according to any preceding claim, wherein the acquisition of a stream of image data comprises converting the stream of image data into a format suitable for processing.
14. A method according to claim 12, wherein layer data is held in a data channel associated with the layers.
15. A method according to claim 14, wherein the associated data carries further data relating to the layers.
16. Apparatus for processing a stream of images for display, comprising: means for acquiring a stream of image data from an image source;
means for dividing the images into a plurality of layers of image data to form a plurality of streams of image layer data;
mapping means for mapping the images of each stream of image layer data onto a respective mesh to provide a representation of each layer, the meshes for the plurality of layers being spaced apart from each other;
selecting means for selecting a portion of the representations of the layers mapped onto the meshes, the selected portions comprising image data from at least one layer; and
means for outputting the selected output as a 2-dimensional stream of images.
17. Apparatus according to claim 11 , wherein the selecting means comprises a virtual camera movable with respect to the meshes.
18. Apparatus according to claim 16 or 17, wherein the virtual camera is moveable through at least some of the plurality of meshes.
19. Apparatus according to claim 16, 17 or 18, wherein the meshes are arranged parallel to one another.
20. Apparatus according to any of claims 16 to 19, wherein the meshes are flat.
21. Apparatus according to any of claims 16 to 19, wherein the meshes are three dimensional and nested.
22. Apparatus according to claim 21 , wherein the meshes are spherical.
23. Apparatus according to claim 21 , wherein the meshes are hemispherical.
24. Apparatus according to any of claims 21 to 23, wherein the nested meshes are concentric.
25. Apparatus according to any of claims 16 to 24, wherein the means for dividing image data into a plurality of layers comprises means for assigning objects in an image, or areas of an image, to a layer on a frame- by-frame basis.
26. Apparatus according to claim 25, wherein the means for dividing of image data into a plurality of layers comprises means for tracking the movement of objects or image areas from frame to frame.
27. Apparatus according to claim 25 or 26, wherein the means for dividing image data into a plurality of layers comprises means for forming a mask for each of (n - 1) layers, where n is the number of layers, the mask for a given layer corresponding to the objects or areas of an image comprising the layer.
28. Apparatus according to any of claims 16 to 27, wherein the means for acquiring a stream of image data comprises means for converting the stream of image data into a format suitable for processing.
29. Apparatus according to claim 25, wherein the means for dividing into layers comprises means for forming a data channel holding layer related data, the data channel being associated with the layers.
30. A video player for playing the two-dimension images selected from layer image data mapped onto a plurality of spaced apart meshes by the method of any of claims 1 to 15, comprising an input device for real time input of a control to determine the portion of the three dimension representations to be selected, wherein the control selects image data from one or more of the plurality of layers; and
means for outputting a stream of selected image data for display.
31. A video player according to claim 30, wherein the control controls the position of a virtual camera moveable with respect to the meshes.
32. A video player according to claim 31 , wherein the control further controls other degrees of freedom of the virtual camera and zoom.
33. A method of processing a stream of images for display, substantially as herein described with reference to Figures 1 to 12 of the accompanying drawings.
34. Apparatus for processing a stream of images for display, substantially as herein described with reference to Figures 1 to 12 of the accompanying drawings.
35. A video player, substantially as herein described with reference to the accompanying drawings.
PCT/IB2008/001375 2008-03-07 2008-03-07 Method and apparatus for image processing WO2009109804A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/IB2008/001375 WO2009109804A1 (en) 2008-03-07 2008-03-07 Method and apparatus for image processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/IB2008/001375 WO2009109804A1 (en) 2008-03-07 2008-03-07 Method and apparatus for image processing

Publications (1)

Publication Number Publication Date
WO2009109804A1 true true WO2009109804A1 (en) 2009-09-11

Family

ID=40091889

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2008/001375 WO2009109804A1 (en) 2008-03-07 2008-03-07 Method and apparatus for image processing

Country Status (1)

Country Link
WO (1) WO2009109804A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2413104A1 (en) * 2010-07-30 2012-02-01 Pantech Co., Ltd. Apparatus and method for providing road view

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6266068B1 (en) * 1998-03-13 2001-07-24 Compaq Computer Corporation Multi-layer image-based rendering for video synthesis
EP1347656A1 (en) * 2000-12-15 2003-09-24 Sony Corporation Image processor, image signal generating method, information recording medium, and image processing program

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6266068B1 (en) * 1998-03-13 2001-07-24 Compaq Computer Corporation Multi-layer image-based rendering for video synthesis
EP1347656A1 (en) * 2000-12-15 2003-09-24 Sony Corporation Image processor, image signal generating method, information recording medium, and image processing program

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
HORRY Y ET AL: "TOUR INTO THE PICTURE: USING A SPIDERY MESH INTERFACE TO MAKE ANIMATION FROM A SINGLE IMAGE", COMPUTER GRAPHICS PROCEEDINGS. SIGGRAPH 97. LOS ANGELES, AUG. 3 - 8, 1997; [COMPUTER GRAPHICS PROCEEDINGS. SIGGRAPH], READING, ADDISON WESLEY, US, 3 August 1997 (1997-08-03), pages 225 - 232, XP000765820, ISBN: 978-0-201-32220-0 *
KANG S B: "A Survey of Image-based Rendering Techniques", INTERNET CITATION, August 1997 (1997-08-01), XP002508149, Retrieved from the Internet <URL:http://www.hpl.hp.com/techreports/Compaq-DEC/CRL-97-4.html> [retrieved on 20081211] *
POSE R ED - CALDER P ET AL: "Steerable interactive television: virtual reality technology changes user interfaces of viewers and of program producers", USER INTERFACE CONFERENCE, 2001. AUIC 2001. PROCEEDINGS. SECOND AUSTRA LASIAN GOLD COAST, QLD., AUSTRALIA 29 JAN.-1 FEB. 2001, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 29 January 2001 (2001-01-29), pages 77 - 84, XP010534525, ISBN: 978-0-7695-0969-3 *
REHG J M ET AL: "Video Editing Using Figure Tracking and Image-Based Rendering", INTERNET CITATION, December 1999 (1999-12-01), XP002261649, Retrieved from the Internet <URL:http://www.hpl.hp.com/techreports/Compaq-DEC/CRL-99-8.pdf> [retrieved on 20031114] *
SHADE J ET AL: "Layered depth images", COMPUTER GRAPHICS. SIGGRAPH 98 CONFERENCE PROCEEDINGS. ORLANDO, FL, JULY 19- - 24, 1998; [COMPUTER GRAPHICS PROCEEDINGS. SIGGRAPH], NEW YORK, NY : ACM, US, 19 July 1998 (1998-07-19), pages 231 - 242, XP002270434, ISBN: 978-0-89791-999-9 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2413104A1 (en) * 2010-07-30 2012-02-01 Pantech Co., Ltd. Apparatus and method for providing road view
CN102420936A (en) * 2010-07-30 2012-04-18 株式会社泛泰 Apparatus and method for providing road view
CN102420936B (en) 2010-07-30 2014-10-22 株式会社泛泰 Apparatus and method for providing a view of the road

Similar Documents

Publication Publication Date Title
Smolic et al. 3D video and free viewpoint video-technologies, applications and MPEG standards
US6327381B1 (en) Image transformation and synthesis methods
US6108005A (en) Method for producing a synthesized stereoscopic image
Schreer et al. 3D Videocommunication: Algorithms, concepts and real-time systems in human centred communication
Smolic et al. Three-dimensional video postproduction and processing
Tanimoto Overview of free viewpoint television
Kubota et al. Multiview imaging and 3DTV
Zhang et al. 3D-TV content creation: automatic 2D-to-3D video conversion
US6462769B1 (en) Image correction method to compensate for point of view image distortion
US6583808B2 (en) Method and system for stereo videoconferencing
US6084979A (en) Method for creating virtual reality
Fehn et al. Interactive 3-DTV-concepts and key technologies
Matusik et al. 3D TV: a scalable system for real-time acquisition, transmission, and autostereoscopic display of dynamic scenes
Tanimoto FTV (free-viewpoint television)
US5963247A (en) Visual display systems and a system for producing recordings for visualization thereon and methods therefor
US20120176481A1 (en) Processing image data from multiple cameras for motion pictures
US20070247522A1 (en) Method and Apparatus for Generating a Stereoscopic Image
US20110249090A1 (en) System and Method for Generating Three Dimensional Presentations
Smolic et al. Interactive 3-D video representation and coding technologies
US20080158345A1 (en) 3d augmentation of traditional photography
US20050219239A1 (en) Method and apparatus for processing three-dimensional images
US20080246759A1 (en) Automatic Scene Modeling for the 3D Camera and 3D Video
Yamanoue et al. Geometrical analysis of puppet-theater and cardboard effects in stereoscopic HDTV images
US20050185711A1 (en) 3D television system and method
US20100091012A1 (en) 3 menu display

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08751068

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08751068

Country of ref document: EP

Kind code of ref document: A1