WO2009034519A1 - Generation of a signal - Google Patents

Generation of a signal Download PDF

Info

Publication number
WO2009034519A1
WO2009034519A1 PCT/IB2008/053629 IB2008053629W WO2009034519A1 WO 2009034519 A1 WO2009034519 A1 WO 2009034519A1 IB 2008053629 W IB2008053629 W IB 2008053629W WO 2009034519 A1 WO2009034519 A1 WO 2009034519A1
Authority
WO
WIPO (PCT)
Prior art keywords
depth
related information
signal
image data
metadata
Prior art date
Application number
PCT/IB2008/053629
Other languages
French (fr)
Inventor
Bart G. B. Barenbrug
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Publication of WO2009034519A1 publication Critical patent/WO2009034519A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/261Image signal generators with monoscopic-to-stereoscopic image conversion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/128Adjusting depth or disparity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/194Transmission of image signals

Abstract

A method of generating a signal comprises receiving depth-related information for image data, receiving metadata relating to at least one mapping function used in the generation of the depth-related information, and multiplexing the depth-related information and metadata into a signal. The signal may also include image data which may define one or more image frames, which may be a pair of image frames. The depth-related information for the image data could comprise a depth map or a disparity map. The metadata could comprise the original mapping function itself used in the generation of the depth-related information, or information derived from that function such as an inverse of the mapping function or coefficients used in that mapping function.

Description

Generation of a signal
FIELD OF THE INVENTION
This invention relates to a method of and system for generating a signal, and to the signal itself, and to a method of and device for receiving the signal. In one embodiment, the invention provides the inclusion of metadata to link stereo images and depth.
BACKGROUND OF THE INVENTION
When a display device such as a television screen or computer monitor displays visual material, image data is received which is rendered by the display device. There are many different formats for the image data. The specific format being used depends upon the communication chain and capabilities of the display device. A very common format used, for example, in many current digital television broadcast systems is MPEG-2, which defines the structure of the image data transmitted and the structure of any additional data carried in the signal that includes the image data.
Three-dimensional (3D) display devices add a third dimension (depth) to the viewing experience by providing each of the viewer's eyes with different views of the scene that is being watched. Many 3D display devices use stereo input, which means that two different but related views are provided. This is used, for example, in standard 3D cinema (where glasses are used to separate left and right views for the viewer's eyes). Instead of, for example 50 frames (of image data) a second being provided, in a stereo system 100 frames a second are provided, being 50 for the left eye, and 50 for the right eye. Each frame of a pair comprises a slightly different view of the same scene, which the brain combines to create a three-dimensional image. As a result of the adoption of this technology in 3D cinemas, there is a lot of stereo content available. It is also possible that there are home cinema enthusiasts who will want to replicate the cinema experience at home and build or install stereo projection systems.
However, the use of glasses that are associated with stereo 3D systems is cumbersome for many applications, such as 3D signage and also more casual home 3DTV viewing. Glasses-free systems (also called auto-stereoscopic systems) often provide more than two views of the scene to provide freedom of movement of the viewer, and since the number of views varies, the representation that is often used in these applications is the image + depth format, where one image and its depth map provide the information required for rendering as many views as needed.
A problem that exists with systems that provide depth-related information is that the structure of the depth-related information (which is additional to the image data), will be optimized for a particular end rendering system or device. For example, if a depth map is provided, then this may be designed with the average end system in mind. For example, it may be assumed in the creation of the map that the end system is designed to provide 6 different views (the user will only ever see two of the six views, depending upon their position). The choice of 6 views may be based upon what is perceived to be the most likely (or average) configuration of the end system. However the depth-related information contained within the image data may not be appropriate for the rendering that will occur at the display device.
SUMMARY OF THE INVENTION
It is therefore an object of the invention to improve upon the known art. According to a first aspect of the invention, there is provided a method of generating a signal, comprising receiving depth-related information for image data, receiving metadata relating to at least one mapping function used in the generation of the depth-related information, and multiplexing the depth-related information and metadata into a signal.
According to a second aspect of the invention, there is provided a system for generating a signal, comprising a receiver arranged to receive depth-related information for image data, and metadata relating to at least one mapping function used in the generation of the depth-related information, and a multiplexer arranged to multiplex the depth-related information and metadata into a signal.
According to a third aspect of the invention, there is provided a signal comprising depth-related information for the data, and metadata relating to at least one mapping function used in the generation of the depth-related information.
According to a fourth aspect of the invention, there is provided a method of receiving a signal, comprising receiving a signal comprising depth-related information for image data, and metadata relating to at least one mapping function used in the generation of the depth-related information, de-multiplexing the depth-related information and metadata from the signal, and processing the depth-related information for the image data according to the metadata. According to a fifth aspect of the invention, there is provided a device for receiving a signal, comprising a receiver arranged to receive a signal comprising depth- related information for image data, and metadata relating to at least one mapping function used in the generation of the depth-related information, a de-multiplexer arranged to de- multiplex the depth-related information and metadata from the signal, and a processor arranged to process the depth-related information for the image data according to the metadata.
By virtue of the invention, it is possible to provide, in addition to the depth- related information, metadata that relates to the generation of the depth-related information. In the simplest embodiment, the metadata may be the function that was used to create the depth-related information. The very significant advantage of the invention is that the provision of the metadata in the signal allows the end receiver to work back from the depth- related information, using the metadata, to the underlying data that was used to create the depth-related information. This allows the end receiver to adjust the depth-related information or to derive different information, and thereby access data that is more appropriate to the end solution provided by the specific end receiver.
The principle of the invention is that the provision of the metadata allows the receiver to work backwards from the depth-related information to obtain data that could not be obtained otherwise from the signal without the metadata. In some cases this will allow an exact recreation of the data used to generate the depth-related information, and in other cases it will allow a close approximation of that data. The depth-related information could be, for example, a depth map, or a disparity map (from which depth can be calculated).
The signal may also include the image data, or this may be delivered by a different channel. In one embodiment of the invention, the signal comprises a single frame, a depth map and the metadata, which might be the original function (such as an appropriate look-up table) that was used to derive the depth map, or coefficients used in a function. Alternatives to a look-up table that could be used include a piecewise linear model, a spline based curve with variable control points and a transfer function based upon polynomial approximation. Alternatively, the inverse of the original function could be provided as metadata sent in the signal, or again the coefficients used in the inverse function could be included.
The depth map may have been derived from a disparity map between two images, or may have been generated from raw depth data, for example as acquired by a laser range-finder. The metadata allows the end receiver to work back from the depth map to, for example, the disparity map. This then means that the receiver has acquired further data about the image data that was not derivable from the image data and the depth-related information alone. This could then be used, for example, to calculate a new depth map that is more appropriate to the end solution being delivered at the receiver location. It could also be used to actually adjust the image data.
The invention also provides a suitable solution for supplying a 3D signal that will work irrespective of whether the end system is a stereoscopic (which requires special glasses) or auto-stereoscopic system (which does not require the user to wear special glasses). This makes it possible to share and re-use content in all application domains, since both stereo and multi-view systems will be options for display at the end of the various video chains. In this respect, the signal structure of stereo+depth will become a prominent transmission/storage format. In stereo+depth, both a stereo pair and a depth map are coded. The stereo pair can directly be used for stereo displays, and the depth map along with one of the images from the stereo pair can be used as the image+depth information required by auto- stereoscopic displays.
In this context, it is highly advantageous to encode the metadata along with the stereo+depth representation that encodes how the depth signal relates to the stereo pair. The depth signal can be thought of as encoding the disparity between left and right images in the stereo pair, but often the exact link between the actual disparity between this pair of left and right images and the values in the depth map is lost. This is valuable information, since, if it is known how left and right images relate (a relation given by the disparity between them), this can be used in rendering, to interpolate between left and right, rather than extrapolate from one of the images only. Such a rendering technique is described in [Ralph Braspenning and Marc Op de Beeck, Efficient View Synthesis from Uncalibrated Stereo, in Proceedings of the SPIE, 6055, 2006].
For example, a depth-from stereo estimator computes disparity between the two images of a stereo pair. Such disparities can, for example, lie in a range of 40 pixels centered around zero disparity (this corresponds to part of the scene being imaged in front of the display plane, and part behind the display plane). Since a depth map usually holds grey values between 0 and 255, such a disparity value is then often translated to a depth value. In the example such a mapping function could be given by m = (d+20)*255/40 (for d the actual disparity and m the value in the depth map). Given only the depth map, the relation between the original left and right images is lost, but if the above mentioned formula is known (for example by encoding that there is a linear relationship with coefficients 20 and 255/40 as the metadata), the relation between d and m can be inverted, allowing the actual disparity values to be computed from the depth map values. Non- linear functions can also be covered by the metadata.
The relation between the depth map values and the actual disparities may not always be as simple as a linear relationship. For example, instead of encoding a scaled disparity, also distance from the camera could be encoded in a depth map. The relation then takes the form of a 1/x relation, where amongst others the base line distance of the camera plays a role. Even if disparity is encoded, it might be warped non- linearly to focus on a region of interest (see for example [Nick Holliman, Mapping Perceived Depth to Regions of Interest in Stereoscopic Images, in Stereoscopic Displays and Applications XV, 2004, available as http://www.comp.leeds.ac.uk/edemand/publications/hol04a.pdfj. In such a case, this warp should also be taken into account. A general representation for the relation is a table that lists the disparity corresponding to each of the 256 possible depth values, i.e. in case of an 8-bit depth representation. As will be clear to the skilled person, more compact representations can be used if the shape of the function is known.
It is also possible that the exact relationship between the two images cannot be retrieved from the depth map and only a little bit of metadata. This can then also be indicated in the metadata. An example is the application of a local contrast enhancement to the depth map, which modifies the depth map values differently in one part of the image than in other parts: here the mapping between the depth value and disparity becomes dependent on the location within the image(s), and encoding this could potentially cost as much as encoding the disparity map directly in addition to the modified depth map. A similar example is the use of a depth expander on the depth data.
Stereo is a common source in relation to image data (since shooting live-action scenes in 3D is almost always done in stereo since cameras that record depth maps directly do not yield the required quality). So estimation of depth from stereo will be a common source for the depth maps, and the information is available in such a depth-from-stereo estimator. Another common source is computer-generated imagery, where also the link between depth-maps (generated from for example reading the z-buffer of the graphics subsystems) and the images (recorded with two virtual cameras) is known. So, in many cases the relationship is known at the authoring side, and storing/transmitting this metadata embodying this relationship, along with the images and depth-related information, enriches the representation and allows for better processing further in the chain. The mapping information represented in the metadata can change over time, for example with different mappings used for different shots. For example, when in one shot disparity is mapped onto depth in a first manner, whereas in a second shot the disparity is mapped in a second manner different from the first, both mappings can be encoded in the metadata associated with the respective shots. In such a scenario, the metadata may be provided with a (presentation) timestamp in order to enable processing of the depth map at the receiver prior to the presentation. Alternatively, the interval applicable may be coded, and/or the location of the next relevant metadata entry in the signal may be indicated.
BRIEF DESCRIPTION OF THE DRAWINGS
Embodiments of the present invention will now be described, by way of example only, with reference to the accompanying drawings, in which: Fig. 1 is a schematic diagram of a broadcasting system,
Fig. 2 is a schematic diagram of a broadcasting station for use in the system of Fig. 1,
Fig. 3 is a diagram showing the relationship between different types of depth- related information,
Fig. 4 is a schematic diagram of a receiver for use in the system of Fig. 1, Fig. 5 is a diagram showing the components of a broadcast signal, and Fig. 6 is a diagram similar to Fig. 5, showing the components of a different broadcast signal.
DETAILED DESCRIPTION OF EMBODIMENTS
A broadcasting system 10 is shown in Fig. 1, which comprises a broadcasting station 12, a receiver 14 and display device 16. A signal 18 is broadcast to multiple recipients, who are each provided with a suitable receiver 14. The signal 18 includes image data 20, shown in the Figure schematically as a sequence of frames. At the transmitting end of the broadcast chain, i.e. the station 12, the signal 18 is created. The image data 20 is formatted according to the scheme being used in the system and multiplexed into the signal 18 for broadcast.
The system 10 shown in Fig. 1 is one example of how a signal may be provided to an end user for rendering on their display device 16. Rather than a wireless system, as shown here, the communication between the station 10 and the receiver 14 may be via a wired connection, through a wide area network such as the Internet. The signal 18 may be packetized, for use in a protocol such as TCP/IP, or may be a continuous stream of data broadcast over a conventional broadcast channel. It is common in many transmission schemes such as DVB-T, which is a scheme for providing digital television pictures terrestrially (i.e. rather than via satellite or cable), that space is provided in the signal for additional information at the broadcaster's discretion. The system of the present invention includes further data above and beyond the image data in the broadcast signal 18. The signal 18 can also be stored on a suitable storage medium. This could be a high-definition media storage format such as BIu Ray disc, which is capable of storing a transport stream and metadata. Fig. 2 shows more in detail an embodiment of the broadcasting station 12.
This is an example of a system that would create the broadcast signal 18, here using stereo estimation (but other embodiments, using e.g. depth cameras are also possible as long as there is a disparity/depth manipulation block (like the component 101) in the signal path.
A stereo estimator 100 receives image data in the form of a left image 110 and a right image 111. Typically these are generated from a scene by two cameras that are spaced a short distance apart to mimic the positioning of a user's eyes. The two images 110 and 111 could also have been generated in a software 3D modeling package, which is able to define a 3D object or scene and then, by use of a "virtual" camera, calculate a pair of 2D images that again are based upon the slight spacing apart of the viewpoint. This pair of images is used by a stereoscopic display device to generate the output that is perceived to be 3D. However, for an auto-stereoscopic display, it is typical to include some form of depth-related information.
In the embodiment of Fig. 2, the stereo estimator 100 distills disparity data 112 from the left image 110 and right image 111. A software or hardware process compares the two images 110 and 111 to detect parts of the images that match. Once a specific component in one image has been located in the other image, the shift of the detected component is calculated, which is the disparity. Various algorithms are known in the art for calculating disparity from a pair of images that are related. This disparity data 112 can be represented in a disparity map which can be on a pixel by pixel basis for a specific image of the image pair. The output of the stereo estimator 100 is the original image data (images 110 and 111) and the disparity data 112. This is then fed to a disparity to depth conversion component 101. This component 101 generates depth-related information 114 from the images and the disparity data 112. This process possibly uses user input 113, which may set parameter ranges for the depth information, for example. The output of the disparity to depth conversion unit 101 is the image data (left image 110 and right image 111) plus both a depth signal 114 and mapping signal 115 (simply relaying left and right signal), which have been created by the component 101. The depth signal 114 defines depth-related information for the image data, which may be a depth map relative to a single one of the two images 110 and 111. The mapping signal 115 is metadata relating to at least one mapping function used in the generation of the depth-related information 114. This mapping data 115 may simply be the actual function used to create the information 114, or may be the inverse of that function. The metadata 115 is provided for the purpose of allowing the ultimate end receiving device to access back to the data that was used to create the depth-related information 114. The metadata 114 could be something as simple as a look-up table that defines a mapping of disparity to depth. This allows the end receiver to obtain the disparity data 112 from the depth-related information 114, without any need to approximate or estimate any of the data.
The multiplexer/encoder 102 creates the signal 18 which can be broadcast. The multiplexer 102 provides a method of generating a signal, which comprises receiving the image data 110 and 111, receiving the depth-related information 114 for the image data, receiving the metadata 115 (which relates to at least one mapping function used in the generation of the depth-related information 114), and finally multiplexing the image data 110 and 111, depth-related information 114 and metadata 115 into a signal 18. The multiplexer can also operate by just multiplexing the depth-related information 114 and metadata 115 into the signal 18 without the image data being present. This supports a system where the image data reaches the end receiver via a different channel from the depth-related information 114 and metadata 115.
The example of Fig. 2 is a system that uses two images (left 110 and right 111) and generates a depth map 114 and associated metadata 115 for transmission to end-users. This system supports stereoscopic and auto-stereoscopic display devices at the end receiver, as two images are present (for the stereoscopic display devices) and a depth map is present (needed by the auto-stereoscopic display systems). The invention is equally applicable in situations where only a single image (rather than a pair) is present in addition to the depth- related information and the metadata. There will always be situations where the receiving device wishes to render the image data in a way that is sub-optimal for the structure of the specific depth-related information, and the presence of the metadata allows the receiver to backtrack through the processing that lead to the creation of the received depth-related information. This allows the depth-related information to be regenerated, perhaps using a modified function. For example, the original image(s) received may have depth-related information that defines depths from -20 to +20, when in fact the end receiver and attached display device have sufficient functionality to support a wider range of depths, say -40 to +40. To simply scale the received depth map could potentially distort the image and create artifacts in the display. In this case, without the metadata there is no way to tread back, for example to the original disparity map or even the raw disparity data. However, with the metadata, it is possible for the device at the end of the transmission chain to recalculate correctly a depth map that accurately works with the range of depths supported by the end display. It should be understood that the metadata, which is supplied with the image data and the depth-related information, could be structured in many different ways. How it is structured best depends primarily upon the type of mapping that has occurred to create the depth-related information. One simple possibility for the metadata is that it defines the inverse function of the mapping function that created the depth-related information, whether that mapping function was linear or non- linear. In this case the receiver at the end of the chain has (through the metadata) a simple way of processing the values in the depth-related information to obtain the data that was used to create that information. As has been discussed above, if a known and defined function is being used (perhaps a polynomial) then it is sufficient that the coefficients of the polynomial are transmitted as the metadata, as this again will allow the end receiving device to effectively "undo" the calculations that led to the depth-related information.
The term "depth-related information" also defines a wide range of possible structures, for example, a depth map or a disparity map (since it is known in the art how to calculate depth from disparity), or even raw disparity data. Fig. 3 illustrates the relationships between these different types of depth-related information. On the left hand side of the
Figure, a representation is shown that is applicable to the situation where image data is being captured by a camera that also has a range finding function. In this case, raw depth data is acquired and this is converted to a depth map that relates to the image captured. Each single image frame has a respective depth map, and this is used by an auto-stereoscopic display system to generate a 3D output. An auto-stereoscopic display system works by generating multiple images from the image data and the depth map, each slightly shifting the viewpoint according to a predetermined scheme.
On the right hand side of Fig. 3, a representation is shown that is applicable to the situation where the image data is made up of a pair of stereo images. This pair of images (or more likely a sequence of image pairs) will be sufficient to drive a conventional stereoscopic display that requires the user to wear special glasses to ensure that each eye only sees one image, and that the image seen by each eye is different from that seen by the other. In this case, in order to provide sufficient data that will allow an auto-stereoscopic display system to output a 3D image, further data needs to be generated. The two images are compared (in a known manner) to derive raw disparity data. This is then processed (again as is known to the skilled reader) to produce the depth map, which relates to one of the images.
Each of the arrows in Fig. 3 can be considered to be a mapping function, which may be linear or non- linear. In the broadcasting station 12 of Figs. 1 and 2, the metadata that is created for inclusion in the broadcast signal 18 is derived from these mapping functions. This means that the depth-related information in the signal 18 (such as the depth map of Fig. 3) can be inverted to travel back to the previous state of the data. In the case of the right hand side of Fig. 3, the metadata could comprise data about both of the functions that were used to derive the depth map. The device for receiving the signal, the receiver 14, is shown in more detail in
Fig. 4. This shows an example of a system that would receive the signal 18, again using by way of example stereo+depth, as the counterpart of Fig. 2. The components that make up the signal 18 need not be received from the same source. For example, the image data could be from a storage device such as a DVD, with the depth-related information and metadata being downloaded from the Internet.
The receiver 14 includes a de-multiplexer/decoder 400 receiving the transmitted (or stored) signal 18, and producing the embedded left image signal 411, right image signal 412, depth-related information 413, and metadata 414, which is the (inverse) mapping information, as an output. The output of the de-multiplexer is passed to an inverse mapping block 401, which is a processor for processing the depth-related information 413 for the image data according to the metadata 414. The processor 401 is arranged to re-derive from depth-related information 413 and (inverse) mapping information 414, a new depth related signal 415 (which may be substantially equal to the original disparity between left and right images 411 and 412). The inverse mapping block 401 uses the metadata 414 to work back from the depth-related information 413 to the source data. As discussed above, this may be the original depth data or disparity data or a disparity map for the two images 411 and 412. In the embodiment of Fig. 4, the block 401 processes the depth-related information 413 for the image data 411 and 412, according to the metadata 414, to generate the new component 415. The output of the inverse mapping block, the processor 401, is passed to the display controller 402, which uses images 411 and 412 and new depth-related information 415 (and possibly the original data 413), for example to render views for a multi-view (auto- stereoscopic) display. One way in which the display controller 402 might operate is to use left image 411 and depth map 413 for view-generation, and disparity map 415 along with image 412 to retrieve extra occlusion information. Since, in an auto-stereoscopic device, the normal procedure is to use a single image and a depth map to generate the multiple images that are displayed, the invention, in this embodiment, provides the advantage that both images 411 and 412 can be used to generate the multiple views of the auto-stereoscopic system.
The existence of the metadata 414 in the original signal 18 means that the receiver 14, at the inverse mapping block 401, can calculate the original disparity map between the two images 411 and 412. The display controller 420 can therefore acquire information from one of the images that is not present in the other image. The two images 411 and 412 are images of the same scene, but with the viewpoint slightly shifted, and each will have image data not present in the other. Particularly in relation to objects that are near to the viewpoint, the shifting of the viewpoint will provide image data that is effectively hidden from the other viewpoint. The disparity map between the two images (derived from the depth-related information and the metadata) can be used to locate objects in each image and therefore derive additional image data to interpolate a new viewpoint.
In one embodiment, the processor 401 is arranged to generate disparity data from the depth-related information, and generates further image data from the received image data and the disparity data. The received image data comprises a pair of images, and the generated further image data comprises an interpolated image between the pair of images. Fig. 5 shows the components that are present in an embodiment of the signal
18, as received by the receiver 14. The signal 18 comprises image data, depth-related information for the image data, and metadata relating to at least one mapping function used in the generation of the depth-related information. In this embodiment, the components of the signal are such that the image data is defined as a pair of related images, being top-left the left image, and bottom-left the right image. The top-right image is a grey scale reproduction of the depth map for the left image after a disparity to depth mapping has taken place.
At the bottom-right of the Figure, the disparity to depth mapping carried out in the stereo conversion unit 101 of Fig. 2 is shown. The histogram of the disparity between left and right is shown, and, using the curve, the user can indicate the mapping to depth, resulting in the top-right depth map. The mapping information to be encoded as the metadata, in this case, consists of the three control points of the curve, using which of the inverse mapping can be derived to determine the original disparity values from the depth map values.
Fig. 6 shows an alternative embodiment of the signal 18, which comprises the image data (top left, a single image frame), depth-related information (bottom left, a depth map), and the metadata that relates to the function that generated the depth-related information (top right, a look-up table of values). The image data and depth map can be used to drive an auto-stereoscopic display device, as multiple views can be generated from the single image and depth map. The look-up table defines a matching of depth values in the depth map (ranging in grey scale from 0 to 255) to disparity data (ranging from -20 to +20). The values in the right hand column of the metadata increase non-linearly. In some schemes it will be possible to just store or transmit only the right hand column, as values in the left hand column can be implicit.

Claims

CLAIMS:
1. A method of generating a signal comprising: receiving depth-related information for image data, receiving metadata relating to at least one mapping function used in the generation of the depth-related information, and - multiplexing the depth-related information and metadata into a signal.
2. A method according to claim 1, further comprising receiving image data and multiplexing the image data into the signal.
3. A method according to claim 2, wherein the image data comprises at least one of: one or more image frames and one or more pairs of image frames.
4. A method according to any preceding claim, wherein the depth-related information for the image data comprises at least one of: a depth map and a disparity map.
5. A method according to claim 1, wherein the mapping function comprises a look-up table.
6. A method according to claim 1, wherein the metadata comprises at least one of: - at least one mapping function used in the generation of the depth-related information and at least one inverse of a mapping function used in the generation of the depth- related information.
7. A method according to claim 1, wherein the metadata comprises at least one coefficient for a mapping function used in the generation of the depth-related information.
8. A system for generating a signal comprising: - a receiver arranged to receive depth-related information for image data, and metadata relating to at least one mapping function used in the generation of the depth-related information, and a multiplexer arranged to multiplex the depth-related information and metadata into a signal.
9. A system according to claim 8, wherein the receiver is further arranged to receive image data, and the multiplexer is further arranged to multiplex the image data into the signal.
10. A system according to claim 9, wherein the image data comprises at least one of: one or more image frames and one or more pairs of image frames.
11. A signal comprising: depth-related information for the data and metadata relating to at least one mapping function used in the generation of the depth-related information.
12. A signal according to claim 11, the signal further comprising image data.
13. A signal according to claim 12, wherein the image data comprises at least one of: one or more image frames and - one or more pairs of image frames.
14. A method of receiving a signal comprising: receiving a signal comprising depth-related information for image data, and metadata relating to at least one mapping function used in the generation of the depth-related information, de-multiplexing the depth-related information and metadata from the signal and processing the depth-related information for the image data according to the metadata.
15. A method according to claim 14, wherein the signal further comprises image data and the de-multiplexing step further comprises de-multiplexing the image data from the signal.
16. A method according to claim 14 or 15, wherein the processing step comprises generating disparity data from the depth-related information.
17. A method according to claim 16, further comprising generating further image data from the received image data and the disparity data.
18. A method according to claim 17, wherein the received image data comprises a pair of images, and the generated further image data comprises an interpolated image between the pair of images.
19. A device for receiving a signal comprising: a receiver arranged to receive a signal comprising depth-related information for image data, and metadata relating to at least one mapping function used in the generation of the depth-related information, - a de-multiplexer arranged to de-multiplex the depth-related information and metadata from the signal, and a processor arranged to process the depth-related information for the image data according to the metadata.
20. A device according to claim 19, wherein the signal further comprises image data and the de-multiplexer is further arranged to de-multiplex the image data from the signal.
21. A device according to claim 19 or 20, wherein the processor is arranged to generate disparity data from the depth-related information.
PCT/IB2008/053629 2007-09-13 2008-09-09 Generation of a signal WO2009034519A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP07116278.8 2007-09-13
EP07116278 2007-09-13

Publications (1)

Publication Number Publication Date
WO2009034519A1 true WO2009034519A1 (en) 2009-03-19

Family

ID=40032564

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2008/053629 WO2009034519A1 (en) 2007-09-13 2008-09-09 Generation of a signal

Country Status (2)

Country Link
TW (1) TW200931948A (en)
WO (1) WO2009034519A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2197217A1 (en) * 2008-12-15 2010-06-16 Koninklijke Philips Electronics N.V. Image based 3D video format
EP2309765A1 (en) * 2009-09-11 2011-04-13 Disney Enterprises, Inc. System and method for three-dimensional video capture workflow for dynamic rendering
WO2011055950A2 (en) 2009-11-03 2011-05-12 Lg Electronics Inc. Image display apparatus, method for controlling the image display apparatus, and image display system
EP2590418A1 (en) * 2011-11-01 2013-05-08 Acer Incorporated Dynamic depth image adjusting device and method thereof
US8913108B2 (en) 2008-10-10 2014-12-16 Koninklijke Philips N.V. Method of processing parallax information comprised in a signal
WO2015055607A2 (en) 2013-10-14 2015-04-23 Koninklijke Philips N.V. Remapping a depth map for 3d viewing
US9426441B2 (en) 2010-03-08 2016-08-23 Dolby Laboratories Licensing Corporation Methods for carrying and transmitting 3D z-norm attributes in digital TV closed captioning
US9519994B2 (en) 2011-04-15 2016-12-13 Dolby Laboratories Licensing Corporation Systems and methods for rendering 3D image independent of display size and viewing distance

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997023097A2 (en) * 1995-12-19 1997-06-26 Philips Electronics N.V. Parallactic depth-dependent pixel shifts
EP1587329A1 (en) * 2003-01-20 2005-10-19 Sanyo Electric Co., Ltd. Three-dimensional video providing method and three-dimensional video display device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997023097A2 (en) * 1995-12-19 1997-06-26 Philips Electronics N.V. Parallactic depth-dependent pixel shifts
EP1587329A1 (en) * 2003-01-20 2005-10-19 Sanyo Electric Co., Ltd. Three-dimensional video providing method and three-dimensional video display device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
FEHN C: "Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3D-TV", PROCEEDINGS OF THE SPIE - THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING, SPIE, BELLINGHAM, VA; US, vol. 5291, 31 May 2004 (2004-05-31), pages 93 - 104, XP002444222 *
KAUFF ET AL: "Depth map creation and image-based rendering for advanced 3DTV services providing interoperability and scalability", SIGNAL PROCESSING. IMAGE COMMUNICATION, ELSEVIER SCIENCE PUBLISHERS, AMSTERDAM, NL, vol. 22, no. 2, 16 March 2007 (2007-03-16), pages 217 - 234, XP005938670, ISSN: 0923-5965 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8913108B2 (en) 2008-10-10 2014-12-16 Koninklijke Philips N.V. Method of processing parallax information comprised in a signal
US8767046B2 (en) 2008-12-15 2014-07-01 Koninklijke Philips N.V. Image based 3D video format
EP2197217A1 (en) * 2008-12-15 2010-06-16 Koninklijke Philips Electronics N.V. Image based 3D video format
KR101651442B1 (en) 2008-12-15 2016-08-26 코닌클리케 필립스 엔.브이. Image based 3d video format
KR20110106367A (en) * 2008-12-15 2011-09-28 코닌클리케 필립스 일렉트로닉스 엔.브이. Image based 3d video format
WO2010070545A1 (en) * 2008-12-15 2010-06-24 Koninklijke Philips Electronics N.V. Image based 3d video format
US20110242279A1 (en) * 2008-12-15 2011-10-06 Koninklijke Philips Electronics N.V. Image based 3d video format
US8614737B2 (en) 2009-09-11 2013-12-24 Disney Enterprises, Inc. System and method for three-dimensional video capture workflow for dynamic rendering
EP2309765A1 (en) * 2009-09-11 2011-04-13 Disney Enterprises, Inc. System and method for three-dimensional video capture workflow for dynamic rendering
EP2497275A4 (en) * 2009-11-03 2016-01-27 Lg Electronics Inc Image display apparatus, method for controlling the image display apparatus, and image display system
WO2011055950A2 (en) 2009-11-03 2011-05-12 Lg Electronics Inc. Image display apparatus, method for controlling the image display apparatus, and image display system
US9426441B2 (en) 2010-03-08 2016-08-23 Dolby Laboratories Licensing Corporation Methods for carrying and transmitting 3D z-norm attributes in digital TV closed captioning
US9519994B2 (en) 2011-04-15 2016-12-13 Dolby Laboratories Licensing Corporation Systems and methods for rendering 3D image independent of display size and viewing distance
EP2590418A1 (en) * 2011-11-01 2013-05-08 Acer Incorporated Dynamic depth image adjusting device and method thereof
WO2015055607A2 (en) 2013-10-14 2015-04-23 Koninklijke Philips N.V. Remapping a depth map for 3d viewing
WO2015055607A3 (en) * 2013-10-14 2015-06-11 Koninklijke Philips N.V. Remapping a depth map for 3d viewing
CN105612742A (en) * 2013-10-14 2016-05-25 皇家飞利浦有限公司 Remapping a depth map for 3D viewing

Also Published As

Publication number Publication date
TW200931948A (en) 2009-07-16

Similar Documents

Publication Publication Date Title
US8780173B2 (en) Method and apparatus for reducing fatigue resulting from viewing three-dimensional image display, and method and apparatus for generating data stream of low visual fatigue three-dimensional image
US9036006B2 (en) Method and system for processing an input three dimensional video signal
KR101716636B1 (en) Combining 3d video and auxiliary data
US9219911B2 (en) Image processing apparatus, image processing method, and program
KR101185870B1 (en) Apparatus and method for processing 3 dimensional picture
JP4952657B2 (en) Pseudo stereoscopic image generation apparatus, image encoding apparatus, image encoding method, image transmission method, image decoding apparatus, and image decoding method
US10158838B2 (en) Methods and arrangements for supporting view synthesis
WO2009034519A1 (en) Generation of a signal
US20100309287A1 (en) 3D Data Representation, Conveyance, and Use
KR20130053452A (en) Calculating disparity for three-dimensional images
KR20090007384A (en) Efficient encoding of multiple views
KR101652186B1 (en) Method and apparatus for providing a display position of a display object and for displaying a display object in a three-dimensional scene
EP2282550A1 (en) Combining 3D video and auxiliary data
KR20130135278A (en) Transferring of 3d image data
EP2837183A2 (en) Depth signaling data
Millin et al. Three dimensions via the Internet
Onural Progress in European 3DTV research
Didier et al. Use of a Dense Disparity Map to Enhance Quality of Experience Watching Stereoscopic 3D Content at Home on a Large TV Screen

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08807577

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08807577

Country of ref document: EP

Kind code of ref document: A1