GB2470402A - Transmitting three-dimensional (3D) video via conventional monoscopic (2D) channels as a multiplexed, interleaved data stream - Google Patents

Transmitting three-dimensional (3D) video via conventional monoscopic (2D) channels as a multiplexed, interleaved data stream Download PDF

Info

Publication number
GB2470402A
GB2470402A GB0908819A GB0908819A GB2470402A GB 2470402 A GB2470402 A GB 2470402A GB 0908819 A GB0908819 A GB 0908819A GB 0908819 A GB0908819 A GB 0908819A GB 2470402 A GB2470402 A GB 2470402A
Authority
GB
United Kingdom
Prior art keywords
video
streams
picture
pictures
stream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB0908819A
Other versions
GB0908819D0 (en
Inventor
Timothy Borer
Stuart Sommerville
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NUMEDIA TECHNOLOGY Ltd
British Broadcasting Corp
Original Assignee
NUMEDIA TECHNOLOGY Ltd
British Broadcasting Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NUMEDIA TECHNOLOGY Ltd, British Broadcasting Corp filed Critical NUMEDIA TECHNOLOGY Ltd
Priority to GB0908819A priority Critical patent/GB2470402A/en
Publication of GB0908819D0 publication Critical patent/GB0908819D0/en
Priority to PCT/GB2010/001023 priority patent/WO2010133852A2/en
Publication of GB2470402A publication Critical patent/GB2470402A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/139Format conversion, e.g. of frame-rate or size
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/161Encoding, multiplexing or demultiplexing different image signal components
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/194Transmission of image signals

Abstract

A method of transmitting three dimensional (3D) video via a two dimensional (2D) monoscopic channel 47 comprises receiving two or more streams 42, 43 of video pictures, each representing a different view of a scene for assembly into a 3D picture at a receiver 48, and time multiplexing 46 the streams into a single interleaved stream for transmission. Thus, conventional, existing broadcast/transmission channels may be used for 3D image data. Streams may be left (L) and right (R) picture streams for stereoscopic video, e.g. multi-view 3D video. Rates may be downsampled (Figure 5) to be within conventional, current channel capabilities; downsampling phases may be offset for motion fluidity. The single stream may include ancillary information indicating interleaving phase. GOP compression, preferably from pictures in the same received streams, may be used, e.g. schemes as shown in Figures 8 and 9. The interleaved (L, R) stream may include interlaced fields in various left/right odd/even formats. Left and right picture sequences may be resampled to half height progressive sequences, then interleaved as full height interlaced frames. Streams can be upsampled. Synchronized shutter glasses or anaglyph methods may be used (Figures 6, 7).

Description

AN APPARATUS AND METHOD OF TRANSMITTING THREE-
DIMENSIONAL VIDEO PICTURES VIA A TWO DIMENSIONAL
MONOSCOPIC VIDEO CHANNEL
The invention relates to an apparatus and a method of transmitting signals for stereoscopic or multiview view 3D video through existing broadcast channels tailored for 2D video.
Stereoscopy or sometimes Stereoscopic 3D refers to the technique in which slightly different images of the same scene are presented to each eye of a viewer in order to generate an illusion of depth in the presented image. The final image has a 3D or three dimensional quality that can be easily distinguished from monoscopic 2D images, which regardless of the nature of the image they present, are perceived by the viewer as essentially flat.
Multiview refers to techniques where images of a scene are captured by cameras with a slightly different viewpoints and are subsequently presented to a viewer according to the position of the viewer in the field of the screen.
In recent years, there has been a resurgence of interest in stereoscopic 3D movies and video. Equipment is available for 3D video and movie production and for display in cinemas, but, hitherto, there has been no satisfactory way for delivery of 3D video and movies to the end user, including to the home.
Ideally a delivery system would require minimal, if any, increase in bandwidth compared to monoscopic 2D movies or video. Additionally, 3D delivery would ideally use existing consumer equipment, such as Set Top Boxes, that are already installed. If this is not possible, a delivery system would ideally use existing technology and be compatible with conventional monoscopic delivery.
Summary of the Invention
The invention is defined in the claims to which reference should now be made. Advantageous features are set out in the dependent claims.
S
The invention provides a way of delivering 3D stereoscopic or multiview TV and movies to the viewers via broadcast, the internet, or other media, using existing video compression technology, requiring less than twice the bandwidth of monoscopic delivery, in a way that is compatible with monoscopic 2D delivery. This is achieved by time multiplexing pictures from the left and right channels, in time, to produce a single composite picture sequence, that can be compressed using existing video compression techniques. In one implementation the composite picture sequence comprises alternate pictures intended for the left and right eye. Corresponding method, apparatus and computer programs implementations are contemplated.
Brief Description of the Drawings
Preferred embodiments of the invention will now be described by way of example and with reference to the drawings in which: Figure 1 illustrates a first known technique for transmitting 3D video through a channel designed for monoscopic 2D video; Figure 2 illustrates a second known technique for transmitting 3D video through a channel designed for monoscopic 2D video; Figure 3 illustrates a third known technique for transmitting 3D video through a channel designed for monoscopic 2D video; Figure 4 illustrates a first example implementation of an example apparatus embodying the invention; Figure 5 illustrates an alternative implementation of example apparatus embodying the invention; Figure 6 illustrates an apparatus for demultiplexing a 3D video stream according to a first example implementation of the invention;
S
Figure 7 illustrates an alternative apparatus for demultiplexing a 3D video stream according to a second example implementation of the invention; Figure 8 illustrates, by way of example, sub-optimal compression in a modern interframe coder; and Figure 9 illustrates a preferred order of images in a GOP structure.
Detailed Description of the Preferred Embodiments
In 3D Stereoscopic TV two separate pictures must be provided, each presented to a single eye to be combined in the viewer's brain. The subjective effect is of viewing a 3 dimensional scene rather than a two dimensional projection of a scene. In stereoscopic 3D the two pictures are often described as left and right pictures meaning that the left image is intended to be seen by the viewers left eye only and vice versa for the right image. For convenience in this description, "3D stereoscopic" shall be abbreviated to "3D", and monoscopic 2D video will be abbreviated to "2D".
3D movies have been around for nearly a century with the first presentation of 3D films before a paying audience at the Astor Theatre, New York, on June 10, 1915. There was a short lived fad for 3D films in the early 1950'swith films such as Bwana Devil, House of Wax and Creature from the Black Lagoon.
More recently Walt Disney Pictures released the film BOLT in 3D in late 2008.
Early 3D movies used the anaglyph technique for presentation. With this technique a pair of glasses is worn in which each lens has a different colour filter. As discussed below, the composite anaglyph image is made up of a combination of left and right images in which the colour components are modified. The filters in the glasses are used to separate the left and right images on the basis of the colour. Although this gives a striking impression of depth, colour fidelity is seriously compromised.
I
To create an anaglyph picture the red, green and blue components from the left and right pictures are combined to form new red green and blue components for the anaglyph picture. One way this can be done is according to the matrix formula: red,ag,yph 0 0.7 0.3 red left 0 0 0 redrighl green = 0 0 0. greenlCfi + 0 1 0 greenrjgh1 blue,YPh 0 0 0 bluelCft 0 0 1 bluerigh, Here the red component of the anaglyph picture is generated from the green and blue components of the left picture and the green and blue components of the anaglyph picture are taken directly from the right picture. Curiously, in this anaglyph technique, no red component is taken from either the left or the right picture.
More recently polarisation techniques have been used for presentation. In these techniques, left and right images are displayed and separated for each eye using a pair of glasses in which each lens has orthogonal polarisation.
Typically, circular polarisation is used to allow the viewer's head to move, whilst preserving separation of the left and right images.
An alternative 3D presentation technique is the use of shuttered glasses. In this technique alternate left and right images are displayed and separated for each eye by active glasses that are synchronised to the display. Separation is achieved by each lens rapidly switching between opaque and transparent so that each eye only sees the pictures intended.
Stereoscopic 3D may also include multiview techniques. In multiview 3D multiple views of a scene are captured from slightly different positions and, therefore, with varying perspectives. Multiview enables a range of views to be synthesised from the views that are captured, or the views may be used directly. This allows a range of perspectives to be presented, rather than the single perspective afforded by stereoscopic 3D. Multiview display devices are
S
available, such as those using prismatic gratings or arrays of microlenses, which allow the viewer's head to move and perceive a 3D picture from varying perspectives. The invention presented here, although presented primarily with respect to stereoscopic 3D, may be extended to encompass multiview 3D.
Like any other video, the distribution of 3D video usually requires video compression. As yet there is no widely accepted standard for 3D video distribution so 3D video has to be transmitted via a channel designed for 2D monoscopic video. Several techniques have been suggested or tried for transmission of 3D video through a 20 channel. Some examples are illustrated in Figures 1,2 and 3.
In figure 1, the left and right picture are each resampled, using well known techniques, to half their original width. The two anamorphic pictures are then horizontally juxtaposed to create a composite image of normal dimensions as illustrated. This composite picture sequence can then be distributed via a conventional, unmodified, video compression system.
Alternatively, as illustrated in figure 2, the left and right pictures may be resampled vertically rather than horizontally and one placed at the top of a composite picture and the other at the bottom.
In figure 3, a further option is shown in which the left and right pictures are resampled on different phases of a quincunx sampling pattern. The expanded view shows individual pixels where black pixels are from the (resampled) left picture and white pixels are from the (resampled) right picture. The composite pictures illustrated in figures 2 and 3 may also be transmitted via a conventional, unmodified, video compression system.
The disadvantage of these techniques is that they halve the spatial resolution.
In the case of figure 3, it is the diagonal resolution which is reduced, which may be perceived as subjectively less objectionable. On the other hand, quincunx spatial multiplexing is likely to significantly degrade the compression performance of a conventional video codec. Maximising the compression of
S
figure 3, in a manner compatible with existing 2D compressed streams, requires a modified coder, which acts as an obstacle to the uptake of the tech no logy.
By contrast the technique described by way of an example of the invention is compatible with a conventional unmodified video coder.
Other techniques for distributing 3D video use two channels. For example, it is possible to send the left and right channels independently. The disadvantage of this, is that doubles the bandwidth and it may result in problems with synchronisation of the left and right channels, or in other instrumental difficulties.
Another option is to analyse the left and right images and produce a single central picture (or use either the left or right picture directly) plus a depth map.
A depth map is an image in which the brightness indicates the depth of the objects in the 3D scene. Although compression efficiency is improved by the image plus depth map technique it uses two channels. It also suffers from distortions in parts of the picture that are occluded (or revealed) in either the left or the right picture. Other techniques try to take advantage to redundancy between the left and right images, and so reduce the necessary bandwidth, but, again require two channels. For example, one of the views (left or right) may be transmitted unchanged whilst a second channel is used to transmit only the difference between the left and right channels. None of the two channel distribution techniques is directly compatible with existing monoscopic 20 distribution.
A monoscopic 2D video channel will be understood to comprise all or part of the transmitting and receiving hardware and/or software necessary to transmit video pictures from a broadcaster to an end user. Such a channel is identifiable as a monoscopic 2D video channel as at every point each hardware and/or software element in the channel must conform to a monoscopic 2D video transmission protocol specifying the predefined
S
synchronisation, encoding and decoding operations that make transmission of monoscopic 2D video possible.
Such a channel, by definition, therefore excludes channels that are configured by means of a suitable transmission protocol solely to transmit 3D video stereoscopic or multiview video sequences. The predefined synchronisation, encoding and decoding operations necessary for 3D video stereoscopic or multiview video transmission to be possible would be quite different to those for monoscopic 2D video transmission, and the signal formats would therefore be quite different.
However, the term could include channels that are arranged to transmit both 2D monoscopic and 3D stereoscopic or multiview video sequences in a logically separate manner utilising hardware and/or software that supports all three formats. In this case, the 2D monoscopic video channel can be thought of as a logically separate channel or functionality within the software and/or hardware that can be separately utilised to carry 3D video in the manner described in this application, irrespective of the presence of similarly enabled signals (albeit with a different format) in other channels reserved for 3D viewing.
An example of the invention for movies or other progressively scanned 3D video will now be described with reference to Figure 4.
An encoder 41 comprises inputs 42 and 43 that receive respectively the left and right eye picture sequences for creating the resulting 3D image. The left and right picture sequences are interleaved at the encoder 41 to form a composite picture sequence in which pictures are alternately from the left and right input sequences. Clearly this composite sequence has twice the data rate of the two input sequences. Therefore respective stores 44 and 45 coupled to inputs 42 and 43 are needed both to double the data rate for each sequence and to delay one input sequence so that it can be inserted in the composite sequence in the correct order. The composite picture sequence may then be transmitted through any transmission channel 47 that can
I
transmit a double frame rate sequence. The transmission channel 47 may include unmodified 2D video compression.
A corresponding decoder 48, comprises a demultiplexer 49 to separate the received sequence into respective left and right video sequences, stores 50 and 51 to preserve timing, and respective left and right outputs 52 and 53 for connection to an appropriate display device. In comparison to the encoder 41, at the decoder the process is therefore reversed to yield the left and right picture sequences at the output.
The invention may be extended to multiview 3D video by interleaving the pictures from multiple views into a single interleaved picture sequence. In this case, the different views in the multiview image take the place of the left and right images mentioned above. If the multiview image requires transmission of more than two views, the transmission channel would need to support transmission of higher than double data rates. As this may not be possible, alternative techniques for transmission are described below.
There are two types of High Definition Television (HDTV) with 1080 lines.
10801, or interlaced, supports either 25 or 30 frames of video per second.
I 080P, or progressive, supports twice as many frames, either 50 or 60 per second, and consequently doesn't use interlace. 1080P transmission may therefore be used to transmit 25 or 30 frame per second stereoscopic 3D video. Therefore, using this invention, the same channel may be used either for I 080P monoscopic 2D of for stereoscopic 3D at half the frame rate. The use of interlace is discussed below.
It may be that a transmission channel can only carry a sequence at the input picture rate (the picture rate of the left or right channel separately). In this case the apparatus of figure 5 may be used.
Figure 5 shows an encoder 55 that differs slightly from that illustrated in Figure 4. To the extent that it is the same the same reference numbers are used. Here the left and right channels both include samplers 54 and 56 respectively for downsampling the input signals in time by a factor of two, to produce two sequences at half the original picture rate. This can be done using well known techniques, and may be as simple as discarding every other frame, or may be more complex. The composite interleaved stream produced is then at the original frame rate and will pass through the channel. The downsampling phases may be offset as described below to produce a more fluid sensation of motion to the final viewer. That is, for example, even picture numbers may be taken from the left stream and interleaved with odd picture numbers from the right stream.
Phase offsetting is particularly relevant when the two, left and right, picture sequences are subsampled temporally. If the left and right picture sequences were originally sampled at 25 frames per second, for example, they might be down sampled to 12.5 frames per second each to achieve the same total number of frames per second as a conventional 2D video signal. However 2D video at 12.5 frames per second would look "jerky" or be perceived to "judder" by viewers.
To minimise this effect for the 3D video, and produce smoother motion portrayal, the temporal sampling phases may be offset. If the original left sequence at 25Hz was originally sampled at 0, 40, 80, l2Oms etc for example, then this might be subsampled, by discarding every other frame, to form a 12.5Hz sequence sampled at 0, 80, 160, 240ms etc. The question then is when to sample the right picture sequence? It could also be sampled at 0, 80, 160, 240ms etc, i.e. the same as the left sequence, but this would result in the "judder"; mentioned above. The solution then is to sample the right picture sequence at 40, 120, 200, 280ms etc instead. That is the sampling phase of the right hand sequence would be offset by 4Oms in time from the sampling of the left hand signal. In this way the brain is at least fed with new information every 4Oms, rather than every 8Oms had the sampling phase not been offset.
This is particularly important if the 3D video is to be viewed using shuttered glasses or with a time interleaved anaglyph. In these cases the temporal phase offsetting of the left and right channel will match the timing with which
S
the left and right pictures are presented to each eye. Without phase offsetting motion portrayal would be degraded further in these scenarios.
Subsarnpling the input sequences is not the only way phase offsetting can be achieved. It can also be achieved when the 3D movies or video is first captured by a camera. The left and right cameras have to be synchronised so that they produce pictures at the same rate, and a choice of temporal sampling phase is therefore available. As described above, it may therefore be advantageous to capture right hand pictures at instants mid-way between capturing consecutive left hand pictures, that is with a temporal offset of half a picture interval, Of course, other temporal offsets or delays could be chosen as desired.
A corresponding receiver would also be required (not shown in Figure 5) for upsampling the left and right streams that are received at half the original picture rate. The receiver may upsample the left and right sequences, using well know techniques, in a manner complementary to downsampling in the encoder, to improve the quality of motion portrayal.
One way to decode the interleaved picture sequence for each eye is to use shuttered glasses 61, as illustrated in figure 6. An unmodified 2D video receiver/decoder 62 (e.g. a conventional 2D set top box) outputs the interleaved left and right picture sequences directly to the display 63 (e.g. a conventional television display). A timing extractor 64 generates a synchronisation signal 65 from the output of the receiver 62 which is used to synchronise the shuttering action of the shuttered glasses 61. Although the timing extractor is shown as a separate entity in figure 6, it may also be located in the receiver 62 or in the display 63.
It may be necessary to provide a means, either for the timing extractor 64 or the glasses, to swap the image to the left and right eye. This is because the 2D receiver 62 may not know the phasing of the interleaved left and right pictures. Alternatively it may be possible to provide an ancillary signal that indicates the phasing information to the video receiver 62 so that it can
S
directly generate a timing signal with no left -right ambiguity. Clearly when the interleaved picture sequence is decoded in this way, directly for the viewer, there is no requirement for the stores in the decoder of figure 4.
The method of transmitting ancillary phase information would vary with implementation. One common situation might be that the interleaved 3D sequences are sent as compressed 2D video within an MPEG-2 transport stream (as used by DVB broadcasting). In an MPEG transport stream timing information relating to specific frames is sent at "Presentation Time Stamps" which are an essential part of the transport stream protocol. All that is required therefore by way of ancillary information is a message saying that "the frame with presentation timestamp value "t" corresponds to a left picture, which is a very small amount of information.
Ancillary phase information would need to be resent periodically to enable someone who had just tuned in to determine which were the left and right channel of the stereo pair. In an MPEG transport stream this ancillary phase information would probably have to be sent as a separate elementary stream (each of which is identified by a 13 bit "PID" or programme ID). Although, in some embodiments this might require a second channel, the amount of ancillary data required is trivial and it would be simple for a broadcaster to broadcast this additional information. The ancillary data would simply be ignored by older equipment that did not understand it.
Another way to decode the interleaved picture sequence is to generate a time interleaved anaglyph in the 2D receiver or display, as illustrated in figure 7.
Assume, for example, the anaglyph method described above were to be used.
For one picture in the interleaved sequence (assumed to be a left picture) the receiver would output a picture in which the output red component was formed by combining the green and blue component of the decoded picture, the blue and green components of the output picture being zero. For the next picture in the interleaved sequence (assumed to be the right picture) the output picture would have its red component set to zero and green and blue components would be as the decoded picture. In this way, the effect for a
S
viewer wearing anaglyph glasses (i.e. with coloured filters) would be substantially, the same as if the anaglyph signal had been sent directly. The difference is that the anaglyph would be generated in a simple way in the receiver and effectively combined in the eye of the viewer. Such an embodiment could be provided as an alternative to wearing shuttered viewing glasses.
In another alternative, an anaglyph signal combining the information from the demultiplexed left and right picture sequences could be generated in the receiver. In this case, the signal output would be a single stream of pictures that chromatically combine the left and right views.
As with shuttered glasses, unless an auxiliary signal is sent to indicate the phase of left and right pictures in the interleaved sequence, there should be a mechanism of reversing the pictures seen by the left and right eyes. One way in which this could be achieved using, for example, paper glasses, is simply to fold the arms of the glass back in the opposite direction before they are worn.
Alternatively the viewer could signal the receiver to reverse the left and right pictures.
As described above, the interleaved picture sequence will pass through any video channel suitable for 20 video. However, if there is video compression in the channel, compression efficiency may not be optimal, that is, the quality of the compressed video may be lower than might optimally be attained for equivalent 2D video. This is because in a video coder pictures may be predicted from other pictures in the sequence, using a well known technique called interframe coding. Since the left channel has a different perspective to the right channel it will be less well predicted from a picture in the right channel than from a co-timed picture in the left channel (and vice versa). This may occur as illustrated in figure 8, which shows a sequence of pictures in an interleaved sequence.
The pictures are numbered in the order in which they would be displayed.
Even numbered pictures are assumed to be from the left channel, and odd pictures from the right channel. Picture type I indicates an intra pictures, which is coded without reference to any other picture. Picture type P represents a picture predicted from one other reference picture. The arrows indicate that a picture has been predicted based on another picture, with the arrow head showing the picture from which the picture at the base of the arrow was predicted, in this case picture zero.
Picture type B represents bidirectionally predicted pictures from two references. Again the arrow heads show the reference pictures from which they were predicted. Consider picture number 3, which is a right picture. It is predicted from picture zero, which is a left picture, and hence the prediction will not be optimal. Similarly pictures I and 2 are predicted from both left and right pictures and the prediction would probably be better were they predicted solely from pictures corresponding to the same type of view.
Figure 8 is illustrative of suboptimal compression in any modern interframe coder. It is not intended to represent a specific compression system and could apply equally well to MPEG-2, VC-I, H264/MPEG-4 Part IO/AVC, Dirac or any other interirame coder.
One aspect of the invention is that the interleaved video sequence can be optimally compressed by predicting pictures substantially from views of the same type. So, for example, left pictures would be predicted substantially from other left pictures. However, it may sometimes be the case that part of a left picture may, by chance and in a particular instance, be better predicted from a picture corresponding to a different view. Although, a coder might usually decide to code views wholly from views of the same type (i.e left from left and right from right), in alternative embodiments a small improvement in compression efficiency may be achieved by permitting the possibility of predicting parts of the picture from a different view. Nevertheless overall compression will be most efficient when pictures are predicted substantially from views of the same type.
One way in which predicting pictures from the same type of view can be facilitated is by choosing an appropriate Group of Pictures (GOP) structure of the coder. A GOP structure is the order and number of I, B and P frames in a compressed video sequence. For example, figure 9 shows a more appropriate GOP structure for interleaved stereoscopic video than figure 8. Here P picture number 4, which is a left picture, is now predicted from another left picture.
Similarly pictures 2 and 3 are predicted from their same view. Note that pictures 3 and 4 are still denoted as B pictures although they are shown as being predicted from a single reference frame. This is because, for some coders, these pictures would still need to be classified as B pictures to comply with the codec specification, even if they are predicted wholly from one reference frame. Note also picture that 1 is predicted from a different view.
This is because the only reference frame available is of a different view, but it is more efficient to predict from a different view than to code as an intra frame.
By predicting pictures from substantially the same view, maximum advantage of the redundancy between the left and right sequences can be exploited.
Improved compression efficiency happens naturally due to the operation of all modern video coders. In this way 3D video can be transmitted with substantially less than twice the bit rate that would be required from naively compressing the left and right views independently. The actual improvement in compression efficiency will depend on the video coding standard used (e.g. MPEG-2, VC-1 or H264) and the way in which the encoder is implemented.
For some programme genres, e.g. sport, the use of interlaced scanning may be preferable to progressive scanning, to provide, subjectively, more fluid motion portrayal. In interlaced scanning, the even lines from a frame are sampled at one instant and the odd lines are sampled half a frame period different in time. In the preceding discussion pictures may be taken to mean
either frames or fields.
If interlaced scanning is used, and pictures are taken as fields, the fields may be interlaced in different ways. For example the interleaved sequence, left even, right even, left odd, right odd etc, where left even' denotes a field from a left picture containing even lines, and so on, corresponds to a double frame rate interlaced sequence that is the simplest interleaving instrumentally. It requires a minimum delays and therefore minimum amount of storage at the encoder and, more importantly, at the decoder. But the image may appear to jitter upand down.
Equivalent alternate interleavings are possible where left and right and/or odd and even are reversed. This is a simple way of interleaving the fields and is suitable, for example, when the interleaved sequence will be decoded back to two separate left and right sequences. However if the interleaved sequence is decoded directly, as described above using glasses, both the left and the right image will jitter vertically as respective fields are received. In this case a
different field ordering may be more appropriate.
For direct decoding therefore an interleaved sequence of left even, right odd, right even, left odd etc is likely to be more appropriate. This field ordering has the correct order of fields (i.e. even, odd, even, odd etc) but now the order of the right hand fields is reversed, which will result in jerky motion for some genres/types of video material. This order may be suitable for slowly moving pictures with lots of detail, such as dramas. If this alternative interleaving is used for fields then the synchronisation for shuttered glasses or time interleaved anaglyph must be adjusted accordingly.
Alternatively a field ordering of left even, right odd, left odd, right even etc, would provide freedom from vertical jitter and acceptable motion perception, if the temporal phases of the left and right sequences are offset. The right could be offset from the left by either one field period so that left even and right odd are sampled at the same instant or, preferably, if right were sampled half a
field period before the left.
Flat panel displays are available in which alternate lines are polarised with orthogonal polarisations. They may be used to view either 20 video without glasses or, with an appropriate input and using polarised glasses, they may be used for viewing stereoscopic 3D. When viewing 3D video, the video displayed has the half the vertical resolution. Using this invention, in conjunction with such a display, allows either 20 or 3D video to be transmitted and displayed with no changes to the channel, decoder or display. In this way 3D is completely compatible with 2D transmission. Whether the video is 2D or 3D depends on what is transmitted and, if 3D is transmitted, the viewer needs to wear polarised glasses.
To transmit 3D video compatible with such a display the video needs to be appropriately formatted. Such 3D displays effectively show progressively scanned video with half the number of lines for each view. Therefore each input video sequence, left and right, should be resampled to the appropriate format. For example, if a display has 1080 lines each input sequence should be resampled onto 540 lines. For 1080P video this is a simple spatial resampling which could, for example, just be generating new lines as the average of adjacent lines in the input picture. More complex and higher quality methods of resampling are well known. For interlaced inputs the resampling process is more complicated, but still well known, and would generate 540 line progressive video at either field rate or at frame rate. Once the left and right inputs have been resampled the resulting pictures would be interleaved to form, for example 1080 line, interlaced frames which would be transmitted through a conventional 2D transmission channel. The 3D video would, effectively, be decoded directly by the display.
As an optimisation to the scheme described in the previous paragraph one of the input sequence should be offset by one input line, when they are resampled, to align the picture position on the display.
The examples above describe ways in which two picture sequences corresponding to left and right eye views of a stereoscopic pair may be interleaved together in such a way that they may be transmitted in a compatible fashion via a transmission channel designed for monoscopic 20 video. Methods have been described by which the interleaved sequence could be decoded back to two independent left right channels. Alternatively, the interleaved signal could be directly decoded using shuttered polarised
S
glasses or anaglyph colour filter glasses. Also described was a method by which a suitably formatted and interleaved picture sequence could be directly decoded by a suitable flat panel display. Methods were described by which the compression efficiency of a compressed transmission channel could be improved for interleaved sequences, allowing 3D stereoscopic video to be transmitted in substantially less bandwidth than that required for two independent picture sequences. It will be appreciated that in many cases where hardware or software are mentioned these could be interchanged or replaced with a combination of both software and hardware.
The examples discussed above are intended to be illustrative and not to limit in any way the scope of protection defined by the attached claims.

Claims (26)

  1. SClaims 1. A method of transmitting three-dimensional video pictures via a 2D monoscopic video channel, comprising: receiving two or more streams of video pictures, each stream representing a different view of a scene, for assembly into a three-dimensional video picture at a receiver; time multiplexing the two or more streams into a single interleaved stream; transmitting the single stream via a 2D monoscopic video channel.
  2. 2. The method of claim 1, comprising downsampling the picture rates of each of the streams of video pictures such that the single stream has a data rate within the capability of the 2D monoscopic video channel.
  3. 3. The method of claim 2, comprising offsetting the downsampling phases to provide more fluid sensation of motion.
  4. 4. The method of any previous claim, wherein ancillary information is transmitted with the single interleaved stream to indicate the phasing of the streams in the interleaving.
  5. 5. The method of claim 1, wherein the two or more streams represent multiview 3D video.
  6. 6. The method of claim 1, wherein the two streams are left and right picture streams in stereoscopic 3D video.
  7. 7. The method of any preceding claim comprising applying video compression to the single interleaved stream based on video image prediction, such that in the compression of the single interleaved stream, pictures from one of the two or more received streams of video pictures are predicted substantially from pictures from the same stream.S
  8. 8. The method of claim 7, wherein the two streams are left and right picture streams in stereoscopic 3D video and the video compression uses a GOP structure comprising Intra (I), Predicted (P), and three Bi-directionally predicted frames (B) in sequence between the I and P frames.
  9. 9. The method of claim 7 or 8, wherein if the only available reference frame for predicting a picture corresponds to a different view, predictive coding the picture using that reference frame rather than coding the picture as an intra frame.
  10. 10. The method of claim 1, wherein the pictures of the single interleaved stream comprise fields for interlaced transmission formats.
  11. 11. The method of claim 10, comprising in an interlaced format, interleaving fields in the order left even, right even, left odd, right odd, including reversing left/right odd/even, where even and odd refer to the alternating types of field in a double frame rate interlaced sequence.
  12. 12. The method of claim 10, comprising interleaving fields in the following order left even, right odd, right even, left odd, including reversing left/right odd/even, where even and odd refer to the alternating types of field in a double frame rate interlaced sequence.
  13. 13. The method of claim 10, comprising interleaving fields in the following order left even, right odd, left odd, right even, including reversing left/right odd/even, where even and odd refer to the alternating types of field in a double frame rate interlaced sequence.
  14. 14. The method of claim 10, wherein the pictures are frames in a progressive transmission format, and the method comprising: resampling the left and right picture sequences to half height progressive sequences; and interleaving the resulting half height progressive sequences together as full height interlaced frames.
  15. 15. A method of displaying three-dimensional video pictures received via a 2D monoscopic video channel, according to the method of any preceding claim, the method comprising: time demultiplexing the received stream of video pictures into two or more streams of video pictures, each stream representing a different view of a scene, for assembly into a three-dimensional video picture.
  16. 16. The method of claim 15, comprising upsampling the two or more or streams of video pictures to improve motion portrayal.
  17. 17. The method of claim 15 or 16, wherein the two or more streams comprise left and right picture streams in stereoscopic 3D video; and the method comprises: generating a synchronisation signal from the time demultiplexed left and right picture streams or from the received stream of video pictures; outputting each of the two or more streams to a display device for interleaved sequential viewing; outputting the synchronisation signal to a pair of shuttered glasses, for synchronising the display of the left and right picture streams with views through the left and right lenses of the shuttered glasses by a wearer.
  18. 18. The method of claim 16 or 17, wherein the two or more streams comprise left and right picture streams in stereoscopic 3D video; and the method comprises: decoding the time demultiplexed streams by generating an anaglyph signal outputting the anaglyph signal to a display device for viewing with anaglyph glasses.
  19. 19. The method of claim 18, wherein the anaglyph signal comprises time interleaved anaglyph images separately generated for each of the left and right picture streams respectively.S
  20. 20. The method of claim 18, wherein the anaglyph signal includes a stream of pictures comprising information from both the left and right picture streams.
  21. 21. The method of claims 16 or 17, comprising swapping the phases of the left and right picture streams in response to a user input.
  22. 22. A method substantially as herein described and with reference to Figure 4 to 7, and 9 of the drawings.
  23. 23. An apparatus for transmitting three-dimensional video pictures via a 2D monoscopic video channel, comprising: an input for receiving two or more streams of video pictures, each representing a different view of a scene for assembly into a three-dimensional video picture at a receiver; a multiplexer for time multiplexing the two or more streams into a single interleaved stream; transmitting the single stream via a 2D monoscopic video channel.
  24. 24. An apparatus for displaying three-dimensional video pictures received via a 2D monoscopic video channel, according to the method of any of claims, I to 22 comprising: a demultiplexer arranged to time demultiplexing the received stream of video pictures into two or more streams of video pictures, each stream representing a different view of a scene, for assembly into a three-dimensional video picture.
  25. 25. An apparatus substantially as herein described and with reference to Figure 4 to 7 and 9 of the drawings.
  26. 26. A computer program product having a computer readable medium on which code is stored, and which when executed on a computer, the code causes the computer to perform any of the methods defined in claims 1 to 21.
GB0908819A 2009-05-21 2009-05-21 Transmitting three-dimensional (3D) video via conventional monoscopic (2D) channels as a multiplexed, interleaved data stream Withdrawn GB2470402A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
GB0908819A GB2470402A (en) 2009-05-21 2009-05-21 Transmitting three-dimensional (3D) video via conventional monoscopic (2D) channels as a multiplexed, interleaved data stream
PCT/GB2010/001023 WO2010133852A2 (en) 2009-05-21 2010-05-21 An apparatus and method of transmitting three- dimensional video pictures via a two dimensional monoscopic video channel

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB0908819A GB2470402A (en) 2009-05-21 2009-05-21 Transmitting three-dimensional (3D) video via conventional monoscopic (2D) channels as a multiplexed, interleaved data stream

Publications (2)

Publication Number Publication Date
GB0908819D0 GB0908819D0 (en) 2009-07-01
GB2470402A true GB2470402A (en) 2010-11-24

Family

ID=40862813

Family Applications (1)

Application Number Title Priority Date Filing Date
GB0908819A Withdrawn GB2470402A (en) 2009-05-21 2009-05-21 Transmitting three-dimensional (3D) video via conventional monoscopic (2D) channels as a multiplexed, interleaved data stream

Country Status (2)

Country Link
GB (1) GB2470402A (en)
WO (1) WO2010133852A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105786423A (en) * 2015-01-14 2016-07-20 联想(新加坡)私人有限公司 Actuation of device for viewing of first content frames presented on a display between second content frames

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114422769A (en) * 2022-01-18 2022-04-29 深圳市洲明科技股份有限公司 Transmitting card, receiving card, display control method and storage medium of display system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5619256A (en) * 1995-05-26 1997-04-08 Lucent Technologies Inc. Digital 3D/stereoscopic video compression technique utilizing disparity and motion compensated predictions
WO1997043681A1 (en) * 1996-05-15 1997-11-20 Vrex, Inc. Stereoscopic 3-d viewing system with portable electro-optical viewing glasses
US20020009137A1 (en) * 2000-02-01 2002-01-24 Nelson John E. Three-dimensional video broadcasting system
WO2009077969A2 (en) * 2007-12-18 2009-06-25 Koninklijke Philips Electronics N.V. Transport of stereoscopic image data over a display interface

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6101276A (en) * 1996-06-21 2000-08-08 Compaq Computer Corporation Method and apparatus for performing two pass quality video compression through pipelining and buffer management
JP3802653B2 (en) * 1997-05-21 2006-07-26 オリンパス株式会社 Stereoscopic image display device
US20010043266A1 (en) * 2000-02-02 2001-11-22 Kerry Robinson Method and apparatus for viewing stereoscopic three- dimensional images
DE10334674A1 (en) * 2003-07-30 2005-02-17 David Bullock Signal image device for eliminating non-signal image contents from a video input signal with image change frequency has a processing unit, a synchronizing unit and an output unit
KR100962696B1 (en) * 2007-06-07 2010-06-11 주식회사 이시티 Format for encoded stereoscopic image data file

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5619256A (en) * 1995-05-26 1997-04-08 Lucent Technologies Inc. Digital 3D/stereoscopic video compression technique utilizing disparity and motion compensated predictions
WO1997043681A1 (en) * 1996-05-15 1997-11-20 Vrex, Inc. Stereoscopic 3-d viewing system with portable electro-optical viewing glasses
US20020009137A1 (en) * 2000-02-01 2002-01-24 Nelson John E. Three-dimensional video broadcasting system
WO2009077969A2 (en) * 2007-12-18 2009-06-25 Koninklijke Philips Electronics N.V. Transport of stereoscopic image data over a display interface

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105786423A (en) * 2015-01-14 2016-07-20 联想(新加坡)私人有限公司 Actuation of device for viewing of first content frames presented on a display between second content frames
GB2535014A (en) * 2015-01-14 2016-08-10 Lenovo Singapore Pte Ltd Actuation of device for viewing of first content frames presented on a display between second content frames
US9672791B2 (en) 2015-01-14 2017-06-06 Lenovo (Singapore) Pte. Ltd. Actuation of device for viewing of first content frames presented on a display between second content frames
GB2535014B (en) * 2015-01-14 2019-06-26 Lenovo Singapore Pte Ltd Actuation of device for viewing of first content frames presented on a display between second content frames
CN105786423B (en) * 2015-01-14 2021-09-07 联想(新加坡)私人有限公司 Actuating a device to view first frames of content presented on a display between second frames of content

Also Published As

Publication number Publication date
GB0908819D0 (en) 2009-07-01
WO2010133852A2 (en) 2010-11-25
WO2010133852A3 (en) 2011-01-27

Similar Documents

Publication Publication Date Title
JP5482254B2 (en) Reception device, transmission device, communication system, display control method, program, and data structure
US9736452B2 (en) Broadcast receiver and video data processing method thereof
Smolic et al. An overview of available and emerging 3D video formats and depth enhanced stereo as efficient generic solution
Vetro et al. 3D-TV content storage and transmission
US9756380B2 (en) Broadcast receiver and 3D video data processing method thereof
US9712803B2 (en) Receiving system and method of processing data
US6055012A (en) Digital multi-view video compression with complexity and compatibility constraints
EP2327226B1 (en) Encoding and decoding architecture of checkerboard multiplexed image data
Fehn et al. Asymmetric coding of stereoscopic video for transmission over T-DMB
US20020009137A1 (en) Three-dimensional video broadcasting system
EP2337361A2 (en) Method and system for synchronizing 3D glasses with 3D video displays
WO2011089982A1 (en) Reception device, transmission device, communication system, method for controlling reception device, and program
Minoli 3DTV content capture, encoding and transmission: building the transport infrastructure for commercial services
KR101977260B1 (en) Digital broadcasting reception method capable of displaying stereoscopic image, and digital broadcasting reception apparatus using same
Coll et al. 3D TV at home: Status, challenges and solutions for delivering a high quality experience
US20140078256A1 (en) Playback device, transmission device, playback method and transmission method
KR20110060763A (en) Added information insertion apparatus and method in broadcasting system
US20150130897A1 (en) Method for generating, transporting and reconstructing a stereoscopic video stream
Vetro Representation and coding formats for stereo and multiview video
GB2470402A (en) Transmitting three-dimensional (3D) video via conventional monoscopic (2D) channels as a multiplexed, interleaved data stream
Fliegel Advances in 3D imaging systems: Are you ready to buy a new 3D TV set?
Cubero et al. Providing 3D video services: The challenge from 2D to 3DTV quality of experience
KR101556149B1 (en) Receiving system and method of processing data
JP2013021683A (en) Image signal processing device, image signal processing method, image display device, image display method, and image processing system
Cagnazzo et al. 3D video representation and formats

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)