WO2013069608A1 - 送信装置、送信方法、受信装置および受信方法 - Google Patents
送信装置、送信方法、受信装置および受信方法 Download PDFInfo
- Publication number
- WO2013069608A1 WO2013069608A1 PCT/JP2012/078637 JP2012078637W WO2013069608A1 WO 2013069608 A1 WO2013069608 A1 WO 2013069608A1 JP 2012078637 W JP2012078637 W JP 2012078637W WO 2013069608 A1 WO2013069608 A1 WO 2013069608A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- view
- image data
- data
- video stream
- views
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/236—Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/194—Transmission of image signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/156—Mixing image signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/161—Encoding, multiplexing or demultiplexing different image signal components
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/172—Processing image signals image signals comprising non-image signal components, e.g. headers or format information
- H04N13/178—Metadata, e.g. disparity information
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/236—Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
- H04N21/2362—Generation or processing of Service Information [SI]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/236—Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
- H04N21/2365—Multiplexing of several video streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/434—Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
- H04N21/4345—Extraction or processing of SI, e.g. extracting service information from an MPEG stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/434—Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
- H04N21/4347—Demultiplexing of several video streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/816—Monomedia components thereof involving special video data, e.g 3D video
Definitions
- the present technology relates to a transmission device, a transmission method, a reception device, and a reception method, and more particularly to a transmission device and the like for satisfactorily viewing a stereoscopic image (three-dimensional image) on the reception side.
- a left-eye image and a right-eye image having parallax are alternately displayed on a display at a predetermined cycle, and the left-eye image and the right-eye image are synchronized with the display.
- a method of observing with shutter glasses including a liquid crystal shutter to be driven is known.
- a method using a multi-view configuration having N views is conceivable as a method for enabling naked-eye viewing of a three-dimensional image (stereoscopic image).
- image data of all views is transmitted, there is a concern that the transmission band increases. Therefore, instead of transmitting image data of all views, image data of one view or more, for example, two views is transmitted, and image data of views other than the view transmitted on the receiving side is generated by interpolation processing. Is also possible.
- FIG. 31 shows a configuration example of the image transmission / reception system 50 in that case.
- images of two views are obtained by the view selector 52 from image data of N views (View 1... View N) obtained by imaging with N cameras 51-1 to 51-N. Data is selected.
- two video streams (1st video, 2nd video) obtained by encoding the image data of these two views by the encoder 53 are transmitted toward the receiving side.
- the transmission method (1) when the number of multi-view views increases, the relative parallax between the two views at both ends of the transmission increases. For this reason, it is difficult to interpolate around the occlusion associated with the processing of fine parts when interpolating view image data that is not transmitted, and the quality of the reproduced image may be a problem.
- FIG. 32 schematically shows a display unit on the receiving side when the number of views is 5 in this transmission method.
- “View_0” is the center view
- “View_1” is the view one right from the center
- “View_2” is the view one left from the center
- “View_3” is two views right from the center, that is, the rightmost view
- “View_4” indicates a view that is two left from the center, that is, the leftmost view.
- FIG. 32 shows a lenticular lens, but a parallax barrier or the like may be used instead. The same applies to FIG. 33 below.
- image data of a so-called conventional stereo view is transmitted, and image data of a view that is not transmitted is interpolated on the receiving side.
- the interpolation of the image data of the views inside the two views constituting the stereo view can be synthesized by interpolation processing.
- the interpolation of the image data of the view outside the stereo view is synthesized by extrapolation processing. In the synthesis by extrapolation processing, it is difficult to maintain high image quality with respect to end point processing such as occlusion, which causes deterioration in image quality.
- FIG. 33 schematically shows a display unit on the receiving side when the number of views is 5 in this transmission method.
- “View_0” is the center view
- “View_1” is the view one right from the center
- “View_2” is the view one left from the center
- “View_3” is two views right from the center, that is, the rightmost view
- “View_4” indicates a view that is two left from the center, that is, the leftmost view.
- the purpose of this technique is to effectively transmit image data for performing naked-eye viewing of a stereoscopic image with a multi-view configuration.
- the concept of this technology is An image data acquisition unit that acquires image data of at least a left end view and a right end view among a plurality of views for stereoscopic image display, and image data of an intermediate view located between the left end and the right end;
- the transmission apparatus includes an image data transmission unit that transmits a container having a predetermined format including a video stream obtained by encoding the acquired image data.
- the image data acquisition unit at least the left end view and the right end view of the plurality of views for stereoscopic image display, and an intermediate view positioned between the left end and the right end, for example, the center view View image data is acquired.
- the image data in this case is, for example, data obtained by being imaged by a camera or data obtained by being read from a storage medium.
- the image data of the left end view and the right end view may each be encoded as data of one picture.
- the image data of the left end view and the right end view may be interleaved and encoded as one picture data.
- the video stream included in the container may include data of one or a plurality of pictures.
- information indicating a boundary may be arranged between the encoded data of each picture.
- image data of at least the left end view and the right end view and image data of an intermediate view positioned between the left end and the right end are transmitted. Is done. Therefore, image data transmission for performing naked-eye viewing of a stereoscopic image with a multi-view configuration can be effectively performed.
- a view configuration information insertion unit that inserts view configuration information related to image data in the video stream may be further included in the layer of the video stream.
- the receiving side can perform appropriate and efficient processing for performing naked-eye viewing of a three-dimensional image (stereoscopic image) based on image data of a plurality of views.
- the container layer may further include an identification information insertion unit that inserts identification information for identifying whether or not view configuration information is inserted into the video stream layer.
- an identification information insertion unit that inserts identification information for identifying whether or not view configuration information is inserted into the video stream layer.
- image data of a predetermined view is encoded as data of one picture in a video stream included in a container
- the position of the predetermined view is included in the view configuration information inserted in the layer of the video stream.
- the information to show may be included.
- the view configuration information inserted in the layer of this video stream includes: Information indicating the positions of these two views may be included.
- the view configuration information may further include information indicating the type of interleaving performed on the image data of the two views.
- the view configuration information inserted in the video stream layer may include information indicating whether or not data of a plurality of pictures is encoded in one access unit of the video stream.
- the view configuration information inserted in the layer of the video stream may include information indicating whether or not the image data of the view essential for image display is the encoded video stream.
- the view configuration information inserted in the layer of the video stream may include pixel ratio information for a predetermined horizontal and / or vertical resolution.
- a parallax data acquisition unit that acquires parallax data between the views is further provided, and the image data transmission unit includes, in addition to the video stream obtained by encoding the acquired image data.
- a container of a predetermined format including a disparity stream obtained by encoding the obtained disparity data may be transmitted.
- the reception side easily interpolates and synthesizes the image data of each view that is not transmitted based on the received parallax data without performing the process of generating the parallax data from the received image data of each view. It becomes possible.
- image data receiving unit for receiving a container of a predetermined format including a video stream;
- An image data acquisition unit that decodes a video stream included in the container to obtain image data of each view;
- the reception apparatus includes an interpolation processing unit that acquires image data of a predetermined number of views positioned between the views based on the parallax data of the views by interpolation processing.
- the image data receiving unit includes at least the left-end view and the right-end view image data, and the intermediate view image data positioned between the left-end and the right-end among the plurality of views for stereoscopic image display.
- a stream of a predetermined format including a video stream obtained by encoding is received.
- the video data included in the stream is decoded by the image data acquisition unit, and image data of each view is obtained.
- the interpolation processing unit acquires image data of a predetermined number of views located between the views by the interpolation processing.
- the container may include a parallax data acquisition unit that includes a parallax stream obtained by encoding parallax data and decodes the parallax stream included in the container to obtain parallax data.
- a parallax data generation unit that generates parallax data based on the image data of each view obtained by the image data acquisition unit may be further provided.
- the image data of the left end view and the right end view and the image data of the intermediate view positioned between the left end and the right end are received.
- the other views are obtained by interpolation processing based on the parallax data. Therefore, it is possible to satisfactorily perform autostereoscopic viewing of a stereoscopic image with a multiview configuration.
- FIG. 6 schematically shows a display unit of a receiver when the number of views is 5 in a method of transmitting image data of a left end view and a right end view among N views and a central view located between them.
- FIG. It is a block diagram which shows the structural example of the transmission data generation part which produces
- FIG. 4 is a diagram illustrating a structure example of “user_data ()”. It is a figure which shows the structural example in case the three video streams are included in the transport stream TS. It is a figure which shows the structural example in case two video streams are contained in the transport stream TS. It is a figure which shows the structural example in case one video stream is contained in transport stream TS. It is a block diagram which shows the structural example of the receiver which comprises an image transmission / reception system. It is a figure which shows the example of calculation of scaling ratio.
- FIG. 1 shows a configuration example of an image transmission / reception system 10 as an embodiment.
- the image transmission / reception system 10 includes a broadcasting station 100 and a receiver 200.
- the broadcasting station 100 transmits a transport stream TS as a container on a broadcast wave.
- the transport stream TS includes a video stream obtained by encoding image data of at least the center view, the left end view, and the right end view among a plurality of views for displaying a stereoscopic image.
- the central view constitutes an intermediate view located between the left end view and the right end view.
- the image data of the center view, the left end view, and the right end view are each encoded as one picture data. Is done.
- the data of each picture has a 1920 ⁇ 1080 full HD size.
- the image data of the center (Center) view is encoded as data of one picture, and the left view (Left) view and the right end
- the image data of the (Right) view is interleaved and encoded as one picture data.
- the data of each picture has a 1920 ⁇ 1080 full HD size.
- the image data of the left end view and the right end view are interleaved and encoded as one picture data
- the image data of each view is thinned by half in the horizontal direction or the vertical direction. It becomes.
- the type of interleaving is side-by-side, and the size of each view is 960 * 1080.
- a top-and-bottom may be considered as an interleave type. In this case, the size of each view is 1920 * 540.
- the receiving side performs scaling processing as shown in FIG.
- the size of the image data of the right view and the rightmost view is returned to the full HD size of 1920 * 1080.
- the video stream included in the transport stream TS includes one or more picture data.
- the transport stream TS includes the following three video streams (video elementary streams). That is, it is a video stream obtained by encoding the image data of the center view, the left end view, and the right end view as one picture.
- the transport stream TS includes the following two video streams (video elementary streams). That is, the video stream obtained by encoding the image data of the central view as one picture and the image data of the left end view and the right end view are interleaved and encoded as one picture. A video stream.
- the transport stream TS includes the following one video stream (video elementary stream). That is, this one video stream includes data obtained by encoding the image data of the center view, the left end view, and the right end view as data of one picture.
- 4 (a) and 4 (b) show examples of video streams including encoded data of a plurality of pictures.
- the encoded data of each picture is sequentially arranged in each access unit.
- the encoded data of the first picture is composed of “SPS ⁇ Coded Slice”
- the encoded data of the second and subsequent pictures is comprised of “Subset SPS ⁇ Coded Slice”.
- this example is an example in which MPEG4-AVC encoding is performed, but other encoding schemes are also applicable.
- the hexadecimal numbers in the figure indicate “ ⁇ NAL unit type”.
- FIGS. 5A and 5B show an example in which encoded data of three pictures coexist in one video stream.
- the encoded data of each picture is shown as a substream.
- FIG. 5A shows the top access unit of the GOP (Group Of Pictures), and
- FIG. 5B shows the access unit other than the top of the GOP.
- View configuration information related to image data in the video stream is inserted into a layer (picture layer, sequence layer, etc.) of the video stream.
- This view configuration information indicates information indicating which view image data is included in the video stream, and indicates whether data of a plurality of pictures is encoded in one access unit of the video stream. Information etc. are included.
- This view configuration information is inserted into, for example, a user data area of a picture header or a sequence header of a video stream.
- the receiving side can perform appropriate and efficient processing for performing naked-eye viewing of a three-dimensional image (stereoscopic image) based on image data of a plurality of views. Details of this view configuration information will be described later.
- identification information for identifying whether or not view configuration information is inserted in the video stream layer is inserted in the transport stream TS layer.
- This identification information is, for example, subordinate to a video elementary loop (Video ESloop) of a program map table (PMT: Program Map Table) included in the transport stream TS, or an event information table (EIT: Event Information). Table).
- PMT Program Map Table
- EIT Event Information Table
- the receiver 200 receives the transport stream TS transmitted from the broadcasting station 100 on a broadcast wave. Further, the receiver 200 decodes the video stream included in the transport stream TS, and acquires, for example, image data of the center view, the left end view, and the right end view. At this time, the receiver 200 can know which view position the image data included in each video stream is based on the view configuration information included in the layer of the video stream.
- the receiver 200 may determine between the center view and the left end view, between the center view and the left end view, based on the disparity data between the center view and the left end view, and between the center view and the left end view.
- Image data of a predetermined number of views located between the right end views is acquired by interpolation processing.
- the receiver 200 can know the number of views based on the view configuration information included in the layer of the video stream, and can easily grasp which position view has not been transmitted.
- the receiver 200 decodes the parallax data stream sent together with the video stream from the broadcast station 100, and acquires the above-described parallax data. Alternatively, the receiver 200 generates the above-described parallax data based on the acquired image data of the center view, the left end view, and the right end view.
- the receiver 200 is a three-dimensional image (stereoscopic image) based on the image data of each view at the center, the left end, and the right end sent from the broadcast station 100 and the image data of each view acquired by the above-described interpolation processing.
- the images of each view are synthesized and displayed on the display unit for viewing with the naked eye.
- FIG. 6 schematically shows a display unit of the receiver 200 when the number of views is five.
- “View_0” is the center view
- “View_1” is the view one right from the center
- “View_2” is the view one left from the center
- “View_3” is two views right from the center, that is, the rightmost view
- “View_4” indicates a view that is two left from the center, that is, the leftmost view.
- the receiver 200 receives the image data of the views “View_0”, “View_3”, and “View_4”.
- the image data of the other views “View_1” and “View_2” are obtained by interpolation processing. Then, in the receiver 200, the images of these five views are synthesized and displayed on the display unit for viewing the three-dimensional image (stereoscopic image) with the naked eye.
- FIG. 6 shows a lenticular lens, a parallax barrier or the like may be used instead.
- FIG. 7 illustrates a configuration example of the transmission data generation unit 110 that generates the above-described transport stream TS in the broadcast station 100.
- the transmission data generation unit 110 includes N image data output units 111-1 to 111-N, a view selector 112, scalers 113-1, 113-2, and 113-3, and video encoders 114-1 and 114. -2, 114-3 and a multiplexer 115.
- the transmission data generation unit 110 includes a parallax data generation unit 116, a parallax encoder 117, a graphics data output unit 118, a graphics encoder 119, an audio data output unit 120, and an audio encoder 121.
- Image data output units 111-1 to 111-N output image data of N views (View 1 ... View N) for stereoscopic image display.
- the image data output unit includes, for example, a camera that images a subject and outputs image data, or an image data reading unit that reads and outputs image data from a storage medium. Note that the image data of the view that is not transmitted may not actually be present.
- the view selector 112 selects at least the image data of the left end view and the right end view from the image data of N views (View 1... View N) and an intermediate view (between the left end and the right end).
- One or more image data are selectively extracted.
- the view selector 112 extracts the image data VL of the left end view and the image data VR of the right end view, and extracts the image data VC of the center view.
- FIG. 8 shows a view selection state in the view selector 112.
- the scalers 113-1, 113-2, and 113-3 perform scaling processing on the image data VC, VL, and VR, respectively, and, for example, 1920 * 1080 full HD size image data VC ′, VL ′ and VR ′ are obtained.
- the image data VC, VL, and VR are 1920 * 1080 full HD size, they are output as they are. If the image data VC, VL, VR is larger than the size of 1920 * 1080, the image data is scaled down and output.
- the video encoder 114-1 performs encoding such as MPEG4-AVC (MVC) or MPEG2 video on the image data VC ′ of the central view to obtain encoded video data. Then, the video encoder 114-1 generates a video stream including the encoded data as a substream (sub stream 1) by a stream formatter (not shown) provided in the subsequent stage.
- MVC MPEG4-AVC
- MPEG2 MPEG2 video
- the video encoder 114-2 performs encoding such as MPEG4-AVC (MVC), MPEG2 video, etc. on the image data VL ′ of the left end view to obtain encoded video data. Then, the video encoder 114-2 generates a video stream including the encoded data as a substream (sub stream 2) by a stream formatter (not shown) provided in the subsequent stage.
- MVC MPEG4-AVC
- MPEG2 video etc.
- the video encoder 114-3 performs encoding such as MPEG4-AVC (MVC) or MPEG2 video on the image data VR ′ of the rightmost view to obtain encoded video data. Then, the video encoder 114-3 generates a video stream including the encoded data as a substream (sub stream) 3) by a stream formatter (not shown) provided in the subsequent stage.
- MVC MPEG4-AVC
- MPEG2 MPEG2 video
- the video encoders 114-1, 114-2, and 114-3 insert the above-described view configuration information into the layer of the video stream.
- this view configuration information is information indicating which view image data is included in the video stream, and data of a plurality of pictures is encoded in one access unit of the video stream. It includes information indicating whether or not This view configuration information is inserted into, for example, a user data area of a picture header or a sequence header of a video stream.
- the disparity data generation unit 116 generates disparity data (disparity data) based on the image data of the center, left end, and right end views output from the view selector 112.
- the disparity data includes, for example, disparity data between the center view and the left end view and disparity data between the center view and the right end view.
- parallax data is generated in pixel units or block units.
- FIG. 9 illustrates an example of disparity data (disparity vector) for each block (Block).
- FIG. 10 shows an example of a method for generating disparity data in units of blocks.
- parallax data indicating the j-th view from the i-th view is obtained.
- pixel blocks parllax detection blocks
- 4 * 4, 8 * 8, or 16 * 16 are set in the picture of the i-th view.
- the i-th view picture is the detected image
- the j-th view picture is the reference image
- the sum of absolute differences between pixels is minimum for each block of the i-th view picture.
- the block search of the picture of the j-th view is performed so that the disparity data is obtained.
- the parallax data DPn of the Nth block is obtained by block search so that the sum of absolute differences in the Nth block is minimized, for example, as shown in the following equation (1).
- Dj represents the pixel value in the picture of the jth view
- Di represents the pixel value in the picture of the ith view.
- DPn min ( ⁇ abs (differ (Dj-Di)) (1)
- the disparity data of each region obtained by dividing the “X” block into four is obtained by the following equation (2).
- the parallax data X (A, B) of the divided areas adjacent to “A” and “B” is the median value of the parallax data of the blocks “A”, “B”, and “X”. In other divided areas, parallax data is similarly obtained.
- X (A, C) median (X, A, C)
- X (B, D) median (X, B, D)
- X (C, D) median (X, C, D) ...
- the area occupied by the parallax data is reduced to half the original vertical / horizontal size by the above-described single conversion. By repeating this conversion a predetermined number of times depending on the block size, parallax data in pixel units is obtained. If the texture contains an edge and the complexity of the in-screen object is higher than the other parts, the block size should be reduced as appropriate and the disparity data itself in the initial block unit itself will be textured. It is also possible to improve.
- the parallax encoder 117 encodes the parallax data generated by the parallax data generation unit 116 to generate a parallax stream (parallax data elementary stream).
- the disparity stream includes disparity data in units of pixels or blocks. When the parallax data is in units of pixels, it can be compressed and transmitted in the same manner as the pixel data.
- the reception side can also perform conversion in units of pixels by performing the above-described conversion processing. Further, when there is no transmission of such a parallax stream, on the receiving side, it is possible to obtain parallax data in units of blocks between views as described above, and further convert them into units of pixels.
- the graphics data output unit 118 outputs data of graphics (including subtitles as subtitles) to be superimposed on the image.
- the graphics encoder 119 generates a graphics stream (graphics elementary stream) including the graphics data output from the graphics data output unit 118.
- the graphics constitute superimposition information, and are, for example, a logo, subtitles, and the like.
- the graphics data output from the graphics data output unit 118 is, for example, graphics data to be superimposed on the central view image.
- the graphics encoder 119 may generate graphics data to be superimposed on the left end and right end views based on the disparity data generated by the disparity data generation unit 116, and generate a graphics stream including these graphics data. . In this case, it is not necessary to create graphics data to be superimposed on the left end and right end views on the receiving side.
- the graphics data is mainly bitmap data. Offset information indicating the superimposed position on the image is added to the graphics data. This offset information indicates, for example, offset values in the vertical and horizontal directions from the upper left origin of the image to the upper left pixel of the graphics superimposed position.
- the standard for transmitting caption data as bitmap data is standardized and operated as “DVB_Subtitling” in DVB, which is a European digital broadcasting standard, for example.
- the audio data output unit 120 outputs audio data corresponding to the image data.
- the audio data output unit 120 includes, for example, a microphone or an audio data reading unit that reads out and outputs audio data from a storage medium.
- the audio encoder 121 performs encoding such as MPEG-2Audio or AAC on the audio data output from the audio data output unit 120 to generate an audio stream (audio elementary stream).
- the multiplexer 115 packetizes and multiplexes the elementary streams generated by the video encoders 114-1, 114-2, 114-3, the parallax encoder 117, the graphics encoder 119, and the audio encoder 121, and generates a transport stream TS. To do.
- PTS Presentation Time Time Stamp
- PES Packetized Elementary Stream
- the multiplexer 115 inserts the identification information described above into the layer of the transport stream TS.
- This identification information is information for identifying whether or not view configuration information is inserted in the layer of the video stream.
- This identification information is, for example, subordinate to a video elementary loop (Video ESloop) of a program map table (PMT: Program Map Table) included in the transport stream TS, or an event information table (EIT: Event Information). Table).
- Image data of N views (View 1... View N) for stereoscopic image display output from the N image data output units 111-1 to 111-N is supplied to the view selector 112.
- the view selector 112 extracts the image data VC of the center view, the image data VL of the left end view, and the image data VR of the right end view from the image data of N views.
- the image data VC of the central view taken out by the view selector 112 is supplied to the scaler 113-1 and scaled to, for example, a full HD size of 1920 * 1080.
- the image data VC ′ after the scaling process is supplied to the video encoder 114-1.
- the image data VC ′ is encoded to obtain encoded video data, and a video stream including the encoded data as a substream (sub stream 1) is generated. Also, in this video encoder 114-1, a view configuration having information indicating which image data the image data included in the video stream is in, for example, the user data area of the picture header or sequence header of the video stream Information is inserted. This video stream is supplied to the multiplexer 115.
- the image data VL ′ is encoded to obtain encoded video data, and a video stream including the encoded data as a substream (sub stream 2) is generated. Also, in this video encoder 114-2, a view configuration having information indicating which image data the image data included in the video stream is in the user data area of the picture header or sequence header of the video stream Information is inserted. This video stream is supplied to the multiplexer 115.
- the image data VR of the right end view extracted by the view selector 112 is supplied to the scaler 113-3 and is scaled to, for example, a full HD size of 1920 * 1080.
- the image data VR ′ after the scaling processing is supplied to the video encoder 114-3.
- the image data VR ′ is encoded to obtain encoded video data, and a video stream including the encoded data as a substream (sub stream 3) is generated. Also, in this video encoder 114-3, a view configuration having information indicating which image data the image data included in the video stream is in, for example, the user data area of the picture header or sequence header of the video stream Information is inserted. This video stream is supplied to the multiplexer 115.
- the image data of each of the center, left end, and right end views output from the view selector 112 is supplied to the parallax data generation unit 116.
- the disparity data generation unit 116 generates disparity data (disparity data) based on the image data of each view.
- the disparity data includes disparity data between the center view and the left end view, and disparity data between the center view and the right end view.
- parallax data is generated in pixel units or block units.
- the parallax data generated by the parallax data generation unit 116 is supplied to the parallax encoder 117.
- the parallax data is encoded, and a parallax stream is generated. This parallax stream is supplied to the multiplexer 115.
- graphics data (including subtitle data) output from the graphics data output unit 118 is supplied to the graphics encoder 119.
- the graphics encoder 119 generates a graphics stream including graphics data. This graphics stream is supplied to the multiplexer 115.
- the audio data output from the audio data output unit 120 is supplied to the audio encoder 121.
- the audio data is encoded by MPEG-2Audio, AAC or the like, and an audio stream is generated. This audio stream is supplied to the multiplexer 115.
- the multiplexer 115 the elementary streams supplied from each encoder are packetized and multiplexed to generate a transport stream TS.
- a PTS is inserted into each PES header for synchronous reproduction on the receiving side.
- identification information for identifying whether or not view configuration information is inserted in the layer of the video stream is inserted under the PMT or the EIT.
- the transmission data generation unit 110 illustrated in FIG. 7 illustrates a case where the transport stream TS includes three video streams. That is, the transport stream TS includes three video streams obtained by encoding the image data of each view at the center, the left end, and the right end as one picture.
- the transport stream TS includes two or one video stream
- the same configuration can be made.
- the following video streams are included. That is, the video stream obtained by encoding the image data of the central view as one picture and the image data of the left end view and the right end view are interleaved and encoded as one picture. Video stream is included.
- one video stream is included in the transport stream TS, for example, the following video streams are included. That is, a video stream including data in which image data of each view at the center, the left end, and the right end is encoded as one picture data is included.
- FIG. 12 shows a structural example (Syntax) of the multiview stream configuration descriptor (multiview_stream_configuration_descriptor) as the identification information.
- FIG. 13 shows the contents (Semantics) of main information in the structural example shown in FIG.
- the 1-bit field of “multiview_stream_check flag” indicates whether or not view configuration information is inserted in the video stream layer. “1” indicates that there is insertion of view configuration information in the layer of the video stream, and “0” indicates that there is no insertion. When it is “1”, the receiving side (decoder) checks the view configuration information existing in the user data area.
- FIG. 14 shows a structural example (Syntax) of multi-view stream configuration information (multiview_stream_configuration_info ()) as the view configuration information.
- FIG. 16 and FIG. 17 show the contents (Semantics) of main information in the structural example shown in FIG.
- the 1-bit field of “3D_flag” indicates whether or not the image data included in the video stream to be encoded is image data of a part of views constituting 3D. “1” indicates that the image data is for some views, and “0” indicates that the image data is not for some views.
- a 4-bit field of “view_count” indicates the number of views constituting the 3D service. The minimum value is 1 and the maximum value is 15.
- a 1-bit field of “single_view_es_flag” indicates whether or not data of a plurality of pictures is encoded in one access unit of the video stream. “1” indicates that only data of one picture is encoded, and “0” indicates that data of two or more pictures is encoded.
- the 1-bit field of “view_interleaving_flag” indicates whether or not image data of two views is interleaved and encoded as data of one picture in the video stream. “1” indicates that the interleave process is performed and the screen is split, and “0” indicates that the interleave process is not performed.
- view_interleaving_flag 0
- view_allocation indicates which view image data the image data included in the video stream is, that is, view allocation. For example, “0000” indicates a center view. Further, for example, “0001” indicates a view (1st left view next to center) adjacent to the left side from the center. In addition, for example, “0010” indicates that one view is adjacent to the right side from the center (1st right view ⁇ ⁇ next to center).
- a 3-bit field of “view_pair_position_id” indicates a relative view position of two views in all views. In this case, for example, an early position in the scan order is left (left), and a late position is right (right). For example, “000” indicates that there are two view pairs at both ends. Further, for example, “001” indicates that two view pairs are located one inside from both ends. Further, for example, “010” indicates that two view pairs are located one inside from both ends.
- the 1-bit field of “view_interleaving_type” indicates the type of interleaving (type). “1” indicates that the type of interleaving is side-by-side (Side-by-Side), and “0” indicates that the type of interleaving is top-and-bottom.
- each information of “display_flag”, “indication_of_picture_size_scaling_horizontal”, and “indication_of_picture_size_scaling_vertical” exists.
- the 1-bit field of “display_flag” indicates whether or not the view is indispensable when image display is performed. “1” indicates that display is mandatory. On the other hand, “0” indicates that display is not essential.
- the 4-bit field of “indication_of_picture_size_scaling_horizontal” indicates the horizontal pixel ratio of the decoded image with respect to full HD (1920). “0000” is 100%, “0001” is 80%, “0010” is 75%, “0011” is 66%, “0100” is 50%, “0101” is 33%, “0110” is 25%, “ “0111” indicates 20%.
- the 4-bit field of “indication_of_picture_size_scaling_vertical” indicates the vertical pixel ratio of the decoded image with respect to full HD (1080). “0000” is 100%, “0001” is 80%, “0010” is 75%, “0011” is 66%, “0100” is 50%, “0101” is 33%, “0110” is 25%, “0111” "" Indicates 20%.
- FIG. 18 shows an example of the relationship between the number of views indicated by “view_count” and the positions of two views indicated by “view_pair_position_id” (here, “View 1” and “View 2”).
- view pairs inside the both ends can be added to the view pairs at both ends in order to improve the performance of interpolation synthesis when the image quality cannot be satisfied with the two views at both ends.
- the encoded video data of the view pair additionally transmitted may be encoded so as to share the access unit (Access (Unit) in the stream of the view pair at both ends, or as another stream. It may be encoded.
- FIG. 19 shows an example of generating disparity data (disparity data) on the transmission side or the reception side when transmitting image data of two view pairs inside both ends together with image data of two view pairs at both ends as described above. Is shown. In the illustrated example, the number of views indicated by “view_count” is nine. Then, a substream (substream1) containing image data of two views (View 1, View 2) at both ends, and a substream containing image data of two views (View 3, View 4) inside it (Substream 2) exists.
- parallax data is calculated for “View 1” and “View 3”.
- parallax data is calculated for “View 2” and “View ⁇ 4 ”.
- parallax data is calculated for “View 3” and “View 4”. If the view resolution differs between substreams, the parallax data is calculated after matching either resolution.
- multi-view stream configuration information multiview_stream_configuration_info ()
- view configuration information inserted into a user data area of a video stream (video elementary stream)
- the multi-view stream configuration information is inserted, for example, in picture units or GOP units using the user data area.
- the multi-view stream configuration info is inserted as “Multiview stream configuration SEI message” in the “SELs” portion of the access unit.
- FIG. 21A shows the top access unit of the GOP (Group Of Pictures), and
- FIG. 21B shows the access unit other than the top of the GOP.
- “MultiviewMultistream configuration SEI message” is inserted only in the first access unit of the GOP.
- FIG. 22A shows a structure example (Syntax) of “Multiview stream configuration SEI message”. “Uuid_iso_iec_11578” has a UUID value indicated by “ISO / IEC 11578: 1996 Annex A.”. “Userdata_for_multiview_stream_configuration ()” is inserted into the “user_data_payload_byte” field.
- FIG. 22B shows a structural example (Syntax) of “userdata_for_multiview_stream _configuration ()”. In this, multiview stream configuration information (multiview_stream_configuration_info ()) is inserted (see FIG. 14). “Userdata_id” is an identifier of multi-view stream configuration information indicated by 16 bits without a sign.
- the multi-view stream configuration info is inserted as user data “user_data ()” in the user data area of the picture header part.
- FIG. 23A shows a structural example (Syntax) of “user_data ()”.
- a 32-bit field of “user_data_start_code” is a start code of user data (user_data), and is a fixed value of “0x000001B2”.
- the multi-view stream configuration descriptor (multiview_stream_configuration_descriptor) as the identification information shown in FIG. 12 is inserted in the transport stream TS layer, for example, under the PMT or under the EIT. In other words, this descriptor is placed at an optimum position in an event unit or in a static or dynamic use case in time.
- FIG. 24 shows a configuration example of the transport stream TS.
- illustration of parallax data, audio, graphics, and the like is omitted for simplification of the drawing.
- This configuration example shows a case where three video streams are included in the transport stream TS. That is, the transport stream TS includes three video streams obtained by encoding the image data of each view at the center, the left end, and the right end as one picture. Further, this configuration example shows a case where the number of views is five.
- a PES packet “video PES1” of a video stream in which the image data VC ′ of the central view is encoded as one picture is included.
- the multi-view stream configuration information inserted in the user data area of this video stream it is indicated that the number of views indicated by “View_count” is five.
- the PES packet “video ⁇ PES2 ”of the video stream in which the image data VL ′ of the left end view is encoded as one picture is included.
- the number of views indicated by “View_count” is five.
- the PES packet “video PES3” of the video stream in which the image data VR ′ of the left end view is encoded as one picture is included.
- the number of views indicated by “View_count” is five.
- the transport stream TS includes a PMT (Program Map Table) as PSI (Program Specific Information).
- PSI Program Specific Information
- This PSI is information describing to which program each elementary stream included in the transport stream belongs.
- the transport stream includes an EIT (Event Information Table) as SI (Serviced Information) for managing each event.
- the PMT there is an elementary loop having information related to each elementary stream.
- a video elementary loop (Video ES loop).
- information such as a packet identifier (PID) is arranged for each stream, and a descriptor describing information related to the elementary stream is also arranged.
- PID packet identifier
- a multiview stream configuration descriptor (multiview_stream_configuration_descriptor) is inserted under the PMT video elementary loop (VideoMTES loop) in association with each video stream.
- PMT video elementary loop VideoMTES loop
- multiview_stream_checkflag 1” is set, indicating the presence of multi-view stream configuration information as view configuration information in the user area of the video stream. It is also conceivable to insert this descriptor under the EIT as shown by the broken line.
- FIG. 25 also shows a configuration example of the transport stream TS. Also in this configuration example, illustration of parallax data, audio, graphics, and the like is omitted to simplify the drawing.
- This configuration example shows a case where two video streams are included in the transport stream TS. That is, the transport stream TS includes a video stream obtained by encoding the image data of the central view as one picture. In addition, the transport stream TS includes a video stream obtained by interleaving the image data of the left end view and the right end view and encoding it as one picture. This configuration example also shows a case where the number of views is five.
- the PES packet “video PES1” of the video stream in which the image data VC ′ of the central view is encoded as one picture is included.
- the number of views indicated by “View_count” is five.
- the PES packet “video ⁇ PES2 ”of the video stream in which the image data VL ′ of the left end view and the image data VR ′ of the right end view are encoded as one picture is included.
- the number of views indicated by “View_count” is five.
- a multiview stream configuration descriptor (multiview_stream_configuration_descriptor) is inserted in association with each video stream under the video elementary loop (VideoPES loop) of the PMT.
- video elementary loop VideoPES loop
- multiview_stream_checkflag 1” is set, indicating the presence of multi-view stream configuration information as view configuration information in the user area of the video stream. It is also conceivable to insert this descriptor under the EIT as shown by the broken line.
- a PES packet “video ⁇ PES1 ”of one video stream is included.
- This video stream includes data in which image data of each view at the center, the left end, and the right end is encoded as data of one picture in one access unit, and a user data area corresponding to each picture Exists.
- multi-view stream configuration information is inserted.
- the info corresponding to the picture data in which the image data of the central view is encoded it is indicated that the number of views indicated by “View_count” is five.
- the number of views indicated by “View_count” is five.
- the number of views indicated by “View_count” is five.
- a multiview stream configuration descriptor (multiview_stream_configuration_descriptor) is inserted in association with one video stream under the video elementary loop (Video ES loop) of the PMT.
- multiview_stream_checkflag 1” is set, indicating the presence of multi-view stream configuration information as view configuration information in the user area of the video stream. It is also conceivable to insert this descriptor under the EIT as shown by the broken line.
- the transmission data generation unit 110 illustrated in FIG. 7 is located between the left end and the right end of at least the left end view and the right end view among the plurality of views for stereoscopic image display.
- a transport stream TS including a video stream obtained by encoding image data of an intermediate view is generated. Therefore, image data transmission for performing naked-eye viewing of a stereoscopic image with a multi-view configuration can be effectively performed.
- multi-view stream configuration information (multiview_stream_configuration_info ()) as the view configuration information is inserted into the video stream layer. Therefore, on the receiving side, appropriate and efficient processing for performing naked-eye viewing of a three-dimensional image (stereoscopic image) using image data of a plurality of views can be performed based on the view configuration information.
- a multiview stream configuration descriptor (multiview_stream_configuration_descriptor) is inserted into the layer of the transport stream TS.
- This descriptor constitutes identification information for identifying whether or not view configuration information is inserted in the layer of the video stream. With this identification information, the reception side can easily identify whether or not view configuration information is inserted in the layer of the video stream. Therefore, it is possible to efficiently extract view configuration information from the user data area of the video stream.
- FIG. 27 illustrates a configuration example of the receiver 200.
- the receiver 200 includes a CPU 201, a flash ROM 202, a DRAM 203, an internal bus 204, a remote control receiver (RC receiver) 205, and a remote control transmitter (RC transmitter) 206.
- the receiver 200 also includes an antenna terminal 211, a digital tuner 212, a transport stream buffer (TS buffer) 213, and a demultiplexer 214.
- the receiver 200 includes coded buffers 215-1, 215-2, and 215-3, video decoders 216-1, 216-2, and 216-3, and decoded buffers 217-1, 217-2, and 217-. 3 and scalers 218-1, 218-2, and 218-3.
- the receiver 200 also includes a view interpolation unit 219 and a pixel interleave / superimposition unit 220.
- the receiver 200 includes a coded buffer 221, a parallax decoder 222, a parallax buffer 223, and a parallax data conversion unit 224.
- the receiver 200 also includes a coded buffer 225, a graphics decoder 226, a pixel buffer 227, a scaler 228, and a graphics shifter 229. Further, the receiver 200 includes a coded buffer 230, an audio decoder 231, and a channel mixing unit 232.
- the CPU 201 controls the operation of each unit of receiver 200.
- the flash ROM 202 stores control software and data.
- the DRAM 203 constitutes a work area for the CPU 201.
- the CPU 201 develops software and data read from the flash ROM 202 on the DRAM 203 and activates the software to control each unit of the receiver 200.
- the RC receiving unit 205 receives a remote control signal (remote control code) transmitted from the RC transmitter 206 and supplies it to the CPU 201.
- CPU201 controls each part of receiver 200 based on this remote control code.
- the CPU 201, flash ROM 202, and DRAM 203 are connected to the internal bus 204.
- the antenna terminal 211 is a terminal for inputting a television broadcast signal received by a receiving antenna (not shown).
- the digital tuner 212 processes the television broadcast signal input to the antenna terminal 211 and outputs a predetermined transport stream (bit stream data) TS corresponding to the user's selected channel.
- the transport stream buffer (TS buffer) 213 temporarily accumulates the transport stream TS output from the digital tuner 212.
- the transport stream TS includes at least image data of a left end view and a right end view among a plurality of views for stereoscopic image display, and a central view as an intermediate view positioned between the left end and the right end.
- a video stream obtained by encoding image data is included.
- the transport stream TS may include three, two, or one video stream (see FIGS. 24, 25, and 26).
- the transport stream TS includes three video streams obtained by encoding the image data of the center, left end, and right end views as one picture. An explanation shall be given.
- a multiview stream configuration descriptor (multiview_stream_configuration_descriptor) is inserted in the transport stream TS under the PMT or the EIT.
- This descriptor is identification information for identifying whether or not view configuration information, that is, multiview stream configuration information (multiview_stream_configuration_info ()) is inserted in the layer of the video stream.
- the demultiplexer 214 extracts video, parallax, graphics, and audio elementary streams from the transport stream TS temporarily stored in the TS buffer 213. Further, the demultiplexer 214 extracts the multi-view stream configuration descriptor described above from the transport stream TS, and sends it to the CPU 201. The CPU 201 can easily determine whether or not view configuration information is inserted in the layer of the video stream from the 1-bit field of “multiview_stream_check flag” of the descriptor.
- the coded buffers 215-1, 215-2, and 215-3 are video streams obtained by encoding the image data of the center, left end, and right end views extracted by the demultiplexer 214 as one picture, respectively. Is temporarily stored. Video decoders 216-1, 216-2, and 216-3 perform decoding processing of video streams stored in coded buffers 215-1, 215-2, and 215-3, respectively, under the control of CPU 201. Then, the image data of each view at the center, the left end, and the right end is acquired.
- the video decoder 216-1 acquires image data of a center view.
- the video decoder 216-2 acquires image data of the left end view (left (view).
- the video decoder 216-3 acquires image data of the right end view (right view).
- Each video decoder extracts multi-view stream configuration information (multiview_stream_configuration_info ()) as view configuration information inserted in a picture data header or a sequence header user data area or the like, and sends it to the CPU 201.
- the CPU 201 performs appropriate and efficient processing for performing naked-eye viewing of a three-dimensional image (stereoscopic image) based on the image data of a plurality of views based on the view configuration information.
- the CPU 201 demultiplexer 214, video decoders 216-1, 216-2, 216-3, scalers 218-1, 218-2, 218-3 in units of pictures or GOPs.
- the operation of the view interpolation unit 219 and the like is controlled.
- the CPU 201 can identify whether or not the image data included in the video stream is the image data of a part of the views constituting 3D by using the 1-bit field of “3D_flag”. Further, for example, the CPU 201 can recognize the number of views constituting the 3D service by a 4-bit field of “view_count”.
- the CPU 201 can identify whether or not data of a plurality of pictures is encoded in one access unit of the video stream by using a 1-bit field of “single_view_es_flag”. Further, for example, the CPU 201 can identify whether or not the image data of two views is interleaved and encoded as data of one picture in the video stream by the 1-bit field of “view_interleaving_flag”.
- the CPU 201 uses the 4-bit field of “view_allocation” to display the image included in the video stream. It is possible to recognize which view image data the data is.
- the CPU 201 uses the 3-bit field of “view_pair_position_id” to display two views in all views. The relative view position can be recognized. Further, at this time, the CPU 201 can know the type of interleaving (type) from the 1-bit field of “view_interleaving_type”.
- the CPU 201 can recognize the horizontal pixel ratio and the vertical pixel ratio of the decoded image with respect to the full HD by the 4-bit field of “indication_of_picture_size_scaling_horizontal” and the 4-bit field of “indication_of_picture_size_scaling _vertical”.
- the decoded buffers 217-1, 217-2, and 217-3 temporarily store the image data of each view acquired by the video decoders 216-1, 216-2, and 216-3, respectively.
- Scalers 218-1, 218-2, and 218-3 are configured so that the output resolution of the image data of each view output from decoded buffers 217-1, 217-2, and 217-3 is a predetermined resolution, respectively. Adjust to.
- the multi-view stream configuration info includes a 4-bit field “indication_of_picture_size_scaling_horizontal” indicating the horizontal pixel ratio of the decoded image and a 4-bit field “indication_of_picture_size_scaling _vertical” indicating the vertical pixel ratio of the decoded image. Based on this pixel ratio information, the CPU 201 controls the scaling ratios in the scalers 218-1, 218-2, and 218-3 so that a predetermined resolution can be obtained.
- the CPU 201 calculates a scaling ratio for the image data stored in the decoded buffer based on the resolution of the decoded image data, the resolution of the monitor, and the number of views, and scalers 218-1 and 218. Instruct -2 and 218-3.
- FIG. 28 shows an example of calculating the scaling ratio.
- the scaling ratio is 1 ⁇ 2.
- the resolution of the decoded image data is 1920 * 1080
- the monitor resolution is 1920 * 1080
- the number of views to be displayed is 4, the scaling ratio is 1/4.
- the scaling ratio is 1/4.
- the coded buffer 221 temporarily stores the parallax stream extracted by the demultiplexer 214.
- the disparity decoder 222 performs processing opposite to that of the disparity encoder 117 (see FIG. 7) of the transmission data generation unit 110 described above. That is, the parallax decoder 222 performs a decoding process on the parallax stream stored in the coded buffer 221 to obtain parallax data.
- the disparity data includes disparity data between the center view and the left end view and disparity data between the center view and the right end view.
- the parallax data is parallax data in units of pixels or blocks.
- the parallax buffer 223 temporarily stores the parallax data acquired by the parallax decoder 222.
- the parallax data conversion unit 224 generates parallax data in units of pixels suitable for the size of the scaled image data based on the parallax data stored in the parallax buffer 223. For example, when the transmitted parallax data is in units of blocks, it is converted into parallax data in units of pixels (see FIG. 11). Also, for example, when the transmitted parallax data is in units of pixels, but does not match the size of the scaled image data, the data is scaled appropriately.
- the view interpolation unit 219 generates a predetermined number of views that have not been transmitted based on the parallax data between the views obtained by the parallax data conversion unit 224 from the image data of the center, left end, and right end views after scaling. Interpolate image data. That is, the view interpolation unit 219 interpolates and outputs the image data of each view positioned between the center view and the left end view. The view interpolation unit 219 interpolates and outputs image data of each view located between the center view and the right end view.
- FIG. 29 schematically shows an example of interpolation synthesis processing in the view interpolation unit 219.
- the current view corresponds to the above-mentioned center view
- the target view 1 corresponds to the above-mentioned left end view
- the target view 2 corresponds to the above-mentioned right end. Corresponds to the view.
- the pixels of the view to be interpolated and located between the current view and the target view 1 are assigned as follows.
- disparity data in two directions that is, disparity data indicating the target view 1 from the current view and conversely, disparity data indicating the current view from the target view 1 is used.
- the current view pixel is allocated by shifting the parallax data as a vector (see solid line arrows and broken line arrows from the current view to the target view 1 and black circles).
- the view pixel to be interpolated and synthesized can be applied with the pixel from the view that can be regarded as the background. It should be noted that an occlusion area that cannot be handled bidirectionally is assigned a value by a post process.
- the part that becomes the target overlapped (overlapped) where the tips of the arrows shown in the figure overlap is the part in the target view 1 where the shift due to disparity overlaps.
- which of the two parallaxes corresponds to the foreground of the current view is determined based on the value of the parallax data and selected. In this case, the smaller value is mainly selected.
- the coded buffer 225 temporarily stores the graphics stream extracted by the demultiplexer 214.
- the graphics decoder 226 performs processing opposite to that of the graphics encoder 119 (see FIG. 7) of the transmission data generation unit 110 described above. That is, the graphics decoder 226 performs a decoding process on the graphics stream stored in the coded buffer 225 to obtain decoded graphics data (including subtitle data).
- the graphics decoder 226 generates graphics bitmap data to be superimposed on the view (image) based on the graphics data.
- the pixel buffer 227 temporarily stores graphics bitmap data generated by the graphics decoder 226.
- the scaler 228 adjusts the size of the graphics bitmap data stored in the pixel buffer 227 so as to correspond to the size of the scaled image data.
- the graphics shifter 229 performs shift processing on the bitmap data of the size-adjusted graphics based on the parallax data obtained by the parallax data conversion unit 224. Then, the graphics shifter 229 generates N graphics bitmap data to be superimposed on the image data of N views (View1, View2,..., ViewN) output from the view interpolation unit 219, respectively.
- the pixel interleaving / superimposing unit 220 superimposes graphics bitmap data corresponding to the image data of N views (View1, View2,..., ViewN) output from the view interpolation unit 219, respectively. Further, the pixel interleaving / superimposing unit 220 performs pixel interleaving processing on the image data of N views (View1, View2,..., ViewN) to view the 3D image (stereoscopic image) with the naked eye. Display image data is generated.
- the coded buffer 230 temporarily stores the audio stream extracted by the demultiplexer 214.
- the audio decoder 231 performs a process reverse to that of the audio encoder 121 (see FIG. 7) of the transmission data generation unit 110 described above. That is, the audio decoder 231 performs a decoding process on the audio stream stored in the coded buffer 230 to obtain decoded audio data.
- the channel mixing unit 232 generates and outputs audio data of each channel for realizing, for example, 5.1ch surround with respect to the audio data obtained by the audio decoder 231.
- image data of each view is read from the decoded buffers 217-1, 217-2, and 217-2, parallax data is read from the parallax buffer 223, and graphics bitmap data is read from the pixel buffer 227. Is performed based on PTS, and transfer synchronization is taken.
- a television broadcast signal input to the antenna terminal 211 is supplied to the digital tuner 212.
- the television broadcast signal is processed, and a predetermined transport stream TS corresponding to the user's selected channel is output.
- This transport stream TS is temporarily stored in the TS buffer 213.
- the demultiplexer 214 extracts elementary streams of video, parallax, graphics, and audio from the transport stream TS temporarily stored in the TS buffer 213. Further, the demultiplexer 214 extracts a multi-view stream configuration descriptor as identification information from the transport stream TS and sends it to the CPU 201.
- the CPU 201 can easily determine whether or not view configuration information is inserted in the layer of the video stream from the 1-bit field of “multiview_stream_check flag” of the descriptor.
- the image data of the center, left end, and right end views extracted by the demultiplexer 214 are supplied to the coded buffers 215-1, 215-2, and 215-3, respectively, and temporarily accumulated.
- the video decoders 216-1, 216-2, and 216-3 decode the video streams stored in the coded buffers 215-1, 215-2, and 215-3, respectively, under the control of the CPU 201. Is performed, and image data of each view at the center, the left end, and the right end is acquired.
- Each video decoder extracts multi-view stream configuration information (multiview_stream_configuration_info ()) as view configuration information inserted in the user data area of the picture header or sequence header of the video stream and sends it to the CPU 201. Sent. Based on this view configuration information, the CPU 201 demultiplexer 214, video decoders 216-1, 216-2, 216-3, scalers 218-1, 218-2, 218-3, view, in picture units or GOP units. Controls the operation of the interpolation unit 219 and the like.
- multi-view stream configuration information multiview_stream_configuration_info ()
- the parallax stream extracted by the demultiplexer 214 is supplied to the coded buffer 221 and temporarily accumulated.
- the parallax decoder 222 the decoding process of the parallax stream memorize
- the disparity data includes disparity data between the center view and the left end view and disparity data between the center view and the right end view.
- the parallax data is parallax data in units of pixels or blocks.
- the parallax data acquired by the parallax decoder 222 is supplied to the parallax buffer 223 and temporarily accumulated. Based on the parallax data stored in the parallax buffer 223, the parallax data conversion unit 224 generates parallax data in pixel units that matches the size of the scaled image data. In this case, when the transmitted parallax data is in units of blocks, it is converted into parallax data in units of pixels. Also, in this case, the transmitted parallax data is in units of pixels, but if it does not match the size of the image data after scaling, it is appropriately scaled.
- the view interpolation unit 219 a predetermined number of views that are not transmitted based on the parallax data between the views obtained by the parallax data conversion unit 224 from the image data of the center, left end, and right end views after scaling.
- Image data is interpolated and synthesized. From the view interpolation unit 219, image data of N views (View1, View2,..., ViewN) for viewing a three-dimensional image (stereoscopic image) with the naked eye is obtained. Note that image data of each view at the center, the left end, and the right end is also included.
- the graphics stream extracted by the demultiplexer 214 is supplied to the coded buffer 225 and temporarily accumulated.
- the graphics decoder 226 performs a decoding process on the graphics stream stored in the coded buffer 225 to obtain decoded graphics data (including subtitle data). Also, the graphics decoder 226 generates graphics bitmap data to be superimposed on the view (image) based on the graphics data.
- the graphics bitmap data generated by the graphics decoder 226 is supplied to the pixel buffer 227 and temporarily accumulated.
- the size of the graphics bitmap data stored in the pixel buffer 227 is adjusted to correspond to the size of the scaled image data.
- the pixel interleaving / superimposing unit 220 superimposes graphics bitmap data respectively corresponding to image data of N views (View1, View2,..., ViewN ⁇ ⁇ ⁇ ⁇ ).
- the pixel interleaving / superimposing unit 220 performs pixel interleaving processing on the image data of N views (View1, View2,..., ViewN) to perform naked-eye viewing of a three-dimensional image (stereoscopic image).
- Display image data is generated. By supplying the display image data to the display, an image display for viewing the three-dimensional image (stereoscopic image) with the naked eye is performed.
- the audio stream extracted by the demultiplexer 214 is supplied to the coded buffer 230 and temporarily accumulated.
- the audio decoder 231 the audio stream stored in the coded buffer 230 is decoded, and decoded audio data is obtained.
- This audio data is supplied to the channel mixing unit 232.
- the channel mixing unit 232 generates audio data of each channel for realizing, for example, 5.1ch surround with respect to the audio data.
- This audio data is supplied to, for example, a speaker, and audio output is performed in accordance with image display.
- the receiver 200 shown in FIG. 27 among the plurality of views for displaying a stereoscopic image, at least the image data of the left end view and the right end view and the intermediate position located between the left end and the right end. View image data is received.
- the other views are obtained by interpolation processing based on the parallax data. Therefore, it is possible to satisfactorily perform autostereoscopic viewing of a stereoscopic image with a multiview configuration.
- the receiver 200 illustrated in FIG. 27 illustrates a configuration example in the case where a disparity stream obtained by encoding disparity data is included in the transport stream TS.
- the parallax data is generated from the received image data of each view and used.
- FIG. 30 shows a configuration example of the receiver 200A in that case.
- This receiver 200 ⁇ / b> A has a parallax data generation unit 233.
- the parallax data generation unit 233 generates parallax data based on the image data of the center, left end, and right end views that have been subjected to the scaling process.
- the disparity data generation unit 233 generates and outputs disparity data similar to the disparity data in units of pixels generated by the disparity data conversion unit 224 of the receiver 200 illustrated in FIG.
- the disparity data generated by the disparity data generation unit 233 is supplied to the view interpolation unit 219 and is also supplied to the flux shifter 229 for use.
- receiver 200A shown in FIG. 30 the coded buffer 221, the parallax decoder 222, the parallax buffer 223, and the parallax data conversion unit 224 in the receiver 200 shown in FIG. 27 are omitted.
- the other configuration of receiver 200A shown in FIG. 30 is the same as that of receiver 200 shown in FIG.
- the image transmission / reception system 10 including the broadcast station 100 and the receiver 200 is shown.
- the configuration of the image transmission / reception system to which the present technology can be applied is not limited to this.
- the receiver 200 may have a configuration of a set top box and a monitor connected by a digital interface such as (High-Definition Multimedia Interface (HDMI)).
- HDMI High-Definition Multimedia Interface
- the container is a transport stream (MPEG-2 TS)
- MPEG-2 TS transport stream
- the present technology can be similarly applied to a system configured to be distributed to receiving terminals using a network such as the Internet.
- the Internet distribution it is often distributed in a container of MP4 or other formats.
- containers of various formats such as transport stream (MPEG-2 TS) adopted in the digital broadcasting standard and MP4 used in Internet distribution correspond to the container.
- the view configuration information inserted in the layer of the video stream includes the predetermined view
- the transmission device according to (2) including information indicating a position of a view.
- the view configuration information inserted in the layer of the video stream includes The transmission device according to (2) or (8), including information indicating positions of the two views.
- the view configuration information inserted in the layer of the video stream includes information indicating whether or not data of a plurality of pictures is encoded in one access unit of the video stream. 2) The transmission device according to any one of (8) to (10). (12) The view configuration information inserted in the layer of the video stream includes information indicating whether the image data of a view essential for image display is a coded video stream. (2) The transmission device according to any one of (8) to (11). (13) The view configuration information inserted into the layer of the video stream includes pixel ratio information for a predetermined horizontal and / or vertical resolution. (2), (8) to (12) The transmission device according to any one of the above. (14) A parallax data acquisition unit that acquires parallax data between the views is further provided.
- the image data transmission unit In addition to the video stream obtained by encoding the acquired image data, a container having a predetermined format including the parallax stream obtained by encoding the acquired parallax data is transmitted. (13) The transmission device according to any one of (1). (15) The transmission device according to any one of (1) to (14), wherein the container is a transport stream. (16) Image data acquisition for acquiring image data of at least a left end view and a right end view and an intermediate view image data positioned between the left end and the right end among a plurality of views for stereoscopic image display Steps, A transmission method comprising: an image data transmission step of transmitting a container of a predetermined format including a video stream obtained by encoding the acquired image data.
- the container includes a disparity stream obtained by encoding the disparity data
- image data of at least the left end view and the right end view and image data of an intermediate view positioned between the left end and the right end are obtained by encoding.
- the main feature of the present technology is that the image data of at least the leftmost view and the rightmost view among a plurality of views for autostereoscopic viewing of a three-dimensional image (stereoscopic image) and an intermediate position located between the left end and the right end.
- DESCRIPTION OF SYMBOLS 10 ... Image transmission / reception system 100 ... Broadcasting station 110 ... Transmission data generation part 111-1 to 111-N ... Image data output part 112 ... View selector 113-1, 113-2, 113 -3: Scaler 114-1, 114-2, 114-3 ... Video encoder 115 ... Multiplexer 116 ... Disparity data generator 117 ... Disparity encoder 118 ... Graphics data output unit 119 ... Graphics encoder 120 ... Audio data output unit 121 ... Audio encoder 200, 200A ... Receiver 201 ... CPU 211 ... Antenna terminal 212 ... Digital tuner 213 ...
- Transport stream buffer (TS buffer) 214 Demultiplexer 215-1, 215-2, 215-3, 221, 225, 230 ... Coded buffer 216-1, 216-2, 216-3 ... Video decoder 217-1, 217- 2, 217-3: view buffer 218-1, 218-2, 218-3, 228 ... scaler 219 ... view interpolation unit 220 ... pixel interleave / superimposition unit 222 ... disparity decoder 223 ... Parallax buffer 224 ... Parallax data converter 226 ... Graphics decoder 227 ... Pixel buffer 229 ... Graphics shifter 231 ... Audio decoder 232 ... Channel mixing section 233 ... Parallax data Generator
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Library & Information Science (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
Abstract
Description
立体画像表示のための複数のビューのうち、少なくとも左端のビューおよび右端のビューの画像データと、上記左端および上記右端の間に位置する中間のビューの画像データを取得する画像データ取得部と、
上記取得された画像データが符号化されて得られたビデオストリームを含む所定フォーマットのコンテナを送信する画像データ送信部を備える
送信装置にある。
立体画像表示のための複数のビューのうち、少なくとも左端のビューおよび右端のビューの画像データと、上記左端および上記右端の間に位置する中間のビューの画像データとが符号化されて得られたビデオストリームを含む所定フォーマットのコンテナを受信する画像データ受信部と、
上記コンテナに含まれるビデオストリームをデコードして、上記各ビューの画像データを得る画像データ取得部と、
上記各ビューの視差データに基づいて、上記各ビューの間に位置する所定数のビューの画像データを補間処理で取得する補間処理部を備える
受信装置にある。
1.実施の形態
2.変形例
[画像送受信システム]
図1は、実施の形態としての画像送受信システム10の構成例を示している。この画像送受信システム10は、放送局100および受信機200により構成されている。放送局100は、コンテナとしてのトランスポートストリームTSを放送波に載せて送信する。
図7は、放送局100において、上述したトランスポートストリームTSを生成する送信データ生成部110の構成例を示している。この送信データ生成部110は、N個の画像データ出力部111-1~111-Nと、ビューセレクタ112と、スケーラ113-1,113-2,113-3と、ビデオエンコーダ114-1,114-2,114-3と、マルチプレクサ115を有している。また、この送信データ生成部110は、視差データ生成部116と、視差エンコーダ117と、グラフィクスデータ出力部118と、グラフィクスエンコーダ119と、音声データ出力部120と、オーディオエンコーダ121を有している。
DPn = min ( Σabs( differ (Dj - Di))) ・・・(1)
X(A,C)=median(X,A,C)
X(B,D)=median(X,B,D)
X(C,D)=median(X,C,D)
・・・(2)
上述したように、トランスポートストリームTSのレイヤに、ビデオストリームのレイヤにビュー構成情報の挿入があるか否かを識別するための識別情報が挿入される。図12は、この識別情報としてのマルチビュー・ストリーム・コンフィグレーション・デスクリプタ(multiview_stream_configuration_descriptor)の構造例(Syntax)を示している。また、図13は、図12に示す構造例における主要な情報の内容(Semantics)を示している。
図27は、受信機200の構成例を示している。この受信機200は、CPU201と、フラッシュROM202と、DRAM203と、内部バス204と、リモートコントロール受信部(RC受信部)205と、リモートコントロール送信機(RC送信機)206を有している。また、この受信機200は、アンテナ端子211と、デジタルチューナ212と、トランスポートストリームバッファ(TSバッファ)213と、デマルチプレクサ214を有している。
なお、上述実施の形態においては、放送局100と受信機200からなる画像送受信システム10を示したが、本技術を適用し得る画像送受信システムの構成は、これに限定されるものではない。例えば、受信機200の部分が、例えば、(HDMI(High-Definition Multimedia Interface)などのデジタルインタフェースで接続されたセットトップボックスおよびモニタの構成などであってもよい。
(1)立体画像表示のための複数のビューのうち、少なくとも左端のビューおよび右端のビューの画像データと、上記左端および上記右端の間に位置する中間のビューの画像データを取得する画像データ取得部と、
上記取得された画像データが符号化されて得られたビデオストリームを含む所定フォーマットのコンテナを送信する画像データ送信部を備える
送信装置。
(2)上記ビデオストリームのレイヤに、該ビデオストリーム内の画像データに関するビュー構成情報を挿入するビュー構成情報挿入部をさらに備える
前記(1)に記載の送信装置。
(3)上記コンテナのレイヤに、上記ビデオストリームのレイヤに上記ビュー構成情報の挿入があるか否かを識別するための識別情報を挿入する識別情報挿入部をさらに備える
前記(2)に記載の送信装置。
(4)上記コンテナが含むビデオストリームにおいて、上記左端のビューおよび上記右端のビューの画像データはそれぞれ1つのピクチャのデータとして符号化されている
前記(1)から(3)のいずれかに記載の送信装置。
(5)上記コンテナが含むビデオストリームにおいて、上記左端のビューおよび上記右端のビューの画像データはインターリーブ処理されて1つのピクチャのデータとして符号化されている
前記(1)から(3)のいずれかに記載の送信装置。
(6)上記コンテナが含むビデオストリームは、1つまたは複数のピクチャのデータを含む
前記(1)から(5)のいずれかに記載の送信装置。
(7)上記コンテナが含むビデオストリームが複数のピクチャの符号化データを含むとき、各ピクチャの符号化データの間に境界を示す情報が配置される
前記(1)から(6)のいずれかに記載の送信装置。
(8)上記コンテナが含むビデオストリームにおいて、所定のビューの画像データが1つのピクチャのデータとして符号化されているとき、該ビデオストリームのレイヤに挿入される上記ビュー構成情報には、上記所定のビューの位置を示す情報が含まれる
前記(2)に記載の送信装置。
(9)上記コンテナが含むビデオストリームにおいて、2つのビューの画像データがインターリーブ処理されて1つのピクチャのデータとして符号化されているとき、該ビデオストリームのレイヤに挿入される上記ビュー構成情報には、該2つのビューの位置を示す情報が含まれる
前記(2)または(8)に記載の送信装置。
(10)上記ビュー構成情報には、上記2つのビューの画像データに対して行われるインターリーブのタイプを示す情報がさらに含まれる
前記(9)に記載の送信装置。
(11)上記ビデオストリームのレイヤに挿入される上記ビュー構成情報には、該ビデオストリームの1アクセスユニット内に複数のピクチャのデータが符号されているか否かを示す情報が含まれている
前記(2)、(8)から(10)のいずれかに記載の送信装置。
(12)上記ビデオストリームのレイヤに挿入される上記ビュー構成情報には、画像表示に必須のビューの画像データが符号化されているビデオストリームであるか否かを示す情報が含まれている
前記(2)、(8)から(11)のいずれかに記載の送信装置。
(13)上記ビデオストリームのレイヤに挿入される上記ビュー構成情報には、水平および/または垂直の所定の解像度に対する画素比率情報が含まれている
前記(2)、(8)から(12)のいずれかに記載の送信装置。
(14)上記各ビューの間の視差データを取得する視差データ取得部をさらに備え、
上記画像データ送信部は、
上記取得された画像データが符号化されて得られたビデオストリームの他に、上記取得された視差データが符号化されて得られた視差ストリームを含む所定フォーマットのコンテナを送信する
前記(1)から(13)のいずれかに記載の送信装置。
(15)上記コンテナは、トランスポートストリームである
前記(1)から(14)のいずれかに記載の送信装置。
(16)立体画像表示のための複数のビューのうち、少なくとも左端のビューおよび右端のビューの画像データと、上記左端および上記右端の間に位置する中間のビューの画像データを取得する画像データ取得ステップと、
上記取得された画像データが符号化されて得られたビデオストリームを含む所定フォーマットのコンテナを送信する画像データ送信ステップを備える
送信方法。
(17)立体画像表示のための複数のビューのうち、少なくとも左端のビューおよび右端のビューの画像データと、上記左端および上記右端の間に位置する中間のビューの画像データが符号化されて得られたビデオストリームを含む所定フォーマットのコンテナを受信する画像データ受信部と、
上記コンテナに含まれるビデオストリームをデコードして、上記各ビューの画像データを得る画像データ取得部と、
上記各ビューの間の視差データに基づいて、上記各ビューの間に位置する所定数のビューの画像データを補間処理で取得する補間処理部を備える
受信装置。
(18)上記コンテナは、上記視差データが符号化されて得られた視差ストリームを含み、
上記コンテナに含まれる上記視差ストリームをデコードして上記視差データを得る視差データ取得部をさらに備える
前記(17)に記載の受信装置。
(19)上記画像データ取得部で得られた上記各ビューの画像データに基づいて、上記視差データを生成する視差データ生成部をさらに備える
前記(17)に記載の受信装置。
(20)立体画像表示のための複数のビューのうち、少なくとも左端ビューおよび右端のビューの画像データと、上記左端および上記右端の間に位置する中間のビューの画像データが符号化されて得られたビデオストリームを含む所定フォーマットのコンテナを受信する画像データ受信ステップと、
上記コンテナに含まれるビデオストリームをデコードして、上記各ビューの画像データを得る画像データ取得ステップと、
上記各ビューの間の視差データに基づいて、上記各ビューの間に位置する所定数のビューの画像データを補間処理で取得する補間処理ステップを備える
受信方法。
100・・・放送局
110・・・送信データ生成部
111-1~111-N・・・画像データ出力部
112・・・ビューセレクタ
113-1,113-2,113-3・・・スケーラ
114-1,114-2,114-3・・・ビデオエンコーダ
115・・・マルチプレクサ
116・・・視差データ生成部
117・・・視差エンコーダ
118・・・グラフィクスデータ出力部
119・・・グラフィクスエンコーダ
120・・・音声データ出力部
121・・・オーディオエンコーダ
200,200A・・・受信機
201・・・CPU
211・・・アンテナ端子
212・・・デジタルチューナ
213・・・トランスポートストリームバッファ(TSバッファ)
214・・・デマルチプレクサ
215-1,215-2,215-3,221,225,230・・・コーデッドバッファ
216-1,216-2,216-3・・・ビデオデコーダ
217-1,217-2,217-3・・・ビューバッファ
218-1,218-2,218-3,228・・・スケーラ
219・・・ビュー補間部
220・・・ピクセルインターリーブ/重畳部
222・・・視差デコーダ
223・・・視差バッファ
224・・・視差データ変換部
226・・・グラフィクスデコーダ
227・・・ピクセルバッファ
229・・・グラフィクスシフタ
231・・・オーディオデコーダ
232・・・チャネルミキシング部
233・・・視差データ生成部
Claims (20)
- 立体画像表示のための複数のビューのうち、少なくとも左端のビューおよび右端のビューの画像データと、上記左端および上記右端の間に位置する中間のビューの画像データを取得する画像データ取得部と、
上記取得された画像データが符号化されて得られたビデオストリームを含む所定フォーマットのコンテナを送信する画像データ送信部を備える
送信装置。 - 上記ビデオストリームのレイヤに、該ビデオストリーム内の画像データに関するビュー構成情報を挿入するビュー構成情報挿入部をさらに備える
請求項1に記載の送信装置。 - 上記コンテナのレイヤに、上記ビデオストリームのレイヤに上記ビュー構成情報の挿入があるか否かを識別するための識別情報を挿入する識別情報挿入部をさらに備える
請求項2に記載の送信装置。 - 上記コンテナが含むビデオストリームにおいて、上記左端のビューおよび上記右端のビューの画像データはそれぞれ1つのピクチャのデータとして符号化されている
請求項1に記載の送信装置。 - 上記コンテナが含むビデオストリームにおいて、上記左端のビューおよび上記右端のビューの画像データはインターリーブ処理されて1つのピクチャのデータとして符号化されている
請求項1に記載の送信装置。 - 上記コンテナが含むビデオストリームは、1つまたは複数のピクチャのデータを含む
請求項1に記載の送信装置。 - 上記コンテナが含むビデオストリームが複数のピクチャの符号化データを含むとき、各ピクチャの符号化データの間に境界を示す情報が配置される
請求項1に記載の送信装置。 - 上記コンテナが含むビデオストリームにおいて、所定のビューの画像データが1つのピクチャのデータとして符号化されているとき、該ビデオストリームのレイヤに挿入される上記ビュー構成情報には、上記所定のビューの位置を示す情報が含まれる
請求項2に記載の送信装置。 - 上記コンテナが含むビデオストリームにおいて、2つのビューの画像データがインターリーブ処理されて1つのピクチャのデータとして符号化されているとき、該ビデオストリームのレイヤに挿入される上記ビュー構成情報には、該2つのビューの位置を示す情報が含まれる
請求項2に記載の送信装置。 - 上記ビュー構成情報には、上記2つのビューの画像データに対して行われるインターリーブのタイプを示す情報がさらに含まれる
請求項9に記載の送信装置。 - 上記ビデオストリームのレイヤに挿入される上記ビュー構成情報には、該ビデオストリームの1アクセスユニット内に複数のピクチャのデータが符号されているか否かを示す情報が含まれている
請求項2に記載の送信装置。 - 上記ビデオストリームのレイヤに挿入される上記ビュー構成情報には、画像表示に必須のビューの画像データが符号化されているビデオストリームであるか否かを示す情報が含まれている
請求項2に記載の送信装置。 - 上記ビデオストリームのレイヤに挿入される上記ビュー構成情報には、水平および/または垂直の所定の解像度に対する画素比率情報が含まれている
請求項2に記載の送信装置。 - 上記各ビューの間の視差データを取得する視差データ取得部をさらに備え、
上記画像データ送信部は、
上記取得された画像データが符号化されて得られたビデオストリームの他に、上記取得された視差データが符号化されて得られた視差ストリームを含む所定フォーマットのコンテナを送信する
請求項1に記載の送信装置。 - 上記コンテナは、トランスポートストリームである
請求項1に記載の送信装置。 - 立体画像表示のための複数のビューのうち、少なくとも左端のビューおよび右端のビューの画像データと、上記左端および上記右端の間に位置する中間のビューの画像データを取得する画像データ取得ステップと、
上記取得された画像データが符号化されて得られたビデオストリームを含む所定フォーマットのコンテナを送信する画像データ送信ステップを備える
送信方法。 - 立体画像表示のための複数のビューのうち、少なくとも左端のビューおよび右端のビューの画像データと、上記左端および上記右端の間に位置する中間のビューの画像データが符号化されて得られたビデオストリームを含む所定フォーマットのコンテナを受信する画像データ受信部と、
上記コンテナに含まれるビデオストリームをデコードして、上記各ビューの画像データを得る画像データ取得部と、
上記各ビューの間の視差データに基づいて、上記各ビューの間に位置する所定数のビューの画像データを補間処理で取得する補間処理部を備える
受信装置。 - 上記コンテナは、上記視差データが符号化されて得られた視差ストリームを含み、
上記コンテナに含まれる上記視差ストリームをデコードして上記視差データを得る視差データ取得部をさらに備える
請求項17に記載の受信装置。 - 上記画像データ取得部で得られた上記各ビューの画像データに基づいて、上記視差データを生成する視差データ生成部をさらに備える
請求項17に記載の受信装置。 - 立体画像表示のための複数のビューのうち、少なくとも左端ビューおよび右端のビューの画像データと、上記左端および上記右端の間に位置する中間のビューの画像データが符号化されて得られたビデオストリームを含む所定フォーマットのコンテナを受信する画像データ受信ステップと、
上記コンテナに含まれるビデオストリームをデコードして、上記各ビューの画像データを得る画像データ取得ステップと、
上記各ビューの間の視差データに基づいて、上記各ビューの間に位置する所定数のビューの画像データを補間処理で取得する補間処理ステップを備える
受信方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/997,448 US9693033B2 (en) | 2011-11-11 | 2012-11-05 | Transmitting apparatus, transmitting method, receiving apparatus and receiving method for transmission and reception of image data for stereoscopic display using multiview configuration and container with predetermined format |
BR112013017322A BR112013017322A2 (pt) | 2011-11-11 | 2012-11-05 | dispositivo e método de transmissão, e, método de recepção |
KR1020137016903A KR102009049B1 (ko) | 2011-11-11 | 2012-11-05 | 송신 장치, 송신 방법, 수신 장치 및 수신 방법 |
EP12847913.6A EP2645724A4 (en) | 2011-11-11 | 2012-11-05 | SENDING DEVICE, TRANSMISSION PROCEDURE, RECEPTION DEVICE AND RECEPTION PROCEDURE |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2011-248114 | 2011-11-11 | ||
JP2011248114 | 2011-11-11 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013069608A1 true WO2013069608A1 (ja) | 2013-05-16 |
Family
ID=48289982
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2012/078637 WO2013069608A1 (ja) | 2011-11-11 | 2012-11-05 | 送信装置、送信方法、受信装置および受信方法 |
Country Status (5)
Country | Link |
---|---|
US (1) | US9693033B2 (ja) |
EP (1) | EP2645724A4 (ja) |
KR (1) | KR102009049B1 (ja) |
BR (1) | BR112013017322A2 (ja) |
WO (1) | WO2013069608A1 (ja) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103391448A (zh) * | 2013-07-12 | 2013-11-13 | 上海环鼎影视科技有限公司 | 多视区裸眼3d播放系统及其播放方法 |
JP2019513320A (ja) * | 2016-03-24 | 2019-05-23 | ノキア テクノロジーズ オーユー | ビデオの符号化・復号装置、方法、およびコンピュータプログラム |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9860515B2 (en) * | 2012-12-11 | 2018-01-02 | Electronics And Telecommunications Research Institute | Apparatus and method for 3D content broadcasting with boundary information |
US20150253974A1 (en) | 2014-03-07 | 2015-09-10 | Sony Corporation | Control of large screen display using wireless portable computer interfacing with display controller |
CN112925109A (zh) * | 2019-12-05 | 2021-06-08 | 北京芯海视界三维科技有限公司 | 多视点裸眼3d显示屏、裸眼3d显示终端 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH089421A (ja) * | 1994-06-20 | 1996-01-12 | Sanyo Electric Co Ltd | 立体映像装置 |
JPH09138384A (ja) | 1995-11-15 | 1997-05-27 | Sanyo Electric Co Ltd | 立体画像観察用眼鏡の制御方法 |
JP2009004940A (ja) * | 2007-06-20 | 2009-01-08 | Victor Co Of Japan Ltd | 多視点画像符号化方法、多視点画像符号化装置及び多視点画像符号化プログラム |
WO2010116955A1 (ja) * | 2009-04-07 | 2010-10-14 | ソニー株式会社 | 情報処理装置、情報処理方法、再生装置、再生方法、およびプログラム |
JP2011519209A (ja) * | 2008-04-10 | 2011-06-30 | ポステック・アカデミー‐インダストリー・ファウンデーション | 眼鏡なし3次元立体テレビのための高速多視点3次元立体映像の合成装置及び方法 |
Family Cites Families (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3733358B2 (ja) * | 1996-04-05 | 2006-01-11 | 松下電器産業株式会社 | 画像伝送装置、送信装置、受信装置、送信方法および受信方法 |
KR100397511B1 (ko) | 2001-11-21 | 2003-09-13 | 한국전자통신연구원 | 양안식/다시점 3차원 동영상 처리 시스템 및 그 방법 |
EP1501316A4 (en) | 2002-04-25 | 2009-01-21 | Sharp Kk | METHOD FOR GENERATING MULTIMEDIA INFORMATION, AND DEVICE FOR REPRODUCING MULTIMEDIA INFORMATION |
JP4251907B2 (ja) | 2003-04-17 | 2009-04-08 | シャープ株式会社 | 画像データ作成装置 |
KR100585966B1 (ko) | 2004-05-21 | 2006-06-01 | 한국전자통신연구원 | 3차원 입체 영상 부가 데이터를 이용한 3차원 입체 디지털방송 송/수신 장치 및 그 방법 |
KR100636785B1 (ko) * | 2005-05-31 | 2006-10-20 | 삼성전자주식회사 | 다시점 입체 영상 시스템 및 이에 적용되는 압축 및 복원방법 |
KR100667830B1 (ko) * | 2005-11-05 | 2007-01-11 | 삼성전자주식회사 | 다시점 동영상을 부호화하는 방법 및 장치 |
KR100747550B1 (ko) | 2005-12-09 | 2007-08-08 | 한국전자통신연구원 | Dmb 기반의 3차원 입체영상 서비스 제공 방법과, dmb기반의 3차원 입체영상 서비스를 위한 복호화 장치 및 그방법 |
US20080043832A1 (en) | 2006-08-16 | 2008-02-21 | Microsoft Corporation | Techniques for variable resolution encoding and decoding of digital video |
US8274551B2 (en) | 2007-06-11 | 2012-09-25 | Samsung Electronics Co., Ltd. | Method and apparatus for generating header information of stereoscopic image data |
CN101803394A (zh) | 2007-06-19 | 2010-08-11 | 韩国电子通信研究院 | 存储和播放立体数据的元数据结构以及使用该元数据存储立体内容文件的方法 |
KR100993428B1 (ko) | 2007-12-12 | 2010-11-09 | 한국전자통신연구원 | Dmb 연동형 스테레오스코픽 데이터 처리방법 및스테레오스코픽 데이터 처리장치 |
US8660402B2 (en) | 2007-12-14 | 2014-02-25 | Koninklijke Philips N.V. | 3D mode selection mechanism for video playback |
KR20090100189A (ko) * | 2008-03-19 | 2009-09-23 | 삼성전자주식회사 | 스케일링 정보를 포함하는 스테레오스코픽 영상데이터스트림 부호화 방법 및 장치, 그리고 그 복호화 방법및 그 장치 |
KR101506219B1 (ko) | 2008-03-25 | 2015-03-27 | 삼성전자주식회사 | 3차원 영상 컨텐츠 제공 방법, 재생 방법, 그 장치 및 그기록매체 |
KR101468267B1 (ko) * | 2008-10-02 | 2014-12-15 | 프라운호퍼-게젤샤프트 추르 푀르데룽 데어 안제반텐 포르슝 에 파우 | 중간 뷰 합성 및 멀티-뷰 데이터 신호 추출 |
AU2008362821A1 (en) * | 2008-10-07 | 2010-04-15 | Telefonaktiebolaget Lm Ericsson (Publ) | Multi-view media data |
WO2010048632A1 (en) * | 2008-10-24 | 2010-04-29 | Real D | Stereoscopic image format with depth information |
RU2513912C2 (ru) | 2008-12-09 | 2014-04-20 | Сони Корпорейшн | Устройство и способ обработки изображений |
JP4947389B2 (ja) * | 2009-04-03 | 2012-06-06 | ソニー株式会社 | 画像信号復号装置、画像信号復号方法、および画像信号符号化方法 |
US8982183B2 (en) * | 2009-04-17 | 2015-03-17 | Lg Electronics Inc. | Method and apparatus for processing a multiview video signal |
US20110012993A1 (en) | 2009-07-14 | 2011-01-20 | Panasonic Corporation | Image reproducing apparatus |
JP5663857B2 (ja) | 2009-10-08 | 2015-02-04 | 株式会社ニコン | 画像表示装置および画像表示方法 |
JP2011010255A (ja) | 2009-10-29 | 2011-01-13 | Sony Corp | 立体画像データ送信方法、立体画像データ受信装置および立体画像データ受信方法 |
JP4823349B2 (ja) | 2009-11-11 | 2011-11-24 | パナソニック株式会社 | 三次元映像復号装置及び三次元映像復号方法 |
CN102334339A (zh) | 2009-12-28 | 2012-01-25 | 松下电器产业株式会社 | 显示装置和方法、记录介质、发送装置和方法、以及再现装置和方法 |
KR20110088334A (ko) * | 2010-01-28 | 2011-08-03 | 삼성전자주식회사 | 3차원 멀티미디어 서비스를 제공하기 위한 데이터스트림 생성 방법 및 장치, 3차원 멀티미디어 서비스를 제공하기 위한 데이터스트림 수신 방법 및 장치 |
DE102010009291A1 (de) | 2010-02-25 | 2011-08-25 | Expert Treuhand GmbH, 20459 | Verfahren und Vorrichtung für ein anatomie-adaptiertes pseudoholographisches Display |
JP5577823B2 (ja) | 2010-04-27 | 2014-08-27 | ソニー株式会社 | 送信装置、送信方法、受信装置および受信方法 |
MX2013002135A (es) | 2010-09-03 | 2013-04-03 | Sony Corp | Dispositivo de codificacion y metodo de codificacion, asi como dispositivo de decodificacion y metodo de decodificacion. |
MX2013008311A (es) * | 2011-02-16 | 2013-09-06 | Panasonic Corp | Codificador de video, metodo de codificacion de video, programa de codificacion de video, metodo de reproduccion de video y programa de reproduccion de video. |
US9525858B2 (en) * | 2011-07-06 | 2016-12-20 | Telefonaktiebolaget Lm Ericsson (Publ) | Depth or disparity map upscaling |
US20130188013A1 (en) * | 2011-07-22 | 2013-07-25 | Qualcomm Incorporated | Mvc based 3dvc codec supporting inside view motion prediction (ivmp) mode |
US9288505B2 (en) * | 2011-08-11 | 2016-03-15 | Qualcomm Incorporated | Three-dimensional video with asymmetric spatial resolution |
JP6192902B2 (ja) | 2011-11-11 | 2017-09-06 | サターン ライセンシング エルエルシーSaturn Licensing LLC | 画像データ送信装置、画像データ送信方法、画像データ受信装置および画像データ受信方法 |
-
2012
- 2012-11-05 KR KR1020137016903A patent/KR102009049B1/ko active IP Right Grant
- 2012-11-05 EP EP12847913.6A patent/EP2645724A4/en not_active Ceased
- 2012-11-05 BR BR112013017322A patent/BR112013017322A2/pt not_active Application Discontinuation
- 2012-11-05 WO PCT/JP2012/078637 patent/WO2013069608A1/ja active Application Filing
- 2012-11-05 US US13/997,448 patent/US9693033B2/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH089421A (ja) * | 1994-06-20 | 1996-01-12 | Sanyo Electric Co Ltd | 立体映像装置 |
JPH09138384A (ja) | 1995-11-15 | 1997-05-27 | Sanyo Electric Co Ltd | 立体画像観察用眼鏡の制御方法 |
JP2009004940A (ja) * | 2007-06-20 | 2009-01-08 | Victor Co Of Japan Ltd | 多視点画像符号化方法、多視点画像符号化装置及び多視点画像符号化プログラム |
JP2011519209A (ja) * | 2008-04-10 | 2011-06-30 | ポステック・アカデミー‐インダストリー・ファウンデーション | 眼鏡なし3次元立体テレビのための高速多視点3次元立体映像の合成装置及び方法 |
WO2010116955A1 (ja) * | 2009-04-07 | 2010-10-14 | ソニー株式会社 | 情報処理装置、情報処理方法、再生装置、再生方法、およびプログラム |
Non-Patent Citations (1)
Title |
---|
See also references of EP2645724A4 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103391448A (zh) * | 2013-07-12 | 2013-11-13 | 上海环鼎影视科技有限公司 | 多视区裸眼3d播放系统及其播放方法 |
CN103391448B (zh) * | 2013-07-12 | 2015-12-23 | 上海环鼎影视科技有限公司 | 多视区裸眼3d播放系统及其播放方法 |
JP2019513320A (ja) * | 2016-03-24 | 2019-05-23 | ノキア テクノロジーズ オーユー | ビデオの符号化・復号装置、方法、およびコンピュータプログラム |
US10863182B2 (en) | 2016-03-24 | 2020-12-08 | Nokia Technologies Oy | Apparatus, a method and a computer program for video coding and decoding of a monoscopic picture |
Also Published As
Publication number | Publication date |
---|---|
EP2645724A4 (en) | 2014-08-06 |
EP2645724A1 (en) | 2013-10-02 |
KR20140095011A (ko) | 2014-07-31 |
KR102009049B1 (ko) | 2019-08-08 |
BR112013017322A2 (pt) | 2017-03-01 |
US9693033B2 (en) | 2017-06-27 |
US20140055561A1 (en) | 2014-02-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6034420B2 (ja) | 3次元映像の再生のための付加情報が挿入された3次元映像データストリーム生成方法及びその装置、3次元映像の再生のための付加情報が挿入された3次元映像データストリーム受信方法及びその装置 | |
JP6192902B2 (ja) | 画像データ送信装置、画像データ送信方法、画像データ受信装置および画像データ受信方法 | |
CA2760100C (en) | Broadcast transmitter, broadcast receiver and 3d video data processing method thereof | |
US20120075421A1 (en) | Image data transmission device, image data transmission method, and image data receiving device | |
WO2013105401A1 (ja) | 送信装置、送信方法、受信装置および受信方法 | |
CA2772927C (en) | Cable broadcast receiver and 3d video data processing method thereof | |
WO2013161442A1 (ja) | 画像データ送信装置、画像データ送信方法、画像データ受信装置および画像データ受信方法 | |
US8953019B2 (en) | Method and apparatus for generating stream and method and apparatus for processing stream | |
WO2013089024A1 (ja) | 送信装置、送信方法、受信装置および受信方法 | |
WO2013069608A1 (ja) | 送信装置、送信方法、受信装置および受信方法 | |
KR101977260B1 (ko) | 입체영상 디스플레이가 가능한 디지털 방송 수신방법 및 수신장치 | |
WO2013073455A1 (ja) | 画像データ送信装置、画像データ送信方法、画像データ受信装置および画像データ受信方法 | |
JP5928118B2 (ja) | 送信装置、送信方法、受信装置および受信方法 | |
WO2013054775A1 (ja) | 送信装置、送信方法、受信装置および受信方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 13997448 Country of ref document: US |
|
REEP | Request for entry into the european phase |
Ref document number: 2012847913 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2012847913 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 20137016903 Country of ref document: KR Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12847913 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112013017322 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 112013017322 Country of ref document: BR Kind code of ref document: A2 Effective date: 20130704 |