CN110073657B - Method for transmitting data relating to a three-dimensional image - Google Patents

Method for transmitting data relating to a three-dimensional image Download PDF

Info

Publication number
CN110073657B
CN110073657B CN201780077783.5A CN201780077783A CN110073657B CN 110073657 B CN110073657 B CN 110073657B CN 201780077783 A CN201780077783 A CN 201780077783A CN 110073657 B CN110073657 B CN 110073657B
Authority
CN
China
Prior art keywords
image
regions
viewport
packed
region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201780077783.5A
Other languages
Chinese (zh)
Other versions
CN110073657A (en
Inventor
E.伊普
崔秉斗
宋在涓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Publication of CN110073657A publication Critical patent/CN110073657A/en
Application granted granted Critical
Publication of CN110073657B publication Critical patent/CN110073657B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/172Processing image signals image signals comprising non-image signal components, e.g. headers or format information
    • H04N13/178Metadata, e.g. disparity information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/172Processing image signals image signals comprising non-image signal components, e.g. headers or format information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/194Transmission of image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/363Image reproducers using image projection screens
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/23614Multiplexing of additional data and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/139Format conversion, e.g. of frame-rate or size

Abstract

A method of displaying a three-dimensional (3D) image by a device is provided. The method comprises the following steps: sending information about a viewport of a device to a server; receiving data on at least one second region corresponding to the viewport from among a plurality of regions of the packed 2D image, from a server; and displaying the 3D image based on the received data, wherein the packed 2D image is generated by modifying or rearranging at least a portion of a plurality of regions of the 2D image projected from the 3D image, and wherein the at least one second region is identified based on information of a relationship between an index of each of the at least one first region corresponding to the viewport, an index of each of the plurality of regions of the 3D image, and an index of each of the plurality of regions of the packed 2D image, among the plurality of regions of the 3D image.

Description

Method for transmitting data relating to a three-dimensional image
Technical Field
The present invention relates to a method and apparatus for transmitting data related to a three-dimensional (3D) image, and more particularly, to a method and apparatus for selectively transmitting data regarding a specific portion of a 3D image.
Background
With the development of Virtual Reality (VR) or Augmented Reality (AR) technologies, technologies related to processing and transmitting 3D images (or omnidirectional images) for display on VR or AR capable devices are also advancing.
To provide an omnidirectional image to a VR device wearer, the size of the 3D image-related data containing the omnidirectional image-related data may be very large. Therefore, the transmission of 3D image-related data may overload the transmission system due to its data size. In particular, the size of the 3D image related data may pose significant limitations for the real-time provision of 3D images.
Disclosure of Invention
Technical problem
What can be considered to reduce the amount of 3D image related data transmitted is not to transmit the entire 3D image, but only a partial area of the 3D image that is related to the area currently being displayed or to be displayed on the VR device, i.e. the viewport (viewport). However, since the transmission of the 3D image-related data is performed based on the 2D image projected from the 3D image, identifying a region on the 2D image corresponding to the viewport identified on the 3D image and identifying a region of the 2D image to be transmitted may place an additional burden on the VR system.
An object of the present invention is to provide a method and apparatus for efficiently transmitting or receiving 3D image-related data.
It is another object of the present invention to provide a method and apparatus for easily identifying at least one region to be selectively transmitted on a 2D image projected from a 3D image.
The object of the present invention is not limited to the foregoing, and other objects not mentioned will be apparent to those of ordinary skill in the art from the following description.
Technical scheme
To achieve the foregoing object, according to an embodiment of the present invention, a method for displaying a three-dimensional (3D) image by a device includes: sending information about a viewport of a device to a server; receiving data on at least one second region corresponding to the viewport from among a plurality of regions of the packed 2D image, from a server; and displaying the 3D image based on the received data, wherein the packed 2D image is generated by modifying or rearranging at least a portion of a plurality of regions of the 2D image projected from the 3D image, and wherein the at least one second region is identified based on an index of each of the plurality of regions of the 3D image corresponding to the viewport and information on a relationship between the index of each of the plurality of regions of the 3D image and the index of each of the plurality of regions of the packed 2D image.
According to another embodiment of the present invention, an apparatus for displaying a 3D image includes: a communication interface and a processor coupled to the communication interface, wherein the processor is configured to send information about a viewport of the device to the server, receive data about at least one second region of the plurality of regions of the packaged 2D image corresponding to the viewport from the server, and display the 3D image based on the received data, wherein the packaged 2D image is generated by modifying or rearranging at least a portion of the plurality of regions of the 2D image projected from the 3D image, and wherein the at least one second region is identified based on an index of each of the plurality of regions of the 3D image corresponding to the viewport and information about a relationship between the index of each of the plurality of regions of the 3D image and the index of each of the plurality of regions of the packaged 2D image.
According to still another embodiment of the present invention, a method for transmitting data of a 3D image by a server includes: receiving information about a viewport of a device from the device, identifying at least one second region corresponding to the viewport among a plurality of regions of a packed 2D image, and sending data about the at least one second region to the device, wherein the packed 2D image is generated by modifying or rearranging at least a part of the plurality of regions of the 2D image projected from the 3D image, and wherein the at least one second region is identified based on an index of each of the plurality of regions of the 3D image corresponding to the viewport and information about a relationship between the index of each of the plurality of regions of the 3D image and the index of each of the plurality of regions of the packed 2D image
According to still another embodiment of the present invention, a server for transmitting data of a three-dimensional (3D) image includes: a communication interface and a processor coupled to the communication interface, wherein the processor is configured to receive information from the device about a viewport of the device, identify at least one second region of the plurality of regions of the packed 2D image corresponding to the viewport, and send data about the at least one second region to the device, wherein the packed 2D image is generated by modifying or rearranging at least a portion of the plurality of regions of the 2D image projected from the 3D image, and wherein the at least one second region is identified based on an index of each of the plurality of regions of the 3D image corresponding to the viewport and information about a relationship between the index of each of the plurality of regions of the 3D image and the index of each of the plurality of regions of the packed 2D image
The details of other embodiments are set forth in the detailed description and the accompanying drawings.
Advantageous technical effects
The embodiment of the present invention exhibits at least the following effects.
In other words, 3D image related data can be efficiently provided.
Further, at least one region to be selectively transmitted on the 2D image can be easily identified based on the projection from the 3D image.
The effect of the present invention is not limited thereto, and the present invention encompasses other various effects.
Drawings
Fig. 1 shows a system of a transmitter for transmitting 3D image related data according to an embodiment of the invention;
FIG. 2 illustrates projecting a 3D image into a 2D image and packing the projected 2D image according to an embodiment of the invention;
FIG. 3 shows a system of receivers for receiving 3D image related data according to an embodiment of the invention;
FIG. 4 shows a viewport on a 3D image segmented into regions, in accordance with an embodiment of the invention;
fig. 5 illustrates a 2D image projected from the 3D image of fig. 4 in an Equal Rectangular Projection (ERP) scheme.
FIG. 6 shows a 2D image packaged from the ERP 2D image of FIG. 5;
FIG. 7 shows an octahedral projection (OHR) 2D image from the 3D image of FIG. 4;
figure 8 shows a 2D image packed from the 2D image projected by the OHP of figure 7;
FIG. 9 shows a group of regions of a 3D image according to an embodiment of the invention;
FIG. 10 is a flow chart illustrating operation of a receiver according to an embodiment of the present invention;
FIG. 11 is a flowchart illustrating the operation of a device and server according to another embodiment of the present invention;
FIG. 12 is a flowchart illustrating the operation of a device and server according to yet another embodiment of the present invention;
FIG. 13 is a flowchart illustrating the operation of a device and server according to yet another embodiment of the present invention;
FIG. 14 illustrates an example method of representing coordinates of a particular point on a spherical 3D image; and
fig. 15 is a block diagram illustrating a receiver according to an embodiment of the present invention.
Detailed Description
Advantages and features of the present invention and methods for accomplishing the same may be understood by the following description of embodiments taken in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed herein, and various changes may be made to the embodiments disclosed herein. The embodiments disclosed herein are provided merely to inform those of ordinary skill in the art of the classification of the present invention. The invention is limited only by the appended claims.
Although the terms "first" and "second" are used to describe various components, these components are not limited by these terms. These terms are provided merely to distinguish one element from another. Therefore, within the technical spirit of the present invention, the first component mentioned herein may also be the second component.
Fig. 1 shows a system of a transmitter for transmitting 3D image related data according to an embodiment of the invention.
Fig. 1 shows a system of transmitters according to an embodiment of the invention. The transmitter may be a server for providing 3D image related data or services. Here, the 3D image may refer to both a dynamic image and a static image. The transmitter may generate or receive a 3D image (110). The transmitter may generate a 3D image by stitching images captured by the plurality of cameras in various directions. The transmitter may receive data on the 3D image that has been generated externally. The 3D image may be in the shape of any of a sphere, cube, cylinder, or octahedron. However, the enumerated shapes of the 3D image are merely examples, and various shapes of the 3D image available in the related art may be generated or received.
The transmitter may project the 3D image into the 2D image (120). Equivalent rectangular projection, octahedral projection, cylindrical projection, cubic projection, and any of various projection schemes available in the related art may be used to project the 3D image into the 2D image.
The transmitter may package (130) the projected 2D image. Packing may refer to generating a new 2D image (i.e., a packed 2D image) by modifying and/or rearranging at least some of the plurality of regions that make up the projected 2D image. Here, modifying a region may refer to, for example, resizing, transforming, rotating, and/or resampling the region (e.g., upsampling, downsampling, and differentially sampling depending on the location in the region).
The projection 120 and packing 130 are described in more detail below with reference to fig. 2. Fig. 2 illustrates projecting a 3D image into a 2D image and packing the projected 2D image according to an embodiment of the present invention. In fig. 2, an example 3D image 210 may be spherical. In an example ERP scheme, the projected 2D image 220 may be generated by projecting the 3D image 210. The projected 2D image 220 may be segmented into a plurality of regions 221, 222, 223, and 224. According to embodiments, there may be various methods of segmenting the projected 2D image 220.
A packed 2D image 230 may be generated from the projected 2D image 220. The packed 2D image 230 may be generated by modifying and/or rearranging the plurality of regions 221, 222, 223, and 224 of the projected 2D image 220. The plurality of regions 231, 232, 233, and 234 of the packed 2D image 230 may sequentially correspond to the plurality of regions 221, 222, 223, and 224 of the projected 2D image 220, respectively. The modification and rearrangement of the plurality of regions 231, 232, 233, and 234 of the packed 2D image 230 shown in fig. 2 are merely examples, and various types of modification and rearrangement may be performed according to the embodiment.
Referring back to fig. 1, the transmitter may encode the packed 2D image (240). The packed 2D image may be segmented into a plurality of regions. The multiple regions of the packed 2D image may be encoded separately. In some embodiments, encoding may be performed only on one or more regions to be transmitted among the plurality of regions of the packed 2D image. In some embodiments, encoding may be performed on a group image of two or more regions among a plurality of regions of the packed 2D image. In some embodiments, the entire packed 2D image may be encoded. The encoding may be performed in a known existing 2D image encoding scheme.
The transmitter may encapsulate the encoded data (150). Encapsulation may refer to processing the encoded data to conform to a predetermined transmission protocol by, for example, segmenting the encoded data and processing (e.g., adding a header to the segmentation). The transmitter may transmit the encapsulated data.
The receiver is described below with reference to fig. 3. Fig. 3 shows a system of a receiver for receiving 3D image related data according to an embodiment of the invention. The receiver may be a VR device or an AR device. The receiver may in other ways represent any type of device capable of receiving and playing back 3D image related data.
The receiver may receive 3D image related data from the transmitter. The receiver may decapsulate the received data (310). The encoded data generated by the encoding 140 of fig. 1 may be obtained by decapsulation 310.
The receiver may decode the decapsulated (310) data (320). The packed 2D image may be restored by decoding (320).
The receiver may unpack (330) the decoded data (i.e., the packed 2D image). The 2D image generated by the projection 120 of fig. 1 may be recovered by unpacking. The unpacking may be a reverse transformation of the modification and/or rearrangement of the plurality of regions of the projected 2D image performed on the packing 130 of fig. 1. For this purpose, the receiver needs to know the method of packing 130. The method of packing 130 may be predetermined between the receiver and the sender. In some embodiments, the sender may transmit information about the method of packaging 130 to the receiver through a separate message (such as metadata). In some embodiments, the transmission data generated via the encapsulation 150 may include information regarding the method of packaging 130, for example, in a header.
The receiver may project the unpacked 2D image into a 3D image (340). To project the 2D image into the 3D image, the receiver may use a back projection of the projection used to project (120) the 2D image, as shown in fig. 1, but is not necessarily limited thereto. The receiver may project the unpacked 2D image into a 3D image, thereby generating a 3D image.
The receiver may display at least a portion of the 3D image via a display device (350). For example, the receiver may extract only data corresponding to a current field-of-view (FOV) of the 3D image and render the data.
In some embodiments, only data regarding a portion of the 3D image among the 3D image-related data may be transmitted to reduce a 3D image-related data transmission load. For example, the transmitter may segment the packed 2D image into a plurality of regions, and transmit only one or more regions of data containing the viewport of the receiver among the plurality of regions of the packed 2D image. Here, a plurality of regions of the 2D image divided for transmission may be set regardless of a plurality of regions of the projected 2D image divided for packing. In this case, identifying a region corresponding to a viewport on the 3D image among the plurality of regions of the packed 2D image segmented for transmission may increase the computational burden on the processor of the receiver. Therefore, there is a need for a method to identify a region on a 3D image corresponding to a viewport in a simplified manner. A method of identifying a region on a 3D image corresponding to a viewport is described below with reference to fig. 4-6, according to an embodiment of the present invention.
Fig. 4 shows a viewport on a 3D image segmented into a plurality of regions according to an embodiment of the invention. The example 3D image 400 may be presented in a spherical shape. The 3D image 400 may be divided into a plurality of regions. The plurality of regions of the 3D image 400 may each be divided to have a predetermined latitude angle range and a predetermined longitude angle range, but is not necessarily limited thereto. In the example of fig. 4, each of the plurality of regions of the 3D image 400 has been set to have a longitude angle range of 45 degrees and a latitude angle range of 30 degrees. An index may be set to each of a plurality of regions of the 3D image 400. The index of each of the plurality of regions may be represented in a form such as [ x, y ], where x and y are the rows and columns, respectively, of the regions in the matrix formed by the regions, but is not necessarily limited thereto. The viewport 420 may be positioned in first to sixth areas 411, 412, 413, 414, 415, and 416 among a plurality of areas of the 3D image 400. The indexes of the first to sixth areas of the 3D image 400 may be [0, 1], [1, 1], [2, 1], [0, 2], [1, 2], and [2, 2], respectively.
Fig. 5 illustrates a 2D image projected from the 3D image of fig. 4 in an ERP scheme. The projected 2D image 500 may be segmented into a plurality of regions corresponding to the plurality of regions of the 3D image 400. In the case where the plurality of areas of the 3D image 400 have the same latitude angle range and the same longitude angle range as described in connection with fig. 4, the plurality of areas of the 2D image 500 projected in the ERP scheme corresponding to the plurality of areas of the 3D image 400 may be rectangles of the same size. Regions 511, 512, 513, 514, 515, and 516 of the plurality of regions including a region 517 corresponding to a viewport on the projected 2D image 500 may correspond to the first to sixth regions 411, 412, 413, 414, 415, and 416 of the 3D image 400.
Fig. 6 shows a 2D image packaged from the ERP 2D image of fig. 5. The packed 2D image 600 may be generated from the projected 2D image 500 in any packing scheme. The regions of the projected 2D image 500 that are segmented for packing need not be the same as the regions of the projected 2D image 500 shown in fig. 5. For transmission purposes, the packed 2D image 600 may be divided into a plurality of regions, and fig. 6 shows an example in which the packed 2D image 600 is divided into eight regions. An index may be assigned to each of the regions of the packed 2D image 600. In the example of fig. 6, the regions are indexed 1 through 8. Regions 611, 612, 613, 615, and 616 of the packed 2D image 600 that correspond to the regions 411, 412, 413, 414, 415, and 416 of the 3D image 400 that include the viewport 420 (i.e., corresponding to the regions 511, 512, 513, 514, 515, and 516 of the projected 2D image 500) may be included in the regions 631, 632, and 633 that are indexed 2, 7, and 8 among the regions segmented for transmission purposes. The regions 620a and 620b in the packed 2D image 600 corresponding to the viewport 420 may also be included in the regions 631, 632, and 633 indexed as 2, 7, and 8 among the regions segmented for transmission purposes. Accordingly, the transmitter may transmit data required by the receiver to display the viewport 420 by transmitting data related to the regions 631, 632, and 633 indexed as 2, 7, and 8 among the plurality of regions of the packed 2D image 600. The sender or receiver may know the relationship between the plurality of regions of the 3D image 400 and the plurality of regions of the packed 2D image 600. Accordingly, the transmitter or the receiver may identify a region corresponding to the plurality of regions of the packed 2D image 600 from the respective indexes of the plurality of regions of the 3D image 400 without complicated calculation. For example, a look-up table (LUT) as shown in table 1 may be used to identify a region of the packed 2D image 600 corresponding to a region of the 3D image 400.
[ Table 1]
Figure BDA0002095104360000081
Such a LUT enables easier identification of areas 631, 632, 633 (which have been indexed to 2, 7, and 8) on the packed 2D image 600, which correspond to areas 411, 412, 413, 414, 415, and 416 (which have been indexed to [0, 1], [1, 1], [2, 1], [0, 2], [1, 2], and [2, 2]) on the 3D image 400.
Thus, identifying a region on a packed 2D image corresponding to a region on a 3D image using an index may be applied not only to a case where the 3D image is projected to the 2D image in an ERP scheme but also to a case where such projection is performed in other schemes. An embodiment of projecting a 3D image into a 2D image using an octahedral projection (OSP) scheme is described below in conjunction with fig. 7 and 8.
Figure 7 shows an OHP 2D image from the 3D image of figure 4. Regions 711, 712, 713, 714, 715, and 716 in the OHP projected 2D image 700 may correspond to regions 411, 412, 413, 414, 415, and 416 that include the viewport 420 of the 3D image 400. The area 720a and the area 720b corresponding to the viewport 420 in the OHP projected 2D image 700 may be included in the areas 711, 712, 713, 714, 715, and 716 in the OHP projected 2D image 700.
Fig. 8 shows a 2D image packed from the 2D image projected by the OHP of fig. 7. The packaged 2D image 800 may be generated from the projected 2D image 700 of fig. 7 in any packaging scheme. For transmission purposes, the packed 2D image 800 may be segmented into a plurality of regions, e.g., four regions indexed 1 through 4. The regions 711, 712, 713, 714, 715, and 716 of the projected 2D image 700 of fig. 7 may be included in the region 831 indexed as 2 and the region 832 indexed as 3 among the regions of the packed 2D image 800. The region corresponding to the viewport 420 may be placed in the indexed 2 region 831 and the indexed 3 region 832 among the regions of the packed 2D image 800. Accordingly, the transmitter may transmit data regarding the region 831 indexed as 2 and the region 832 indexed as 3 among the regions of the packed 2D image 800 to transmit data regarding the viewport 420. Similar to fig. 5 and 6, the transmitter or the receiver may know the relationship between the plurality of regions of the 3D image 400 and the region of the packed 2D image 800, and may accordingly identify the corresponding region of the packed 2D image 800 from the region index of the 3D image 400. A LUT similar to table 1 may be used to easily identify the relationship between the multiple regions of the 3D image 400 and the regions of the packed 2D image 800.
In some embodiments, the receiver needs to know the scheme by which the 3D image is segmented into multiple regions. The transmitter may provide advance notice of a method of segmenting the 3D image into a plurality of regions. In some embodiments, information regarding a method of segmenting the 3D image into a plurality of regions may be transmitted as metadata. The transmitter may transmit a method of dividing the 3D image into a plurality of regions through an example syntax as shown in table 2 below.
[ Table 2]
Figure BDA0002095104360000091
The semantics of the syntax of table 2 are as follows.
sphere _ tile _ groups-a parameter defining the number of sphere groups (i.e. regions of the 3D image), including video data in which the surface of a sphere (i.e. the spherical 3D image) is segmented
sphere _ tile _ group _ ID-a parameter for defining an identifier of a group of sphere blocks, including video data in which the surface of the sphere is segmented
hor active range start/end-parameter indicating the horizontal extent (i.e. longitude) of the group of spherical blocks given by the start angle and the end angle in the direction defined by θ
vert _ active _ range _ start/end-for indicating the time at which the message is sent
Figure BDA0002095104360000101
The start angle and the end angle in a defined direction give a parameter of the vertical extent (i.e. latitude) of the set of spherical blocks
spatial tiles enabled flag indicating whether view direction is additionally divided in arrangement of spherical blocks
num _ sphere _ tile _ columns-parameter for specifying the number of sphere columns in a sphere block group
num _ sphere _ tile _ rows-parameter for specifying the number of sphere block rows in a sphere block group
uniform _ spacing _ flag — flag indicating that the spherical block row and column borders are uniformly distributed over the picture when it is 1, and that the spherical block borders are defined by column _ width _ angle and row _ height _ angle when it is 0
column _ width _ angle [ i ] -parameter for specifying the width (i.e., longitude) of the ith sphere block column in the direction defined by θ, in degrees, in the sphere block group
row _ height _ angle [ i ] -parameter for specifying the height (i.e., latitude) of the ith sphere block row in the sphere block group in the direction defined by θ, in degrees
The syntax of table 2 is an example when the 3D image is spherical. Also, the syntax example of table 2 illustrates grouping and notifying the regions of the 3D image. Referring to fig. 9, grouping regions of a 3D image is described. Fig. 9 illustrates a group of a plurality of regions of a 3D image according to an embodiment of the present invention. Referring to fig. 9, a plurality of regions on a 3D image 900 may be divided into four groups 910, 920, 930, and 940. The region corresponding to each of the groups 910, 920, 930, and 940 of the 3D image 900 may be set to have a longitude angle range of 180 degrees and a latitude angle range of 90 degrees. The 3D image 400 may be divided into a plurality of regions in a scheme similar to that of the 3D image 400 of fig. 4. Each of the groups 910, 920, 930, and 940 of the 3D image 900 may include 12 regions.
In some embodiments, rather than grouping the plurality of regions, the method of segmenting the 3D image into the plurality of regions may represent a set of respective pieces of information for the plurality of regions.
Fig. 10 is a flowchart illustrating an operation of a receiver according to an embodiment of the present invention. Referring to fig. 10, the receiver may send information about the viewport to the server (1010). The viewport-related information may be information directly indicating a placement of the viewport on the 3D image, information including an index of each of at least one region including the viewport among the plurality of regions of the 3D image, or information including an index of each of at least one region of the packed 2D image corresponding to the at least one region including the viewport among the plurality of regions of the packed 2D image.
The receiver may receive data corresponding to at least one region corresponding to the viewport among the plurality of regions of the packed 2D image from the server (1020).
The receiver may display a 3D image based on the received data (1030). The receiver may display only the region corresponding to the 3D image viewport.
According to various embodiments, the operation of the receiver and the transmitter is described below with reference to fig. 11 to 13.
Fig. 11 is a flowchart illustrating operations of a device and a server according to another embodiment of the present invention. The device 1110 and the server 1120 may correspond to the receiver and the transmitter, respectively. In the embodiment of fig. 11, the device 1110 may know information about a relationship between indexes of a plurality of regions of a 3D image and indexes of a plurality of regions of a packed 2D image.
The server 1120 may send the region segmentation information to the device 1110 (1130). The region segmentation information may include information on indexes of a plurality of regions and a method of segmenting the 3D image into the plurality of regions. The region segmentation information may be transmitted in the form of a syntax as in table 2.
The server 1120 may transmit a LUT to the device 1110, the LUT including information about a relationship between indexes of a plurality of regions of the 3D image and indexes of a plurality of regions of the packed 2D image (1140). In some embodiments, information regarding the relationship between the index of the plurality of regions of the 3D image and the index of the plurality of regions of the packed 2D image may be transmitted in a form other than a LUT. In some embodiments, device 1110 may receive the LUT from a server other than server 112 or another device. The LUT related information may be transmitted as metadata. The LUT-related information may be expressed as an example syntax shown in table 3.
[ Table 3]
Figure BDA0002095104360000121
Table 3 the semantics of the syntax parameters are as follows.
num _ HEVC _ tiles — a parameter indicating the number of High Efficiency Video Coding (HEVC) blocks (i.e., regions of a projected 2D image) in which an image is encoded
HEVC tile column index-parameter for indicating index of a particular HEVC block column
HEVC tile row index-parameter for indicating the index of a particular HEVC block row
num _ spehrical _ tiles-a parameter specifying the number of spherical blocks in an HEVC block that contribute to video data (i.e., a parameter specifying the number of spherical blocks associated with a particular HEVC block)
spatial _ tile _ column _ index — parameter used to indicate index of particular spherical block column associated with HEVC block
Temporal _ tile _ row _ index — a parameter used to indicate the index of a particular row of spherical blocks associated with an HEVC block
The device 1110 may identify at least one first region corresponding to the viewport among a plurality of regions of the 3D image (1140). In the example of fig. 4, the areas 411, 412, 413, 414, 415, and 416 may be identified as at least one first area.
The device 1110 may identify an index of each of the at least one second region corresponding to the at least one first region among the plurality of regions of the packed 2D image based on the LUT (1145). In the example of fig. 6, indices 2, 7, and 8 may be identified as respective indices of the at least one second region.
The device 1110 can send information to the server 1120 regarding a respective index of the viewport that includes the at least one second region.
The server 1120 may identify at least one second area based on the respective index of the at least one second area. Accordingly, the server 1120 may transmit data regarding the at least one second area to the device 1110.
The device 1110 may display at least a portion of a 3D image (e.g., a viewport) based on the received data regarding the at least one second region.
Fig. 12 is a flowchart illustrating operations of a device and a server according to still another embodiment of the present invention. In the embodiment of fig. 12, the device 1210 may know a method of segmenting the 3D image into a plurality of regions and information on indexes of the plurality of regions.
The server 1220 may send the zone partition information 1230 to the device 1210. The region segmentation information may include information on indexes of a plurality of regions and a method of segmenting the 3D image into the plurality of regions. The region segmentation information may be transmitted in the form of a syntax as in table 2.
The device 1210 may identify at least one first region corresponding to the viewport among a plurality of regions of the 3D image (1235).
The device 1210 may send information about the respective index of the viewport comprising the at least one first region to the server 1220. A respective index of the at least one first region may be identified based on the region segmentation information.
The server 1220 may identify, based on the LUT, at least one second region corresponding to at least one first region among the plurality of regions of the packed 2D image according to the respective index of the at least one first region (1245). Here, the LUT may be in the form of information indicating a relationship between indexes of the plurality of regions of the 3D image and indexes of the packed 2D image, and in some embodiments, such information may not have the form of the LUT.
The server 1220 may send identification data regarding the at least one second area to the device 1210 (1250).
Device 1210 may display a 3D image based on the received data (1255).
Fig. 13 is a flowchart illustrating operations of a device and a server according to still another embodiment of the present invention. In the embodiment of fig. 13, device 1310 may not obtain region segmentation information and LUT related information.
Device 1310 may send information to server 1320 regarding the placement of a viewport on a 3D image. The information about the placement of the viewport may directly indicate the location and the area where the viewport is arranged on the 3D image. For example, where the 3D image is shaped as a sphere, information about the placement of the viewport may be represented using coordinates on the surface of the sphere. Referring to fig. 14, which illustrates a method of representing coordinates of a specific point on a spherical 3D image, the position of a point P on the surface of the sphere in a 3D coordinate system may be represented by the radius r, latitude θ and longitude of the sphere
Figure BDA0002095104360000131
To indicate. Since the radius r of the spherical 3D image is known between the server 1320 and the device 1310, the device 1310 may pass the latitude θ and longitude
Figure BDA0002095104360000132
The server is notified of the particular point on the 3D image. The device 1310 may use various methods for representing the placement of a viewport. In some embodiments, the device 1310 may represent the placement of a viewport by coordinates of the viewport edges. In some embodiments, device 1310 may use coordinates of viewport edges and viewports between the viewport edgesThe coordinates of the points on the port boundary represent the placement of the viewport. In some embodiments, the device 1310 may represent the placement of the viewport using a value indicative of viewport rotation based on coordinates of a viewport center and the viewport center, and a value indicative of an angular range of the viewport from a sphere center (e.g., a vertical angular range and a horizontal angular range based on the value indicative of viewport rotation). The above-described methods for representing viewport placement are merely examples, and device 1310 may use various methods to represent the placement of a viewport on a 3D image.
Referring back to fig. 13, the server 1320 may identify at least one first region of the plurality of regions of the 3D image that corresponds to the viewport based on the information regarding the placement of the viewport (1330). The server may identify the at least one first area using the area segmentation information described above in connection with fig. 11 and 12.
The server 1320 may identify at least one second region corresponding to the at least one first region based on the LUT (1335). In other words, the server 1320 may identify the at least one second region by obtaining the identifier of each of the at least one second region corresponding to the identifier of each of the at least one first region from the LUT.
The server 1320 may send identification data about the at least one second area to the device 1310 (1340).
The device 1310 may display a 3D image based on the received data (1345).
The structure of the receiver is described below with reference to fig. 15. Fig. 15 is a block diagram illustrating a receiver according to an embodiment of the present invention. The receiver 1500 may include a memory, a communication interface 1520, and a processor 1530. The receiver 1500 may be configured to perform the operations of the receiver (i.e., device) described above in connection with the embodiments. Processor 1530 may be communicatively and electrically connected to memory 1510 and communication interface 1520. Receiver 1500 can send and receive data through communication interface 1520. The memory 1510 may store various pieces of information for the operation of the receiver 1500. The memory 1510 may store commands or code for controlling the processor 1530. Additionally, the memory 1510 may store temporary or non-temporary data required for the operation of the processor 1530. Processor 1530 may be one processor, and in some embodiments, processor 1530 may refer to a collection of multiple processors that differ by function. Processor 1530 may be configured to control the operation of receiver 1500. The above-described operations of receiver 1500 may be substantially processed and performed by processor 3530. Although transmission or reception of signals is performed through the communication interface 1520 and storage of data and commands is performed by the memory 1510, the operations of the communication interface 1520 and the memory 1510 may be controlled by the processor 1530, and thus, transmission and reception of signals and storage of data and commands may also be considered to be performed by the processor 1530. Although not shown, the receiver 1530 may further include a display device for displaying a 3D image.
Similar to the receiver 1530, the transmitter may include a memory, a communication interface, and a processor. The description of the memory, communication interface and processor of the transmitter is similar to the description of the corresponding elements of receiver 1530.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it will be understood by those of ordinary skill in the art that the present invention may be embodied in other various specific forms without changing the spirit or technical spirit of the present invention. Therefore, it should be noted that the above-described embodiments are provided as examples and should not be construed as final limitations.

Claims (8)

1. A method of displaying a three-dimensional (3D) image by a device, the method comprising:
receiving information on a relationship between indexes of a plurality of regions of the 3D image and indexes of a plurality of regions of the packed 2D image from the server;
identifying at least one first region comprising a viewport of the device among a plurality of regions of the 3D image;
identifying an index of each of at least one second region corresponding to the at least one first region among the plurality of regions of the packed 2D image based on the received information on the relationship between the indexes of the plurality of regions of the 3D image and the indexes of the plurality of regions of the packed 2D image;
sending information related to a viewport of the device to the server, wherein the information related to the viewport of the device comprises an index for each of the at least one second region;
receiving data relating to the at least one second region in response to sending the information relating to the viewport of the device; and
displaying a 3D image based on the received data,
wherein the packed 2D image is generated by modifying or rearranging at least a portion of a plurality of regions of the 2D image projected from the 3D image.
2. The method of claim 1, wherein the information on the relationship between the indexes of the plurality of regions of the 3D image and the indexes of the plurality of regions of the packed 2D image has a form of a look-up table (LUT).
3. An apparatus for displaying a three-dimensional (3D) image, comprising:
a communication interface; and
a processor coupled to the communication interface and configured to,
wherein the processor is configured to:
receiving information on a relationship between indexes of a plurality of regions of the 3D image and indexes of a plurality of regions of the packed 2D image from the server;
identifying at least one first region comprising a viewport of the device among a plurality of regions of the 3D image;
identifying an index of each of at least one second region corresponding to the at least one first region among the plurality of regions of the packed 2D image based on the received information on the relationship between the indexes of the plurality of regions of the 3D image and the indexes of the plurality of regions of the packed 2D image;
sending information related to a viewport of the device to the server, wherein the information related to the viewport of the device comprises an index for each of the at least one second region;
receiving data relating to the at least one second region in response to sending the information relating to the viewport of the device; and
displaying a 3D image based on the received data,
wherein the packed 2D image is generated by modifying or rearranging at least a portion of a plurality of regions of the 2D image projected from the 3D image.
4. The apparatus of claim 3, wherein the information on the relationship between the indexes of the plurality of regions of the 3D image and the indexes of the plurality of regions of the packed 2D image has a form of a look-up table (LUT).
5. A method of transmitting data of a three-dimensional (3D) image by a server, the method comprising:
transmitting information about a relationship between indexes of the plurality of regions of the 3D image and indexes of the plurality of regions of the packed 2D image to the device;
receiving, from the device, information related to a viewport of the device, wherein the information related to the viewport of the device includes an index of each of at least one second region corresponding to the viewport among a plurality of regions of a packed 2D image;
identifying the at least one second region based on the received information about the viewport of the device; and
transmitting data regarding the at least one second area to the device,
wherein the packed 2D image is generated by modifying or rearranging at least a part of a plurality of regions of the 2D image projected from the 3D image, an
Wherein the index of the at least one second region is identified based on an index of each of the at least one first region including the viewport among the plurality of regions of the 3D image and information on a relationship between the index of the plurality of regions of the 3D image and the index of the plurality of regions of the packed 2D image.
6. The method of claim 5, wherein the information on the relationship between the indexes of the plurality of regions of the 3D image and the indexes of the plurality of regions of the packed 2D image has a form of a look-up table (LUT).
7. A server for transmitting data of a three-dimensional (3D) image, comprising:
a communication interface; and
a processor coupled to the communication interface and configured to,
wherein the processor is configured to:
transmitting information about a relationship between indexes of the plurality of regions of the 3D image and indexes of the plurality of regions of the packed 2D image to the device;
receiving, from the device, information related to a viewport of the device, wherein the information related to the viewport of the device includes an index of each of at least one second region corresponding to the viewport among a plurality of regions of a packed 2D image;
identifying, based on the received information about the viewport of the device, at least one second region corresponding to the viewport among a plurality of regions of the packed 2D image; and
transmitting data regarding the at least one second area to the device,
wherein the packed 2D image is generated by modifying or rearranging at least a part of a plurality of regions of the 2D image projected from the 3D image, an
Wherein the index of the at least one second region is identified based on an index of each of the at least one first region including the viewport among the plurality of regions of the 3D image and information on a relationship between the index of the plurality of regions of the 3D image and the index of the plurality of regions of the packed 2D image.
8. The server of claim 7, wherein the information on the relationship between the indexes of the plurality of regions of the 3D image and the indexes of the plurality of regions of the packed 2D image has a form of a look-up table (LUT).
CN201780077783.5A 2016-12-16 2017-11-14 Method for transmitting data relating to a three-dimensional image Active CN110073657B (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201662435348P 2016-12-16 2016-12-16
US62/435,348 2016-12-16
PCT/KR2017/012899 WO2018110839A1 (en) 2016-12-16 2017-11-14 Method for transmitting data relating to three-dimensional image
KR1020170151519A KR102397574B1 (en) 2016-12-16 2017-11-14 Method and apparatus for transmitting data related to 3 dimensional image

Publications (2)

Publication Number Publication Date
CN110073657A CN110073657A (en) 2019-07-30
CN110073657B true CN110073657B (en) 2022-01-21

Family

ID=62558935

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201780077783.5A Active CN110073657B (en) 2016-12-16 2017-11-14 Method for transmitting data relating to a three-dimensional image

Country Status (3)

Country Link
KR (1) KR102397574B1 (en)
CN (1) CN110073657B (en)
WO (1) WO2018110839A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7271672B2 (en) 2018-12-14 2023-05-11 中興通訊股▲ふん▼有限公司 Immersive video bitstream processing
CN115543083A (en) * 2022-09-29 2022-12-30 歌尔科技有限公司 Image display method and device and electronic equipment

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR19980035726U (en) * 1996-12-13 1998-09-15 김영귀 Pipe penetration sealing structure of low dash panel
KR101146620B1 (en) * 2010-08-17 2012-05-16 옥영선 Displaying device with three dimensional viewing function by using tilting sensor and its displaying system using the same
KR20120093693A (en) * 2011-02-15 2012-08-23 엘지디스플레이 주식회사 Stereoscopic 3d display device and method of driving the same
KR101596415B1 (en) * 2014-02-28 2016-02-22 동의대학교 산학협력단 System and Method for Monitoring Around View with Multiple Scopes
US9710881B2 (en) * 2014-04-05 2017-07-18 Sony Interactive Entertainment America Llc Varying effective resolution by screen location by altering rasterization parameters
KR101844032B1 (en) * 2014-12-26 2018-05-14 주식회사 케이티 Method for sending video in region of interest from panoramic-video, server and device
KR102313485B1 (en) * 2015-04-22 2021-10-15 삼성전자주식회사 Method and apparatus for transmitting and receiving image data for virtual reality streaming service

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
signaling of vr video information in isobmff;SEJIN OH等;《MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11》;20160630;正文第1节-第3节 *

Also Published As

Publication number Publication date
KR20180070460A (en) 2018-06-26
KR102397574B1 (en) 2022-05-13
WO2018110839A1 (en) 2018-06-21
CN110073657A (en) 2019-07-30

Similar Documents

Publication Publication Date Title
CN109478313B (en) Method and apparatus for processing three-dimensional image
EP3542530B1 (en) Suggested viewport indication for panoramic video
US10360721B2 (en) Method and apparatus for signaling region of interests
US11647177B2 (en) Method, apparatus and stream for volumetric video format
KR20210000331A (en) Spherical rotation for encoding wide view video
EP3562159A1 (en) Method, apparatus and stream for volumetric video format
EP3606084A1 (en) Method for transmitting data about three-dimensional image
CN110073657B (en) Method for transmitting data relating to a three-dimensional image
US11113870B2 (en) Method and apparatus for accessing and transferring point cloud content in 360-degree video environment
US10827160B2 (en) Method for transmitting data relating to three-dimensional image
WO2019191202A1 (en) Method, apparatus and stream for volumetric video format
KR102331041B1 (en) Method and apparatus for transmitting data related to 3 dimensional image
KR20200111089A (en) Method and apparatus for point cloud contents access and delivery in 360 video environment
KR102664649B1 (en) Spherical rotation technique for encoding wide-field video
WO2019193011A1 (en) Region description for 360 or spherical video

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant