WO2020138536A1 - System and method for transmitting image on basis of hybrid network - Google Patents

System and method for transmitting image on basis of hybrid network Download PDF

Info

Publication number
WO2020138536A1
WO2020138536A1 PCT/KR2018/016726 KR2018016726W WO2020138536A1 WO 2020138536 A1 WO2020138536 A1 WO 2020138536A1 KR 2018016726 W KR2018016726 W KR 2018016726W WO 2020138536 A1 WO2020138536 A1 WO 2020138536A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer stream
resolution
base layer
unit
eye image
Prior art date
Application number
PCT/KR2018/016726
Other languages
French (fr)
Korean (ko)
Inventor
김동호
서봉석
Original Assignee
서울과학기술대학교 산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 서울과학기술대학교 산학협력단 filed Critical 서울과학기술대학교 산학협력단
Publication of WO2020138536A1 publication Critical patent/WO2020138536A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234327Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into layers, e.g. base layer and one or more enhancement layers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/161Encoding, multiplexing or demultiplexing different image signal components
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/194Transmission of image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/204Image signal generators using stereoscopic image cameras
    • H04N13/239Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/239Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/631Multimode Transmission, e.g. transmitting basic layers and enhancement layers of the content over different transmission paths or transmitting with different error corrections, different keys or with different transmission protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/01Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/01Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
    • H04N7/0102Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving the resampling of the incoming video signal

Definitions

  • the present invention relates to a hybrid network-based video transmission system and method, and more specifically, using a single encoder/decoder, ATSC 3.0-based ROUTE / DASH protocol based mono HD-class resolution 360 VR (Virtual Reality), It relates to a technology capable of providing a 360-degree VR in mono UHD-class resolution and a 360-degree VR video service in stereo UHD-class resolution.
  • ATSC 3.0 the world's first IP-based broadcasting system.
  • One of the ultimate objectives of ATSC 3.0 is to focus on compatibility with existing fixed TVs and users' mobile devices, and to enable service to both categories of users through a single transmission method.
  • ATSC 3.0 The transmission method used by ATSC 3.0 is largely divided into two categories: MPEG Media Transport Protocol (MMPT) and Real-Time Object delivery over Unidirectional Transport (ROUTE). Last year, the ROSC method was used to successfully test the ATSC 3.0 broadcasting system.
  • MMPT MPEG Media Transport Protocol
  • ROUTE Real-Time Object delivery over Unidirectional Transport
  • 360-degree VR Virtual Reality
  • the present invention is to provide a hybrid network-based image transmission system and method capable of reducing complexity and maximizing compression efficiency by using the MPD structure of one encoder and decoder.
  • signaling is performed in MPD structures for a base layer stream, an enhancement layer stream, and an up-sampled base layer stream derived from a high-resolution left-eye image and a right-eye image captured through a 360-degree camera that is sensible to a viewer. It is intended to provide a hybrid network-based video transmission system and method capable of providing sensational media services such as a 360-degree VR with mono HD-definition resolution, a 360-degree VR with mono UHD-definition resolution, and a 360-degree VR with stereo UHD-definition resolution.
  • the hybrid network-based video transmission system according to an embodiment of the present invention
  • An image acquisition unit acquiring high-resolution left-eye images and right-eye images respectively using a 360 camera;
  • a pre-processing unit that performs tiling with a predetermined size for each of the obtained left-eye image and right-eye image;
  • a plurality of down-sampling units for down-sampling each of the tiled left-eye images and right-eye images in a predetermined number of unit resolutions;
  • An encoder for encoding each of the down-sampled left-eye and right-eye images of each unit resolution to output a base layer stream and an enhancement layer stream;
  • an up-sampling unit performing up-sampling the base layer stream at a high resolution to output the base layer stream, the enhancement layer stream, and the up-sampled base layer stream.
  • a broadcasting unit that standardizes and multiplexes at least one of the base layer stream, the enhancement layer stream, and the up-sampled base layer stream according to a broadcast network or Internet standard, and then transmits the same.
  • the plurality of down-sampling units set an arbitrary resolution of a predetermined original resolution as a reference resolution, and then sequentially perform down-sampling on a tiled right-eye image at a unit resolution less than the preset reference resolution, and exceed the reference resolution It may be provided to sequentially perform down-sampling on the left-eye image tiled with a unit resolution of.
  • the encoder performs encoding on a down-sampled right-eye image of each unit resolution to output a base layer stream of each unit resolution, and for each down-sampled left-eye image of each unit resolution, based on the base layer stream, each unit resolution It may be provided to output the enhancement layer stream of.
  • the broadcasting unit generates MPD (Media Presentation Description) information for the base layer stream and the enhancement layer stream, and then generates at least one of the MPD information, the base layer stream and the enhancement layer stream, and the upsampled base layer stream. It may be provided to propagate.
  • MPD Media Presentation Description
  • the MPD includes tiling information that is provided as a spatial relationship description (SRD) and describes a viewpoint and a position between the photographed left-eye image and the right-eye image tile; And dependency information describing a base layer stream ID of each unit resolution depending on each of a plurality of enhancement layer streams for each unit resolution generated based on the base layer stream of each unit resolution.
  • SRD spatial relationship description
  • a hybrid network-based image transmission method includes an image acquisition step of obtaining a high-resolution left-eye image and a right-eye image respectively using a 360 camera; A pre-processing step of performing tiling on each of the obtained left-eye image and right-eye image in a predetermined size; A plurality of down-sampling steps for down-sampling the tiled left-eye image and the right-eye image, respectively, in predetermined unit resolutions; An encoder that performs encoding on each of the down-sampled left-eye and right-eye images and outputs a base layer stream and an enhancement layer stream; And an up-sampling step of performing up-sampling the base layer stream at a high resolution to output the base layer stream, the enhancement layer stream, and the up-sampled base layer stream. And a broadcasting step of standardizing and multiplexing at least one of the base layer stream, the enhancement layer stream, and the up-sampled base layer stream according to a broadcast network or Internet network
  • the plurality of down-sampling steps include setting an arbitrary resolution from a predetermined original resolution as a reference resolution; Sequentially performing down-sampling on the tiled right-eye image at a unit resolution less than the preset reference resolution; And sequentially performing downsampling on the left-eye image tiled with a unit resolution exceeding the reference resolution.
  • mono HD-level resolution is performed by signaling MPD structures for a base layer stream, an enhancement layer stream, and an up-sampled base layer stream derived from a high-resolution left-eye image and a right-eye image captured through a 360-degree camera.
  • Can provide immersive media such as 360 VR, mono UHD-level resolution 360-degree VR, and stereo UHD-level resolution 360-degree VR.
  • complexity of the system can be reduced and compression efficiency can be improved by deriving a base layer stream, an enhancement layer stream, and an up-sampled base layer stream using one encoder.
  • FIG. 1 is a block diagram of an image transmission system of a hybrid network in this embodiment.
  • FIG. 2 is a detailed configuration diagram of a downsampling unit of the system of this embodiment.
  • 3 and 4 are exemplary diagrams of the MPD structure of the system of the present embodiment.
  • FIG. 5 is a flowchart showing an image transmission process according to another embodiment.
  • a transport stream applied to a broadcast network may be Real-time Object Delivery over Unidirectional Transport (ROTUE) or MPEG Media Transport Protocol (MMTP).
  • ROUTE and MMTP represent a broadcast network transport stream multiplexing standard that is currently being standardized in ATSC 3.0.
  • the transport stream transmitted through the Internet network conforms to the MPEG-DASH (Dynamic Adaptive Streaming over HTTP) standard.
  • a high-resolution left-eye image and a right-eye image photographed through a 360-degree camera are tiled according to a service request mode, and downsampling is performed on a tiled right-eye image for each unit resolution below a predetermined reference resolution to be a base layer stream.
  • immersive media such as HD 360 VR, mono UHD-level resolution 360-degree VR, and stereoscopic UHD-level resolution 360-degree VR are provided to the user.
  • the hybrid network is a network that transmits a video stream to a broadcasting network and/or an Internet network.
  • FIG. 1 is a configuration diagram of a hybrid network-based image transmission system according to an embodiment of the present invention
  • FIG. 2 is a detailed configuration diagram of the down-sampling unit 300 illustrated in FIG. 1, referring to FIGS. 1 and 2
  • the system S according to an embodiment of the present invention includes an image acquisition unit 100, a pre-processing unit 200, a down-sampling unit 300, an encoder 400, an up-sampling unit 500, and a broadcasting unit ( 600).
  • the image acquisition unit 100 outputs each left-eye image and right-eye image captured through two cameras that support a predetermined resolution (eg, 8K resolution). Images taken from each camera may be used as UHD (Ultra High Definition) images or HD images, or images of some regions may be applied as stereoscopic UHD images in 360 degree VR.
  • a predetermined resolution eg, 8K resolution.
  • Images taken from each camera may be used as UHD (Ultra High Definition) images or HD images, or images of some regions may be applied as stereoscopic UHD images in 360 degree VR.
  • the pre-processing unit 200 tiles each of the acquired left-eye image and right-eye image according to service requirements, and transmits the tiled left-eye image and right-eye image to the down-sampling unit 300.
  • the tiles are the same size and the same number.
  • the down-sampling unit 300 is provided with a plurality of down-simpling modules 311 to 318 and sets any resolution (preferably set to a medium 4K resolution) of a predetermined original resolution (8K resolution) as a reference resolution, For the tiled right-eye image, down-sampling is sequentially performed at each unit resolution (1K to 3K) below the preset reference resolution, and down-sampling is sequentially performed for a left-eye image tiled at a unit resolution (4K to 8K) greater than or equal to the next reference resolution. Perform.
  • the unit resolution is a randomly determined 1K unit resolution, and downsampling is performed in 1K, 2K, and 3K for a right eye image, respectively, and 4K, 5K, and 6K for a left eye image. Downsampling is performed in 7K and 8K, respectively.
  • the unit resolution is limited to 1K units, but it can be changed according to the specifications of the system, which can be understood by those skilled in the art related to the present embodiment.
  • Each sampled left-eye and right-eye image is delivered to the encoder 400.
  • the encoder 400 encodes a down-sampled left-eye image of each unit resolution and outputs it in a stream form.
  • the encoder 400 is provided with a scalable high efficiency video codec (SHVC).
  • SHVC scalable high efficiency video codec
  • the encoder 400 encodes a down-sampled left-eye image of each unit resolution, outputs a base layer stream of each unit resolution, and outputs a down-sampled right-eye image of each unit resolution based on a base layer stream of each unit resolution. Outputs the enhancement enhancement layer stream. Accordingly, a plurality of unit resolution enhancement layer streams are output to the base layer stream having one unit resolution.
  • the base layer stream is a transport stream of the right-eye image
  • the enhancement layer stream is a transport stream of the left-eye image for providing high resolution (eg, 8K resolution) image quality for a user-selected or watched view (VoI).
  • the base layer stream of the encoder 400 is delivered to the up-sampling unit 500.
  • the up-sampling unit 500 performs up-sampling of the base layer stream of the encoder 400 at a high resolution, and then transmits the up-sampled base layer stream to the broadcasting unit 600. Accordingly, the broadcasting unit 600 receives a base layer stream of each unit resolution, an enhancement layer stream of each unit resolution based on each base layer stream, and a base layer stream of each unit resolution up-sampled.
  • the broadcaster 600 When decoding the broadcasted stream, the broadcaster 600 performs MPD (Media Presentation Description) signaling that describes tiling information and information about the base layer stream and the up-sampled base layer stream, and the enhancement layer stream and MPD information. Produces
  • MPD Media Presentation Description
  • MPD information includes tiling information and dependency information between a base layer and an enhancement layer stream.
  • the tiling information includes a spatial relationship description (SRD) indicating a viewpoint and a position between tiles of the captured left-eye image and right-eye image, and the SRD of each tile is a component that can be independently encoded and decoded, and an adaptation set (Adaptation Set) ).
  • SRDs are specified as schemed Uri and value using Supplemental Property within each adaptation set. Accordingly, in the decoding process, a tile or a tile is identified by the scheme and Uri of the components in the corresponding adaptation set, and in the case of a tile, the location and time of the tile.
  • the adaptation set includes an adaptation set right tile 0 of the base layer, an adaptation set left tile 0-1 of the enhancement layer for 3K resolution, an adaptation set right tile 0-2 of the enhancement layer for 2K resolution, and the like.
  • the supplemental property of the adaptation set right tile 0 of the base layer includes representation 1 of 5 Mbps (3K), representation 2 of 2 Mbps (2K), representation 3 of 1 Mbps (1K), and representation 4 of 0.5 Mbps (SD).
  • Uri for SRD is "urn:m.peg:dash:srd:2014”, and the value is expressed as "source_id, object_x, object_width, object_hight, total_width, total_high, spatial_set_id", and corresponding tile ID , It is possible to check the horizontal and vertical position of the upper left, the width and height of the corresponding tile, the width and height of the original image, and the ID of the grouped tiles.
  • the resolution of each tile is 1920X1080. Therefore, as in the 19th line, the value of the Supplemental Property of the first tile in the left image is (2, 0, 0, 1920, 1080, 7680, 4320). The value of Supplemental Property of the second tile is (2, 1920, 0, 1920, 1080, 7680, 4320).
  • MPD information further includes dependency information between a base layer and an enhancement layer for transmission based on a hybrid network.
  • dependency ID indicating which of the base layer streams of each unit resolution depends on which base layer stream is used is represented by Representation of the adaptation set.
  • the base layer stream of each unit resolution since the base layer stream of each unit resolution includes a plurality of enhancement layer streams, the relationship between the base layer and the enhancement layer must be specified in the Representation, and the dependency ID is applicable. Since Representation is a signal for deciding which Representation should be dependent on, the dependency ID of the enhancement layer must match the Representation value of the base layer stream of the corresponding unit resolution.
  • the broadcasting unit 600 receiving the base layer stream, the enhancement layer stream, the up-sampled base layer stream, and the MPD information, the base layer stream, the enhancement layer stream according to the ATSC 3.0 broadcast platform (ROUTE) and the Internet platform (DASH) , Up-sampled base layer stream and MPD information are standardized and multiplexed, and thus, a mono UHD resolution 360-degree VR broadcast service, an HD resolution 360-degree VR broadcast service, and a stereoscopic UHD resolution 360-degree VR Experienced media such as broadcast services are provided to viewers.
  • a mono UHD resolution 360-degree VR broadcast service an HD resolution 360-degree VR broadcast service
  • a stereoscopic UHD resolution 360-degree VR Experienced media such as broadcast services are provided to viewers.
  • a base layer stream applied to a broadcast network, an enhancement layer stream, and an up-sampled base layer stream may be multiplexed by ROTUE.
  • the broadcast network transport stream multiplexing standard follows ROUTE standardized in ATSC 3.0, and the transport stream transmitted through the Internet network follows a dynamic adaptive streaming over HTTP (DASH) standard.
  • DASH dynamic adaptive streaming over HTTP
  • the broadcasting unit 600 multiplexes the base layer stream and the enhancement layer stream and the up-sampled base layer stream by ROUTE/DASH to generate a transport stream
  • the transport stream is a mono UHD 360-degree VR broadcast service and HD 360 It is transmitted to each channel or PLP for providing a VR broadcast service in FIG. and a 360 degree VR broadcast service in stereoscopic. Accordingly, the transport stream is produced and transmitted as an enhancement layer stream having a unit resolution of 4K to 8K in the base layer stream sea.
  • the base layer stream is played as a mono HD class resolution 360 video through a mobile or low resolution display
  • the base layer stream and the enhancement layer stream is played as a mono UHD class resolution 360 video through a high resolution display
  • the base layer stream The up-sampled base layer stream and enhancement layer stream have a good band condition and are reproduced as a stereo UHD-level 360 video through a 3D display.
  • the viewer may be provided with a mono UHD class or HD class resolution 360 VR broadcast service and a stereo UHD class resolution 360 VR. Broadcast service can be provided.
  • signaling is performed in MPD structure for a base layer stream, an enhancement layer stream, and an up-sampled base layer stream derived from a high-resolution left-eye image and a right-eye image captured through a 360-degree camera.
  • It can provide tangible media such as 360 VR with HD resolution, 360 degree VR with mono UHD resolution, and 360 degree VR with stereo UHD resolution.
  • complexity of the system may be reduced and compression efficiency may be improved by deriving a base layer stream, an enhancement layer stream, and an up-sampled base layer stream using one encoder.
  • a hybrid network-based image transmission method includes an image acquisition step. (10), a pre-processing step 20, a down-sampling step 30, an encoding step 40, an up-sampling step 50, and a broadcasting step 60 may be included.
  • the image acquiring step 10 acquires each of a high-resolution left-eye image and right-eye image using a 360 camera, and each of the obtained left-eye and right-eye images is transmitted to the pre-processing unit 200.
  • each of the obtained left-eye and right-eye images performs tiling in a predetermined size, and the tiled left-eye and right-eye images are transmitted to the down sampling unit 300.
  • each of the tiled left-eye and right-eye images is set to a reference resolution of an arbitrary resolution of the intermediate resolution (4K) among the predetermined original resolution 8K resolution, and then the set reference for the tiled right-eye image.
  • Down-sampling is sequentially performed at each unit resolution (1K, 2K, 3K) below the resolution.
  • the down-sampling step 30 sequentially performs down-sampling on the left eye image tiled with each unit resolution (4K, 5K, 6K, 7K, 8K) exceeding the reference resolution, and the right eye down-sampled with each unit resolution.
  • the image and the left eye image are delivered to the encoder 400.
  • Each of the right-eye image and the left-eye image down-sampled at each unit resolution performs encoding by the encoding step 40 to output a base layer stream and an enhancement layer stream. That is, a base layer stream of each unit resolution is output for a right-eye image downsampled at each unit resolution, and an enhancement layer stream of each unit resolution is output for each base layer stream of each unit resolution.
  • the base layer stream is delivered to the up-sampling step 50, and the up-sampling step 50 performs up-sampling the base layer stream at a high resolution to perform the base layer stream and the enhancement layer stream and the up-sampled base. Output the hierarchical stream.
  • At least one of the base layer stream, the enhancement layer stream, and the up-sampled base layer stream is transmitted to a broadcasting step 60, and the broadcasting step 60 includes a base layer stream and the enhancement layer stream and the up sampling At least one of the base layer streams is standardized and multiplexed according to a standard specification of a broadcasting network or an Internet network, and then transmitted.
  • mono HD It can provide sensational media such as 360 VR of class resolution, 360 degree VR of mono UHD class resolution, and 360 degree VR of stereo UHD class resolution.
  • complexity of the system may be reduced and compression efficiency may be improved by deriving a base layer stream, an enhancement layer stream, and an up-sampled base layer stream using one encoder.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The present invention relates to a system and method for transmitting stereoscopic images on the basis of a hybrid network. A specific embodiment of the present invention can provide immersive media such as monoscopic 360 VR in HD resolution, monoscopic 360-degree VR in UHD resolution, stereoscopic 360-degree VR in UHD resolution due to signaling with an MPD structure for a base layer stream, enhancement layer stream, and upsampled base layer stream, which are derived from high-resolution left-eye view images and right-eye view images captured by a 360-degree camera, and can reduce system complexity and improve compression efficiency by employing a single encoder to derive the base layer stream, enhancement layer stream, and upsampled base layer stream.

Description

하이브리드망 기반의 영상 전송 시스템 및 방법Hybrid network based video transmission system and method
본 발명은 하이브리드망 기반의 영상 전송 시스템 및 방법에 관한 것으로서, 더욱 상세하게는 하나의 인코더/디코더를 사용하여 ATSC 3.0 기반 ROUTE /DASH 의 프로토콜 기반의 모노 HD급 해상도의 360 VR(Virtual Reality), 모노 UHD급 해상도의 360도 VR, 및 스테레오 UHD급 해상도의 360도 VR 영상 서비스를 제공할 수 있는 기술에 관한 것이다.The present invention relates to a hybrid network-based video transmission system and method, and more specifically, using a single encoder/decoder, ATSC 3.0-based ROUTE / DASH protocol based mono HD-class resolution 360 VR (Virtual Reality), It relates to a technology capable of providing a 360-degree VR in mono UHD-class resolution and a 360-degree VR video service in stereo UHD-class resolution.
최근 IP기반의 요구사항이 증가하면서 세계 최초로 IP기반의 방송시스템인 ATSC 3.0의 표준화가 완료되었다. ATSC 3.0의 궁극적인 목적 중 하나는 기존의 고정된 TV와 사용자의 모바일 기기들과 호환성을 중시하였으며 하나의 전송 방식으로 두 분류의 사용자에게 모두 서비스를 가능하게 하는 것이다. With the recent increase in IP-based requirements, the standardization of ATSC 3.0, the world's first IP-based broadcasting system, has been completed. One of the ultimate objectives of ATSC 3.0 is to focus on compatibility with existing fixed TVs and users' mobile devices, and to enable service to both categories of users through a single transmission method.
ATSC 3.0이 사용하는 전송방식은 크게 두 분류로 나뉘는데 MMPT(MPEG Media Transport Protocol)와 ROUTE(Real-Time Object delivery over Unidirectional Transport)이다. 지난해 ROUTE방식을 사용하여 ATSC 3.0 방송 시스템 실험방송을 성공하였다.The transmission method used by ATSC 3.0 is largely divided into two categories: MPEG Media Transport Protocol (MMPT) and Real-Time Object delivery over Unidirectional Transport (ROUTE). Last year, the ROSC method was used to successfully test the ATSC 3.0 broadcasting system.
최근에는 가상현실 기기들이 보급화 되면서 360도 VR(Virtual Reality)영상과 같은 실감 미디어에 대한 요구사항도 증가하고 있다. 현재 이러한 콘텐츠들은 동영상 공유사이트 중심으로 사용자들에게 서비스가 제공되고 있으며 교육, 네비게이션, 게임 등과 같은 분야에서도 다양하게 사용되고 있다. 이러한 360도 VR영상들은 향후 방송산업에도 크게 영향을 미칠 것으로 많은 학자들이 예상하고 있고, 기존의 방송 패러다임(Paradigm)을 완전히 바꿀 것으로 기대하고 있다. Recently, as virtual reality devices have become popular, requirements for tangible media such as 360-degree VR (Virtual Reality) videos have also increased. Currently, these contents are provided to users mainly on video sharing sites, and they are also used in various fields such as education, navigation, and games. These 360-degree VR videos are expected to greatly affect the broadcasting industry in the future, and many scholars expect it to completely change the existing broadcasting paradigm.
하지만 최근 완료된 ATSC 3.0 방송 프로토콜을 기반으로 고해상도 스테레오스코픽 360도 VR 영상 전송을 고려하고 있지 않고 있다.However, based on the recently completed ATSC 3.0 broadcast protocol, high resolution stereoscopic 360 degree VR video transmission is not considered.
본 발명은 하나의 인코더 및 디코더의 MPD 구조를 이용하여 복잡도를 줄일 수 있고, 압축 효율을 극대화할 수 있는 하이브리드망 기반의 영상 전송 시스템 및 방법을 제공하고자 한다. The present invention is to provide a hybrid network-based image transmission system and method capable of reducing complexity and maximizing compression efficiency by using the MPD structure of one encoder and decoder.
또한 본 발명은 시청자에게 실감나는 360도 카메라를 통해 촬영된 고해상도의 좌안 영상 및 우안 영상으로부터 도출된 기본 계층 스트림, 향상 계층 스트림, 및 업 샘플링된 기본 계층 스트림에 대한 MPD 구조로 시그널링을 수행함에 따라 모노 HD급 해상도의 360 VR, 모노 UHD급 해상도의 360도 VR, 및 스테레오 UHD급 해상도의 360도 VR 등 실감 미디어 서비스를 제공할 수 있는 하이브리드망 기반의 영상 전송 시스템 및 방법을 제공하고자 한다. In addition, according to the present invention, signaling is performed in MPD structures for a base layer stream, an enhancement layer stream, and an up-sampled base layer stream derived from a high-resolution left-eye image and a right-eye image captured through a 360-degree camera that is sensible to a viewer. It is intended to provide a hybrid network-based video transmission system and method capable of providing sensational media services such as a 360-degree VR with mono HD-definition resolution, a 360-degree VR with mono UHD-definition resolution, and a 360-degree VR with stereo UHD-definition resolution.
본 발명의 목적은 이상에서 언급한 목적으로 제한되지 않으며, 언급되지 않은 본 발명의 다른 목적 및 장점들은 하기의 설명에 의해서 이해될 수 있으며, 본 발명의 실시 예에 의해 보다 분명하게 알게 될 것이다. 또한, 본 발명의 목적 및 장점들은 특허청구 범위에 나타낸 수단 및 그 조합에 의해 실현될 수 있음을 쉽게 알 수 있을 것이다.The object of the present invention is not limited to the above-mentioned object, other objects and advantages of the present invention which are not mentioned can be understood by the following description, and will be more clearly understood by embodiments of the present invention. It will also be readily appreciated that the objects and advantages of the present invention can be realized by means of the appended claims and combinations thereof.
이에 본 발명의 실시 태양에 의한 하이브리드망 기반의 영상 전송 시스템은 Accordingly, the hybrid network-based video transmission system according to an embodiment of the present invention
360 카메라를 이용하여 고해상도의 좌안 영상 및 우안 영상 각각을 획득하는 영상 획득부; 상기 획득된 좌안 영상 및 우안 영상 각각에 대한 소정 크기로 타일링을 수행하는 전 처리부; 상기 타일링된 좌안 영상과 우안 영상 각각에 대해 기 정해진 다수의 단위 해상도로 각각 다운 샘플링을 수행하는 다수의 다운 샘플링부; 상기 각 단위 해상도의 다운 샘플링된 좌안 영상 및 우안 영상 각각에 대해 인코딩을 수행하여 기본 계층 스트림 및 향상 계층 스트림을 출력하는 인코더; 및 상기 기본 계층 스트림을 고 해상도로 업 샘플링을 수행하여 상기 기본 계층 스트림 및 상기 향상 계층 스트림과 상기 업 샘플링된 기본 계층 스트림을 출력하는 업 샘플링부; 및 상기 기본 계층 스트림 및 상기 향상 계층 스트림과 상기 업 샘플링된 기본 계층 스트림 중 적어도 하나를 방송망 또는 인터넷 표준 규격에 따라 표준화 및 다중화한 다음 전송하는 브로드 캐스팅부를 포함하는 것을 일 특징으로 한다. An image acquisition unit acquiring high-resolution left-eye images and right-eye images respectively using a 360 camera; A pre-processing unit that performs tiling with a predetermined size for each of the obtained left-eye image and right-eye image; A plurality of down-sampling units for down-sampling each of the tiled left-eye images and right-eye images in a predetermined number of unit resolutions; An encoder for encoding each of the down-sampled left-eye and right-eye images of each unit resolution to output a base layer stream and an enhancement layer stream; And an up-sampling unit performing up-sampling the base layer stream at a high resolution to output the base layer stream, the enhancement layer stream, and the up-sampled base layer stream. And a broadcasting unit that standardizes and multiplexes at least one of the base layer stream, the enhancement layer stream, and the up-sampled base layer stream according to a broadcast network or Internet standard, and then transmits the same.
바람직하게 상기 다수의 다운 샘플링부는, 기 정해진 원본 해상도 중 임의의 해상도를 기준 해상도로 설정한 다음 타일링된 우안 영상에 대해 상기 설정된 기준 해상도 미만의 단위 해상도로 다운 샘플링을 순차 수행하고, 상기 기준 해상도 초과의 단위 해상도로 타일링된 좌안 영상에 대해 다운 샘플링을 순차 수행하도록 구비될 수 있다.Preferably, the plurality of down-sampling units set an arbitrary resolution of a predetermined original resolution as a reference resolution, and then sequentially perform down-sampling on a tiled right-eye image at a unit resolution less than the preset reference resolution, and exceed the reference resolution It may be provided to sequentially perform down-sampling on the left-eye image tiled with a unit resolution of.
상기 인코더는 각 단위 해상도의 다운 샘플링된 우안 영상에 대해 인코딩을 수행하여 각 단위 해상도의 기본 계층 스트림을 출력하고, 각 단위 해상도의 다운 샘플링된 좌안 영상에 대해 상기 기본 계층 스트림을 기반으로 각 단위 해상도의 향상 계층 스트림을 출력하도록 구비될 수 있다.The encoder performs encoding on a down-sampled right-eye image of each unit resolution to output a base layer stream of each unit resolution, and for each down-sampled left-eye image of each unit resolution, based on the base layer stream, each unit resolution It may be provided to output the enhancement layer stream of.
바람직하게 상기 브로드 캐스팅부는 상기 기본 계층 스트림 및 향상 계층 스트림에 대한 MPD(Media Presentation Description) 정보를 생성한 다음 상기 MPD 정보와 기본 계층 스트림 및 향상 계층 스트림, 및 업 샘플링된 기본 계층스트림 중 적어도 하나를 전파하도록 구비될 수 있다. Preferably, the broadcasting unit generates MPD (Media Presentation Description) information for the base layer stream and the enhancement layer stream, and then generates at least one of the MPD information, the base layer stream and the enhancement layer stream, and the upsampled base layer stream. It may be provided to propagate.
바람직하게 상기 MPD는 SRD(Spatial Relationship Description)로 구비되고 촬영된 좌안 영상과 우안 영상 타일 간의 시점 및 위치를 설명하는 타일링 정보; 및 각 단위 해상도의 기본 계층 스트림을 기반으로 생성된 각 단위 해상도 별 다수의 향상 계층 스트림 각각에 대해 의존하는 각 단위 해상도의 기본 계층 스트림 아이디를 설명하는 의존도 정보를 포함할 수 있다.Preferably, the MPD includes tiling information that is provided as a spatial relationship description (SRD) and describes a viewpoint and a position between the photographed left-eye image and the right-eye image tile; And dependency information describing a base layer stream ID of each unit resolution depending on each of a plurality of enhancement layer streams for each unit resolution generated based on the base layer stream of each unit resolution.
본 발명의 다른 실시 양태에 의거 하이브리드망 기반의 영상 전송 방법은 360 카메라를 이용하여 고해상도의 좌안 영상 및 우안 영상 각각을 획득하는 영상 획득단계; 상기 획득된 좌안 영상 및 우안 영상 각각에 대해 소정 크기로 타일링을 수행하는 전 처리단계; 상기 타일링된 좌안 영상과 우안 영상에 대해 기 정해진 다수의 단위 해상도로 각각 다운 샘플링을 수행하는 다수의 다운 샘플링단계; 상기 다운 샘플링된 좌안 영상 및 우안 영상 각각에 대해 인코딩을 수행하여 기본 계층 스트림 및 향상 계층 스트림을 출력하는 인코더; 및 상기 기본 계층 스트림을 고 해상도로 업 샘플링을 수행하여 상기 기본 계층 스트림 및 상기 향상 계층 스트림과 상기 업 샘플링된 기본 계층 스트림을 출력하는 업 샘플링단계; 및 상기 기본 계층 스트림 및 상기 향상 계층 스트림과 상기 업 샘플링된 기본 계층 스트림중 적어도 하나를 방송망 또는 인터넷망 표준 규격에 따라 표준화 및 다중화한 다음 전송하는 브로드 캐스팅 단계를 포함하는 것을 다른 특징으로 한다. According to another embodiment of the present invention, a hybrid network-based image transmission method includes an image acquisition step of obtaining a high-resolution left-eye image and a right-eye image respectively using a 360 camera; A pre-processing step of performing tiling on each of the obtained left-eye image and right-eye image in a predetermined size; A plurality of down-sampling steps for down-sampling the tiled left-eye image and the right-eye image, respectively, in predetermined unit resolutions; An encoder that performs encoding on each of the down-sampled left-eye and right-eye images and outputs a base layer stream and an enhancement layer stream; And an up-sampling step of performing up-sampling the base layer stream at a high resolution to output the base layer stream, the enhancement layer stream, and the up-sampled base layer stream. And a broadcasting step of standardizing and multiplexing at least one of the base layer stream, the enhancement layer stream, and the up-sampled base layer stream according to a broadcast network or Internet network standard specification, and then transmitting the same.
바람직하게 상기 다수의 다운 샘플링단계는 기 정해진 원본 해상도 중 임의의 해상도를 기준 해상도로 설정하는 단계; 상기 타일링된 우안 영상에 대해 상기 설정된 기준 해상도 미만의 단위 해상도로 다운 샘플링을 순차 수행하는 단계; 및 상기 기준 해상도 초과의 단위 해상도로 타일링된 좌안 영상에 대해 다운 샘플링을 순차 수행하는 단계를 포함할 수 있다.Preferably, the plurality of down-sampling steps include setting an arbitrary resolution from a predetermined original resolution as a reference resolution; Sequentially performing down-sampling on the tiled right-eye image at a unit resolution less than the preset reference resolution; And sequentially performing downsampling on the left-eye image tiled with a unit resolution exceeding the reference resolution.
본 발명에 의하면 360도 카메라를 통해 촬영된 고해상도의 좌안 영상 및 우안 영상으로부터 도출된 기본 계층 스트림, 향상 계층 스트림, 및 업 샘플링된 기본 계층 스트림에 대한 MPD 구조로 시그널링을 수행함에 따라 모노 HD급 해상도의 360 VR, 모노 UHD급 해상도의 360도 VR, 및 스테레오 UHD급 해상도의 360도 VR 등 실감 미디어를 제공할 수 있다.According to the present invention, mono HD-level resolution is performed by signaling MPD structures for a base layer stream, an enhancement layer stream, and an up-sampled base layer stream derived from a high-resolution left-eye image and a right-eye image captured through a 360-degree camera. Can provide immersive media such as 360 VR, mono UHD-level resolution 360-degree VR, and stereo UHD-level resolution 360-degree VR.
본 발명에 따르면, 하나의 인코더를 이용하여 기본 계층 스트림, 향상 계층 스트림, 및 업 샘플링된 기본 계층 스트림을 도출함에 따라 시스템의 복잡도를 줄일 수 있고, 압축 효율을 향상시킬 수 있다.According to the present invention, complexity of the system can be reduced and compression efficiency can be improved by deriving a base layer stream, an enhancement layer stream, and an up-sampled base layer stream using one encoder.
본 명세서에서 첨부되는 다음의 도면들은 본 발명의 바람직한 실시예를 예시하는 것이며, 후술하는 발명의 상세한 설명과 함께 본 발명의 기술사상을 더욱 이해시키는 역할을 하는 것이므로, 본 발명은 그러한 도면에 기재된 사항에만 한정되어 해석되어서는 아니된다.The following drawings attached in this specification are intended to illustrate preferred embodiments of the present invention, and serve to further understand the technical idea of the present invention together with the detailed description of the invention described below, and thus the present invention is described in such drawings. It is not limited to interpretation.
도 1은 본 실시예의 하이브리드 망의 영상 전송 시스템의 구성도이다.1 is a block diagram of an image transmission system of a hybrid network in this embodiment.
도 2는 본 실시 예의 시스템의 다운 샘플링부의 세부 구성도이다.2 is a detailed configuration diagram of a downsampling unit of the system of this embodiment.
도 3 및 도 4는 본 실시 예의 시스템의 MPD 구조 예시도들이다.3 and 4 are exemplary diagrams of the MPD structure of the system of the present embodiment.
도 5는 본 다른 실시 예의 영상 전송과정을 보인 흐름도이다.5 is a flowchart showing an image transmission process according to another embodiment.
아래에서는 첨부한 도면을 참고로 하여 본 발명의 실시 예에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시 예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art to which the present invention pertains may easily practice. However, the present invention can be implemented in many different forms and is not limited to the embodiments described herein. In addition, in order to clearly describe the present invention in the drawings, parts irrelevant to the description are omitted, and like reference numerals are assigned to similar parts throughout the specification.
명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성 요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.Throughout the specification, when a part “includes” a certain component, it means that the component may further include other components, not to exclude other components, unless otherwise stated.
본 발명의 실시 예에서 방송망에 적용되는 전송 스트림은 ROTUE(Real-time Object Delivery over Unidirectional Transport) 또는 MMTP(MPEG Media Transport Protocol)가 될 수 있다. ROUTE 및 MMTP는 현재 ATSC 3.0에서 표준화 진행 중인 방송망 전송 스트림 다중화 규격을 나타낸다. 또한, 인터넷 망으로 전송되는 전송 스트림은 MPEG-DASH(Dynamic Adaptive Streaming over HTTP) 규격을 따른다. In an embodiment of the present invention, a transport stream applied to a broadcast network may be Real-time Object Delivery over Unidirectional Transport (ROTUE) or MPEG Media Transport Protocol (MMTP). ROUTE and MMTP represent a broadcast network transport stream multiplexing standard that is currently being standardized in ATSC 3.0. In addition, the transport stream transmitted through the Internet network conforms to the MPEG-DASH (Dynamic Adaptive Streaming over HTTP) standard.
본 발명은 360도 카메라를 통해 촬영된 고해상도의 좌안 영상 및 우안 영상을 서비스 요구 모드에 따라 타일링하고, 기 정해진 기준 해상도 미만의 단위 해상도 마다 타일링된 우안 영상에 대해 다운 샘플링을 수행하여 기본 계층 스트림으로 인코딩하며, 기 정해진 기준 해상도 이상의 단위 해상도마다 상기 타일링된 좌안 영상에 대해 다운 샘플링을 수행한 다음 상기 기본 계층 스트림을 토대로 인코딩하여 향상 계층 스트림을 출력한 다음 기본 계층 스트림에 대해 업 샘플링을 수행하여 기본 계층 스트림, 향상 계층 스트림, 및 업 샘플링된 기본 계층 스트림에 의거 HD 360 VR, 모노 UHD급 해상도의 360도 VR, 및 스테레오스코픽 UHD급 해상도의 360도 VR 등 실감 미디어를 사용자에게 제공한다.According to the present invention, a high-resolution left-eye image and a right-eye image photographed through a 360-degree camera are tiled according to a service request mode, and downsampling is performed on a tiled right-eye image for each unit resolution below a predetermined reference resolution to be a base layer stream. Encoding, down-sampling the tiled left-eye image for each unit resolution higher than a predetermined reference resolution, encoding based on the base layer stream, outputting an enhancement layer stream, and then performing upsampling on the base layer stream to perform basic Based on the hierarchical stream, the enhanced hierarchical stream, and the up-sampled base hierarchical stream, immersive media such as HD 360 VR, mono UHD-level resolution 360-degree VR, and stereoscopic UHD-level resolution 360-degree VR are provided to the user.
여기서, 하이브리드 망은 방송망 및/또는 인터넷 망으로 비디오 스트림을 전송하는 네트워크이다.Here, the hybrid network is a network that transmits a video stream to a broadcasting network and/or an Internet network.
도 1은 본 발명의 실시 예에 따른 하이브리드망 기반의 영상 전송 시스템의 구성도이고, 도 2는 도 1에 도시된 다운 샘플링부(300)의 세부 구성도로서, 도 1 및 도 2를 참조하면, 본 발명의 실시 예에 따른 시스템(S)은 영상 획득부(100), 전처리부(200), 다운 샘플링부(300), 인코더(400), 업 샘플링부(500), 및 브로드 캐스팅부(600)를 포함한다. 1 is a configuration diagram of a hybrid network-based image transmission system according to an embodiment of the present invention, and FIG. 2 is a detailed configuration diagram of the down-sampling unit 300 illustrated in FIG. 1, referring to FIGS. 1 and 2 The system S according to an embodiment of the present invention includes an image acquisition unit 100, a pre-processing unit 200, a down-sampling unit 300, an encoder 400, an up-sampling unit 500, and a broadcasting unit ( 600).
영상 획득부(100)는 소정 해상도(예를 들어, 8K 해상도)를 지원하는 두 개의 카메라를 통하여 촬영된 각각의 좌안 영상 및 우안 영상을 각각 출력한다. 각각의 카메라에서 찍힌 영상이 그대로 UHD(Ultra High Definition) 영상 또는 HD 영상으로 사용되거나, 일부 영역의 영상이 360도 VR에서 스테레오스코픽 UHD 영상으로 적용될 수 있다.The image acquisition unit 100 outputs each left-eye image and right-eye image captured through two cameras that support a predetermined resolution (eg, 8K resolution). Images taken from each camera may be used as UHD (Ultra High Definition) images or HD images, or images of some regions may be applied as stereoscopic UHD images in 360 degree VR.
이에 전 처리부(200)는 획득된 좌안 영상 및 우안 영상 각각을 서비스 요구 사항에 따라 타일링을 수행하고 타일링된 좌안 영상 및 우안 영상은 다운 샘플링부(300)로 전달한다. 여기서, 좌안 영상 및 우안 영상에 대해 타일은 동일한 크기와 동일한 수이다. Accordingly, the pre-processing unit 200 tiles each of the acquired left-eye image and right-eye image according to service requirements, and transmits the tiled left-eye image and right-eye image to the down-sampling unit 300. Here, for the left-eye image and the right-eye image, the tiles are the same size and the same number.
다운 샘플링부(300)는 다수의 다운 심플링 모듈(311 내지 318)로 구비되고 기 정해진 원본 해상도(8K 해상도) 중 임의의 해상도(바람직하게 중간 4K 해상도로 설정됨)를 기준 해상도로 설정한 다음 타일링된 우안 영상에 대해 상기 설정된 기준 해상도 미만의 단위 해상도(1K 내지 3K) 각각으로 다운 샘플링을 순차 수행하고, 다음 기준 해상도 이상의 단위 해상도(4K 내지 8K)로 타일링된 좌안 영상에 대해 다운 샘플링을 순차 수행한다.The down-sampling unit 300 is provided with a plurality of down-simpling modules 311 to 318 and sets any resolution (preferably set to a medium 4K resolution) of a predetermined original resolution (8K resolution) as a reference resolution, For the tiled right-eye image, down-sampling is sequentially performed at each unit resolution (1K to 3K) below the preset reference resolution, and down-sampling is sequentially performed for a left-eye image tiled at a unit resolution (4K to 8K) greater than or equal to the next reference resolution. Perform.
여기서, 단위 해상도는 임의로 정해진 1K 단위의 해상도로서, 우안 영상의 경우 1K, 2K, 3K 로 각각 다운 샘플링이 수행되고, 좌안 영상의 경우 4K, 5K, 6K. 7K, 8K로 각각 다운 샘플링이 수행된다. 여기서 단위 해상도는 1K 단위로 한정하여 설명하고 있으나, 시스템의 사양에 따라 변경 가능하고 이는 본 실시 예와 관련된 기술분야에서 통상의 지식을 가진 자라면 이해할 수 있다. 각 샘플링된 좌안 및 우안 영상은 인코더(400)로 전달된다.Here, the unit resolution is a randomly determined 1K unit resolution, and downsampling is performed in 1K, 2K, and 3K for a right eye image, respectively, and 4K, 5K, and 6K for a left eye image. Downsampling is performed in 7K and 8K, respectively. Here, the unit resolution is limited to 1K units, but it can be changed according to the specifications of the system, which can be understood by those skilled in the art related to the present embodiment. Each sampled left-eye and right-eye image is delivered to the encoder 400.
인코더(400)는 각 단위 해상도의 다운 샘플링된 좌안 영상을 인코딩하여 스트림 형태로 출력한다. 여기서, 상기 인코더(400)는 SHVC(Scalable High efficiency Video Codec)으로 구비된다.The encoder 400 encodes a down-sampled left-eye image of each unit resolution and outputs it in a stream form. Here, the encoder 400 is provided with a scalable high efficiency video codec (SHVC).
즉, 인코더(400)는 각 단위 해상도의 다운 샘플링된 좌안 영상을 인코딩하여 각 단위 해상도의 기본 계층 스트림을 출력하고 각 단위 해상도의 다운 샘플링된 우안 영상을 각 단위 해상도의 기반 계층 스트림을 토대로 각 단위 해상도의 향상 계층 스트림을 출력한다. 이에 하나 단위 해상도의 기본 계층 스트림에 다수의 단위 해상도의 향상 계층 스트림이 출력된다.That is, the encoder 400 encodes a down-sampled left-eye image of each unit resolution, outputs a base layer stream of each unit resolution, and outputs a down-sampled right-eye image of each unit resolution based on a base layer stream of each unit resolution. Outputs the enhancement enhancement layer stream. Accordingly, a plurality of unit resolution enhancement layer streams are output to the base layer stream having one unit resolution.
그리고 기본 계층 스트림은 우안 영상의 전송 스트림이고, 향상 계층 스트림은 사용자 선택 또는 주시 화면(View of Interest, VoI)에 대해 고해상도(예를 들어 8K 해상도) 화질을 제공하기 위한 좌안 영상의 전송 스트림이다.In addition, the base layer stream is a transport stream of the right-eye image, and the enhancement layer stream is a transport stream of the left-eye image for providing high resolution (eg, 8K resolution) image quality for a user-selected or watched view (VoI).
인코더(400)의 기본 계층 스트림은 업 샘플링부(500)로 전달된다.The base layer stream of the encoder 400 is delivered to the up-sampling unit 500.
업 샘플링부(500)는 인코더(400)의 기본 계층 스트림을 고 해상도로 업 샘플링을 수행한 다음 업 샘플링된 기본 계층 스트림을 브로드 캐스팅부(600)로 전달된다. 이에 브로드 캐스팅부(600)는 각 단위 해상도의 기본 계층 스트림과 각 기본 계층 스트림 기반의 각 단위 해상도의 향상 계층 스트림, 및 업 샘플링된 각 단위 해상도의 기본 계층 스트림을 제공받는다.The up-sampling unit 500 performs up-sampling of the base layer stream of the encoder 400 at a high resolution, and then transmits the up-sampled base layer stream to the broadcasting unit 600. Accordingly, the broadcasting unit 600 receives a base layer stream of each unit resolution, an enhancement layer stream of each unit resolution based on each base layer stream, and a base layer stream of each unit resolution up-sampled.
브로드 캐스팅부(600)는 브로드 캐스팅된 스트림을 디코딩할 때 타일링 정보 및 기본 계층 스트림 및 향상 계층 스트림과 업 샘플링된 기본 계층 스트림에 대한 정보를 기술하는 MPD(Media Presentation Description) 시그널링을 수행하여 MPD 정보를 생성한다.When decoding the broadcasted stream, the broadcaster 600 performs MPD (Media Presentation Description) signaling that describes tiling information and information about the base layer stream and the up-sampled base layer stream, and the enhancement layer stream and MPD information. Produces
도 3 및 도 4은 MPD 정보의 구조를 보인 예시도로서, 도 3를 참조하면, MPD 정보는 타일링 정보 및 기본 계층과 향상 계층 스트림 간의 의존도 정보를 포함한다. 3 and 4 are exemplary diagrams showing the structure of MPD information. Referring to FIG. 3, MPD information includes tiling information and dependency information between a base layer and an enhancement layer stream.
즉, 타일링 정보는 촬영된 좌안 영상과 우안 영상의 타일 간의 시점 및 위치를 나타내는 SRD(Spatial Relationship Description)를 포함하고, 각 타일의 SRD는 독립적으로 부호화 및 복호화가 가능한 컴포넌트이고, 적응 세트(Adaptation Set)로 표현된다. 예를 들어, SRD는 각각의 적응 세트 내에 Supplemental Property를 사용하여 schemed Uri와 value로 지정된다. 이에 디코딩 과정에서 해당 적응 세트 내의 컴포넌트의 schemed Uri와 value에 의해, 타일인 지, 타일인 경우 해당 타일의 위치 및 시점이 식별된다. 예를 들어 적응 세트는 기본 계층의 Adaptation Set right tile 0, 3K 해상도를 위한 향상 계층의 Adaptation Set left tile 0-1, 2K 해상도를 위한 향상 계층의 Adaptation Set right tile 0-2 등으로 구비되고, 이러한 기본 계층의 Adaptation Set right tile 0의 Supplemental Property는 5Mbps(3K)의 representation 1, 2Mbps(2K)의 representation 2, 1Mbps(1K)의 representation 3, 0.5Mbps(SD)의 representation 4를 포함한다.That is, the tiling information includes a spatial relationship description (SRD) indicating a viewpoint and a position between tiles of the captured left-eye image and right-eye image, and the SRD of each tile is a component that can be independently encoded and decoded, and an adaptation set (Adaptation Set) ). For example, SRDs are specified as schemed Uri and value using Supplemental Property within each adaptation set. Accordingly, in the decoding process, a tile or a tile is identified by the scheme and Uri of the components in the corresponding adaptation set, and in the case of a tile, the location and time of the tile. For example, the adaptation set includes an adaptation set right tile 0 of the base layer, an adaptation set left tile 0-1 of the enhancement layer for 3K resolution, an adaptation set right tile 0-2 of the enhancement layer for 2K resolution, and the like. The supplemental property of the adaptation set right tile 0 of the base layer includes representation 1 of 5 Mbps (3K), representation 2 of 2 Mbps (2K), representation 3 of 1 Mbps (1K), and representation 4 of 0.5 Mbps (SD).
예를 들어, SRD를 위한 schemed Uri는 "urn:m.peg:dash:srd:2014"이고, value가 "source_id, object_x, object_width, object_hight, total_width, total_high, spatial_set_id"로 표현되고, 이에 해당 타일 아이디, 좌측 상단의 수평 및 수직 위치, 해당 타일의 넓이 및 높이, 원 영상의 넓이 및 높이, 그리고 그룹화된 타일의 아이디의 확인이 가능하다. For example, the schemed Uri for SRD is "urn:m.peg:dash:srd:2014", and the value is expressed as "source_id, object_x, object_width, object_hight, total_width, total_high, spatial_set_id", and corresponding tile ID , It is possible to check the horizontal and vertical position of the upper left, the width and height of the corresponding tile, the width and height of the original image, and the ID of the grouped tiles.
즉, 8K의 원본 해상도를 고려하여 16개의 타일로 타일링된 경우 한 개 당 타일의 해상도는 1920X1080이 된다. 따라서 19번째 줄과 같이 좌측 영상의 첫번째 타일의 Supplemental Property 의 value는 (2, 0, 0, 1920, 1080, 7680, 4320)이 된다. 두번째 타일의 Supplemental Property 의 value는 (2, 1920, 0, 1920, 1080, 7680, 4320)이 된다.That is, when tiled with 16 tiles in consideration of the original resolution of 8K, the resolution of each tile is 1920X1080. Therefore, as in the 19th line, the value of the Supplemental Property of the first tile in the left image is (2, 0, 0, 1920, 1080, 7680, 4320). The value of Supplemental Property of the second tile is (2, 1920, 0, 1920, 1080, 7680, 4320).
한편, MPD 정보는 도 4를 참조하면, 하이브리드 망 기반으로 전송하기 위해 기본계층과 향상계층 사이의 의존도 정보(Dependancy)가 더 포함된다. 예를 들어 향상 계층 스트림의 디코딩 시 향상 계층 스트림은 각 단위 해상도의 기본 계층 스트림 중 어떤 기본 계층 스트림에 의존하는 지를 나타내는 의존도 아이디는 적응 세트의 Representation 으로 나타낸다. Meanwhile, referring to FIG. 4, MPD information further includes dependency information between a base layer and an enhancement layer for transmission based on a hybrid network. For example, when decoding the enhancement layer stream, the dependency ID indicating which of the base layer streams of each unit resolution depends on which base layer stream is used is represented by Representation of the adaptation set.
즉, 도 3 및 도 4에 도시된 바와 같이, 각 단위 해상도의 기본 계층 스트림에 다수개의 향상 계층 스트림을 포함하고 있으므로, 이러한 기본 계층과 향상 계층 간의 관계는 Representation에 명시되어야 하고, 의존도 아이디는 해당 Representation가 어떤 Representation에 의존되어야 하는 지에 대해 결정하기 위한 신호이므로, 향상 계층의 의존도 아이디는 해당 단위 해상도의 기본 계층 스트림의 Representation value와 일치되어야 한다. That is, as shown in FIGS. 3 and 4, since the base layer stream of each unit resolution includes a plurality of enhancement layer streams, the relationship between the base layer and the enhancement layer must be specified in the Representation, and the dependency ID is applicable. Since Representation is a signal for deciding which Representation should be dependent on, the dependency ID of the enhancement layer must match the Representation value of the base layer stream of the corresponding unit resolution.
이러한 기본 계층 스트림, 향상 계층 스트림, 업 샘플링된 기본 계층 스트림 및 MPD 정보를 수신한 브로드 캐스팅부(600)는 ATSC 3.0 방송 플랫폼(ROUTE) 및 인터넷 플랫폼(DASH)에 따라 기본 계층 스트림, 향상 계층 스트림, 업 샘플링된 기본 계층 스트림 및 MPD 정보를 표준화 및 다중화하여 전송하고, 이에 모노 UHD급 해상도의 360도 VR 방송 서비스, HD급 해상도의 360도 VR 방송 서비스, 및 스테레오스코픽 UHD급 해상도의 360도 VR 방송 서비스 등의 실감 미디어가 시청자에게 제공된다. The broadcasting unit 600 receiving the base layer stream, the enhancement layer stream, the up-sampled base layer stream, and the MPD information, the base layer stream, the enhancement layer stream according to the ATSC 3.0 broadcast platform (ROUTE) and the Internet platform (DASH) , Up-sampled base layer stream and MPD information are standardized and multiplexed, and thus, a mono UHD resolution 360-degree VR broadcast service, an HD resolution 360-degree VR broadcast service, and a stereoscopic UHD resolution 360-degree VR Experienced media such as broadcast services are provided to viewers.
예를 들어 본 발명의 실시 예에서 방송망에 적용되는 기본 계층 스트림, 향상 계층 스트림, 및 업 샘플링된 기본 계층 스트림은 ROTUE에 의해 다중화될 수 있다. 이러한 방송망 전송 스트림 다중화 규격은 ATSC 3.0에서 표준화된 ROUTE에 따르고, 인터넷 망으로 전송되는 전송 스트림은 DASH(Dynamic Adaptive Streaming over HTTP) 규격을 따른다.For example, in an embodiment of the present invention, a base layer stream applied to a broadcast network, an enhancement layer stream, and an up-sampled base layer stream may be multiplexed by ROTUE. The broadcast network transport stream multiplexing standard follows ROUTE standardized in ATSC 3.0, and the transport stream transmitted through the Internet network follows a dynamic adaptive streaming over HTTP (DASH) standard.
이를 위해 브로드 캐스팅부(600)는 상기 기본 계층 스트림 및 향상 계층 스트림 및 업 샘플링된 기본 계층 스트림을 ROUTE/DASH로 다중화하여 전송 스트림을 생성하며 상기 전송 스트림은 모노 UHD 360도 VR 방송 서비스 및 HD 360도 VR 방송 서비스, 및 스테레오스코픽 360도 VR 방송 서비스를 제공하기 위한 각각의 채널 또는 PLP로 전송된다. 이에 전송 스트림은 기본 계층 스트림 바다 4K 에서 8K 각각의 단위 해상도의 향상 계층 스트림으로 제작되어 전송된다. 이에 기본 계층 스트림은 모바일이나 저 해상도의 디스플레이를 통해 모노 HD 급 해상도 360 영상으로 재생되고, 기본 계층 스트림 및 향상 계층 스트림은 고 해상도의 디스플레이를 통해 모노 UHD 급 해상도 360 영상으로 재생되며, 기본 계층 스트림과 업 샘플링된 기본 계층 스트림 및 향상 계층 스트림은 대역 상황이 좋고 3D 디스플레이를 통해 스테레오 UHD 급 해상도 360 영상으로 재생된다. To this end, the broadcasting unit 600 multiplexes the base layer stream and the enhancement layer stream and the up-sampled base layer stream by ROUTE/DASH to generate a transport stream, and the transport stream is a mono UHD 360-degree VR broadcast service and HD 360 It is transmitted to each channel or PLP for providing a VR broadcast service in FIG. and a 360 degree VR broadcast service in stereoscopic. Accordingly, the transport stream is produced and transmitted as an enhancement layer stream having a unit resolution of 4K to 8K in the base layer stream sea. Accordingly, the base layer stream is played as a mono HD class resolution 360 video through a mobile or low resolution display, and the base layer stream and the enhancement layer stream is played as a mono UHD class resolution 360 video through a high resolution display, and the base layer stream The up-sampled base layer stream and enhancement layer stream have a good band condition and are reproduced as a stereo UHD-level 360 video through a 3D display.
즉, 방송망을 통하여 기본 계층 스트림, 향상 계층 스트림, 및 업 샘플링된 기본 계층 스트림과 MPD 정보가 추가 전송됨에 따라, 시청자는 모노 UHD 급 또는 HD 급 해상도의 360 VR 방송 서비스 및 스테레오 UHD급 해상도 360 VR 방송 서비스를 제공받을 수 있게 된다. That is, as the base layer stream, the enhancement layer stream, and the up-sampled base layer stream and MPD information are additionally transmitted through the broadcasting network, the viewer may be provided with a mono UHD class or HD class resolution 360 VR broadcast service and a stereo UHD class resolution 360 VR. Broadcast service can be provided.
따라서, 본 실시 예에 의거 360도 카메라를 통해 촬영된 고해상도의 좌안 영상 및 우안 영상으로부터 도출된 기본 계층 스트림, 향상 계층 스트림, 및 업 샘플링된 기본 계층 스트림에 대한 MPD 구조로 시그널링을 수행함에 따라 모노 HD급 해상도의 360 VR, 모노 UHD급 해상도의 360도 VR, 및 스테레오 UHD급 해상도의 360도 VR 등 실감 미디어를 제공할 수 있다.Accordingly, according to the present embodiment, signaling is performed in MPD structure for a base layer stream, an enhancement layer stream, and an up-sampled base layer stream derived from a high-resolution left-eye image and a right-eye image captured through a 360-degree camera. It can provide tangible media such as 360 VR with HD resolution, 360 degree VR with mono UHD resolution, and 360 degree VR with stereo UHD resolution.
또한, 하나의 인코더를 이용하여 기본 계층 스트림, 향상 계층 스트림, 및 업 샘플링된 기본 계층 스트림을 도출함에 따라 시스템의 복잡도를 줄일 수 있고, 압축 효율을 향상시킬 수 있다.In addition, complexity of the system may be reduced and compression efficiency may be improved by deriving a base layer stream, an enhancement layer stream, and an up-sampled base layer stream using one encoder.
도 5는 본 발명의 다른 실시 예에 따른 하이브리드망 기반의 영상 전송과정을 보인 전체 흐름도로서, 도 5를 참조하면, 본 발명의 다른 실시 예에 따른 하이브리드망 기반의 영상 전송방법은, 영상 획득단계(10), 전 처리단계(20), 다운 샘플링단계(30), 인코딩 단계(40), 업 샘플링단계(50), 및 브로드 캐스팅단계(60)를 포함할 수 있다. 5 is an overall flow chart showing a hybrid network-based image transmission process according to another embodiment of the present invention. Referring to FIG. 5, a hybrid network-based image transmission method according to another embodiment of the present invention includes an image acquisition step. (10), a pre-processing step 20, a down-sampling step 30, an encoding step 40, an up-sampling step 50, and a broadcasting step 60 may be included.
여기서 영상 획득단계(10)는 360 카메라를 이용하여 고해상도의 좌안 영상 및 우안 영상 각각을 획득하고 획득된 좌안 및 우안 영상 각각은 전 처리부(200)로 전달된다.Here, the image acquiring step 10 acquires each of a high-resolution left-eye image and right-eye image using a 360 camera, and each of the obtained left-eye and right-eye images is transmitted to the pre-processing unit 200.
그리고 획득된 좌안 및 우안 영상 각각은 기 정해진 소정 크기로 타일링을 수행하고 타일링된 좌안 영상과 우안 영상은 다운 샘플링부(300)로 전달된다.Then, each of the obtained left-eye and right-eye images performs tiling in a predetermined size, and the tiled left-eye and right-eye images are transmitted to the down sampling unit 300.
즉, 다운 샘플링단계(30)는 타일링된 좌안 및 우안 영상 각각을 기 정해진 원본 해상도 8K 해상도 중 중간 해상도(4K)의 임의의 해상도를 기준 해상도로 설정한 다음 상기 타일링된 우안 영상에 대해 상기 설정된 기준 해상도 미만의 각 단위 해상도(1K, 2K, 3K)로 다운 샘플링을 순차 수행한다.That is, in the down-sampling step 30, each of the tiled left-eye and right-eye images is set to a reference resolution of an arbitrary resolution of the intermediate resolution (4K) among the predetermined original resolution 8K resolution, and then the set reference for the tiled right-eye image. Down-sampling is sequentially performed at each unit resolution (1K, 2K, 3K) below the resolution.
그리고, 다운 샘플링단계(30)는 상기 기준 해상도 초과의 각 단위 해상도(4K, 5K, 6K, 7K, 8K)로 타일링된 좌안 영상에 대해 다운 샘플링을 순차 수행하고, 각 단위 해상도로 다운 샘플링된 우안 영상 및 좌안 영상은 인코더(400)로 전달된다.Then, the down-sampling step 30 sequentially performs down-sampling on the left eye image tiled with each unit resolution (4K, 5K, 6K, 7K, 8K) exceeding the reference resolution, and the right eye down-sampled with each unit resolution. The image and the left eye image are delivered to the encoder 400.
각 단위 해상도로 다운 샘플링된 우안 영상 및 좌안 영상 각각은 인코딩 단계(40)에 의해 인코딩을 수행하여 기본 계층 스트림 및 향상 계층 스트림을 출력한다. 즉, 각 단위 해상도로 다운 샘플링된 우안 영상에 대해 각 단위 해상도의 기본 계층 스트림이 출력되고, 각 단위 해상도의 기본 계층 스트림 각각에 대해 각 단위 해상도의 향상 계층 스트림이 출력된다.Each of the right-eye image and the left-eye image down-sampled at each unit resolution performs encoding by the encoding step 40 to output a base layer stream and an enhancement layer stream. That is, a base layer stream of each unit resolution is output for a right-eye image downsampled at each unit resolution, and an enhancement layer stream of each unit resolution is output for each base layer stream of each unit resolution.
그리고 상기 기본 계층 스트림은 업 샘플링단계(50)로 전달되며, 업 샘플링단계(50)는 기본 계층 스트림을 고 해상도로 업 샘플링을 수행하여 상기 기본 계층 스트림 및 상기 향상 계층 스트림과 상기 업 샘플링된 기본 계층 스트림을 출력한다.And the base layer stream is delivered to the up-sampling step 50, and the up-sampling step 50 performs up-sampling the base layer stream at a high resolution to perform the base layer stream and the enhancement layer stream and the up-sampled base. Output the hierarchical stream.
이러한 기본 계층 스트림 및 상기 향상 계층 스트림과 상기 업 샘플링된 기본 계층 스트림중 적어도 하나는 브로드 캐스팅단계(60)로 전달되며, 브로드 캐스팅단계(60)는 기본 계층 스트림 및 상기 향상 계층 스트림과 상기 업 샘플링된 기본 계층 스트림중 적어도 하나를 방송망 또는 인터넷망의 표준 규격에 따라 표준화 및 다중화한 다음 전송한다.At least one of the base layer stream, the enhancement layer stream, and the up-sampled base layer stream is transmitted to a broadcasting step 60, and the broadcasting step 60 includes a base layer stream and the enhancement layer stream and the up sampling At least one of the base layer streams is standardized and multiplexed according to a standard specification of a broadcasting network or an Internet network, and then transmitted.
이에 본 실시 예에 의거 360도 카메라를 통해 촬영된 고해상도의 좌안 영상 및 우안 영상으로부터 도출된 기본 계층 스트림, 향상 계층 스트림, 및 업 샘플링된 기본 계층 스트림에 대한 MPD 구조로 시그널링을 수행함에 따라 모노 HD급 해상도의 360 VR, 모노 UHD급 해상도의 360도 VR, 및 스테레오 UHD급 해상도의 360도 VR 등 실감 미디어를 제공할 수 있다.Accordingly, according to the present embodiment, as the signaling is performed in MPD structure for the base layer stream, the enhancement layer stream, and the up-sampled base layer stream derived from the high-resolution left-eye image and right-eye image captured through a 360-degree camera, mono HD It can provide sensational media such as 360 VR of class resolution, 360 degree VR of mono UHD class resolution, and 360 degree VR of stereo UHD class resolution.
또한, 하나의 인코더를 이용하여 기본 계층 스트림, 향상 계층 스트림, 및 업 샘플링된 기본 계층 스트림을 도출함에 따라 시스템의 복잡도를 줄일 수 있고, 압축 효율을 향상시킬 수 있다.In addition, complexity of the system may be reduced and compression efficiency may be improved by deriving a base layer stream, an enhancement layer stream, and an up-sampled base layer stream using one encoder.
이상에서는 본 발명의 바람직한 실시 예를 참조하여 설명하였지만, 해당 기술 분야의 숙련된 당업자라면 하기의 특허 청구범위에 기재된 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다.Although described above with reference to preferred embodiments of the present invention, those skilled in the art can variously modify and change the present invention without departing from the spirit and scope of the present invention as set forth in the claims below. You will understand that you can.

Claims (5)

  1. 360 카메라를 이용하여 고해상도의 좌안 영상 및 우안 영상 각각을 획득하는 영상 획득부; An image acquisition unit acquiring high-resolution left-eye images and right-eye images respectively using a 360 camera;
    상기 획득된 좌안 영상 및 우안 영상 각각에 대한 소정 크기로 타일링을 수행하는 전 처리부; 상기 타일링된 좌안 영상과 우안 영상에 대해 다운 샘플링을 수행하는 다수의 다운 샘플링부;A pre-processing unit that performs tiling with a predetermined size for each of the obtained left-eye image and right-eye image; A plurality of down-sampling units performing down-sampling on the tiled left-eye and right-eye images;
    상기 다운 샘플링된 좌안 영상 및 우안 영상 각각에 대해 인코딩을 수행하여 기본 계층 스트림 및 향상 계층 스트림을 출력하는 인코더; An encoder that performs encoding on each of the down-sampled left-eye and right-eye images and outputs a base layer stream and an enhancement layer stream;
    상기 기본 계층 스트림을 고 해상도로 업 샘플링을 수행하여 상기 기본 계층 스트림 및 상기 향상 계층 스트림과 상기 업 샘플링된 기본 계층 스트림을 출력하는 업 샘플링부; 및 An up-sampling unit configured to perform the up-sampling of the base layer stream at a high resolution to output the base layer stream, the enhancement layer stream, and the up-sampled base layer stream; And
    상기 기본 계층 스트림 및 상기 향상 계층 스트림과 상기 업 샘플링된 기본 계층 스트림중 적어도 하나를 방송망 표준 규격에 따라 표준화 및 다중화한 다음 전송하는 브로드 캐스팅부를 포함하는 것을 특징으로 하는 하이브리드망 기반의 영상 전송 시스템.And a broadcasting unit that standardizes and multiplexes the at least one of the base layer stream, the enhancement layer stream, and the up-sampled base layer stream according to a broadcast network standard specification, and then transmits the same.
  2. 제1항에 있어서, 상기 다수의 다운 샘플링부는, The method of claim 1, wherein the plurality of down-sampling unit,
    기 정해진 원본 해상도 중 임의의 해상도를 기준 해상도로 설정한 다음 타일링된 우안 영상에 대해 상기 설정된 기준 해상도 미만의 단위 해상도로 다운 샘플링을 순차 수행하고, 상기 기준 해상도 초과의 단위 해상도로 타일링된 좌안 영상에 대해 다운 샘플링을 순차 수행하고,After setting an arbitrary resolution among the predetermined original resolutions as a reference resolution, downsampling is sequentially performed on a tiled right-eye image at a unit resolution less than the preset reference resolution, and the tiled right-eye image is tiled at a unit resolution exceeding the reference resolution. Sequential downsampling,
    상기 인코더는 The encoder
    각 단위 해상도의 다운 샘플링된 우안 영상에 대해 인코딩을 수행하여 각 단위 해상도의 기본 계층 스트림을 출력하고, 각 단위 해상도의 다운 샘플링된 좌안 영상에 대해 상기 기본 계층 스트림을 기반으로 각 단위 해상도의 향상 계층 스트림을 출력하도록 구비되는 것을 특징으로 하는 하이브리드망 기반의 영상 전송 시스템.Encoding of the down-sampled right-eye image of each unit resolution is performed to output a base layer stream of each unit resolution, and for the down-sampled left-eye image of each unit resolution, an enhancement layer of each unit resolution is based on the base layer stream. Hybrid network-based video transmission system characterized in that it is provided to output the stream.
  3. 제2항에 있어서, 상기 브로드 캐스팅부는 According to claim 2, The broadcasting unit
    상기 기본 계층 스트림 및 향상 계층 스트림에 대한 MPD(Media Presentation Description) 정보를 생성한 다음 상기 MPD 정보와 기본 계층 스트림 및 향상 계층 스트림, 및 업 샘플링된 기본 계층스트림 중 적어도 하나를 전파하도록 구비되고,It is provided to generate media presentation description (MPD) information for the base layer stream and the enhancement layer stream, and then propagate at least one of the MPD information, the base layer stream and the enhancement layer stream, and the upsampled base layer stream,
    상기 MPD 정보는 The MPD information
    SRD(Spatial Relationship Description)로 구비되고 촬영된 좌안 영상과 우안 영상 타일 간의 시점 및 위치를 설명하는 타일링 정보; 및 각 단위 해상도의 기본 계층 스트림을 기반으로 생성된 다수의 향상 계층 스트림 각각에 대해 의존하는 각 단위 해상도의 기본 계층 스트림 아이디를 설명하는 의존도 정보를 포함하는 것을 특징으로 하는 하이브리드망 기반의 영상 전송 시스템.Tiling information that is provided as a spatial relationship description (SRD) and describes a viewpoint and a position between a left-eye image and a right-eye image tile; And dependency information describing a base layer stream ID of each unit resolution depending on each of the plurality of enhancement layer streams generated based on the base layer stream of each unit resolution. .
  4. 360 카메라를 이용하여 고해상도의 좌안 영상 및 우안 영상 각각을 획득하는 영상 획득단계; An image acquiring step of acquiring high-resolution left-eye images and right-eye images respectively using a 360 camera;
    상기 획득된 좌안 영상 및 우안 영상 각각에 대한 소정 크기로 타일링을 수행하는 전 처리단계; A pre-processing step of performing tiling with a predetermined size for each of the obtained left-eye image and right-eye image;
    상기 타일링된 좌안 영상과 우안 영상에 대해 다운 샘플링을 수행하는 다수의 다운 샘플링단계; A plurality of down-sampling steps for down-sampling the tiled left-eye and right-eye images;
    상기 다운 샘플링된 좌안 영상 및 우안 영상 각각에 대해 인코딩을 수행하여 기본 계층 스트림 및 향상 계층 스트림을 출력하는 인코딩 단계; An encoding step of encoding each of the down-sampled left-eye image and right-eye image and outputting a base layer stream and an enhancement layer stream;
    상기 기본 계층 스트림을 고 해상도로 업 샘플링을 수행하여 상기 기본 계층 스트림 및 상기 향상 계층 스트림과 상기 업 샘플링된 기본 계층 스트림을 출력하는 업 샘플링단계; 및 An up-sampling step of performing up-sampling the base layer stream at a high resolution to output the base layer stream, the enhancement layer stream, and the up-sampled base layer stream; And
    상기 기본 계층 스트림 및 상기 향상 계층 스트림과 상기 업 샘플링된 기본 계층 스트림중 적어도 하나를 방송망 표준 규격에 따라 표준화 및 다중화한 다음 전송하는 브로드 캐스팅 단계를 포함하는 것을 특징으로 하는 하이브리드망 기반의 영상 전송 방법.And a broadcasting step of standardizing and multiplexing at least one of the base layer stream, the enhancement layer stream, and the up-sampled base layer stream according to a broadcast network standard specification, and then transmitting the same. .
  5. 제4항에 있어서, 상기 다수의 다운 샘플링단계는 The method of claim 4, wherein the plurality of down-sampling steps
    기 정해진 원본 해상도 중 임의의 해상도를 기준 해상도로 설정하는 단계; 상기 타일링된 우안 영상에 대해 상기 설정된 기준 해상도 미만의 단위 해상도로 다운 샘플링을 순차 수행하는 단계; 및 상기 기준 해상도 초과의 단위 해상도로 타일링된 좌안 영상에 대해 다운 샘플링을 순차 수행하는 단계를 포함하고, Setting an arbitrary resolution among predetermined original resolutions as a reference resolution; Sequentially performing down-sampling on the tiled right-eye image at a unit resolution less than the preset reference resolution; And sequentially performing downsampling on the left-eye image tiled with a unit resolution exceeding the reference resolution,
    상기 인코딩 단계는,The encoding step,
    각 단위 해상도의 다운 샘플링된 우안 영상에 대해 인코딩을 수행하여 각 단위 해상도의 기본 계층 스트림을 출력하는 단계; 및 각 단위 해상도의 다운 샘플링된 좌안 영상에 대해 상기 기본 계층 스트림을 기반으로 각 단위 해상도의 향상 계층 스트림을 출력하도록 구비되는 것을 특징으로 하는 하이브리드망 기반의 영상 전송방법.Encoding a down-sampled right-eye image of each unit resolution to output a base layer stream of each unit resolution; And outputting an enhancement layer stream of each unit resolution based on the base layer stream for a down-sampled left-eye image of each unit resolution.
PCT/KR2018/016726 2018-12-24 2018-12-27 System and method for transmitting image on basis of hybrid network WO2020138536A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020180168045A KR102158007B1 (en) 2018-12-24 2018-12-24 System and method for transmissing images based on hybrid network
KR10-2018-0168045 2018-12-24

Publications (1)

Publication Number Publication Date
WO2020138536A1 true WO2020138536A1 (en) 2020-07-02

Family

ID=71126521

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2018/016726 WO2020138536A1 (en) 2018-12-24 2018-12-27 System and method for transmitting image on basis of hybrid network

Country Status (2)

Country Link
KR (1) KR102158007B1 (en)
WO (1) WO2020138536A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120033040A1 (en) * 2009-04-20 2012-02-09 Dolby Laboratories Licensing Corporation Filter Selection for Video Pre-Processing in Video Applications
US20140226952A1 (en) * 2010-05-18 2014-08-14 Enforcement Video, Llc Method and system for split-screen video display
US8923403B2 (en) * 2011-09-29 2014-12-30 Dolby Laboratories Licensing Corporation Dual-layer frame-compatible full-resolution stereoscopic 3D video delivery
KR20170138994A (en) * 2015-04-22 2017-12-18 엘지전자 주식회사 Broadcast signal transmission apparatus, broadcast signal reception apparatus, broadcast signal transmission method, and broadcast signal reception method
KR20180107149A (en) * 2016-02-17 2018-10-01 엘지전자 주식회사 360 how to transfer video, 360 how to receive video, 360 video transmission device, 360 video receiving device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120033040A1 (en) * 2009-04-20 2012-02-09 Dolby Laboratories Licensing Corporation Filter Selection for Video Pre-Processing in Video Applications
US20140226952A1 (en) * 2010-05-18 2014-08-14 Enforcement Video, Llc Method and system for split-screen video display
US8923403B2 (en) * 2011-09-29 2014-12-30 Dolby Laboratories Licensing Corporation Dual-layer frame-compatible full-resolution stereoscopic 3D video delivery
KR20170138994A (en) * 2015-04-22 2017-12-18 엘지전자 주식회사 Broadcast signal transmission apparatus, broadcast signal reception apparatus, broadcast signal transmission method, and broadcast signal reception method
KR20180107149A (en) * 2016-02-17 2018-10-01 엘지전자 주식회사 360 how to transfer video, 360 how to receive video, 360 video transmission device, 360 video receiving device

Also Published As

Publication number Publication date
KR102158007B1 (en) 2020-09-22
KR20200078818A (en) 2020-07-02

Similar Documents

Publication Publication Date Title
JP6721631B2 (en) Video encoding/decoding method, device, and computer program product
EP3334164B1 (en) A method and an apparatus and a computer program product for video encoding and decoding
KR100556826B1 (en) System and Method of Internet Broadcasting for MPEG4 based Stereoscopic Video
KR102343700B1 (en) Video transmission based on independently encoded background updates
WO2020189817A1 (en) Method and system for distributed decoding of split image for tile-based streaming
US20140181884A1 (en) System for transmitting/receiving digital realistic broadcasting based on non-realtime and method therefor
US20030095177A1 (en) 3D stereoscopic/multiview video processing system and its method
JP2004536529A (en) Method and apparatus for continuously receiving frames from a plurality of video channels and alternately transmitting individual frames containing information about each of the video channels to each of a plurality of participants in a video conference
CN104584562A (en) Transmission device, transmission method, reception device, and reception method
WO2009065325A1 (en) A video encoding/decoding method and a video encoder/decoder
WO2012100537A1 (en) Bearing method, and processing method, device and system of auxiliary video supplemental information
KR20120065943A (en) Methods of signaling for stereoscopic video service and apparatuses for using the same
WO2018186646A1 (en) Device and method for processing high-definition 360-degree vr image
WO2017171391A1 (en) Method and apparatus for transmitting and receiving broadcast signals
KR101941789B1 (en) Virtual reality video transmission based on viewport and tile size
KR20170130883A (en) Method and apparatus for virtual reality broadcasting service based on hybrid network
WO2020138536A1 (en) System and method for transmitting image on basis of hybrid network
WO2012057564A2 (en) Receiver apparatus and method for receiving a three-dimensional broadcast signal in a mobile environment
KR102312668B1 (en) Video transcoding system
KR102030630B1 (en) System for providing stereoscopic 3d 360 virtual peality sevice based on hybrid network
WO2010027142A2 (en) Transmitting/receiving system and transmitting/receiving method for multi-view video
EP4329303A1 (en) Media file processing method, and device therefor
WO2015009092A1 (en) Method and apparatus for processing video signal
CN116781913A (en) Encoding and decoding method of point cloud media and related products
WO2016129899A1 (en) 3dtv broadcast transmission and reception device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18944465

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18944465

Country of ref document: EP

Kind code of ref document: A1