KR101910609B1

KR101910609B1 - System and method for providing user selective view

Info

Publication number: KR101910609B1
Application number: KR1020170093436A
Authority: KR
Inventors: 조용범; 이기승
Original assignee: 건국대학교 산학협력단
Priority date: 2017-07-24
Filing date: 2017-07-24
Publication date: 2018-10-23

Abstract

A system for providing a user selectable image capable of more efficiently compressing a plurality of pieces of image data photographed at a plurality of points by using a plurality of prediction methods may comprise: an image encoder receiving a plurality of pieces of image data from a plurality of image cameras, encoding image data belonging to the same group among the received image data by a first prediction method, encoding the image data belonging to different groups by a second prediction method, and transmitting the encoded video data; and an image decoder receiving the encoded image data, receiving a user input for selecting at least one piece of image data the image data photographed by the image cameras, and decoding the encoded image data based on the user input to output the decoded image data.

Description

[0001] SYSTEM AND METHOD FOR PROVIDING USER SELECTIVE VIEW [0002]

본원은 사용자 선택형 영상 제공 시스템 및 방법에 관한 것이다.The present invention relates to a system and method for providing a user-selectable image.

입체 영상(3D영상)이란 깊이 및 공간에 대한 형상 정보를 동시에 제공하는 3차원 영상을 의미한다. 스테레오 영상의 경우, 좌우 눈에 각각 다른 시점의 영상을 제공하는 반면에, 입체 영상은 관찰자가 보는 시점을 달리할 때마다 다른 방향에서 본 것과 같은 영상을 제공한다. 따라서, 입체 영상을 생성하기 위해서는 여러 시점(view)에서 촬영한 영상들이 필요하다.A stereoscopic image (3D image) means a three-dimensional image simultaneously providing shape information on depth and space. In the case of stereoscopic images, stereoscopic images are provided at different viewpoints on the left and right eyes, while stereoscopic images provide the same images viewed from different directions each time the viewer views them. Therefore, in order to generate stereoscopic images, images taken at various view points are required.

일반적으로 다시점 영상 코딩(Multi-View HEVC(high efficiency video coding)) 과정에 따르면, 실제의 장면을 두 개 이상의 카메라를 이용하여 캡쳐하여 MVV(Multi-View Video) 시퀀스를 인코딩 한 후, 비트스트림을 수신기측을 통해 MV-HEVC로 전송한 후 디코딩 과정을 거치면 3D 영상을 디스플레이할 수 있다. Generally, according to a multi-view high efficiency video coding (HEVC) process, an actual scene is captured using two or more cameras to encode a MVV (Multi-View Video) sequence, Is transmitted to the MV-HEVC through the receiver, and then the 3D image can be displayed by decoding.

이러한 과정에서, 입체 영상을 생성하기 위해 여러 시점에서 찍은 영상들은 그 데이터량이 방대하다. 따라서, 입체 영상의 구현을 위한 네트워크 인프라, 지상파 대역폭 등을 고려하면 MPEG-2, H.264/AVC 등과 같은 단일시점 비디오 압축(Single-View Video Coding) 또는 종래의 MVV 압축에 최적화된 부호화 장치를 사용하여 압축더라도 구현상에 많은 제약이 있다.In this process, the amount of data that is taken at various viewpoints to generate a stereoscopic image is enormous. Therefore, in consideration of the network infrastructure and the terrestrial bandwidth for realizing the stereoscopic image, a single-view video coding such as MPEG-2, H.264 / AVC, or a conventional encoding apparatus optimized for MVV compression There are many limitations on the spherical phenomenon even if it is compressed by using.

다만, 관찰자가 보는 시점마다 찍은 영상들은 서로 관련성이 있기 때문에 중복되는 정보가 많다. 따라서, 시점간 중복성을 제거하고, 시공간축의 모션 추정(Motion Estimation, ME)를 고려할 수 있는 다시점 영상에 최적화된 부호화 장치를 이용하면 보다 적은 양의 데이터를 전송 및 데이터 전송 효율을 향상시킬 수 있다.However, since the images taken at each viewpoint of the observer are related to each other, there is a lot of overlapping information. Therefore, it is possible to improve the transmission efficiency and data transmission efficiency by using a coding apparatus optimized for multi-view image, which can eliminate the inter-view redundancy and consider the motion estimation (ME) of the space-time axis .

또한, 종래에는 실제의 장면을 두 개 이상의 복수의 카메라로 촬영하고 촬영된 영상 데이터를 확보할 수 있음에도 불구하고 네트워크 인프라, 방송 대역폭 등의 제약으로 인해 사용자 단말은 방송국에서 일방적으로 제공하는 카메라 영상만을 수신하여 출력하는 정도에 그쳤다. 따라서, 사용자는 채널 선택의 다양성은 보장받을 수 있었으나 동일 채널 내에서 촬영 카메라 영상의 선택권은 보장받을 수 없었다. Conventionally, although an actual scene can be photographed by two or more cameras, and captured image data can be acquired, due to a restriction of a network infrastructure, a broadcast bandwidth, etc., the user terminal can only use a camera image unilaterally provided from a broadcasting station But only to the extent of receiving and outputting. Therefore, the user can be assured of diversity of channel selection, but the choice of the camera image in the same channel can not be guaranteed.

본원의 배경이 되는 기술은 한국공개특허공보 제2017-0044637(공개일: 2017.04.25)호에 개시되어 있다.The background technology of the present application is disclosed in Korean Patent Laid-Open Publication No. 2017-0044637 (Publication Date: 2017.04.25).

본원은 전술한 종래 기술의 문제점을 해결하기 위한 것으로서, 다시점에서 촬영된 복수의 영상 데이터를 복수의 예측 방법을 사용하여 보다 효율적으로 압축할 수 있는 사용자 선택형 영상 제공 시스템 및 방법을 제공하는 것을 목적으로 한다. SUMMARY OF THE INVENTION It is an object of the present invention to provide a user selectable image providing system and method capable of compressing a plurality of image data photographed at multiple points more efficiently using a plurality of prediction methods .

또한, 본원은 전술한 종래 기술의 문제점을 해결하기 위한 것으로서, 사용자가 다양한 시점의 영상 중에서 원하는 한 영상을 선택하여 해당 영상의 부호화 데이터만을 복호화함으로써 보다 최적화된 속도로 영상을 제공할 수 있는 사용자 선택형 영상 제공 시스템 및 방법을 제공하는 것을 목적으로 한다. In addition, the present invention has been made to solve the above-mentioned problems of the related art, and it is an object of the present invention to provide a method and apparatus for selecting a desired one of images from various viewpoints and decoding the encoded data, And an image providing system and method.

다만, 본원의 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들도 한정되지 않으며, 또 다른 기술적 과제들이 존재할 수 있다.It should be understood, however, that the technical scope of the embodiments of the present invention is not limited to the above-described technical problems, and other technical problems may exist.

상기한 기술적 과제를 달성하기 위한 기술적 수단으로서, 사용자 선택형 영상 제공 시스템은, 복수의 영상 카메라로부터 복수의 영상 데이터를 수신하고, 상기 수신된 복수의 영상 데이터 중 같은 그룹에 속하는 영상 데이터 간에는 제 1 예측 방법으로 부호화하고 서로 다른 그룹에 속하는 영상 데이터 간에는 제 2 예측 방법으로 부호화하여 상기 부호화된 영상 데이터를 전송하는 영상 인코딩 장치, 및 상기 부호화된 영상 데이터를 수신하고, 상기 복수의 영상 카메라에서 촬영된 복수의 영상 데이터 중 적어도 하나의 영상 데이터를 선택하는 사용자 입력을 수신하고, 상기 사용자 입력에 기초하여 상기 부호화된 영상 데이터를 복호화하여 출력하는 영상 디코딩 장치를 포함할 수 있다. According to another aspect of the present invention, there is provided a user-selectable image providing system for receiving a plurality of image data from a plurality of image cameras, And a second predictive method to encode video data belonging to different groups to transmit the encoded video data, and a video encoding device for receiving the encoded video data, and a plurality of video cameras And an image decoding device for decoding the encoded image data based on the user input and outputting the decoded image data.

또한, 본원의 일 실시예에 따르면, 상기 영상 인코딩 장치는 상기 복수의 영상 데이터간의 촬영 시점의 유사도가 미리 설정된 값의 이상인 경우 제1예측 방법으로 부호화하고, 상기 복수의 영상 데이터간의 촬영 시점의 유사도가 미리 설정된 값의 미만인 경우 제2예측 방법으로 부호화할 수 있다. According to an embodiment of the present invention, the image encoding apparatus may encode the image data using a first prediction method when the similarity degree of the shooting time between the plurality of image data is equal to or greater than a predetermined value, Is less than a preset value, it can be encoded by the second prediction method.

또한, 본원의 일 실시예에 따르면, 상기 제1예측 방법은 MVC inter prediction이고, 상기 제2예측 방법은 simulcast inter prediction 일 수 있다. Also, according to an embodiment of the present invention, the first prediction method may be MVC inter prediction, and the second prediction method may be simulcast inter prediction.

상기한 기술적 과제를 달성하기 위한 기술적 수단으로서, 사용자 선택형 영상 인코딩 장치는 복수의 영상 카메라에서 촬영된 복수의 영상 데이터를 수신하는 영상 데이터 수신부, 상기 수신된 복수의 영상 데이터 중 같은 그룹에 속하는 영상 데이터 간에는 제 1 예측 방법으로 부호화하고, 서로 다른 그룹에 속하는 영상 데이터 간에는 제 2 예측 방법으로 부호화하는 인코딩부, 및 상기 부호화된 영상 데이터를 전송하는 전송부를 포함할 수 있다.According to an aspect of the present invention, there is provided a user-selected type image encoding apparatus including a video data receiving unit receiving a plurality of video data captured by a plurality of video cameras, An encoding unit for encoding the image data of the different groups in a first prediction method and a second prediction method of image data of different groups and a transmission unit for transmitting the encoded image data.

또한, 본원의 일 실시예에 따르면, 상기 인코딩부는 상기 복수의 영상 데이터간의 촬영 시점(view)의 중복성이 있는 경우, 제1예측 방법으로 부호화할 수 있다. In addition, according to an embodiment of the present invention, the encoding unit may encode a first prediction method when there is redundancy of a viewpoint of the plurality of image data.

또한, 본원의 일 실시예에 따르면, 상기 인코딩부는 상기 복수의 영상 데이터간의 촬영 시점의 유사도가 미리 설정된 값의 이상인 경우 제1예측 방법으로 부호화하고, 상기 복수의 영상 데이터간의 촬영 시점의 유사도가 미리 설정된 값의 미만인 경우 제2예측 방법으로 부호화할 수 있다. According to an embodiment of the present invention, the encoding unit may encode the plurality of image data using a first prediction method when the similarity degree between the plurality of image data is equal to or greater than a predetermined value, If it is less than the set value, it can be encoded by the second prediction method.

또한, 본원의 일 실시예에 따르면, 상기 제1예측 방법은 MVC inter prediction이고, 상기 제2예측 방법은 simulcast inter prediction일 수 있다. Also, according to an embodiment of the present invention, the first prediction method may be MVC inter prediction, and the second prediction method may be simulcast inter prediction.

또한, 본원의 일 실시예에 따르면, 상기 인코딩부는 상기 복수의 영상 카메라의 위치 근접도에 따라 상기 수신된 복수의 영상 데이터 중 같은 그룹에 속하는 영상 데이터 간에는 제 1 예측 방법으로 부호화하고, 서로 다른 그룹에 속하는 영상 데이터 간에는 제 2 예측 방법으로 부호화할 수 있다. According to an embodiment of the present invention, the encoding unit encodes video data belonging to the same group among the plurality of received video data using a first prediction method in accordance with the positional proximity of the plurality of video cameras, Can be encoded by the second prediction method.

상기한 기술적 과제를 달성하기 위한 기술적 수단으로서, 사용자 선택형 영상 디코딩 장치는, 부호화된 복수의 영상 데이터를 수신하는 수신부, 상기 복수의 영상 데이터 중 적어도 하나의 영상 데이터를 선택하는 사용자 입력을 수신하는 사용자 입력부, 및 상기 사용자 입력에 기초하여 상기 부호화된 영상 데이터를 복호화하여 출력하는 디코딩부,를 포함하되, 상기 부호화된 복수의 영상 데이터는 복수의 영상 카메라에서 촬영된 복수의 영상 데이터 중 같은 그룹에 속하는 영상 데이터 간에는 제 1 예측 방법으로 부호화되고, 서로 다른 그룹에 속하는 영상 데이터 간에는 제 2 예측 방법으로 부호화될 수 있다.According to an aspect of the present invention, there is provided a user-selectable image decoding apparatus including a receiving unit for receiving a plurality of encoded image data, a user receiving a user input for selecting at least one of the plurality of image data, And a decoding unit decoding the encoded image data based on the user input, and outputting the decoded image data, wherein the encoded plurality of image data includes at least one of a plurality of image data captured by a plurality of image cameras, The first predictive method can be coded between the video data and the second predictive method can be encoded between the video data belonging to different groups.

또한, 본원의 일 실시예에 따르면, 상기 디코딩부는 상기 사용자 입력에 기초하여 복수의 부호화된 영상 데이터 중 선택된 제 1 영상 데이터를 복호화하여 출력할 수 있다. According to an embodiment of the present invention, the decoding unit may decode and output selected first image data among a plurality of encoded image data based on the user input.

또한, 본원의 일 실시예에 따르면, 상기 디코딩부는 상기 부호화된 영상 데이터 모두를 복호화하되, 상기 사용자 입력에 기초하여 선택된 제1영상 데이터를 출력할 수 있다. According to an embodiment of the present invention, the decoding unit may decode all the encoded image data, and output the selected first image data based on the user input.

또한, 본원의 일 실시예에 따르면, 상기 디코딩부는 상기 수신한 사용자 입력에 기초하여 복수의 영상 데이터 중 선택된 제1영상 데이터와 연계된 제1영상 카메라 및 상기 제1영상 카메라와 같은 그룹에 속하는 영상 카메라의 영상 데이터를 복호화할 수 있다.According to an embodiment of the present invention, the decoding unit may include a first image camera associated with the first image data selected from the plurality of image data based on the received user input, and an image belonging to the same group as the first image camera The image data of the camera can be decoded.

상기한 기술적 과제를 달성하기 위한 기술적 수단으로서, 사용자 선택형 영상 제공 방법은, 복수의 영상 카메라에서 촬영된 복수의 영상 데이터를 수신하는 단계, 상기 수신된 복수의 영상 데이터 중 같은 그룹에 속하는 영상 데이터 간에는 제 1 예측 방법으로 부호화하고, 서로 다른 그룹에 속하는 영상 데이터 간에는 제 2 예측 방법으로 부호화하는 단계, 부호화된 영상 데이터를 전송하는 단계, 상기 복수의 영상 데이터 중 적어도 하나의 영상 데이터를 선택하는 사용자 입력을 수신하는 단계, 상기 부호화된 영상 데이터를 복호화하는 단계 및 상기 복호화된 영상 데이터를 전송하는 단계를 포함할 수 있다. According to another aspect of the present invention, there is provided a method for providing a user-selected image, the method comprising: receiving a plurality of image data taken by a plurality of image cameras; Encoding the image data belonging to different groups using a first prediction method and encoding the image data belonging to different groups using a second prediction method, transmitting the encoded image data, inputting at least one of the plurality of image data, Receiving the encoded image data, decoding the encoded image data, and transmitting the decoded image data.

또한, 본원의 일 실시예에 따르면, 상기 복호화하는 단계는 상기 사용자 입력에 기초하여 복수의 영상 데이터 중 선택된 제 1 영상 데이터를 복호화하고, 상기 영상 데이터를 출력하는 단계는 상기 복호화된 제 1 영상 데이터를 출력할 수 있다. According to an embodiment of the present invention, the decrypting step decrypts the first image data selected from the plurality of image data based on the user input, and the step of outputting the image data includes receiving the decrypted first image data Can be output.

또한, 본원의 일 실시예에 따르면, 상기 복호화하는 단계는 상기 부호화된 영상 데이터 모두를 복호화하고, 상기 영상 데이터를 출력하는 단계는 상기 사용자 입력에 기초하여 선택된 제1영상 데이터를 출력할 수 있다. According to an embodiment of the present invention, the decoding step may decode all the encoded image data, and the step of outputting the image data may output the first image data selected based on the user input.

또한, 본원의 일 실시예에 따르면, 상기 부호화하는 단계는 상기 복수의 영상 데이터간의 촬영 시점(view)의 중복성이 있는 경우, 제 1 예측 방법으로 부호화할 수 있다. In addition, according to an embodiment of the present invention, the encoding may be performed by a first prediction method when there is redundancy of a viewpoint of a view between the plurality of image data.

또한, 본원의 일 실시예에 따르면, 상기 부호화하는 단계는, 상기 복수의 영상 데이터간의 촬영 시점의 유사도가 미리 설정된 값의 이상인 경우 제1예측 방법으로 부호화고, 상기 복수의 영상 데이터간의 촬영 시점의 유사도가 미리 설정된 값의 미만인 경우 제2예측 방법으로 부호화할 수 있다. According to an embodiment of the present invention, in the encoding step, when the similarity degree of the shooting time between the plurality of video data is equal to or larger than a preset value, If the similarity is less than a preset value, the second prediction method can be coded.

또한, 본원의 일 실시예에 따르면, 상기 제 1 예측 방법은 MVC inter prediction이고, 상기 제2예측 방법은 simulcast inter prediction일 수 있다. Also, according to an embodiment of the present invention, the first prediction method may be MVC inter prediction, and the second prediction method may be simulcast inter prediction.

상술한 과제 해결 수단은 단지 예시적인 것으로서, 본원을 제한하려는 의도로 해석되지 않아야 한다. 상술한 예시적인 실시예 외에도, 도면 및 발명의 상세한 설명에 추가적인 실시예가 존재할 수 있다.The above-described task solution is merely exemplary and should not be construed as limiting the present disclosure. In addition to the exemplary embodiments described above, there may be additional embodiments in the drawings and the detailed description of the invention.

전술한 본원의 과제 해결 수단에 의하면, 다시점에서 촬영된 복수의 영상 데이터를 복수의 예측 방법을 사용하여 보다 효율적으로 압축할 수 있는 사용자 선택형 영상 제공 시스템 및 방법을 제공할 수 있다. According to the present invention, it is possible to provide a user-selectable image providing system and method that can compress a plurality of image data photographed at multiple points more efficiently using a plurality of prediction methods.

또한, 전술한 본원의 과제 해결 수단에 의하면, 사용자가 다양한 시점의 영상 중에서 원하는 한 영상을 선택하여 해당 영상만을 복호화함으로써 보다 최적화된 속도로 영상을 제공할 수 있는 사용자 선택형 영상 제공 시스템 및 방법을 제공할 수 있다. According to the present invention, there is provided a system and method for providing a user-selectable image capable of providing an image at a more optimized speed by allowing a user to select a desired one of images from various viewpoints and decoding only the corresponding image can do.

또한, 전술한 본원의 과제 해결 수단에 의하면, 복수의 영상 예측 방법을 동시 또는 선택적으로 사용하여 다시점 압축효율을 높이고 인코더와 디코더 사이의 속도의 최적화를 통해 사용자가 선택한 영상만을 디코딩할 수 있는 사용자 선택형 영상 제공 시스템 및 방법을 제공할 수 있다.In addition, according to the above-mentioned problem solving means of the present invention, a plurality of video prediction methods can be simultaneously or selectively used to increase the efficiency of multi-point compression and to optimize the speed between the encoder and the decoder, It is possible to provide a system and method for providing a selectable image.

도 1 은 본원의 일 실시예에 따른 사용자 선택형 영상 제공 시스템의 구성을 개략적으로 나타낸 도면이다.
도 2 은 본원의 일 실시예에 복수의 영상 데이터를 분류하는 예시를 계략적으로 나타낸 도면이다.
도 3a는 본원의 일 실시예에 따른 제 1 예측 방법인MVC inter prediction패턴을 예시적으로 나타낸 도면이다.
도3b는 본원의 일 실시예에 따른 제 2 예측 방법인 simulcast inter prediction패턴을 예시적으로 나타낸 도면이다.
도 4 는 본원의 일 실시예에 따른 영상 인코딩 장치 및 영상 디코딩 장치의 개략적인 구성을 나타낸 블록도이다.
도5는 본원의 일 실시예에 따른 사용자 선택형 영상 제공 방법의 제 1 실시예를 개략적으로 나타낸 흐름도이다.
도 6은 본원의 일 실시예에 따른 사용자 선택형 영상 제공 방법의 제2 실시예를 개략적으로 나타낸 흐름도이다. 1 is a diagram schematically illustrating a configuration of a user-selected type image providing system according to an embodiment of the present invention.
2 is a diagram schematically showing an example of classifying a plurality of image data in one embodiment of the present invention.
FIG. 3A is a diagram illustrating an MVC inter prediction pattern as a first prediction method according to an embodiment of the present invention.
FIG. 3B is a diagram illustrating a simulcast inter prediction pattern, which is a second prediction method according to an embodiment of the present invention.
4 is a block diagram illustrating a schematic configuration of a video encoding apparatus and an image decoding apparatus according to an embodiment of the present invention.
5 is a flowchart schematically illustrating a first embodiment of a method for providing a user-selected image according to an embodiment of the present invention.
FIG. 6 is a flowchart schematically illustrating a second embodiment of a method for providing a user-selectable image according to an embodiment of the present invention.

아래에서는 첨부한 도면을 참조하여 본원이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본원의 실시예를 상세히 설명한다. 그러나 본원은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본원을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily carry out the present invention. It should be understood, however, that the present invention may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In the drawings, the same reference numbers are used throughout the specification to refer to the same or like parts.

본원 명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다. Throughout this specification, when a part is referred to as being "connected" to another part, it is not limited to a case where it is "directly connected" but also includes the case where it is "electrically connected" do.

본원 명세서 전체에서, 어떤 부재가 다른 부재 "상에", "상부에", "상단에", "하에", "하부에", "하단에" 위치하고 있다고 할 때, 이는 어떤 부재가 다른 부재에 접해 있는 경우뿐 아니라 두 부재 사이에 또 다른 부재가 존재하는 경우도 포함한다.It will be appreciated that throughout the specification it will be understood that when a member is located on another member "top", "top", "under", "bottom" But also the case where there is another member between the two members as well as the case where they are in contact with each other.

본원 명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함" 한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성 요소를 더 포함할 수 있는 것을 의미한다.Throughout this specification, when an element is referred to as "including " an element, it is understood that the element may include other elements as well, without departing from the other elements unless specifically stated otherwise.

도 1 은 본원의 일 실시예에 따른 사용자 선택형 영상 제공 시스템(100)의 구성을 개략적으로 나타낸 도면이고, 도 2 은 본원의 일 실시예에 복수의 영상 데이터를 분류하는 예시를 계략적으로 나타낸 도면이고, 도 3a는 본원의 일 실시예에 따른 제 1 예측 방법인MVC inter prediction패턴을 예시적으로 나타낸 도면이고, 도3b는 본원의 일 실시예에 따른 제 2 예측 방법인 simulcast inter prediction패턴을 예시적으로 나타낸 도면이다. FIG. 1 schematically shows a configuration of a user-selectable image providing system 100 according to an embodiment of the present invention. FIG. 2 schematically illustrates an example of classifying a plurality of image data in an embodiment of the present invention 3A shows an MVC inter prediction pattern as a first prediction method according to an embodiment of the present invention. FIG. 3B illustrates a simulcast inter prediction pattern as a second prediction method according to an embodiment of the present invention. Fig.

도1을 참조하면 사용자 선택형 영상 제공 시스템(100)은 영상 인코딩 장치(110), 영상 디코딩 장치(120), 복수의 카메라(200) 및 사용자 단말(300)을 포함할 수 있다. 또한, 도 1에는 도시하지 않았으나 사용자 선택형 영상 제공 시스템(100)은, 예를 들어, 방송 서버 및 장치, 방송국에서 전파하는 방송 신호를 수신하는 장치 등을 더 포함할 수 있다. 또한, 사용자 선택형 영상 제공 시스템(100)의 구성 중 일부는 하나의 장치 또는 서버에 구비될 수 있다. 예를 들어, 영상 디코딩 장치(120) 및 사용자 단말(300)은 하나의 장치 내에 포함될 수 있다.1, the user-selected type image providing system 100 may include an image encoding apparatus 110, an image decoding apparatus 120, a plurality of cameras 200, and a user terminal 300. In addition, although not shown in FIG. 1, the user-selectable image providing system 100 may further include, for example, a broadcast server and an apparatus, a device for receiving a broadcast signal propagated in a broadcast station, and the like. In addition, some of the configurations of the user-selected type image providing system 100 may be provided in one device or server. For example, the video decoding apparatus 120 and the user terminal 300 may be included in one apparatus.

복수의 카메라(200)는 복수의 위치에서 영상을 촬영하는 영상 카메라 일 수 있다. 예시적으로, 복수의 영상 카메라(200)는 경기장, 콘서트 등의 광범위의 장소에서 다수의 각 영역 또는 인물 등을 촬영하는 영상 카메라 일 수 있다. 일 예로, 복수의 카메라(200)의 배치는 1차원 평행, 2차원 평행, 1차원 배열 등이 사용될 수 있고, 소정의 간격의 위치에서 영상을 촬영하는 영상 카메라일 수 있다. 영상 카메라는 양안식 카메라, 수평 리그를 사용한 카메라, 직교 리그를 사용한 카메라 등 영상 촬영이 가능한 카메라를 포함할 수 있다. The plurality of cameras 200 may be image cameras that photograph images at a plurality of positions. Illustratively, the plurality of video cameras 200 may be video cameras that photograph a plurality of areas or persons in a wide range of venues, concerts, and the like. For example, the arrangement of the plurality of cameras 200 may be one-dimensional parallel, two-dimensional parallel, one-dimensional arrangement, or the like, and may be an image camera that captures an image at a predetermined interval. The video camera may include a camera capable of capturing images such as a binocular camera, a camera using a horizontal rig, and a camera using an orthogonal rig.

영상 인코딩 장치(110)는 복수의 영상 카메라(200) 각각으로부터 각각 촬영된 복수의 영상 데이터를 수신할 수 있다. 영상 인코딩 장치(110)는 복수의 영상 카메라(200)로부터 영상 데이터의 특성, 줌-인/줌-아웃(zoom-in/zoom-out)의 정도, 영상 카메라의 시점(view), 카메라의 위치 등이 서로 상이한 복수의 영상 데이터를 수신할 수 있다. 예를 들어, 영상 데이터의 특성은 해상도, 색상, 배경(background)의 유사도, 픽셀의 수, 영상 프레임의 수 등을 포함할 수 있다.The video encoding apparatus 110 can receive a plurality of video data photographed from each of the plurality of video cameras 200. The video encoding device 110 receives the video data from the plurality of video cameras 200 in accordance with the characteristics of the video data, the degree of zoom-in / zoom-out, the view of the video camera, And the like can be received. For example, the characteristics of the image data may include resolution, color, similarity of the background, number of pixels, number of image frames, and the like.

영상 인코딩 장치(110)는 영상 카메라(200)로부터 수신한 영상 데이터를 부호화할 수 있다. 또한, 영상 인코딩 장치(110)는 수신한 영상 데이터의 특성, 영상 카메라의 시점, 줌-인/줌-아웃(zoom-in/zoom-out)의 정도, 카메라의 위치 등에 따라 수신한 영상 데이터의 부호화 방법을 결정할 수 있다. The video encoding apparatus 110 can encode the video data received from the video camera 200. [ In addition, the video encoding apparatus 110 may encode the received video data according to the characteristics of the received video data, the viewpoint of the video camera, the degree of zoom-in / zoom-out, The encoding method can be determined.

영상 인코딩 장치(110)는 수신된 복수의 영상 데이터 중 같은 그룹에 속하는 영상 데이터 간에는 제 1 예측 방법으로 부호화하고, 서로 다른 그룹에 속하는 영상 데이터 간에는 제 2 예측 방법으로 부호화할 수 있다. 본원의 일 실시예에 따르면, 영상 데이터가 같은 그룹에 속한다라는 의미는, 예를 들어, 수신된 영상 데이터 간의 촬영 시점(view)의 중복성이 있는 경우 복수의 영상 데이터가 같은 그룹에 속하는 것으로 판단될 수 있다. 보다 구체적으로, 영상 인코딩 장치(110)는 영상 데이터 간의 촬영 시점의 유사도가 미리 설정된 값 이상인 경우 동일한 시점을 촬영한 영상 데이터라고 판단하고 같은 그룹에 속하는 것으로 판단할 수 있다. 예를 들어, 촬영 시점의 유사도의 연산은 영상 데이터에 포함된 배경(background)의 유사도 정도를 포함할 수 있다. 또는 동일한 대상체(1)에 관한 영상 데이터를 포함하고 있는 정도 또는 동일한 대상체(1)에 대한 줌-인/줌-아웃(zoom-in/zoom-out)의 정도가 미리 설정된 값 이상인 경우 각 영상 데이터가 그룹에 속하는 것으로 판단될 수 있다.The video encoding apparatus 110 may encode video data belonging to the same group among a plurality of received video data using a first prediction method and encode video data belonging to different groups using a second prediction method. According to one embodiment of the present invention, the meaning that the image data belongs to the same group means that, for example, when there is redundancy of the shooting view between the received image data, it is determined that the plurality of image data belongs to the same group . More specifically, the image encoding apparatus 110 may judge that the same point in time is image data photographed when the similarity degree of the imaging point of time between image data is equal to or greater than a predetermined value, and judge that they belong to the same group. For example, the calculation of the degree of similarity at the time of photographing may include the degree of similarity of the background included in the image data. Or the degree of zoom-in / zoom-out for the same object 1 including the image data of the same object 1 is equal to or greater than a preset value, May be determined to belong to the group.

또 다른 예를 들어, 각 영상 카메라의 위치 근접도에 따라 각 영상 데이터가 같은 그룹에 속하는지 판단될 수 있다. 예를 들어, 각 영상 카메라 간의 거리 또는 촬영 방향 또는 촬영 각도 등의 차이가 미리 설정 기준값 이하인 경우, 해당 영상 카메라의 촬영 영상 데이터는 같은 그룹에 속하는 것으로 판단될 수 있다. 각 영상 카메라 간의 거리는 각 카메라의 위치값, gps 값 등에 기초하여 판단될 수 있고, 촬영 방향 또는 촬영 각도는 미리 설정된 기준선을 기준으로 회전된 각도 등에 기초하여 판단될 수 있다.As another example, it can be determined whether each image data belongs to the same group according to the positional proximity of each image camera. For example, when the difference between the respective image cameras or the difference in the photographing direction or the photographing angle is less than a predetermined reference value, the photographed image data of the corresponding image camera can be judged to belong to the same group. The distance between each image camera can be judged on the basis of the position value of each camera, the gps value and the like, and the photographing direction or photographing angle can be judged on the basis of the angle or the like rotated on the basis of the preset reference line.

영상 인코딩 장치(110)는 같은 그룹에 속하는 영상 데이터 간에는 제 1 예측 방법으로 부호화할 수 있다. 복수의 카메라(200)로부터 획득한 영상을 전송하기 위해 모두 압축 해야 하지만, 복수의 카메라(200)로부터 획득한 데이터의 양이 매우 많을 수 있다. 같은 그룹에 속하는 영상 데이터 간에 제 1 예측 방법을 사용하여 부호화 하는 것은, 같은 그룹에 속하는 카메라의 영상 데이터간에는 시간 및 공간적인 중복성이 있고 모션 추정이 가능하기 때문에, 압축 효율이 상대적으로 높은 제 1 예측 방법을 사용함으로써 압축 효율을 높일 수 있다. 반면, 다른 그룹에 속하는 영상 데이터 간 즉, 영상 데이터의 시점(view) 간의 중복성이 존재하지 않는 경우, 영상 데이터 간의 촬영 시점의 유사도가 미리 설정된 값 이하인 경우에는 영상 데이터 상호 간의 시각 간의 예측이 배제되고, 압축 효율이 상대적으로 낮은 제 2 예측 방법으로 부호화할 수 있다. 이와 같이, 본 발명의 영상 인코딩 장치(110)는 복수의 영상 데이터를 부호화 및 압축하되 영상 데이터 간의 유사도 및 상관 관계에 따라 부호화 방법을 선택적으로 적용하여 압축 효율 및 압축 속도를 향상시킬 수 있다.The video encoding apparatus 110 can encode video data belonging to the same group using a first prediction method. It is necessary to compress all the images acquired from the plurality of cameras 200 to transmit them, but the amount of data acquired from the plurality of cameras 200 may be very large. In the case of coding the image data belonging to the same group using the first prediction method, since there is temporal and spatial redundancy between the image data of the cameras belonging to the same group and motion estimation is possible, a first prediction The compression efficiency can be increased. On the other hand, when there is no redundancy between the image data belonging to another group, that is, the view of the image data, if the similarity degree between the image data at the shooting time is less than a predetermined value, , And can be encoded by a second prediction method with a relatively low compression efficiency. As described above, the video encoding apparatus 110 of the present invention can improve the compression efficiency and the compression rate by selectively encoding the plurality of image data, and selectively applying the encoding method according to the degree of similarity and correlation between the image data.

본원의 일 실시예에 따르면, 영상 인코딩 장치(110)는 복수의 영상 데이터간의 촬영 시점의 유사도가 미리 설정된 값의 이상인 경우 제 1 예측 방법으로 부호화하고, 복수의 영상 데이터간의 촬영 시점의 유사도가 미리 설정된 값의 미만인 경우 제 2예측 방법으로 부호화할 수 있다. 이때, 촬영 시점은 객관적 관찰자의 입장으로 보는 촬영 각도 또는 시점 또는 줌-인/줌-아웃(zoom-in/zoom-out)의 정도를 포함할 수 있다. 예시적으로 촬영 시점의 유사도는 30%의 값으로 미리 설정할 수 있다. 각 영상 데이터 간의 영상 분석을 통해 제 1 영상 데이터 및 제 2 영상 데이터간의 촬영 시점의 유사도가 미리 설정된 값(예를들어 30%) 이상인 경우 제 1 예측 방법으로 부호화할 수 있다. 반면, 제 3 영상 데이터 및 제 4 영상 데이터간의 촬영 시점의 유사도가 미리 설정된 값(예를 들어 30%) 미만인 경우 제2 예측 방법으로 부호화할 수 있다. According to one embodiment of the present invention, the image encoding apparatus 110 encodes a plurality of pieces of image data by using a first prediction method when the similarity degree of the shooting time point between the plurality of pieces of image data is equal to or larger than a predetermined value, If it is less than the set value, it can be encoded by the second prediction method. At this time, the photographing time point may include the angle of view or the viewpoint or the degree of zoom-in / zoom-out in view of the objective observer. Illustratively, the similarity at the time of photographing can be preset to a value of 30%. If the similarity degree between the first image data and the second image data at the time of photographing is equal to or greater than a predetermined value (for example, 30%) through image analysis between each image data, the first prediction method can be performed. On the other hand, if the similarity degree between the third image data and the fourth image data is less than a preset value (for example, 30%), the second prediction method can be performed.

예시적으로 제1예측 방법은 MVC inter prediction이고, 제2예측 방법은 simulcast inter prediction일 수 있다. 제1 예측 방법인 MVC inter prediction은 다시점 영상 코딩의 inter prediction으로 시간축 상에서의 프레임 간의 예측뿐만 아니라 인접한 카메라의 영상 데이터 사이에서의 예측을 통해 압축 효율을 높일 수 있는 방법일 수 있다. 제 2예측 방법인 simulcast inter prediction은 인접한 카메라 영상 데이터의 시각 간의 예측이 배제되고 기존의 H.264/AVC및 H.265/HEVC를 기반으로 한 카메라 영상의 도메인 상에서의 프레임 간 예측만을 사용한 코딩 방법일 수 있다. Illustratively, the first prediction method may be MVC inter prediction and the second prediction method may be simulcast inter prediction. MVC inter prediction, which is a first prediction method, can be a method of increasing compression efficiency through inter prediction between multi-view video coding as well as inter-frame prediction on a time axis, as well as prediction between image data of adjacent cameras. Simulcast inter prediction, which is a second prediction method, excludes prediction of the time between adjacent camera image data, and a coding method using only interframe prediction on a domain of a camera image based on existing H.264 / AVC and H.265 / HEVC Lt; / RTI >

본원의 일 실시예에 따르면, 영상 인코딩 장치(110)는 복수의 영상 데이터 간의 중복성 또는 유사도가 소정 값 이상인 경우에는 각 영상 데이터의 프레임을 참조하여 예측하는 제1예측 방법을 사용하여 부호화함으로써 압축 효율을 높이고, 복수의 영상 데이터 간의 중복성 또는 유사도가 없거나 소정 값 미만인 경우에는 이웃 영상 데이터의 참조 없이 한 카메라 영상의 프레임만을 이용한 예측을 수행하는 제2예측 방법을 선택적으로 사용하여 압축 속도를 높일 수 있다. Simulcast inter prediction 방법은 MVC inter prediction 방법에 비하여 Inter 프레임(frame) 비교가 적기 때문에, 프레임 간 유사성이 상대적으로 낮은 경우에 사용될 수 있다. 복수의 영상 카메라가 존재하는 경우, 상대적으로 유사성이 높은 화면을 촬영하는 복수의 카메라끼리 그룹을 형성하고, 같은 그룹에 속한 카메라의 영상 간에는 MVC inter prediction방법을 사용하여 압축하고, 그룹과 그룹간의 영상 간에는 Simulcast inter prediction 방법을 사용하여 압축할 수 있다. Simulcast inter prediction 방법이 MVC inter prediction 방법에 비하여 처리 속도가 빠르기 때문에, 본원의 일 실시예에 따르면, 복수의 카메라 영상의 프레임 간 유사도에 따라 Simulcast inter prediction 방법과 MVC inter prediction 방법을 선택적으로 적용하여 압축 효율 및 압축 속도를 향상시킬 수 있다.According to an embodiment of the present invention, when the redundancy or similarity between a plurality of image data is equal to or greater than a predetermined value, the image encoding apparatus 110 encodes the image data using a first prediction method of referring to each image data frame, The second prediction method of performing prediction using only one frame of a camera image without reference to neighboring image data can be selectively used to increase the compression rate when there is no redundancy or similarity among a plurality of image data or less than a predetermined value . The Simulcast inter prediction method can be used when the inter-frame similarity is relatively low because the Inter frame comparison is less than the MVC inter prediction method. When there are a plurality of video cameras, a plurality of cameras that capture relatively similar images are grouped, and the video of cameras belonging to the same group is compressed using the MVC inter prediction method, Can be compressed using the Simulcast inter prediction method. According to one embodiment of the present invention, the Simulcast inter prediction method and the MVC inter prediction method are selectively applied according to the degree of similarity between frames of a plurality of camera images, The efficiency and the compression speed can be improved.

본원의 일 실시예에 따른, 제 1 예측 방법인 MVC inter prediction은 두 대 이상의 카메라를 이용하여 획득한 영상을 효과적으로 압축 부호화하기 위해 사용되는 예측 구조일 수 있다. 도 3a 및 도 3b를 참조하면, Sm은 m번째 시점의 카메라 또는 영상 데이터를 의미하고, Tn은시간적으로 n번째 화면 또는 영상 프레임을 의미할 수 있다. 화살표는 이웃하는 화면들 사이의 예측 참조 관계를 의미할 수 있다. 기존 시스템과의 호환성을 유지하기 위해 다른 시점과 상관없이 독립적으로 복원할 수 있는 시점을 I 시점(I 프레임)이라 하며, 부호화가 끝난 하나의 시점만 참조하여 예측 부호화하는 시점을 P시점(P 프레임), 양쪽의 두 시점(앞뒤 시점)을 참조하여 예측 부호화하는 시점을 B시점(B프레임)이라고 할 수 있다. 예를 들어, 시점S0가 I시점에 해당하고, S2, S4, S6, S7 등이 P시점, S1, S3, S5등이 B시점에 해당하는 것으로 가정한다. 이렇게 정해진 예측 구조에서 I시점을 가장 먼저 부호화하고, P시점을 부호화한 후, 이어서 B시점을 부호화 한다. 예를 들어, S0-S2-S1-S4-S3-S6-S5-S7 순으로 부호호화를 수행할 수 있다. 임의 접근을 위해 일정한 간격으로 기준 화면을 둔다. 이 기준 화면은 오직 시점간 예측만을 이용하여 부호화할 수 있다. 인터 예측은 비디오 시퀀스에 있어서 연속하는 프레임들 간에 유사성(similiarities)에 기초한 부호화 기술이다. 하나 이상의 참조 프레임을 이용하여 현재 프레임에 움직임을 블록 단위로 추정하고 보상하여 영상을 부호화하는 기술이다. 참조 프레임에서 현재 프레임과 유사한 블록을 검색하고 움직임 벡터를 추출한다. 그리고 현재 블록과 참조 프레임 내의 유사한 블록 사이의 레지듀를 부호화함으로써 영상 부호화의 압축률을 높일 수 있다. 이때, 인터 예측에 따라 부호화된 영상을 복호화하기 위해서는 움직임 벡터가 필요하기 때문에 움직임 벡터 역시 함께 부호화한다. According to an embodiment of the present invention, MVC inter prediction, which is a first prediction method, may be a prediction structure used for effectively compressing and encoding an image obtained using two or more cameras. Referring to FIGS. 3A and 3B, Sm represents camera or image data at the m-th time point, and Tn may represent an n-th screen or an image frame temporally. An arrow may refer to a predictive reference relationship between neighboring views. (I frame) is referred to as an I frame, and the point of time at which the predictive encoding is performed with reference to only one time point after coding is referred to as P point (P frame ), And a point of time at which the predictive encoding is performed with reference to both of the two viewpoints (forward and backward viewpoints) is referred to as a B viewpoint (B frame). For example, it is assumed that the time point S0 corresponds to the time point I, the time points S2, S4, S6, and S7 correspond to the time point P, and the time points S1, S3, and S5 correspond to the time point B. In the predictive structure thus determined, the I viewpoint is encoded first, the P viewpoint is encoded, and then the B viewpoint is encoded. For example, the sign encoding can be performed in the order of S0-S2-S1-S4-S3-S6-S5-S7. Place the reference screen at regular intervals for random access. This reference picture can be encoded using only inter-view prediction. Inter prediction is an encoding technique based on similiarities between consecutive frames in a video sequence. A technique of encoding an image by estimating and compensating motion in a current frame using one or more reference frames on a block-by-block basis. A block similar to the current frame is searched in the reference frame and the motion vector is extracted. By compressing the residue between the current block and similar blocks in the reference frame, the compression rate of the image encoding can be increased. In this case, since a motion vector is required to decode an encoded image according to inter prediction, a motion vector is also encoded.

본원의 일 실시예에 따른, 제2예측 방법은 simulcast inter prediction일 수 있다. Inter Frame Coding에서 P 프레임은 과거프레임을 참조하는 예측 프레임으로, 통상 프레임간 화면의 변화가 전체적으로 바뀌는 것이 아니라 일정 부분(화소)이 이동하는 형태를 취하는 것일 수 있다. 제 2예측 방법은 시각 간의 예측이 배제된 기존의 H.264/AVC와 H.265/HEVC를 기반으로 한 도메인 상에서의 프레임 간 예측만을 사용한 코딩 방법일 수 있다. 제 2 예측 방법은 앞 뒤 프레임의 차이값만을 부호화하면 된다. According to one embodiment of the present disclosure, the second prediction method may be simulcast inter prediction. In Inter Frame Coding, a P frame is a prediction frame that refers to a past frame, and may be a form in which a certain portion (pixel) moves instead of a change of the frame between frames as a whole. The second prediction method may be a coding method using only inter-frame prediction on a domain based on the existing H.264 / AVC and H.265 / HEVC, in which the prediction between views is excluded. In the second prediction method, only difference values of the preceding and following frames are encoded.

도 3a 및 도 3b의 도시 내용 및 설명은 일 실시예에 해당하며, 제 1 예측 방법인 MVC inter prediction 및 제2예측 방법인 simulcast inter prediction의 영상 참조 순서 및 각 영상 프레임의 예측 조합은 변경 가능하다.3A and 3B correspond to an embodiment, and the MVC inter prediction of the first prediction method and the image reference order of the simulcast inter prediction, which is the second prediction method, and the prediction combination of the respective image frames can be changed .

또한, 영상 인코딩 장치(110)는 부호화하여 압축된 복수의 영상 데이터를 영상 디코딩 장치(120)로 전송할 수 있다. 본원의 일 실시예에 따르면, 복수의 영상 데이터는 적어도 하나의 비트 스트림(bit-stream)으로 압축되고, 네트워크를 통해 영상 디코딩 장치(120)로 전송될 수 있다.In addition, the video encoding apparatus 110 can transmit a plurality of encoded and compressed video data to the video decoding apparatus 120. According to one embodiment of the present invention, a plurality of image data may be compressed into at least one bit-stream and transmitted to the image decoding apparatus 120 through a network.

영상 디코딩 장치(120)는 부호화된 영상 데이터를 수신할 수 있다. 또한, 본원의 일 실시예에 따르면, 영상 디코딩 장치(120)는 사용자 입력에 기초하여 부호화된 영상 데이터를 복호화하여 출력할 수 있다. 본원의 일 실시예에 따르면, 영상 디코딩 장치(120)는 부호화된 복수의 영상 데이터 중 사용자 입력에 기초하여 사용자에 의해 선택된 제 1 영상 데이터만을 복호화하여 출력할 수 있다. 영상 디코딩 장치(120)는 복수의 영상 카메라(200)에서 촬영된 복수의 영상 데이터 중 적어도 하나의 영상 데이터를 선택하는 사용자 입력을 수신할 수 있다. 상기 사용자 입력은 카메라의 위치 데이터, 카메라의 식별자, flag에 기초한 영상 선택 입력 등을 포함할 수 있다. 영상 디코딩 장치(120)는 사용자가 선택한 카메라 위치 데이터 또는 flag를 통해 영상 인코딩 장치(110)로부터 수신한 비트 스트림에서 선택된 카메라와 연계된 영상 데이터만 추출하여 단독으로 복호화할 수 있다. 따라서, 영상 디코딩 장치(120)는 복수의 부호화된 영상 데이터 중에서 사용자에 의해 선택된 영상 데이터만을 복호화 함으로써 복호화 효율 및 속도를 향상시킬 수 있다.The image decoding apparatus 120 can receive the encoded image data. In addition, according to one embodiment of the present invention, the image decoding apparatus 120 can decode and output the encoded image data based on a user input. According to an embodiment of the present invention, the image decoding apparatus 120 can decode only the first image data selected by the user based on the user input among the plurality of encoded image data and output the decoded first image data. The image decoding apparatus 120 may receive a user input for selecting at least one image data among a plurality of image data captured by the plurality of image cameras 200. [ The user input may include location data of a camera, an identifier of a camera, an image selection input based on a flag, and the like. The image decoding apparatus 120 may extract only the image data associated with the selected camera from the bit stream received from the image encoding apparatus 110 through the camera position data or flag selected by the user and decode the extracted image data alone. Accordingly, the image decoding apparatus 120 can improve the decoding efficiency and speed by decoding only the image data selected by the user from among the plurality of encoded image data.

또한, 영상 디코딩 장치(120)는 수신한 사용자 입력에 기초하여 복수의 영상 데이터 중 선택된 제 1 영상 데이터와 연계된 제 1 영상 카메라 및 제 1 영상 카메라와 같은 그룹에 속하는 영상 카메라의 영상 데이터를 복호화할 수 있다. 예를 들어, 영상 디코딩 장치(120)는 선택된 제1영상 데이터 및 제1영상 데이터와 MVC inter prediction 관계가 있는 영상 데이터만 추출하여 복호화할 수 있다. Also, the image decoding apparatus 120 decodes the image data of the image camera belonging to the group such as the first image camera and the first image camera linked to the selected first image data among the plurality of image data based on the received user input can do. For example, the image decoding apparatus 120 can extract and decode only the image data having the MVC inter prediction relationship with the selected first image data and the first image data.

예를 들어, 도 2를 참조하면, 제 1 영상 데이터에 연계된 카메라는 제 1 영상 카메라(211) 일 수 있다. 제 1 영상 카메라(211)는 제 1 그룹(210)에 속하는 카메라 일 수 있다. 제 1 영상 카메라와 같은 그룹에 속하는 영상 카메라는 제 2 영상 카메라 내지 제 3 영상 카메라(212 내지 213)일 수 있다. 사용자 입력에 기초하여 영상 데이터 중 선택된 제 1 영상 데이터와 연계된 제 1 영상 카메라 및 제 1 영상 카메라와 같은 그룹에 속하는 영상 데이터 간에는 촬영 시점의 중복성 및 유사도가 기 설정된 값 이상이므로 같은 그룹에 속하는 영상 간에는 공간적인 중복성이 이용하여 높은 압축효율 및 복호화 효율을 제공할 수 있다. For example, referring to FIG. 2, the camera associated with the first image data may be the first image camera 211. The first image camera 211 may be a camera belonging to the first group 210. The image cameras belonging to the same group as the first image camera may be the second image camera to the third image camera 212 to 213. Since the redundancy and the similarity at the shooting time are equal to or greater than predetermined values between the video data belonging to the group such as the first video camera and the first video camera associated with the selected first video data based on the user input, The spatial redundancy can be utilized to provide a high compression efficiency and a decryption efficiency.

예를 들어, 복수의 영상 카메라가 존재하는 경우, 상대적으로 유사성이 높은 화면을 촬영하는 제1그룹(210)의 영상 카메라(211 내지 213)의 영상 간에는 MVC inter prediction방법을 사용하여 압축하고, 상대적으로 유사성이 낮은 제1그룹(210)의 영상 카메라(211 내지 213)의 영상과 제2그룹(220)의 영상 카메라(221 내지 223)의 영상 간에는 Simulcast inter prediction 방법을 사용하여 압축할 수 있다. For example, when there are a plurality of image cameras, the images of the video cameras 211 to 213 of the first group 210 that capture a relatively similar screen are compressed using the MVC inter prediction method, Compression can be performed between the video images of the video cameras 211 to 213 of the first group 210 having a low similarity and the video images of the video cameras 221 to 223 of the second group 220 using the Simulcast inter prediction method.

본원의 다른 일 실시예에 따르면, 영상 디코딩 장치(120)는 부호화된 영상 데이터 모두를 복호화하고, 복호화된 영상 데이터 모두를 출력할 수 있다. 복호화된 영상 데이터는 사용자 단말(300)에 출력될 수 있다. 영상 디코딩 장치(120)는 각 영상 카메라(200)의 식별자와 연계하여 각 영상 데이터를 출력할 수 있다. 또한, 영상 디코딩 장치(120) 또는 사용자 단말(300)은 복수의 영상 데이터 중 사용자가 원하는 영상 데이터 하나를 선택하는 입력을 수신할 수 있다. 예시적으로, 사용자는 사용자 단말(300)에 출력된 영상 중 사용자가 원하는 한 영상만을 선택하는 입력을 수행할 수 있다. 영상 디코딩 장치(120)는 사용자 입력에 기초하여 복호화된 영상 데이터 중 선택된 영상만을 출력하고, 사용자 입력을 받지 않는 영상 데이터들은 출력을 중지할 수 있다. According to another embodiment of the present invention, the image decoding apparatus 120 can decode both the encoded image data and output all of the decoded image data. The decoded image data may be output to the user terminal 300. The image decoding apparatus 120 can output each image data in association with the identifier of each image camera 200. Also, the video decoding apparatus 120 or the user terminal 300 may receive an input for selecting one of the plurality of video data desired by the user. Illustratively, the user may perform an input to select only one image desired by the user out of the images output to the user terminal 300. The image decoding apparatus 120 may output only the selected image out of the decoded image data based on the user input and stop outputting the image data that does not receive the user input.

사용자 단말(300)은 복수의 카메라에서 촬영된 복수의 영상 데이터 중 적어도 하나의 영상 데이터를 선택한 사용자의 입력을 수신할 수 있다. 사용자 단말(300)은 영상 디코딩 장치(120)로부터 복호화된 영상 데이터를 수신할 수 있다. 사용자는 입력 인터페이스를 사용할 수 있다. 예를 들어 사용자는 리모콘 또는 터치 스크린을 이용하여, 복수의 영상 중 적어도 하나의 영상 데이터를 선택할 수 있으나, 이에 한정되는 것은 아니다.The user terminal 300 can receive input of a user who selects at least one image data among a plurality of image data photographed by a plurality of cameras. The user terminal 300 can receive the decoded image data from the image decoding apparatus 120. [ The user can use the input interface. For example, the user can select at least one image data among a plurality of images using a remote controller or a touch screen, but the present invention is not limited thereto.

이와 같이, 본원의 일 실시예에 따르면, 사용자는 사용자 단말(300)을 통해 복수의 영상 카메라가 촬영한 영상 데이터 중 시청을 희망하는 영상 카메라의 영상 데이터만을 선택하여 시청할 수 있다.As described above, according to one embodiment of the present invention, the user can select and view only the image data of the image camera desired to be viewed among the image data captured by the plurality of image cameras through the user terminal 300. [

예시적으로, 사용자 단말(300)은 네트워크를 통해 영상 디코딩 장치(100)와 연동되는 디바이스로서, 예를 들면, 스마트폰(Smartphone), 스마트패드(SmartPad), 태블릿 PC, 웨어러블 디바이스 등과 PCS(Personal Communication System), GSM(Global System for Mobile communication), PDC(Personal Digital Cellular), PHS(Personal Handyphone System), PDA(Personal Digital Assistant), IMT(International Mobile Telecommunication)-2000, CDMA(Code Division Multiple Access)-2000, W-CDMA(W-Code Division Multiple Access), Wibro(Wireless Broadband Internet) 단말기 같은 모든 종류의 무선 통신 장치 및 데스크탑 컴퓨터, 스마트 TV와 같은 고정용 단말기일 수도 있다.Illustratively, the user terminal 300 is a device that interacts with the video decoding apparatus 100 through a network, for example, a smart phone, a smart pad, a tablet PC, a wearable device, Communication System, GSM (Global System for Mobile communication), PDC (Personal Digital Cellular), PHS (Personal Handyphone System), PDA (Personal Digital Assistant), IMT (International Mobile Telecommunication) -2000, CDMA (Code Division Multiple Access) -2000, W-Code Division Multiple Access (W-CDMA), and Wireless Broadband Internet (Wibro) terminals, desktop computers and smart TVs.

영상 제공 시스템(100)의 각 구성 간에는 영상 데이터 공유를 위해 네트워크가 사용될 수 있다. 영상 데이터 공유를 위한 네트워크의 일 예로는 LAN(Local Area Network), Wireless LAN(Wireless Local Area Network), WAN(Wide Area Network), PAN(Personal Area Network), Wi-Fi Network, 블루투스(Bluetooth) 네트워크, NFC(Near Field Communication) 네트워크, 3G, LTE(Long Term Evolution), 5G 네트워크, WIMAX(World Interoperability for Microwave Access) 네트워크 등과 같은 다양한 종류가 사용될 수 있다. A network may be used for image data sharing between the respective components of the image providing system 100. An example of a network for sharing image data includes a LAN (Local Area Network), a wireless LAN (Local Area Network), a WAN (Wide Area Network), a PAN (Personal Area Network), a Wi- , NFC (Near Field Communication) network, 3G, Long Term Evolution (LTE), 5G network, and WIMAX (World Interoperability for Microwave Access) network.

도 4 는 본원의 일 실시예에 따른 영상 인코딩 장치 및 영상 디코딩 장치의 개략적인 구성을 나타낸 블록도이다. 4 is a block diagram illustrating a schematic configuration of a video encoding apparatus and an image decoding apparatus according to an embodiment of the present invention.

도4를 참조하면, 사용자 선택형 영상 제공 시스템(100)은 영상 인코딩 장치(110)및 영상 디코딩 장치(120)를 포함할 수 있다. 영상 인코딩 장치(110)는 영상 데이터 수신부(111), 인코딩부(112) 및 전송부(113)을 포함할 수 있다. 영상 디코딩 장치(120)는 사용자 입력부(121), 디코딩부(122) 및 수신부(123)를 포함할 수 있다. Referring to FIG. 4, the user-selected type image providing system 100 may include an image encoding apparatus 110 and an image decoding apparatus 120. The video encoding apparatus 110 may include a video data receiving unit 111, an encoding unit 112, and a transmission unit 113. The image decoding apparatus 120 may include a user input unit 121, a decoding unit 122, and a receiving unit 123.

영상 데이터 수신부(111)는 복수의 영상 카메라(200)에서 촬영된 복수의 영상 데이터를 수신할 수 있다. 영상 데이터 수신부(111)는 복수의 영상 카메라 각각에서 촬영된 복수의 영상 데이터를 수신할 수 있다. 복수의 영상 데이터 영상 데이터의 특성, 영상 카메라의 시점, 위치 등이 서로 상이한 영상 데이터일 수 있다. The video data receiving unit 111 can receive a plurality of video data photographed by the plurality of video cameras 200. The video data receiving unit 111 can receive a plurality of video data photographed by each of the plurality of video cameras. The characteristics of the plurality of image data image data, the viewpoint and position of the image camera, and the like may be different from each other.

인코딩부(112)는 수신된 복수의 영상 데이터 중 같은 그룹에 속하는 영상 데이터 간에는 제 1 예측 방법으로 부호화하고, 서로 다른 그룹에 속하는 영상 데이터 간에는 제 2 예측 방법으로 부호화할 수 있다. The encoding unit 112 may encode video data belonging to the same group among a plurality of received video data using a first prediction method and encode video data belonging to different groups using a second prediction method.

인코딩부(112)는 복수의 영상 데이터간의 촬영 시점(view)의 중복성이 있는 경우, 제 1 예측 방법으로 부호화할 수 있다. 촬영 시점(view) 중복성이 있는 복수의 영상 데이터간에는 유사 영상 또는 프레임이 다수 존재하기 때문에, 데이터 압축시 다수 카메라 영상을 참조하는 제1예측 방법을 통해 영상 데이터 압축 시 압축 효율을 높일 수 있다. The encoding unit 112 may encode the image data using the first prediction method when there is redundancy in the viewpoint of a plurality of image data. Since a plurality of similar images or frames exist in a plurality of image data having redundant view, compression efficiency can be increased during image data compression through a first prediction method referring to a plurality of camera images during data compression.

인코딩부(112)는 복수의 영상 데이터간의 촬영 시점의 유사도가 미리 설정된 값의 이상인 경우 제 1 예측 방법으로 부호화할 수 있다. 인코딩부(112)는 복수의 영상 데이터간의 촬영 시점의 유사도가 미리 설정된 값의 미만인 경우 제 2 예측 방법으로 부호화할 수 있다.The encoding unit 112 can encode the image data using the first prediction method when the similarity degree of the shooting time between the plurality of image data is equal to or larger than a predetermined value. The encoding unit 112 can encode the image data with the second prediction method when the similarity degree of the shooting time between the plurality of image data is less than a preset value.

예를 들어, 복수의 영상 데이터간의 촬영 시점의 유사도는 배경(background)이 유사한 것을 의미할 수 있다. 일예로, 촬영 시점의 유사도는 30%일 수 있다. 인코딩부(112)는 복수의 영상 데이터간의 촬영 시점(배경)의 유사도가 30% 이상일 경우 제 1 예측 방법으로 부호화 할 수 있다. 예시적으로 도 2을 참조하면, 제1그룹(210)에 포함된 복수의 영상 카메라(211, 212, 213) 또는 제2그룹(220)에 포함된 복수의 영상 카메라(221, 222, 223) 가 촬영하는 촬영 시점(배경)은 유사도가 상대적으로 높을 수 있다. 반면, 제 1 그룹(210)이 촬영하는 촬영 시점(배경)과 제 2 그룹(220)이 촬영하는 촬영 시점(배경)은 상호간 유사도가 상대적으로 낮을 수 있다. For example, the similarity between the plurality of image data at the shooting time point may mean that the background is similar. For example, the similarity degree at the time of photographing may be 30%. The encoding unit 112 may encode the image data using the first prediction method when the similarity degree of the shooting time (background) between the plurality of image data is 30% or more. 2, a plurality of video cameras 211, 212, 213 included in the first group 210 or a plurality of video cameras 221, 222, 223 included in the second group 220, The similarity degree may be relatively high. On the other hand, the similarity degree between the shooting time (background) taken by the first group 210 and the shooting time (background) taken by the second group 220 may be relatively low.

인코딩부(112)는 복수의 영상 카메라(200)의 위치 근접도에 따라 수신된 복수의 영상 데이터 중 같은 그룹에 속하는 영상 데이터 간에는 제 1 예측 방법으로 부호화할 수 있다. 인코딩부(112)는 복수의 영상 카메라(200)의 위치 근접도에 따라 수신된 복수의 영상 데이터 중 서로 다른 그룹에 속하는 영상 데이터 간에는 제 2 예측 방법으로 부호화할 수 있다. 복수의 영상 카메라(200)의 위치 근접도는 카메라들 사이의 인접 정도를 의미하는 것일 수 있다. 제 1 카메라 및 제 2 카메라의 위치가 인접한다면 같은 시점을 촬영한 영상 데이터가 존재할 수 있다. 예시적으로, 도2를 참조하면, 제1그룹(210)안에 속해있는 복수의 카메라들은 위치 근접도가 근접한 복수의 영상 데이터들의 그룹일 수 있다. 일예로,제 1 그룹(210)에 속하는 복수의 카메라들은 일정한 간격으로 배치될 수 있다. 같은 그룹에 속하는 복수의 영상 카메라(300)들의 카메라 배치는 1차원 평행, 1차원 수렴, 1차원 원호, 2차원 평행, 2차원 배열 등으로 배치될 수 있다. 같은 그룹에 속하는 영상 데이터간에는 인접한 복수의 카메라를 이용하여 유사한 장면을 동시에 촬영할 수 있기 때문에, 시차와 약간의 조명 차이를 제외하면 거의 같은 정보를 담고 있으므로 시점간 상관도가 높을 수 있다. 반면, 제 1그룹(210) 및 제 2그룹(220)의 영상 카메라에 의해 촬영된 영상 데이터는 서로 다른 그룹에 속하는 영상 데이터 일 수 있다. 제 1그룹(210)에 포함되는 영상 카메라(211 내지 213)와 제 2 그룹(220)에 포함되는 영상 카메라(221 내지 223 )간에는 위치 근접도가 각각의 그룹에 포함되어 있는 거리보다 원거리에 위치하기 때문에, 영상(배경)의 유사도 또는 상관도가 낮을 수 있어, 제 2 예측 방법으로 부호화될 수 있다. The encoding unit 112 can encode video data belonging to the same group among a plurality of video data received according to the positional proximity of the plurality of video cameras 200 by a first prediction method. The encoding unit 112 can encode video data belonging to different groups among a plurality of video data received according to the positional proximity of the plurality of video cameras 200 by a second prediction method. The location proximity of the plurality of video cameras 200 may refer to the degree of proximity between the cameras. If the positions of the first camera and the second camera are adjacent to each other, there may exist image data photographed at the same point in time. Illustratively, referring to FIG. 2, a plurality of cameras in the first group 210 may be a group of a plurality of image data whose positional proximity is close to each other. For example, a plurality of cameras belonging to the first group 210 may be arranged at regular intervals. The camera arrangement of the plurality of video cameras 300 belonging to the same group can be arranged in one-dimensional parallelism, one-dimensional convergence, one-dimensional arcs, two-dimensional parallelism, two-dimensional arrangement, or the like. Since similar images can be simultaneously photographed using a plurality of cameras adjacent to each other in the same group, correlation between the viewpoints can be high since they contain almost the same information except for the time difference and slight illumination difference. On the other hand, the image data photographed by the image cameras of the first group 210 and the second group 220 may be image data belonging to different groups. The positional proximity between the video cameras 211 to 213 included in the first group 210 and the video cameras 221 to 223 included in the second group 220 is longer than the distance included in each group , The degree of similarity or correlation of the image (background) can be low, and can be encoded by the second prediction method.

전송부(113)는 부호화된 영상 데이터를 전송할 수 있다. 전송부(113)는 인코딩부(112)에서 부호화한 영상 데이터를 영상 디코딩 장치(120)로 전송할 수 있다. The transmitting unit 113 can transmit the encoded video data. The transmitting unit 113 may transmit the image data encoded by the encoding unit 112 to the image decoding apparatus 120.

사용자 입력 수신부(121)는 복수의 영상 데이터 중 적어도 하나의 영상 데이터를 선택하는 사용자 입력을 수신할 수 있다.. 사용자 입력 수신부(121)는 사용자 입력 정보를 디코딩부(122)로 전송할 수 있다. The user input receiving unit 121 may receive a user input for selecting at least one image data among a plurality of image data. The user input receiving unit 121 may transmit the user input information to the decoding unit 122.

디코딩부(122)는 인코딩부(112)에서 부호화된 영상 데이터를 복호화할 수 있다. The decoding unit 122 may decode the image data encoded by the encoding unit 112. [

본원의 일 실시예에 따른, 디코딩부(122)는 인코딩부 (120) 에서 제 1 예측 방법 및 제 2 예측 방법으로 부호화된 영상 데이터 모두를 복호화할 수 있다. 본원의 다른 일 실시예에 따르면, 디코딩부(122)는 사용자 입력 수신부(121)에서 수신된 사용자 입력 정보에 기초하여 선택된 영상 데이터 (제 1 영상 데이터)만을 복호화할 수 있다. The decoding unit 122 may decode all the image data encoded by the first prediction method and the second prediction method in the encoding unit 120 according to an embodiment of the present invention. According to another embodiment of the present invention, the decoding unit 122 may decode only the selected video data (first video data) based on the user input information received by the user input receiving unit 121. [

수신부(123)는 부호화된 복수의 영상 데이터를 수신할 수 있다. 수신부(123)는 인코딩부(112)에서 부호화한 복수의 영상 데이터 모두를 전송부(113)로부터 수신할 수 있다. The receiving unit 123 can receive a plurality of encoded video data. The receiving unit 123 can receive all of the plurality of video data encoded by the encoding unit 112 from the transmitting unit 113. [

도5는 본원의 일 실시예에 따른 사용자 선택형 영상 제공 방법의 제 1 실시예를 개략적으로 나타낸 흐름도이다. 도 5에 도시된 사용자 선택형 영상 제공 방법은 도 1 내지 도 4를 통해 설명된 사용자 선택형 영상 시스템(100)의 동작에 의해 수행될 수 있다. 따라서, 이하 생략된 내용이라고 하더라도, 도 1 내지 도 4를 통해 설명된 사용자 선택형 영상 시스템(100)에 대한 설명은 이하 도 4에도 동일하게 적용되므로 자세한 내용은 생략된다. 5 is a flowchart schematically illustrating a first embodiment of a method for providing a user-selected image according to an embodiment of the present invention. 5 can be performed by the operation of the user-selectable image system 100 described with reference to FIGS. 1 to 4. FIG. Therefore, the description of the user-selectable type image system 100 described with reference to FIGS. 1 to 4 is applied to FIG. 4 as well, so that detailed description thereof will be omitted.

도5를 참조하면, 본원의 일 실시예에 따르면, 단계S510에서, 복수의 카메라(200)는 복수의 영상 카메라에서 촬영된 복수의 영상 데이터를 영상 인코딩 장치(110)로 전송할 수 있다. 단계, S520에서, 영상 인코딩 장치(110)는 복수의 카메라(200)로부터 전송된 복수의 영상 데이터를 수신할 수 있다. Referring to FIG. 5, according to one embodiment of the present invention, in step S510, a plurality of cameras 200 may transmit a plurality of image data photographed by a plurality of image cameras to the image encoding apparatus 110. FIG. In operation S520, the image encoding apparatus 110 may receive a plurality of image data transmitted from the plurality of cameras 200. [

단계 S530에서, 영상 인코딩 장치(110)는 수신한 영상 데이터의 특성, 영상 카메라의 시점, 위치 등에 따라 수신한 영상 데이터의 부호화 방법을 결정할 수 있다. 또한, 영상 인코딩 장치(110)는 영상 데이터의 특성, 영상 카메라의 시점, 위치 등에 따라 수신된 복수의 영상 데이터 중 같은 그룹에 속하는 영상 데이터 간에는 제 1 예측 방법으로 부호화하고, 서로 다른 그룹에 속하는 영상 데이터 간에는 제 2 예측 방법으로 부호화할 수 있다. 예를 들어, 제 1 예측 방법은 MVC inter prediction이고, 제2예측 방법은 simulcast inter prediction일 수 있다.In step S530, the image encoding apparatus 110 can determine the encoding method of the received image data according to the characteristics of the received image data, the viewpoint and the position of the image camera, and the like. In addition, the video encoding apparatus 110 encodes video data belonging to the same group among a plurality of video data received according to the characteristics of the video data, the viewpoint and the position of the video camera using the first prediction method, The data can be encoded by the second prediction method. For example, the first prediction method may be MVC inter prediction and the second prediction method may be simulcast inter prediction.

단계 S540에서, 영상 인코딩 장치(110)는 부호화된 영상 데이터를 영상 디코딩 장치(120)로 전송할 수 있다. 단계 S550에서, 영상 디코딩 장치(120)는 부호화된 복수의 영상 데이터를 수신할 수 있다. In step S540, the image encoding apparatus 110 may transmit the encoded image data to the image decoding apparatus 120. [ In step S550, the video decoding apparatus 120 can receive a plurality of encoded video data.

단계 S560에서, 영상 디코딩 장치(120)는 복수의 영상 데이터 중 적어도 하나의 영상 데이터를 선택하는 사용자 입력을 수신할 수 있다. 단계 S570에서, 영상 디코딩 장치(120)는 사용자 입력 정보에 기초하여 영상 인코딩 장치(110)에서 부호화된 복수의 영상 데이터 중 사용자에 의해 선택된 제 1 영상 데이터를 복호화할 수 있다. In step S560, the image decoding apparatus 120 may receive a user input for selecting at least one of the plurality of image data. In step S570, the image decoding apparatus 120 may decode the first image data selected by the user among the plurality of image data encoded by the image encoding apparatus 110 based on the user input information.

단계 S580에서, 영상 디코딩 장치(120)는 복호화된 영상 데이터(제1영상 데이터)를 출력할 수 있다. 본원의 일 실시예에 따르면, 영상 디코딩 장치(120)는 복호화된 모든 영상을 데이터를 제공받되, 사용자가 선택한 영상만을 출력하는 것일 수 있다In step S580, the image decoding apparatus 120 can output decoded image data (first image data). According to an embodiment of the present invention, the image decoding apparatus 120 may be configured to receive all the decoded images and output only images selected by the user

도 6은 본원의 일 실시예에 따른 사용자 선택형 영상 제공 방법의 제2 실시예를 개략적으로 나타낸 흐름도이다. 도 6에 도시된 사용자 선택형 영상 제공 방법은 도 1 내지 도 4를 통해 설명된 사용자 선택형 영상 시스템(100)의 동작에 의해 수행될 수 있다. 따라서, 이하 생략된 내용이라고 하더라도, 도 1 내지 도 4를 통해 설명된 사용자 선택형 영상 시스템(100)에 대한 설명은 이하 도 6에도 동일하게 적용되므로 자세한 내용은 생략된다. FIG. 6 is a flowchart schematically illustrating a second embodiment of a method for providing a user-selectable image according to an embodiment of the present invention. The method for providing a user-selected image shown in FIG. 6 may be performed by the operation of the user-selectable image system 100 described with reference to FIGS. Therefore, the description of the user-selectable image system 100 described with reference to FIGS. 1 to 4 will be applied to FIG. 6 as well, so that detailed description thereof will be omitted.

본원의 다른 일 실시예에 따르면, 단계S610에서, 복수의 카메라(200)는 촬영된 복수의 영상 데이터를 영상 인코딩 장치(110)로 전송할 수 있다. 단계, S620에서, 영상 인코딩 장치(110)는 복수의 카메라(200)로부터 전송된 복수의 영상 데이터를 수신할 수 있다. According to another embodiment of the present invention, in step S610, the plurality of cameras 200 may transmit a plurality of photographed image data to the image encoding apparatus 110. [ In operation S620, the image encoding apparatus 110 may receive a plurality of image data transmitted from the plurality of cameras 200. [

단계 S630에서, 영상 인코딩 장치(110)는 수신한 영상 데이터의 특성, 영상 카메라의 시점, 위치 등에 따라 수신한 영상 데이터의 부호화 방법을 결정할 수 있다. 영상 인코딩 장치(110)는 수신된 복수의 영상 데이터 중 같은 그룹에 속하는 영상 데이터 간에는 제 1 예측 방법으로 부호화할 수 있다. 영상 인코딩 장치(110)는 서로 다른 그룹에 속하는 영상 데이터 간에는 제 2 예측 방법으로 부호화할 수 있다. 1 예측 방법은 MVC inter prediction이고, 제2예측 방법은 simulcast inter prediction일 수 있다.In step S630, the image encoding apparatus 110 can determine the encoding method of the received image data according to the characteristics of the received image data, the viewpoint and the position of the image camera, and the like. The video encoding apparatus 110 may encode video data belonging to the same group among a plurality of received video data using a first prediction method. The video encoding apparatus 110 can encode video data belonging to different groups using a second prediction method. 1 prediction method may be MVC inter prediction, and the second prediction method may be simulcast inter prediction.

단계 S640에서, 영상 인코딩 장치(110)는 부호화된 영상 데이터를 영상 디코딩 장치(120)로 전송할 수 있다. 단계 S650에서, 영상 디코딩 장치(120)는 부호화된 복수의 영상 데이터를 수신할 수 있다. 영상 디코딩 장치(120)는 영상 인코딩 장치(110)에서 부호화된 복수의 영상 데이터 모두를 수신할 수 있다. In step S640, the image encoding apparatus 110 may transmit the encoded image data to the image decoding apparatus 120. [ In step S650, the video decoding apparatus 120 can receive a plurality of encoded video data. The image decoding apparatus 120 can receive all of the plurality of image data encoded by the image encoding apparatus 110. [

단계 S660에서, 영상 디코딩 장치(120)는 부호화된 복수의 영상 데이터 모두를 복호화할 수 있다. 단계 S670에서, 영상 디코딩 장치(120)는 사용자 입력을 수신할 수 있다. 영상 디코딩 장치(120)는 부호화된 복수의 영상 데이터 모두를 사용자 단말(300)에 출력하고, 영상 디코딩 장치(120)는 사용자가 원하는 영상 데이터를 선택하는 입력을 수신할 수 있다. In step S660, the image decoding apparatus 120 can decode all of the plurality of encoded image data. In step S670, the image decoding apparatus 120 may receive user input. The video decoding apparatus 120 outputs all of the encoded video data to the user terminal 300 and the video decoding apparatus 120 can receive an input for selecting the video data desired by the user.

단계 S680에서, 영상 디코딩 장치(120)는 사용자 입력에 기초하여 선택된 영상 데이터를 출력할 수 있다. 영상 디코딩 장치(120)는 사용자 입력을 받지 않은 나머지 영상 데이터의 출력을 중지하고 사용자 입력에 따라 선택된 영상 데이터만을 출력할 수 있다. In step S680, the video decoding apparatus 120 may output the selected video data based on the user input. The image decoding apparatus 120 may stop outputting the remaining image data that has not received user input and may output only the selected image data according to the user input.

본원의 일 실시 예에 따른 사용자 선택형 영상 제공 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method for providing a user-selected image according to an exemplary embodiment of the present invention may be implemented in the form of a program command that can be executed through various computer means and recorded on a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions recorded on the medium may be those specially designed and constructed for the present invention or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

전술한 본원의 설명은 예시를 위한 것이며, 본원이 속하는 기술분야의 통상의 지식을 가진 자는 본원의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.It will be understood by those of ordinary skill in the art that the foregoing description of the embodiments is for illustrative purposes and that those skilled in the art can easily modify the invention without departing from the spirit or essential characteristics thereof. It is therefore to be understood that the above-described embodiments are illustrative in all aspects and not restrictive. For example, each component described as a single entity may be distributed and implemented, and components described as being distributed may also be implemented in a combined form.

본원의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본원의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present invention is defined by the appended claims rather than the detailed description, and all changes or modifications derived from the meaning and scope of the claims and their equivalents should be interpreted as being included in the scope of the present invention.

100: 영상 제공 장치
110: 영상 인코딩 장치
111: 영상 데이터 수신부
112: 인코딩부
113: 전송부
120: 영상 디코딩 장치
121: 사용자 입력부
122: 디코딩부
123: 수신부
200: 복수의 영상 카메라
300: 사용자 단말100:
110: a video encoding device
111:
112: encoding section
113:
120: Image decoding device
121: user input section
122: decoding section
123: Receiver
200: Multiple video cameras
300: user terminal

Claims

A computer-readable recording medium storing a computer-readable program for causing a computer to execute the steps of: receiving a plurality of image data from a plurality of image cameras; calculating a similarity degree of a shooting time point among the plurality of image data among the plurality of received image data, Wherein when the positional proximity is less than or equal to a preset reference value, the plurality of image data is classified into image data belonging to the same group, the image data belonging to the same group is encoded with the first prediction method, 2 prediction method and transmitting the encoded image data; And
Receiving the encoded image data, receiving a user input for selecting at least one image data among a plurality of image data captured by the plurality of image cameras, decrypting the encoded image data based on the user input An image decoding device for outputting,
, &Lt; / RTI &
The similarity calculation at the time of photographing may include a background included in the plurality of image data, a degree of including image data related to the same object, and a zoom-in / zoom-out -out < / RTI >
The positional proximity is determined by taking into consideration the difference between the distance between the image cameras, the photographing direction and the photographing angle,
Wherein the distance between the image cameras is determined based on a position value of each camera, and the photographing direction or photographing angle is judged based on an angle that is rotated based on a preset reference line.

delete

The method according to claim 1,
Wherein the first prediction method is MVC inter prediction and the second prediction method is simulcast inter prediction.

A video data receiving unit for receiving a plurality of video data photographed by a plurality of video cameras;
Wherein when the degree of similarity is equal to or greater than a predetermined reference value and the positional proximity of each of the plurality of image cameras is equal to or less than a predetermined reference value, An encoding unit that classifies image data into the same group, encodes image data belonging to the same group using a first prediction method, and encodes image data belonging to different groups using a second prediction method; And
A transmitting unit for transmitting the encoded image data,
, &Lt; / RTI &
The similarity calculation at the time of photographing may include a background included in the plurality of image data, a degree of including image data related to the same object, and a zoom-in / zoom-out -out < / RTI >
The positional proximity is determined by taking into consideration the difference between the distance between the image cameras, the photographing direction and the photographing angle,
Wherein the distance between the image cameras is determined based on the position value of each camera, and the photographing direction or photographing angle is judged based on a rotated angle based on a preset reference line.

delete

5. The method of claim 4,
Wherein the first prediction method is MVC inter prediction and the second prediction method is simulcast inter prediction.

delete

A receiving unit for receiving a plurality of encoded video data;
A user input unit for receiving a user input for selecting at least one image data among the plurality of image data; And
A decoding unit decoding the encoded image data based on the user input and outputting the decoded image data,
, &Lt; / RTI &
Wherein the plurality of encoded video data are generated in such a manner that the similarity degree of the shooting time between the plurality of video data among the plurality of video data photographed by the plurality of video cameras is equal to or greater than a preset reference value, The image data belonging to the same group is coded by the first prediction method and the image data belonging to the different group is coded by the second prediction method Coded,
The similarity calculation at the time of photographing may include a background included in the plurality of image data, a degree of including image data related to the same object, and a zoom-in / zoom-out -out < / RTI >
The positional proximity is determined by taking into consideration the difference between the distance between the image cameras, the photographing direction and the photographing angle,
Wherein a distance between the image cameras is judged based on a position value of each camera, and a photographing direction or photographing angle is judged based on a rotated angle with reference to a preset reference line.

10. The method of claim 9,
Wherein the decoding unit decodes the selected first image data among a plurality of encoded image data based on the user input and outputs the decoded first image data.

10. The method of claim 9,
Wherein the decoding unit decodes all the encoded image data and outputs the selected first image data based on the user input.

10. The method of claim 9,
Wherein the first prediction method is MVC inter prediction and the second prediction method is simulcast inter prediction.

10. The method of claim 9,
Wherein the decoding unit decodes image data of a first image camera associated with the selected first image data and a video camera belonging to the same group as the first image camera based on the received user input, A user-selectable image decoding apparatus.

The method comprising: receiving a plurality of image data photographed by a plurality of image cameras;
Wherein when the degree of similarity is equal to or greater than a predetermined reference value and the positional proximity of each of the plurality of image cameras is equal to or less than a predetermined reference value, Encoding the image data into image data belonging to the same group, encoding the image data belonging to the same group using a first prediction method, and encoding image data belonging to different groups using a second prediction method;
Transmitting the encoded image data;
Receiving a user input for selecting at least one of the plurality of image data;
Decoding the encoded image data; And
Outputting the decoded image data;
, &Lt; / RTI &
The similarity calculation at the time of photographing may include a background included in the plurality of image data, a degree of including image data related to the same object, and a zoom-in / zoom-out -out < / RTI >
The positional proximity is determined by taking into consideration the difference between the distance between the image cameras, the photographing direction and the photographing angle,
Wherein the distance between the image cameras is determined based on the position value of each camera, and the photographing direction or photographing angle is judged based on a rotated angle with reference to a preset reference line.

15. The method of claim 14,
Wherein the decrypting step decrypts the first image data selected from the plurality of image data based on the user input,
Wherein the step of outputting the image data outputs the decoded first image data.

15. The method of claim 14,
Wherein the decoding step decodes all the encoded image data,
Wherein the step of outputting the image data outputs the first image data selected based on the user input.

delete

15. The method of claim 14,
Wherein the first prediction method is MVC inter prediction and the second prediction method is simulcast inter prediction.

A computer-readable recording medium storing a program for causing a computer to execute the method of any one of claims 14 to 16 and 19.