KR20230053229A

KR20230053229A - Device and Method for Performing Distributed Parallel-Transcoding

Info

Publication number: KR20230053229A
Application number: KR1020210136531A
Authority: KR
Inventors: 박성수; 김동원; 김재일; 문정미; 황태승
Original assignee: 에스케이텔레콤 주식회사
Priority date: 2021-10-14
Filing date: 2021-10-14
Publication date: 2023-04-21

Abstract

Disclosed is a distributed parallel transcoding method and device. According to one aspect of the present invention, a computer-implemented method for parallel transcoding video content containing key frames at given frame intervals, comprising: a process of dividing the playback time of video content into a plurality of time sections; a process of adjusting at least one of a start point or an end point of one of the plurality of time sections based on the position of the key frame; a process of decoding a video content part corresponding to the adjusted time interval using one transcoder among a plurality of transcoders; and a process of encoding the decoded video content part in the range of the one time interval using the one transcoder. Accordingly, the present invention divides video content into a plurality of data chunks and assigns them to a plurality of transcoders to transcode the divided video content in parallel, thereby reducing the time required for transcoding.

Description

Distributed parallel transcoding method and apparatus {Device and Method for Performing Distributed Parallel-Transcoding}

본 발명의 실시예들은 분산 트랜스코딩 방법 및 장치에 관한 것이다.Embodiments of the present invention relate to a distributed transcoding method and apparatus.

이 부분에 기술된 내용은 단순히 본 발명에 대한 배경 정보를 제공할 뿐 종래기술을 구성하는 것은 아니다.The information described in this section simply provides background information on the present invention and does not constitute prior art.

최근 주문형 비디오(Video On Demand, VOD) 시스템 및 실시간 스트리밍 서비스와 같은 미디어 서비스가 확대되고 있다. Recently, media services such as video on demand (VOD) systems and real-time streaming services are expanding.

미디어 서비스를 제공하는 시스템에서 서버는 사용자의 요청에 따라 미디어 콘텐츠를 압축하고, 압축된 미디어 콘텐츠를 사용자 단말에 제공한다. 사용자 단말은 압축된 미디어 콘텐츠로부터 원본 미디어 콘텐츠를 복원한다. In a system that provides media services, a server compresses media content according to a user's request and provides the compressed media content to a user terminal. The user terminal restores the original media content from the compressed media content.

일반적으로, 미디어 콘텐츠를 관리하는 서버 외에 별도의 인코딩 서버를 두어 MPEG2 또는 H.264와 같은 압축 알고리즘을 미디어 콘텐츠에 적용하고 있다. 특히, 미디어 콘텐츠 중 비디오 콘텐츠에 관한 인코딩 및 디코딩은 많은 연산 자원을 소모하므로, 종래에는 고성능을 갖춘 하나의 전용 서버에 하나의 미디어 콘텐츠를 할당하여 인코딩하였다.In general, a compression algorithm such as MPEG2 or H.264 is applied to media content by placing a separate encoding server in addition to a server that manages media content. In particular, since encoding and decoding of video content among media content consumes a lot of computational resources, one media content has conventionally been allocated and encoded to a single dedicated server with high performance.

하지만, 트랜스코딩을 수행하는 전용 서버와 콘텐츠를 1:1로 매핑하는 시스템에서는 고화질의 VOD 서비스나 실시간 스트리밍 서비스를 제공하는 데 한계가 있다. 즉, 종래의 시스템에서는 인코딩 및 디코딩을 수행하는 데 소요되는 시간의 증가로 원활한 서비스를 제공하지 못하는 문제점이 있다.However, there is a limit to providing a high-definition VOD service or real-time streaming service in a system in which a dedicated server performing transcoding and content are mapped 1:1. That is, in the conventional system, there is a problem in that a smooth service cannot be provided due to an increase in the time required to perform encoding and decoding.

나아가, 다양한 환경에 있는 사용자들에게 일관된 품질의 미디어 서비스에 대한 수요가 증가하고 있다. 이러한 수요를 충족시키기 위해서는, 각 사용자의 환경을 고려하여 원본 미디어 콘텐츠를 압축할 필요가 있다. 즉, 사용자 단말의 성능, 디스플레이 정보, 네트워크 상태, 서비스 시나리오 등 다양한 환경을 고려하여 미디어 콘텐츠를 트랜스코딩(transcoding)할 필요가 있다.Furthermore, demand for media services of consistent quality is increasing among users in various environments. In order to meet these demands, it is necessary to compress original media contents in consideration of each user's environment. That is, it is necessary to transcode media content in consideration of various environments such as performance of a user terminal, display information, network conditions, and service scenarios.

여기서, 트랜스코딩은 원본 미디어 콘텐츠를 사용자의 환경에 맞도록 다른 포맷(format)으로 변환하거나, 원본 미디어 콘텐츠의 해상도, 프레임 레이트(frame rate), 비트레이트(bitrate) 등을 변경하는 것을 의미한다. 즉, 트랜스코딩은 제작자에 의해 만들어진 비디오 데이터와 오디오 데이터를 사용자가 시청할 수 있는 형태로 변환하는 과정을 의미한다.Here, transcoding means converting original media content into a different format to suit a user's environment or changing the resolution, frame rate, bitrate, etc. of original media content. That is, transcoding refers to a process of converting video data and audio data created by a producer into a form that a user can view.

이처럼, 트랜스코딩을 수행하는 전용 서버와 콘텐츠를 1:1로 매핑하는 시스템에서는 사용자들의 다양한 환경과 고품질의 미디어 서비스를 제공하는 데 한계가 있다.As such, a system in which a dedicated server performing transcoding and contents are mapped 1:1 has limitations in providing various environments for users and high-quality media services.

본 발명의 실시예들은, 비디오 콘텐츠를 복수의 데이터 청크(chunk)로 분할하고 복수의 트랜스코더에 할당하여 분할된 비디오 콘텐츠를 병렬적으로 트랜스코딩하기 위한 방법 및 장치를 제공하는 데 주된 목적이 있다.Embodiments of the present invention are mainly aimed at providing a method and apparatus for transcoding the divided video contents in parallel by dividing the video contents into a plurality of data chunks and assigning them to a plurality of transcoders. .

본 발명의 다른 실시예들은, 비디오 콘텐츠 내 키 프레임들 간 간격인 프레임 구간과 비디오 콘텐츠의 재생 시간으로부터 분할된 시간 구간이 일치하지 않을 때, 디코딩 과정에서 프레임 손실이 발생하는 것을 방지하기 위한 방법 및 장치를 제공하는 데 일 목적이 있다.Another embodiment of the present invention is a method for preventing frame loss in a decoding process when a frame interval, which is an interval between key frames in video content, and a time interval divided from a playback time of video content do not match, and One object is to provide a device.

본 발명의 일 측면에 의하면, 주어진 프레임 간격마다 키 프레임을 포함하는 비디오 콘텐츠를 병렬 트랜스코딩하기 위한 컴퓨터 구현 방법에 있어서, 비디오 콘텐츠의 재생 시간을 복수의 시간 구간들로 분할하는 과정, 상기 키 프레임의 위치에 기초하여, 상기 복수의 시간 구간들 중 하나의 시간 구간의 시작 지점 또는 종료 지점 중 적어도 하나를 조정하는 과정, 복수의 트랜스코더 중 하나의 트랜스코더를 이용하여 상기 조정된 시간 구간에 대응하는 비디오 콘텐츠 파트를 디코딩하는 과정, 및 상기 하나의 트랜스코더를 이용하여 상기 디코딩된 비디오 콘텐츠 파트를 상기 하나의 시간 구간의 범위에서 인코딩하는 과정을 포함하는 방법을 제공한다.According to one aspect of the present invention, in a computer implemented method for parallel transcoding video contents including key frames at given frame intervals, a process of dividing a reproduction time of video contents into a plurality of time intervals, the key frames A process of adjusting at least one of the starting point or the ending point of one of the plurality of time intervals based on the position of, and corresponding to the adjusted time interval using one transcoder among a plurality of transcoders. decoding a video content part of the video content, and encoding the decoded video content part in a range of the one time interval using the one transcoder.

본 실시예의 다른 측면에 의하면, 명령어들을 저장하는 메모리, 및 적어도 하나의 프로세서를 포함하되, 상기 적어도 하나의 프로세서는 상기 명령어들을 실행함으로써, 주어진 프레임 간격마다 키 프레임을 포함하는 비디오 콘텐츠의 재생 시간을 복수의 시간 구간들로 분할하고, 상기 키 프레임의 위치에 기초하여, 상기 복수의 시간 구간들 중 하나의 시간 구간의 시작 지점 또는 종료 지점 중 적어도 하나를 조정하고, 복수의 트랜스코더 중 하나의 트랜스코더를 이용하여 상기 조정된 시간 구간에 대응하는 비디오 콘텐츠 파트를 디코딩하고, 상기 하나의 트랜스코더를 이용하여 상기 디코딩된 비디오 콘텐츠 파트를 상기 하나의 시간 구간의 범위에서 인코딩하는 장치를 제공한다.According to another aspect of the present embodiment, a memory for storing instructions, and at least one processor, wherein the at least one processor executes the instructions to reduce the playback time of video content including key frames at given frame intervals. Dividing into a plurality of time intervals, adjusting at least one of a start point or an end point of one of the plurality of time intervals based on the position of the key frame, and using one transcoder among a plurality of transcoders. An apparatus for decoding a video content part corresponding to the adjusted time interval by using a coder and encoding the decoded video content part in a range of the one time interval by using the one transcoder.

이상에서 설명한 바와 같이 본 발명의 일 실시예에 의하면, 비디오 콘텐츠를 복수의 데이터 청크로 분할하고 복수의 트랜스코더에 할당하여 분할된 비디오 콘텐츠를 병렬적으로 트랜스코딩함으로써, 트랜스코딩에 소요되는 시간을 줄일 수 있다.As described above, according to an embodiment of the present invention, video content is divided into a plurality of data chunks and allocated to a plurality of transcoders to transcode the divided video contents in parallel, thereby reducing the time required for transcoding. can be reduced

본 발명의 다른 실시예에 의하면, 사용자의 비디오 재생 환경, 사용자 단말과 서버 간 네트워크 상태 등 다양한 환경을 고려하여 비디오 콘텐츠를 트랜스코딩함으로써, 사용자에게 최적의 미디어 서비스를 제공할 수 있다.According to another embodiment of the present invention, it is possible to provide an optimal media service to a user by transcoding video content in consideration of various environments, such as a user's video playback environment and a network condition between a user terminal and a server.

본 발명의 다른 실시예에 의하면, 비디오 콘텐츠 내 키 프레임들 간 간격인 프레임 구간과 비디오 콘텐츠의 재생 시간으로부터 분할된 시간 구간이 일치하지 않을 때, 디코딩 과정에서 프레임 손실이 발생하는 것을 방지할 수 있다.According to another embodiment of the present invention, when a frame interval, which is an interval between key frames in video content, and a time interval divided from the reproduction time of video content do not match, frame loss in the decoding process can be prevented from occurring. .

도 1은 본 발명의 일 실시예에 따른 트랜스코딩 과정을 나타낸 도면이다.
도 2는 본 발명의 일 실시예에 따른 트랜스코딩 시스템의 구성을 나타낸 도면이다.
도 3은 본 발명의 일 실시예에 따른 분산형 병렬 트랜스코딩 시스템을 나타낸 구성도다.
도 4는 본 발명의 일 실시예에 따른 병렬 트랜스코딩 과정을 도시한 순서도다.
도 5는 본 발명의 일 실시예에 따른 분석부의 동작 과정을 나타낸 순서도다.
도 6은 본 발명의 일 실시예에 따른 트랜스코딩부의 동작 과정을 나타낸 순서도다.
도 7은 본 발명의 일 실시예에 따른 병합부의 동작 과정을 나타낸 순서도다.
도 8은 본 발명의 일 실시예에 따른 제어부의 동작 과정을 나타낸 순서도다.
도 9는 비디오 콘텐츠의 프레임 단위 구성을 나타낸 도면이다.
도 10은 비디오 콘텐츠의 프레임 그룹의 구성을 나타낸 도면이다.
도 11a 및 도 11b는 프레임들의 트랜스코딩 구간을 나타낸 도면이다.
도 12는 본 발명의 일 실시예에 따른 프레임들의 트랜스코딩 구간을 나타낸 도면이다.
도 13은 본 발명의 일 실시예에 따른 병렬 트랜스코딩 과정을 나타낸 순서도다.1 is a diagram showing a transcoding process according to an embodiment of the present invention.
2 is a diagram showing the configuration of a transcoding system according to an embodiment of the present invention.
3 is a block diagram showing a distributed parallel transcoding system according to an embodiment of the present invention.
4 is a flowchart illustrating a parallel transcoding process according to an embodiment of the present invention.
5 is a flowchart illustrating an operation process of an analyzer according to an embodiment of the present invention.
6 is a flowchart illustrating an operation process of a transcoding unit according to an embodiment of the present invention.
7 is a flowchart illustrating an operation process of a merging unit according to an embodiment of the present invention.
8 is a flowchart illustrating an operation process of a control unit according to an embodiment of the present invention.
9 is a diagram illustrating a frame-by-frame configuration of video content.
10 is a diagram showing the configuration of frame groups of video content.
11A and 11B are diagrams illustrating transcoding sections of frames.
12 is a diagram showing transcoding sections of frames according to an embodiment of the present invention.
13 is a flowchart illustrating a parallel transcoding process according to an embodiment of the present invention.

이하, 본 발명의 일부 실시예들을 예시적인 도면을 통해 상세하게 설명한다. 각 도면의 구성요소들에 참조부호를 부가함에 있어서, 동일한 구성요소들에 대해서는 비록 다른 도면 상에 표시되더라도 가능한 한 동일한 부호를 가지도록 하고 있음에 유의해야 한다. 또한, 본 발명을 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다.Hereinafter, some embodiments of the present invention will be described in detail through exemplary drawings. In adding reference numerals to components of each drawing, it should be noted that the same components have the same numerals as much as possible even if they are displayed on different drawings. In addition, in describing the present invention, if it is determined that a detailed description of a related known configuration or function may obscure the gist of the present invention, the detailed description will be omitted.

또한, 본 발명의 구성 요소를 설명하는 데 있어서, 제 1, 제 2, A, B, (a), (b) 등의 용어를 사용할 수 있다. 이러한 용어는 그 구성 요소를 다른 구성 요소와 구별하기 위한 것일 뿐, 그 용어에 의해 해당 구성 요소의 본질이나 차례 또는 순서 등이 한정되지 않는다. 명세서 전체에서, 어떤 부분이 어떤 구성요소를 '포함', '구비'한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 '~부', '모듈' 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.Also, terms such as first, second, A, B, (a), and (b) may be used in describing the components of the present invention. These terms are only used to distinguish the component from other components, and the nature, order, or order of the corresponding component is not limited by the term. Throughout the specification, when a part 'includes' or 'includes' a certain component, it means that it may further include other components without excluding other components unless otherwise stated. . In addition, terms such as '~unit' and 'module' described in the specification refer to a unit that processes at least one function or operation, and may be implemented by hardware, software, or a combination of hardware and software.

도 1은 본 발명의 일 실시예에 따른 트랜스코딩 과정을 나타낸 도면이다.1 is a diagram showing a transcoding process according to an embodiment of the present invention.

도 1을 참조하면, 디멀티플렉서(de-multiplexer, 100), 비디오 트랜스코더(110), 오디오 트랜스코더(120) 및 멀티플렉서(multiplexer, 130)가 도시되어 있다. 도 1에 도시된 구성들은 클라우드 컴퓨팅 환경인 클라우드 서버에 포함될 수 있다.Referring to FIG. 1 , a de-multiplexer 100, a video transcoder 110, an audio transcoder 120, and a multiplexer 130 are shown. Components shown in FIG. 1 may be included in a cloud server, which is a cloud computing environment.

입력 영상은 비디오 콘텐츠 및 오디오 콘텐츠를 포함하는 데이터이며, 인코딩된 상태로 수신된다. 입력 영상은 디멀티플렉서(100)에 입력되어 비디오 콘텐츠와 오디오 콘텐츠로 나뉘어진다. 비디오 콘텐츠는 비디오 트랜스코더(110)에 입력되고, 오디오 콘텐츠는 오디오 트랜스코더(120)에 입력된다.The input image is data including video content and audio content, and is received in an encoded state. The input video is input to the demultiplexer 100 and divided into video content and audio content. Video content is input to the video transcoder (110) and audio content is input to the audio transcoder (120).

입력된 비디오 콘텐츠는 인코딩된 상태이므로, 비디오 트랜스코더(110)는 디코더를 이용하여 인코딩된 비디오 콘텐츠를 디코딩한다. 비디오 트랜스코더(110)는 디코딩된 비디오 콘텐츠를 사용자 단말의 환경 및 네트워크 상태를 고려하여 인코딩 파라미터들을 설정한다. 예를 들면, 사용자 단말의 성능, 디스플레이 환경, 네트워크 상태, 서비스 시나리오 등을 고려하여, 비디오 트랜스코더(110)는 입력 영상의 코덱을 다른 종류의 코덱으로 변환하거나 해상도, 입력 영상의 프레임 레이트(frame rate), 비트레이트(bitrate), 파일 포맷, 미디어 컨테이너 등을 변경하여 인코딩 파라미터들을 설정할 수 있다. 여기서, 프레임은 비디오 콘텐츠를 구성하는 이미지 또는 이미지의 표현으로서, 픽처(picture)와 혼용될 수 있다. 비디오 트랜스코더(110)는 설정된 인코딩 파라미터들 및 인코더를 이용하여 디코딩된 비디오 콘텐츠를 인코딩한다.Since the input video content is in an encoded state, the video transcoder 110 decodes the encoded video content using a decoder. The video transcoder 110 sets encoding parameters for the decoded video content in consideration of the user terminal's environment and network conditions. For example, the video transcoder 110 converts the codec of the input image into a different type of codec or converts the resolution and frame rate of the input image in consideration of the performance of the user terminal, the display environment, the network state, and the service scenario. rate), bitrate, file format, media container, etc., to set encoding parameters. Here, a frame is an image constituting video content or a representation of an image, and may be used interchangeably with a picture. The video transcoder 110 encodes the decoded video content using the set encoding parameters and encoder.

비디오 콘텐츠와 마찬가지로, 오디오 콘텐츠는 오디오 트랜스코더(120)에 의해 디코딩 된 후 다시 인코딩된다.Like video content, audio content is decoded by audio transcoder 120 and then re-encoded.

인코딩된 비디오 콘텐츠 및 인코딩된 오디오 콘텐츠는 멀티플렉서(130)에 입력되어 하나의 출력 영상으로 병합된다.The encoded video content and the encoded audio content are input to the multiplexer 130 and merged into one output image.

사용자 단말은 서버에 입력되는 입력 영상을 그대로 수신하는 것이 아니라, 사용자 단말의 환경 및 네트워크 상태 등을 고려하여 트랜스코딩된 출력 영상을 수신한다. 이를 통해, 사용자는 사용자 단말에 적합하도록 트랜스코딩된 출력 영상을 시청할 수 있다.The user terminal does not receive the input video input to the server as it is, but receives the transcoded output video in consideration of the environment and network conditions of the user terminal. Through this, the user can watch the output video transcoded to be suitable for the user terminal.

도 2는 본 발명의 일 실시예에 따른 트랜스코딩 시스템의 구성을 나타낸 도면이다.2 is a diagram showing the configuration of a transcoding system according to an embodiment of the present invention.

도 2를 참조하면, 서버(200), 비디오 콘텐츠(202), 서버 비디오 코덱(204), 오디오 콘텐츠(206), 서버 오디오 코덱(208), 컨테이너(210), 사용자 단말(220), 단말 비디오 코덱(212), 단말 오디오 코덱(214), 및 출력 영상(230)이 도시되어 있다. Referring to FIG. 2 , a server 200, video content 202, server video codec 204, audio content 206, server audio codec 208, container 210, user terminal 220, terminal video A codec 212 , a terminal audio codec 214 , and an output image 230 are shown.

비디오 콘텐츠(202)는 카메라에 의해 촬영된 비디오 데이터이고, 오디오 콘텐츠(2206)는 마이크에 의해 수집된 오디오 데이터이다. 이하에서, 미디어 콘텐츠는 비디오 콘텐츠(202)와 오디오 콘텐츠(206) 모두를 포함하는 구성으로 지칭된다.Video content 202 is video data captured by a camera, and audio content 2206 is audio data collected by a microphone. In the following, media content is referred to as a composition that includes both video content 202 and audio content 206 .

서버(200)는 비디오 콘텐츠(202)를 미리 저장해 놓거나, 네트워크를 통해 실시간으로 수신할 수 있다. 서버(200)는 비디오 코덱(204)을 이용하여 비디오 콘텐츠(202)를 디코딩 및 인코딩한다. 서버(200)는 비디오 콘텐츠(202)를 압축하여 인코딩한다.The server 200 may store the video content 202 in advance or receive it in real time through a network. Server 200 decodes and encodes video content 202 using video codec 204 . The server 200 compresses and encodes the video content 202 .

여기서, 코덱 종류의 예시로써, 비디오 콘텐츠(202)를 압축하는 데 이용되는 코덱 포맷(format)은 MPEG(Moving Picture Experts Group)에서 표준화한 MPEG-1, MPEG-2, MPEG-4, MPEG-7, MPEG-21 등이 있다. 이러한 코덱 포맷은 2차원 공간 상에서 중복되는 픽셀을 제거함으로써, 비디오 콘텐츠(202)를 압축한다. 이를 위해, 시간 축의 화상 신호를 주파수 축으로 변환하는 이산 코사인 변환(Discrete Cosine Transform, DCT)이 적용된다. 또한, 프레임 단위의 압축뿐만 아니라 프레임 간의 연관성을 이용한 압축 방법이 적용된다. 이때, 비디오 시퀀스(sequence)가 갖는 시간 축 상의 중복을 없애기 위해 움직임 보상 방법이 더 적용될 수 있다. 움직임 보상 방법은 2 장의 프레임 메모리를 이용하여 쌍방향 예측을 수행하는 것을 의미한다. 또한, 임의 접근(Random access)를 가능하도록 하기 위해 픽처 그룹(Group of Pictures, GOP) 구조를 이용하여 압축한다.Here, as an example of the type of codec, the codec format used to compress the video content 202 is MPEG-1, MPEG-2, MPEG-4, and MPEG-7 standardized by the Moving Picture Experts Group (MPEG). , MPEG-21, etc. This codec format compresses video content 202 by removing overlapping pixels in two-dimensional space. To this end, a discrete cosine transform (DCT) is applied that transforms an image signal on a time axis into a frequency axis. In addition, a compression method using correlation between frames as well as frame-by-frame compression is applied. In this case, a motion compensation method may be further applied to eliminate duplication on the time axis of video sequences. The motion compensation method means performing bi-directional prediction using two frame memories. In addition, compression is performed using a Group of Pictures (GOP) structure to enable random access.

서버(200)는 오디오 콘텐츠(206)를 미리 저장해 놓거나, 네트워크를 통해 실시간으로 수신할 수 있다. 서버(200)는 오디오 코덱(208)을 이용하여 오디오 콘텐츠(206)를 디코딩 및 인코딩한다. 서버(200)는 오디오 콘텐츠(206)를 압축하여 인코딩한다.The server 200 may store the audio content 206 in advance or receive it in real time through a network. Server 200 decodes and encodes audio content 206 using audio codec 208 . The server 200 compresses and encodes the audio content 206 .

여기서, 오디오 코덱은 오디오 콘텐츠의 트랜스코딩을 위해 수정 이산 코사인 변환(Modified Discrete Cosine Transform, MDCT)을 이용할 수 있다. 이때, 맨 앞의 펄스 부호 변조(Pulse-code modulation, PCM) 샘플을 정상적으로 디코딩시키기 위해, 일정 길이의 묶음 샘플을 추가해 인코딩할 수 있다. 즉, 오디오 콘텐츠는 비디오 콘텐츠와 연동되어 인코딩될 수 있다.Here, the audio codec may use Modified Discrete Cosine Transform (MDCT) for transcoding of audio content. At this time, in order to normally decode the first pulse-code modulation (PCM) sample, a bundled sample of a certain length may be added and encoded. That is, audio content may be encoded in association with video content.

서버(200)는 인코딩된 비디오 콘텐츠와 인코딩된 오디오 콘텐츠를 합쳐 컨테이너(210)를 생성한다.The server 200 creates the container 210 by combining the encoded video content and the encoded audio content.

컨테이너(210)는 비디오 콘텐츠(202) 또는 오디오 콘텐츠(206)에 관한 콘텐츠 정보 및 세부 정보를 포함한다. 콘텐츠 정보는 비디오 콘텐츠(202)의 영상 데이터 또는 오디오 콘텐츠(206)의 음성 데이터를 포함한다. 세부 정보는 코덱(codec) 정보, 비트레이트, 프레임 레이트, 오디오 트랙, 오디오 채널 정보, 촬영 날짜, 촬영 위치, 스트림의 개수, 미디어 콘텐츠의 재생 시간의 길이, 스트림 위치 정보, 또는 그 외 메타 데이터 등을 포함한다.Container 210 contains content information and details about video content 202 or audio content 206 . Content information includes video data of video content 202 or audio data of audio content 206 . Detailed information includes codec information, bit rate, frame rate, audio track, audio channel information, recording date, recording location, number of streams, length of playback time of media content, stream location information, or other metadata. includes

컨테이너(210)는 인코딩된 미디어 콘텐츠를 스트림 형태로 포함하며, 서버(200)로부터 사용자 단말(220)에게 전송된다.The container 210 includes encoded media content in the form of a stream and is transmitted from the server 200 to the user terminal 220 .

사용자 단말(220)은 컨테이너(210)를 수신하고, 단말 비디오 코덱(222) 및 단말 오디오 코덱(224)을 이용하여 인코딩된 미디어 콘텐츠를 디코딩한다. 사용자 단말(220)은 디코딩을 통해 디코딩된 비디오 콘텐츠 및 디코딩된 오디오 콘텐츠를 획득한다. 사용자 단말(220)은 디코딩된 비디오 콘텐츠 및 디코딩된 오디오 콘텐츠로부터 사용자가 시청할 수 있는 출력 영상(230)을 생성하고, 사용자에게 제공한다.User terminal 220 receives container 210 and decodes the encoded media content using terminal video codec 222 and terminal audio codec 224 . The user terminal 220 obtains decoded video content and decoded audio content through decoding. The user terminal 220 generates an output image 230 that the user can watch from the decoded video content and the decoded audio content, and provides the output image 230 to the user.

한편, 서버(200)가 단일 서버인 경우, 서버(200)가 비디오 콘텐츠(202) 및 오디오 콘텐츠(206)를 트랜스코딩하는 데 많은 시간이 소요된다. 특히, 서버(200)는 고해상도의 콘텐츠, HDR(High Dynamic Range)과 같은 영상 기술을 처리하는 데 많은 시간을 소모한다.On the other hand, when the server 200 is a single server, it takes a lot of time for the server 200 to transcode the video content 202 and the audio content 206. In particular, the server 200 consumes a lot of time to process high-resolution content and video technologies such as high dynamic range (HDR).

본 발명의 일 실시예에 의하면, 많은 연산량이 필요한 코덱이 비디오 콘텐츠에 적용되거나 고품질의 비디오 콘텐츠에 대해, 비디오 콘텐츠를 분할하고 복수의 트랜스코더들에 할당하여 병렬 트랜스코딩함으로써, 트랜스코딩에 소요되는 시간을 줄일 수 있다.According to an embodiment of the present invention, a codec requiring a large amount of computation is applied to video content or high-quality video content is divided and allocated to a plurality of transcoders to perform parallel transcoding, thereby reducing the cost required for transcoding. can save time

도 3은 본 발명의 일 실시예에 따른 분산형 병렬 트랜스코딩 시스템을 나타낸 구성도다.3 is a block diagram showing a distributed parallel transcoding system according to an embodiment of the present invention.

도 3을 참조하면, 트랜스코딩 시스템(30), 저장부(310), 제어부(320), 분석부(330), 트랜스코딩부(340) 및 병합부(350)가 도시되어 있다.Referring to FIG. 3 , a transcoding system 30 , a storage unit 310 , a control unit 320 , an analysis unit 330 , a transcoding unit 340 and a merging unit 350 are illustrated.

트랜스코딩 시스템(30)은 복수의 트랜스코더를 위해 병렬 처리 전용 하드웨어 장치, 또는 복수의 소프트웨어 기반 인코더 장치를 구비할 수 있다. The transcoding system 30 may include a hardware device dedicated to parallel processing for a plurality of transcoders, or a plurality of software-based encoder devices.

트랜스코딩 시스템(30) 내 구성요소들은 독자적인 메모리와 적어도 하나의 프로세서를 포함할 수 있다. 즉, 트랜스코딩 시스템(30) 내 구성요소들은 개별적인 컴퓨팅 장치일 수 있다.Components within the transcoding system 30 may include an independent memory and at least one processor. That is, the components within transcoding system 30 may be separate computing devices.

트랜스코딩 시스템(30) 내 구성요소들은 네트워크를 통해 연결된다. 트랜스코딩 시스템(30) 내 구성요소들은 미디어 데이터, 미디어 데이터에 관한 세부 정보, 또는 제어 신호를 IP(Internet Protocol) 패킷 형태로 전송하거나 수신한다. 이를 위해, 각 구성요소들은 통신 모듈을 포함할 수 있다.Components in the transcoding system 30 are connected through a network. Components in the transcoding system 30 transmit or receive media data, detailed information about the media data, or control signals in the form of Internet Protocol (IP) packets. To this end, each component may include a communication module.

한편, 사용자 단말은 트랜스코딩 시스템(30)으로부터 인코딩된 비디오 콘텐츠를 수신한다. 사용자 단말은 디코더를 구비한 전자 기기일 수 있다. 예를 들면, 사용자 단말은 스마트폰(smart phone), 휴대폰, 내비게이션, 컴퓨터, 노트북, 디지털방송용 단말, PDA(Personal Digital Assistants), PMP(Portable Multimedia Player), 태블릿 PC, 게임 콘솔(game console), 웨어러블 디바이스(wearable device), IoT(internet of things) 디바이스, VR(virtual reality) 디바이스, AR(augmented reality) 디바이스 등에 해당할 수 있다.Meanwhile, the user terminal receives encoded video content from the transcoding system 30 . A user terminal may be an electronic device having a decoder. For example, the user terminal includes a smart phone, a mobile phone, a navigation device, a computer, a laptop computer, a digital broadcasting terminal, a PDA (Personal Digital Assistants), a PMP (Portable Multimedia Player), a tablet PC, a game console, It may correspond to a wearable device, an internet of things (IoT) device, a virtual reality (VR) device, an augmented reality (AR) device, and the like.

여기서, 미디어 데이터는 비디오 콘텐츠를 포함하고, 오디오 콘텐츠를 더 포함할 수 있다. 이하에서는, 비디오 콘텐츠를 기준으로 설명하나, 오디오 콘텐츠는 비디오 콘텐츠와 동일하게 처리될 수 있다.Here, the media data includes video content and may further include audio content. Hereinafter, video content will be described, but audio content may be processed the same as video content.

또한, 미디어 데이터에 관한 세부 정보는 비디오 콘텐츠에 관한 세부 정보 또는 오디오 콘텐츠에 관한 세부 정보를 포함한다. 비디오에 관한 세부 정보는 비디오 콘텐츠의 오프셋 정보, 비디오 코덱 정보, 비디오 비트레이트, 비디오 프레임 레이트(Frame Per Second, FPS), GOP 정보, 전체 재생시간, 해상도, 화면 비율, 평균 비트레이트, 스트림의 개수, 스트림의 위치 정보, 또는 메타 데이터 등을 포함한다. 오디오 콘텐츠에 관한 세부 정보는 오디오 코덱 정보, 오디오 트랙 정보, 오디오 채널 정보, 오디오 비트레이트, 전체 재생시간, 오디오 콘텐츠의 오프셋 정보, 샘플링 레이트, 언어 정보 또는 메타 데이터 등을 포함한다. 제어 신호는 제어부(320)가 트랜스코딩 시스템(30)의 제어를 위해 각 구성요소에 전송하는 신호다.Also, detailed information on media data includes detailed information on video content or detailed information on audio content. Detailed video information includes offset information of video content, video codec information, video bit rate, video frame rate (Frame Per Second, FPS), GOP information, total playback time, resolution, aspect ratio, average bit rate, number of streams , stream location information, or metadata. Detailed information about audio content includes audio codec information, audio track information, audio channel information, audio bit rate, total playback time, offset information of audio content, sampling rate, language information or meta data. The control signal is a signal that the control unit 320 transmits to each component for controlling the transcoding system 30 .

트랜스코딩 시스템(30)이 수신하는 비디오 콘텐츠는 인코딩된 상태일 수 있다.Video content received by transcoding system 30 may be in an encoded state.

이하에서, 비디오 콘텐츠는 복수의 데이터 청크(chunk)로 분할 될 수 있으며, 비디오 콘텐츠의 분할은 물리적 분할이 아니라 논리적 분할을 의미한다. 구체적으로, 트랜스코딩 시스템(30)은 비디오 콘텐츠의 재생 시간을 복수의 시간 구간들로 분할하고, 각 시간 구간의 시작 지점과 종료 지점을 저장한다. 분할된 시간 구간들은 동일한 크기를 가질 수 있다. 트랜스코딩 시스템(30) 내 구성요소들은 비디오 콘텐츠 전부를 로드하는 것이 아니라, 각 시간 구간의 시작 지점과 종료 지점을 이용하여 시간 구간에 해당하는 비디오 콘텐츠 일부를 로드할 수 있다. 이처럼, 비디오 콘텐츠에 대해 각 데이터 청크의 오프셋 정보를 설정하는 것을 논리적 분할이라 하고, 시간 구간별로 분할된 데이터를 데이터 청크라 한다. Hereinafter, video content may be divided into a plurality of data chunks, and division of video content means logical division, not physical division. Specifically, the transcoding system 30 divides the reproduction time of video content into a plurality of time intervals, and stores the start point and end point of each time interval. The divided time intervals may have the same size. Components in the transcoding system 30 may load a part of video content corresponding to a time interval using the start point and end point of each time interval, instead of loading all of the video content. In this way, setting offset information of each data chunk for video content is called logical division, and data divided by time interval is called a data chunk.

저장부(310)는 비디오 콘텐츠 및 비디오 콘텐츠의 병렬 트랜스코딩에 필요한 정보를 저장하도록 구성된다. The storage unit 310 is configured to store video content and information necessary for parallel transcoding of the video content.

구체적으로, 저장부(310)는 비디오 콘텐츠, 비디오 콘텐츠의 세부 정보, 비디오 콘텐츠의 오프셋 정보, 또는 인코딩된 비디오 콘텐츠 등을 저장한다. 또한, 저장부(310)는 오디오 콘텐츠, 오디오 콘텐츠의 세부 정보, 오디오 콘텐츠의 오프셋 정보, 또는 인코딩된 오디오 콘텐츠 등을 저장할 수 있다.Specifically, the storage unit 310 stores video content, detailed information of video content, offset information of video content, or encoded video content. Also, the storage unit 310 may store audio content, detailed information of audio content, offset information of audio content, or encoded audio content.

저장부(310)는 비디오 콘텐츠가 업로드되면, 제어부(320)에게 비디오 콘텐츠의 업로드 완료를 통지하거나, 업로드 완료에 따른 이벤트를 통지할 수 있다.When the video content is uploaded, the storage unit 310 may notify the control unit 320 of completion of the upload of the video content or an event according to the completion of the upload.

제어부(320)는 병렬 트랜스코딩을 위해 제어 신호를 생성하고, 제어 신호를 각 구성요소들에 전송하도록 구성된다.The control unit 320 is configured to generate a control signal for parallel transcoding and transmit the control signal to each component.

제어부(320)는 비디오 콘텐츠에 관한 트랜스코딩 요청을 수신하면, 분석부(330)에 비디오 콘텐츠에 관한 분석 요청을 전송한다. 이때, 분석부(330)는 비디오 콘텐츠를 시간에 따라 복수의 데이터 청크로 분할하기 위한 시간 구간의 경계값인 오프셋 정보를 생성한다. When receiving a transcoding request for video content, the control unit 320 transmits a video content analysis request to the analyzer 330 . At this time, the analysis unit 330 generates offset information that is a boundary value of a time interval for dividing the video content into a plurality of data chunks according to time.

제어부(320)는 분석 결과에 따라 트랜스코딩 파라미터들을 설정하고, 트랜스코딩부9340)에 트랜스코딩 파라미터들 및 오프셋 정보를 전송한다. 여기서, 트랜스코딩 파라미터들은 디코딩 파라미터들 및 인코딩 파라미터들을 포함한다. 디코딩 파라미터들은 세부 정보에 포함된 정보 중 비디오 콘텐츠의 디코딩에 필요한 파라미터들을 의미한다. 인코딩 파라미터들은 세부 정보에 포함된 정보 중 비디오 콘텐츠의 인코딩에 필요한 파라미터들을 의미한다. 예를 들어, 인코딩 파라미터들은 비디오 코덱 정보, 비디오 비트레이트, 비디오 프레임 레이트(Frame Per Second, FPS), 평균 비트레이트, 스트림의 개수, 스트림의 위치 정보 등을 포함한다.The control unit 320 sets transcoding parameters according to the analysis result, and transmits the transcoding parameters and offset information to the transcoding unit 9340. Here, the transcoding parameters include decoding parameters and encoding parameters. Decoding parameters refer to parameters necessary for decoding video content among information included in detailed information. Encoding parameters refer to parameters necessary for encoding video content among information included in detailed information. For example, encoding parameters include video codec information, video bit rate, video frame rate (Frame Per Second, FPS), average bit rate, number of streams, location information of streams, and the like.

본 발명의 일 실시예에 의하면, 트랜스코딩 파라미터들은 사용자 단말의 재생 환경 또는 네트워크 상태 중 적어도 하나를 고려하여 설정된다. 이를 위해, 사용자 단말의 재생 환경에 관한 정보를 수신하는 구성 또는 네트워크 상태를 측정하는 구성이 더 이용될 수 있다. According to an embodiment of the present invention, transcoding parameters are set in consideration of at least one of a playback environment of a user terminal and a network condition. To this end, a configuration for receiving information about a reproduction environment of a user terminal or a configuration for measuring a network state may be further used.

본 발명의 다른 실시예에 의하면, 트랜스코딩 파라미터들은 트랜스코딩된 비디오 콘텐츠에서 키 프레임들 간 시간 간격을 포함한다. 각 트랜스코더들은 인코딩 과정에서 트랜스코딩된 비디오 콘텐츠 내 키 프레임들 간 시간 간격을 고려하여, 키 프레임을 생성한다. 키 프레임들은 영상 재생, 역재생, 건너뛰기 등 재생의 기준점이 된다.According to another embodiment of the present invention, the transcoding parameters include a time interval between key frames in the transcoded video content. Each transcoder generates a key frame by considering a time interval between key frames in transcoded video content during an encoding process. Key frames serve as reference points for playback such as video playback, reverse playback, and skipping.

한편, 비디오 콘텐츠는 오프셋 정보에 따라 복수의 데이터 청크로 분할되고, 데이터 청크별로 트랜스코딩된다. Meanwhile, video content is divided into a plurality of data chunks according to offset information and transcoded for each data chunk.

이후, 제어부(320)는 병합부(350)에 트랜스코딩된 데이터 청크들에 대해 병합을 요청한다. 제어부(320)는 트랜스코딩된 비디오 콘텐츠를 저장부(310)에 저장한다.Then, the control unit 320 requests the merging unit 350 to merge the transcoded data chunks. The control unit 320 stores the transcoded video content in the storage unit 310 .

분석부(330)는 저장부(310)에 저장된 비디오 콘텐츠를 로딩하고, 비디오 콘텐츠에 관한 세부 정보 및 오프셋 정보를 획득하도록 구성된다. The analyzer 330 is configured to load the video content stored in the storage 310 and obtain detailed information and offset information about the video content.

여기서, 오프셋 정보는 비디오 콘텐츠의 재생 시간을 복수의 시간 구간들로 분할하였을 때, 각 시간 구간의 시작 지점과 종료 지점에 관한 정보를 나타낸다. 즉, 오프셋 정보는 비디오 콘텐츠로부터 분할된 복수의 데이터 청크의 경계값이다. 예를 들면, 오프셋 정보는 재생 시간이 30초인 비디오 콘텐츠를 0.5 ms 단위의 시간 구간들로 분할하였을 때, 각 시간 구간의 시작 지점과 종료 지점을 나타낼 수 있다. Here, the offset information represents information about a start point and an end point of each time interval when the reproduction time of video content is divided into a plurality of time intervals. That is, the offset information is a boundary value of a plurality of data chunks divided from video content. For example, the offset information may indicate a start point and an end point of each time interval when video content having a reproduction time of 30 seconds is divided into time intervals in units of 0.5 ms.

본 발명의 일 실시예에 의하면, 분석부(330)는 비디오 콘텐츠를 디코딩하여 디코딩 가능 여부를 확인하고, 비디오 콘텐츠의 세부 정보와 비디오 콘텐츠의 디코딩 결과를 비교함으로써, 비디오 콘텐츠의 오류를 확인한다. 도 4에서 자세히 설명한다.According to an embodiment of the present invention, the analyzer 330 decodes the video content to determine whether decoding is possible, and compares detailed information of the video content with a decoding result of the video content to determine an error in the video content. It is explained in detail in FIG. 4 .

트랜스코딩부(340)는 복수의 비디오 트랜스코더를 생성하고, 비디오 콘텐츠로부터 분할된 복수의 데이터 청크를 트랜스코더들에 할당하여 병렬 트랜스코딩을 수행한다. 여기서, 복수의 비디오 트랜스코더는 독립적인 프로세서로 구성될 수 있다.The transcoder 340 generates a plurality of video transcoders and allocates a plurality of data chunks divided from video content to the transcoders to perform parallel transcoding. Here, a plurality of video transcoders may be composed of independent processors.

구체적으로, 트랜스코딩부(340)는 비디오 콘텐츠의 재생 시간과 주어진 시간 간격에 따라 분할된 데이터 청크의 개수만큼 비디오 트랜스코더들을 생성한다. 트랜스코딩부(340)는 트랜스코딩 파라미터들과 오프셋 정보를 수신하고, 복수의 비디오 트랜스코더에게 트랜스코딩 파라미터들과 오프셋 정보를 할당한다.Specifically, the transcoder 340 generates as many video transcoders as the number of divided data chunks according to the playback time of video content and a given time interval. The transcoding unit 340 receives transcoding parameters and offset information, and allocates the transcoding parameters and offset information to a plurality of video transcoders.

각 비디오 트랜스코더는 할당된 오프셋 정보를 기반으로 비디오 콘텐츠 중 오프셋 정보에 해당하는 시간 구간만큼 데이터 청크를 로드한다. 각 비디오 트랜스코더는 비디오 콘텐츠의 세부 정보에 기초하여 데이터 청크를 디코딩한다. 이후, 각 비디오 트랜스코더는 트랜스코딩 파라미터들을 이용하여 디코딩된 데이터 청크를 인코딩한다.Each video transcoder loads a data chunk for a time interval corresponding to offset information among video contents based on the allocated offset information. Each video transcoder decodes a chunk of data based on the details of the video content. Then, each video transcoder encodes the decoded data chunk using the transcoding parameters.

본 발명의 일 실시예에 의하면, 트랜스코딩부(340)는 트랜스코딩된 데이터 청크의 품질을 평가하고, 트랜스코딩된 데이터 청크의 품질에 따라 디코딩된 데이터 청크에 대한 트랜스코딩 파라미터들을 조정하여 다시 인코딩할 수 있다. 특히, 인코딩 파라미터들을 조정하여 다시 인코딩할 수 있다. 이는 도 6에서 자세히 설명한다.According to an embodiment of the present invention, the transcoding unit 340 evaluates the quality of the transcoded data chunk, adjusts transcoding parameters for the decoded data chunk according to the quality of the transcoded data chunk, and encodes again. can do. In particular, encoding may be re-encoded by adjusting encoding parameters. This is explained in detail in FIG. 6 .

본 발명의 다른 실시예에 의하면, 트랜스코딩 파라미터들은 트랜스코딩된 비디오 콘텐츠에서 키 프레임들(key frames) 간 시간 간격을 포함한다. 즉, 최종적으로 트랜스코딩된 비디오 콘텐츠가 일정 간격마다 키 프레임을 포함하도록, 각 비디오 트랜스코더는 인코딩을 수행한다. 일정 간격의 키 프레임들은 사용자가 미디어 콘텐츠 시청 중 건너뛰기, 2배속 재생 등의 기능을 가능하게 한다.According to another embodiment of the present invention, the transcoding parameters include a time interval between key frames in the transcoded video content. That is, each video transcoder performs encoding so that finally transcoded video content includes key frames at regular intervals. Key frames at regular intervals allow the user to skip while watching media content, play at double speed, and the like.

한편, 트랜스코딩부(340)는 복수의 오디오 트랜스코더를 이용하여 오디오 콘텐츠를 트랜스코딩할 수 있다. 오디오 콘텐츠의 트랜스코딩은 연산량이 적게 소요되므로, 트랜스코딩부(340)는 오디오 콘텐츠를 전체로서 트랜스코딩하거나 분산형 병렬 트랜스코딩할 수 있다.Meanwhile, the transcoder 340 may transcode audio content using a plurality of audio transcoders. Since transcoding of the audio content requires a small amount of computation, the transcoding unit 340 may transcode the audio content as a whole or perform distributed parallel transcoding.

트랜스코딩부(340)는 복수의 트랜스코딩된 데이터 청크를 저장부(310)에 저장한다.The transcoding unit 340 stores a plurality of transcoded data chunks in the storage unit 310 .

병합부(350)는 복수의 트랜스코딩된 데이터 청크를 병합하여 트랜스코딩된 비디오 콘텐츠를 생성하도록 구성된다. 병합부(350)는 오프셋 정보에 기초하여 복수의 트랜스코딩된 데이터 청크의 순서를 판단하고, 복수의 트랜스코딩된 데이터 청크를 순서대로 병합한다. 이때, 각 트랜스코딩된 데이터 청크의 시작 위치에는 키 프레임이 위치한다.The merging unit 350 is configured to generate transcoded video content by merging the plurality of transcoded data chunks. The merging unit 350 determines the order of the plurality of transcoded data chunks based on the offset information, and merges the plurality of transcoded data chunks in order. At this time, a key frame is located at the start position of each transcoded data chunk.

본 발명의 일 실시예에 의하면, 병합부(350)는 트랜스코딩된 비디오 콘텐츠의 길이 정보와 트랜스코딩된 오디오 콘텐츠의 길이 정보를 비교하여, 트랜스코딩된 비디오 콘텐츠의 무결성을 검증할 수 있다. 즉, 두 길이 정보가 일치하지 않는 경우, 병합부(350)는 프레임 손실이나 중복을 탐지할 수 있다. 이는 도 7에서 자세히 설명한다.According to an embodiment of the present invention, the merging unit 350 may verify the integrity of the transcoded video content by comparing the length information of the transcoded video content with the length information of the transcoded audio content. That is, when the two pieces of length information do not match, the merging unit 350 may detect frame loss or duplication. This is explained in detail in FIG. 7 .

도 4는 본 발명의 일 실시예에 따른 병렬 트랜스코딩 과정을 도시한 순서도다.4 is a flowchart illustrating a parallel transcoding process according to an embodiment of the present invention.

도 4를 참조하면, 저장부는 비디오 콘텐츠를 입력 받는다(S400).Referring to FIG. 4 , the storage unit receives video content (S400).

저장부는 입력 받은 비디오 콘텐츠를 미리 저장해두거나 실시간으로 입력 받을 수 있다.The storage unit may store input video content in advance or receive input video content in real time.

저장부는 비디오 콘텐츠의 저장이 완료되면, 제어부에게 비디오 콘텐츠 업로드를 통지한다(S402).When the storage of the video content is completed, the storage unit notifies the control unit of uploading the video content (S402).

저장부는 제어부에게 업로드 통지와 함께 업로드된 비디오 콘텐츠의 식별 정보(identification, ID)를 전송한다. 이 외에, 저장부는 비디오 콘텐츠의 저장이 완료되면 트랜스코딩 요청 이벤트를 생성하여 제어부에게 전송할 수도 있다.The storage unit transmits identification information (ID) of the uploaded video content to the control unit along with an upload notification. In addition, the storage unit may generate a transcoding request event and transmit it to the control unit when the storage of the video content is completed.

제어부는 비디오 콘텐츠의 업로드 통지에 따라 분석부에게 비디오 콘텐츠에 대한 분석을 요청한다(S404). The control unit requests the analysis unit to analyze the video contents according to the upload notification of the video contents (S404).

제어부는 분석부에게 비디오 콘텐츠의 ID와 분석 요청을 전송한다. The controller transmits the video content ID and analysis request to the analyzer.

분석부는 비디오 콘텐츠의 ID를 이용하여 저장부로부터 비디오 콘텐츠를 로컬 저장소에 로드한다(S406).The analysis unit loads the video contents from the storage unit into the local storage using the ID of the video contents (S406).

분석부는 비디오 콘텐츠의 ID를 포함하는 URL(Uniform Resource Locator) 정보를 이용하여 저장부 내 비디오 콘텐츠의 위치를 식별할 수 있다. 분석부는 인코딩된 상태의 비디오 콘텐츠를 로컬 저장소로 불러온다.The analysis unit may identify the location of the video content in the storage unit using uniform resource locator (URL) information including an ID of the video content. The analysis unit loads video content in an encoded state into a local storage.

분석부는 로드된 비디오 콘텐츠에 대해 오류 검사를 수행하고, 비디오 콘텐츠를 복수의 데이터 청크로 분할한다(S408).The analysis unit performs error checking on the loaded video content and divides the video content into a plurality of data chunks (S408).

분석부는 비디오 콘텐츠의 재생 시간을 기 설정된 시간 간격으로 분할하여 비디오 콘텐츠를 복수의 데이터 청크로 분할할 수 있다. 분석부는 각 시간 간격에 대한 시작 지점과 종료 지점을 해당 데이터 청크의 오프셋 정보로 저장한다. 또한, 분석부는 비디오 콘텐츠의 세부 정보를 추출할 수 있다.The analysis unit may divide the video content into a plurality of data chunks by dividing the reproduction time of the video content into predetermined time intervals. The analysis unit stores the starting point and ending point for each time interval as offset information of the corresponding data chunk. Also, the analysis unit may extract detailed information of video content.

분석부는 비디오 콘텐츠의 세부 정보, 오류 검사 결과 및 오프셋 정보와 함께 제어부에게 비디오 콘텐츠에 대한 분석 완료를 통지한다(S408).The analysis unit notifies the control unit of completion of analysis of the video contents together with detailed information of the video contents, error check results, and offset information (S408).

제어부는 데이터 청크의 수에 따라 트랜스코더들의 식별 정보(identification, ID)를 생성한다(S412).The control unit generates identification information (ID) of transcoders according to the number of data chunks (S412).

제어부는 트랜스코딩부에게 트랜스코딩을 요청한다(S414).The control unit requests transcoding to the transcoding unit (S414).

제어부는 복수의 데이터 청크에 대한 오프셋 정보와 트랜스코딩 파라미터들을 트랜스코딩부에게 전송한다.The control unit transmits offset information and transcoding parameters for a plurality of data chunks to the transcoding unit.

트랜스코딩부는 트랜스코딩 요청에 따라 저장부로부터 데이터 청크들을 로드한다(S416).The transcoding unit loads data chunks from the storage unit according to the transcoding request (S416).

트랜스코딩부는 복수의 데이터 청크들의 수에 따라 트랜스코더들을 생성하고, 트랜스코더들에 연산 자원을 할당하며, 트랜스코더들, 오프셋 정보 및 트랜스코딩 파라미터들을 이용하여 복수의 데이터 청크들을 병렬 트랜스코딩한다(S418). 각 트랜스코더는 할당된 오프셋 정보에 해당하는 데이터 청크를 트랜스코딩한다.The transcoding unit generates transcoders according to the number of the plurality of data chunks, allocates computing resources to the transcoders, and transcodes the plurality of data chunks in parallel using the transcoders, offset information, and transcoding parameters ( S418). Each transcoder transcodes a data chunk corresponding to the assigned offset information.

본 발명의 일 실시예에 의하면, 각 트랜스코더는 할당된 데이터 청크를 트랜스코딩하되, 트랜스코딩된 데이터 청크의 품질을 평가하고, 품질 기준에 못 미치는 데이터 청크에 대해서는 다시 인코딩을 수행한다. 이때, 인코딩 파라미터를 변경하여 인코딩을 수행할 수 있다.According to an embodiment of the present invention, each transcoder transcodes allocated data chunks, evaluates the quality of the transcoded data chunks, and re-encodes data chunks that do not meet the quality standard. In this case, encoding may be performed by changing an encoding parameter.

트랜스코딩부는 데이터 청크들에 대한 트랜스코딩 완료 후 제어부에게 트랜스코딩 완료를 통지한다(S420).After the transcoding of the data chunks is completed, the transcoding unit notifies the controller of the completion of transcoding (S420).

트랜스코딩부는 트랜스코딩된 데이터 청크들을 저장부에 저장한다(S422).The transcoding unit stores the transcoded data chunks in the storage unit (S422).

제어부는 트랜스코딩된 데이터 청크들을 병합하도록 병합부에게 병합을 요청한다(S424).The control unit requests the merging unit to merge the transcoded data chunks (S424).

병합부는 병합 요청에 따라 저장부로부터 트랜스코딩된 데이터 청크들을 로드한다(S426).The merging unit loads the transcoded data chunks from the storage unit according to the merging request (S426).

병합부는 트랜스코딩된 데이터 청크들을 메모리로 로드하고, 병합한다(S428).The merging unit loads the transcoded data chunks into memory and merges them (S428).

구체적으로, 병합부는 트랜스코딩된 데이터 청크들의 키 프레임을 이용하여 순서대로 병합함으로써, 트랜스코딩된 비디오 콘텐츠를 생성할 수 있다.Specifically, the merging unit may generate transcoded video content by sequentially merging transcoded data chunks using key frames.

병합부는 제어부에게 병합 완료를 통지한다(S430).The merging unit notifies the control unit of merging completion (S430).

병합부는 저장부에 트랜스코딩된 비디오 콘텐츠를 저장한다(S432).The merging unit stores the transcoded video contents in the storage unit (S432).

한편, 트랜스코딩 시스템은 비디오 콘텐츠에 대한 트랜스코딩 과정을 오디오 콘텐츠에 동일하게 적용할 수 있다. 다만, 오디오 콘텐츠의 트랜스코딩 연산량은 비디오 콘텐츠보다 적으므로, 트랜스코딩 시스템은 오디오 콘텐츠를 한번에 트랜스코딩할 수도 있고, 비디오 콘텐츠에 대한 데이터 청크의 크기의 정수배만큼 분할하여 병렬 트랜스코딩할 수도 있다.Meanwhile, the transcoding system may equally apply a transcoding process for video content to audio content. However, since the amount of transcoding operation for audio content is smaller than that for video content, the transcoding system may transcode the audio content at once or divide the data chunk of the video content by an integer multiple of the size and perform parallel transcoding.

도 5는 본 발명의 일 실시예에 따른 분석부의 동작 과정을 나타낸 순서도다.5 is a flowchart illustrating an operation process of an analyzer according to an embodiment of the present invention.

도 5를 참조하면, 분석부는 제어부로부터 비디오 콘텐츠에 대한 분석 요청을 수신한다(S500).Referring to FIG. 5 , the analyzer receives a video content analysis request from the control unit (S500).

분석 요청은 분석 대상이 되는 비디오 콘텐츠의 ID를 URL 형식으로 포함할 수 있다. 분석부는 URL을 이용하여 저장부에 저장된 비디오 콘텐츠에 접근할 수 있다.The analysis request may include an ID of video content to be analyzed in a URL format. The analysis unit may access video content stored in the storage unit using a URL.

분석부는 저장부에서 비디오 콘텐츠를 분석부 내 로컬 저장소에 복사한다(S502).The analysis unit copies the video content from the storage unit to a local storage within the analysis unit (S502).

분석부는 비디오 콘텐츠의 세부 정보를 추출하고, 데이터 청크들의 오프셋 정보를 생성한다(S504).The analysis unit extracts detailed information of video content and generates offset information of data chunks (S504).

구체적으로, 분석부는 비디오 콘텐츠의 오류 검사 및 제어부의 인코딩 파라미터 설정을 위해 비디오 콘텐츠의 세부 정보를 추출할 수 있다.Specifically, the analyzer may extract detailed information of video content for error checking of video content and encoding parameter setting of the control unit.

분석부는 비디오 콘텐츠를 시간에 따라 복수의 데이터 청크로 분할할 수 있다. 분석부는 비디오 콘텐츠의 재생 시간을 기 설정된 시간 간격으로 분할하여 비디오 콘텐츠를 복수의 데이터 청크로 분할할 수 있다. 반면, 분석부는 비디오 콘텐츠의 복잡도 등을 고려하여 시간 간격을 가변적으로 설정할 수도 있다.The analyzer may divide the video content into a plurality of data chunks according to time. The analysis unit may divide the video content into a plurality of data chunks by dividing the reproduction time of the video content into predetermined time intervals. On the other hand, the analysis unit may variably set the time interval in consideration of the complexity of video content.

분석부는 각 시간 간격에 대한 시작 지점과 종료 지점을 해당 데이터 청크의 오프셋 정보로 저장한다. 즉, 각 시간 간격에 대한 시작 지점과 종료 지점은 데이터 청크의 시작 지점과 종료 지점일 수 있다. 구체적으로, 분석부는 비디오 콘텐츠에 포함된 프레임들을 디코딩한다. 디코딩 중인 현재 프레임이 다음 데이터 청크의 시작 지점에 가장 인접한 프레임일 경우, 분석부는 현재 프레임을 다음 데이터 청크의 시작 지점으로 설정하고, 현재 프레임을 현재 데이터 청크의 종료 지점으로 설정한다.The analysis unit stores the starting point and ending point for each time interval as offset information of the corresponding data chunk. That is, the starting point and ending point for each time interval may be the starting point and ending point of the data chunk. Specifically, the analyzer decodes frames included in video content. If the current frame being decoded is the frame closest to the starting point of the next data chunk, the analyzer sets the current frame as the starting point of the next data chunk and sets the current frame as the ending point of the current data chunk.

분석부는 비디오 콘텐츠를 프레임 단위로 디코딩한다(S506).The analysis unit decodes the video content frame by frame (S506).

저장부에 저장된 비디오 콘텐츠는 인코딩된 상태로 저장되므로, 분석부는 비디오 콘텐츠의 키 프레임을 탐색하고, 키 프레임을 기준으로 프레임 단위로 디코딩한다.Since the video contents stored in the storage unit are stored in an encoded state, the analysis unit searches for key frames of the video contents and decodes them frame by frame based on the key frames.

분석부는 디코딩 중 프레임의 오류 여부를 판단한다(S508).The analysis unit determines whether or not there is an error in the frame during decoding (S508).

구체적으로, 본 발명의 일 실시예에 의하면, 분석부는 비디오 콘텐츠에 대한 디코딩을 시도하고, 디코딩 가능 여부를 확인함으로써, 비디오 콘텐츠에 대한 제1 오류를 검사한다. 한편, 분석부는 비디오 콘텐츠의 세부 정보와 디코딩된 비디오 콘텐츠가 일치하는지 여부를 통해 비디오 콘텐츠에 대한 제2 오류를 검사한다. 이 외에도, 분석부는 비디오 콘텐츠의 신택스(syntax) 오류를 검사함으로써, 제3 오류를 검사할 수 있다.Specifically, according to an embodiment of the present invention, the analysis unit checks the first error in the video content by attempting to decode the video content and checking whether decoding is possible. Meanwhile, the analyzer checks for a second error in the video content by determining whether the detailed information of the video content matches the decoded video content. In addition to this, the analyzer may check a third error by checking a syntax error of the video content.

비디오 콘텐츠의 프레임이 오류를 포함하는 것으로 판단한 경우, 분석부는 비디오 콘텐츠에 대한 오류를 제어부에게 통지한다(S510).If it is determined that the frame of the video content includes an error, the analysis unit notifies the controller of the error in the video content (S510).

비디오 콘텐츠의 프레임이 오류를 포함하지 않은 것으로 판단한 경우, 분석부는 비디오 콘텐츠의 마지막 프레임인지 여부를 판단한다(S512).If it is determined that the frame of the video content does not contain an error, the analysis unit determines whether it is the last frame of the video content (S512).

현재 프레임이 비디오 콘텐츠의 마지막 프레임이 아닌 경우, 분석부는 비디오 콘텐츠의 현재 재생 시간이 데이터 청크의 오프셋 경계값 여부인지 여부를 판단한다(S514).If the current frame is not the last frame of the video content, the analysis unit determines whether the current playback time of the video content is an offset boundary value of the data chunk (S514).

디코딩 중인 현재 프레임이 다음 데이터 청크의 시작 지점에 가장 인접한 프레임일 경우, 분석부는 현재 프레임을 다음 데이터 청크의 시작 지점으로 설정하고, 현재 프레임을 현재 데이터 청크의 종료 지점으로 설정한다(S516).If the current frame being decoded is the frame closest to the starting point of the next data chunk, the analyzer sets the current frame as the starting point of the next data chunk and sets the current frame as the ending point of the current data chunk (S516).

디코딩 중인 현재 프레임이 다음 데이터 청크의 시작 지점에 가장 인접한 프레임이 아닐 경우, 분석부는 다음 프레임을 디코딩한다.If the current frame being decoded is not the frame closest to the starting point of the next data chunk, the analysis unit decodes the next frame.

한편, 현재 프레임이 비디오 콘텐츠의 마지막 프레임인 경우, 분석부는 현재 프레임을 데이터 청크의 종료 지점으로 기록한다(S518).Meanwhile, when the current frame is the last frame of the video content, the analysis unit records the current frame as the end point of the data chunk (S518).

마지막으로, 분석부는 비디오 콘텐츠의 세부 정보 및 오프셋 정보들을 저장부에 전송한다(S520).Finally, the analysis unit transmits the detailed information and offset information of the video contents to the storage unit (S520).

도 6은 본 발명의 일 실시예에 따른 트랜스코딩부의 동작 과정을 나타낸 순서도다.6 is a flowchart illustrating an operation process of a transcoding unit according to an embodiment of the present invention.

도 6을 참조하면, 트랜스코딩부는 트랜스코딩 요청을 수신하고, 분석부에 의해 계산된 데이터 청크 수에 따라 트랜스코더를 생성한다(S600).Referring to FIG. 6 , the transcoding unit receives a transcoding request and generates transcoders according to the number of data chunks calculated by the analysis unit (S600).

트랜스코딩 요청은 트랜스코딩 파라미터들과 데이터 청크별 오프셋 정보를 포함한다. 한편, 데이터 청크 수는 비디오 콘텐츠의 재생 시간을 특정 크기의 시간 간격으로 나눔으로써 계산된다.The transcoding request includes transcoding parameters and offset information for each data chunk. Meanwhile, the number of data chunks is calculated by dividing the reproduction time of video content by a time interval of a specific size.

트랜스코딩부는 트랜스코더마다 오프셋 정보에 따른 데이터 청크를 로딩한다(S602).The transcoding unit loads data chunks according to the offset information for each transcoder (S602).

구체적으로, 트랜스코딩부는 복수의 데이터 청크를 복수의 트랜스코더에 할당한다. 각 트랜스코더는 할당된 오프셋 정보에 해당되는 비디오 콘텐츠의 일부인 데이터 청크를 로드한다.Specifically, the transcoding unit allocates a plurality of data chunks to a plurality of transcoders. Each transcoder loads a chunk of data that is part of the video content corresponding to the assigned offset information.

트랜스코딩부는 트랜스코딩 요청에 포함된 트랜스코딩 파라미터들을 이용하여 데이터 청크들을 트랜스코딩한다(S604).The transcoding unit transcodes the data chunks using the transcoding parameters included in the transcoding request (S604).

데이터 청크들은 트랜스코딩됨으로써, 사용자의 환경 및 네트워크 상태에 적합한 데이터로 변환된다. 복수의 데이터 청크들은 병렬적으로 데이터 청크들을 트랜스코딩함으로써, 단일 트랜스코더 방식에 비해 트랜스코딩에 소요되는 시간을 줄일 수 있다.Data chunks are converted into data suitable for the user's environment and network conditions by being transcoded. The plurality of data chunks can reduce the time required for transcoding compared to a single transcoder method by transcoding the data chunks in parallel.

트랜스코딩부는 각 트랜스코딩된 데이터 청크에 대해 품질을 평가한다(S606).The transcoding unit evaluates the quality of each transcoded data chunk (S606).

트랜스코딩부는 비디오 콘텐츠에 기초하여 각 트랜스코딩된 데이터 청크의 품질을 평가할 수 있다. 구체적으로, 트랜스코딩부는 VMAF(Video Multimethod Assessment Fusion)와 같은 영상 품질 측정 알고리즘을 이용하여 각 트랜스코딩된 데이터 청크의 품질을 평가할 수 있다.The transcoding unit may evaluate the quality of each transcoded data chunk based on the video content. Specifically, the transcoding unit may evaluate the quality of each transcoded data chunk using a video quality measurement algorithm such as VMAF (Video Multimethod Assessment Fusion).

트랜스코딩부는 각 트랜스코딩된 데이터 청크가 품질 기준을 통과하는지 여부를 판단한다(S608).The transcoding unit determines whether each transcoded data chunk passes the quality standard (S608).

트랜스코딩된 데이터 청크가 품질 기준에 미달하는 경우, 트랜스코딩부는 데이터 청크의 트랜스코딩 파라미터를 변경한다(S610).If the transcoded data chunk does not meet the quality standard, the transcoding unit changes the transcoding parameter of the data chunk (S610).

구체적으로, 트랜스코딩된 데이터 청크가 품질 기준에 미달하는 경우, 트랜스코딩부는 해당 트랜스코딩된 데이터 청크에 대응되는 데이터 청크를 타겟 데이터 청크로 식별한다. 즉, 타겟 데이터 청크는 트랜스코딩된 데이터 청크의 트랜스코딩되기 이전의 데이터이다. Specifically, when the transcoded data chunk does not meet the quality criterion, the transcoding unit identifies a data chunk corresponding to the transcoded data chunk as a target data chunk. That is, the target data chunk is data before transcoding of the transcoded data chunk.

트랜스코딩부는 타겟 데이터 청크가 목표하는 품질 기준을 만족하도록 트랜스코딩하기 위한 트랜스코딩 파라미터들의 값을 계산한다. 트랜스코딩부는 타겟 데이터 청크에 대한 트랜스코딩 파라미터를 변경한다. 트랜스코딩부는 타겟 데이터 청크의 트랜스코딩 파라미터를 고품질로 변경할 수 있다. 예를 들면, 트랜스코딩부는 비디오 콘텐츠의 압축 데이터의 분포를 파악하고, 평균 비트레이트를 유지하되, 각 데이터 청크별 인코딩 비트레이트를 조정할 수 있다. 다른 예로써, 트랜스코딩부는 트랜스코딩 파라미터들 중 CRF(Constant Rate Factor) 값을 조정함으로써, 목표 품질을 달성할 수 있다. 트랜스코딩부는 조정된 트랜스코딩 파라미터들을 이용하여 다시 타겟 데이터 청크를 트랜스코딩하고, 품질을 평가한다. 다시 트랜스코딩된 타겟 데이터 청크가 품질 기준을 만족하지 못하는 경우, 트랜스코딩부는 비디오 콘텐츠의 평균 비트레이트를 높여서 다시 트랜스코딩할 수 있다. 이처럼, 트랜스코딩부는 트랜스코딩된 데이터 청크의 품질을 평가하고, 품질에 따라 트랜스코딩 파라미터들을 변경하여 다시 트랜스코딩함으로써, 사용자가 시청하는 비디오 콘텐츠의 품질을 보장할 수 있다.The transcoding unit calculates transcoding parameter values for transcoding so that the target data chunk satisfies a target quality criterion. The transcoding unit changes transcoding parameters for the target data chunk. The transcoding unit may change the transcoding parameters of the target data chunk with high quality. For example, the transcoding unit may determine the distribution of compressed data of the video content, maintain the average bit rate, and adjust the encoding bit rate for each data chunk. As another example, the transcoding unit may achieve a target quality by adjusting a Constant Rate Factor (CRF) value among transcoding parameters. The transcoding unit transcodes the target data chunk again using the adjusted transcoding parameters and evaluates the quality. If the re-transcoded target data chunk does not satisfy the quality criterion, the transcoder may re-transcode the video content by increasing the average bit rate. As such, the transcoding unit may evaluate the quality of the transcoded data chunk, change transcoding parameters according to the quality, and perform the transcoding again, thereby guaranteeing the quality of the video content viewed by the user.

트랜코딩부는 현재 트랜스코딩된 데이터 청크가 마지막 데이터 청크인지 여부를 판단한다(S612).The transcoding unit determines whether the currently transcoded data chunk is the last data chunk (S612).

현재 트랜스코딩된 데이터 청크가 마지막 데이터 청크가 아닌 경우, 트랜스코딩부는 다음 트랜스코딩된 데이터 청크에 대해 품질을 평가한다.If the currently transcoded data chunk is not the last data chunk, the transcoding unit evaluates the quality of the next transcoded data chunk.

현재 트랜스코딩된 데이터 청크가 마지막 데이터 청크인 경우, 트랜스코딩부는 각 트랜스코딩된 데이터 청크 및 오프셋 정보를 저장한다(S614).If the currently transcoded data chunk is the last data chunk, the transcoding unit stores each transcoded data chunk and offset information (S614).

한편, 트랜스코딩부는 오디오 콘텐츠에 대해 비디오 콘텐츠와 동일한 과정을 수행할 수 있다. 트랜스코딩부는 오디오 콘텐츠를 일체로 트랜스코딩하거나, 복수의 오디오 데이터 청크로 분할하여 병렬 트랜스코딩할 수 있다. 이때, 각 오디오 데이터 청크의 크기는 비디오 데이터 청크의 정수배일 수 있다. 즉, 비디오 콘텐츠와 오디오 콘텐츠 모두 분산 인코딩할 수 있으나, 경우에 따라서는 오디오 콘텐츠는 단일 인코딩으로 처리하고 비디오 콘텐츠만 분산 인코딩으로 처리하는 것 또한 가능하다.Meanwhile, the transcoding unit may perform the same process as video content for audio content. The transcoding unit may integrally transcode the audio content or divide the audio content into a plurality of audio data chunks and perform parallel transcoding. In this case, the size of each audio data chunk may be an integer multiple of the video data chunk. That is, both video content and audio content can be distributedly encoded, but in some cases, it is also possible to process audio content with single encoding and process only video content with distributed encoding.

도 7은 본 발명의 일 실시예에 따른 병합부의 동작 과정을 나타낸 순서도다.7 is a flowchart illustrating an operation process of a merging unit according to an embodiment of the present invention.

먼저, 병합부는 제어부로부터 복수의 트랜스코딩된 데이터 청크에 대한 병합 요청을 수신한다. First, the merging unit receives a merging request for a plurality of transcoded data chunks from the control unit.

도 7을 참조하면, 병합부는 오프셋 정보를 이용하여 복수의 트랜스코딩된 데이터 청크들을 순차적으로 병합한다(S700).Referring to FIG. 7 , the merging unit sequentially merges a plurality of transcoded data chunks using offset information (S700).

복수의 트랜스코딩된 데이터 청크들이 오프셋 순서대로 병합됨으로써, 트랜스코딩된 비디오 콘텐츠가 생성된다.Transcoded video content is created by merging a plurality of transcoded data chunks in offset order.

병합부는 트랜스코딩된 오디오 콘텐츠를 로드한다(S702).The merging unit loads the transcoded audio content (S702).

여기서, 트랜스코딩된 오디오 콘텐츠는 하나의 데이터로 존재할 수도 있고, 복수의 데이터 청크의 형태로 존재할 수도 있다.Here, the transcoded audio content may exist as one piece of data or in the form of a plurality of data chunks.

병합부는 트랜스코딩된 비디오 콘텐츠의 길이 정보와 오디오 콘텐츠의 길이 정보가 일치하는지 여부를 판단한다(S704).The merging unit determines whether length information of the transcoded video content matches length information of the audio content (S704).

구체적으로, 병합부는 트랜스코딩된 비디오 콘텐츠의 전체 재생 시간과 트랜스코딩된 오디오 콘텐츠이 전체 재생 시간이 동일한지 판단한다.Specifically, the merging unit determines whether the total playback time of the transcoded video content and the total playback time of the transcoded audio content are the same.

트랜스코딩된 비디오 콘텐츠의 길이 정보와 오디오 콘텐츠의 길이 정보가 일치하는 경우, 병합부는 비디오 콘텐츠와 오디오 콘텐츠를 병합한다(S708).When the length information of the transcoded video content matches the length information of the audio content, the merging unit merges the video content and the audio content (S708).

병합부는 트랜스코딩된 비디오 콘텐츠와 트랜스코딩된 오디오 콘텐츠를 세그먼트별로 시간적으로 인터리빙(interleaving)하여 하나의 인코딩 파일로 만들 수 있다.The merging unit may temporally interleave the transcoded video content and the transcoded audio content for each segment into one encoding file.

병합부는 병합된 미디어 콘텐츠를 저장한다(S710).The merging unit stores the merged media contents (S710).

한편, 트랜스코딩된 비디오 콘텐츠의 길이 정보와 오디오 콘텐츠의 길이 정보가 일치하지 않는 경우, 병합부는 트랜스코딩 오류를 통지한다(S706).Meanwhile, when the length information of the transcoded video content and the length information of the audio content do not match, the merging unit notifies a transcoding error (S706).

병합부는 트랜스코딩된 비디오 콘텐츠의 트랜스코딩 과정에서 프레임 손실 또는 프레임 중복이 발생한 것으로 판단한다. 즉, 병합부는 트랜스코딩된 비디오 콘텐츠의 무결성이 검증되지 않은 것으로 판단한다. 비디오 콘텐츠의 트랜스코딩이 다시 시도될 수 있다.The merging unit determines that frame loss or frame duplication has occurred during the transcoding of the transcoded video content. That is, the merging unit determines that the integrity of the transcoded video content has not been verified. Transcoding of the video content may be attempted again.

도 8은 본 발명의 일 실시예에 따른 제어부의 동작 과정을 나타낸 순서도다.8 is a flowchart illustrating an operation process of a control unit according to an embodiment of the present invention.

먼저, 제어부는 저장부로부터 비디오 콘텐츠의 업로드 완료 통지 또는 트랜스코딩 요청 이벤트를 수신한다.First, the control unit receives an upload completion notification of video content or a transcoding request event from the storage unit.

도 8을 참조하면, 제어부는 비디오 콘텐츠에 대한 세부 정보 추출 요청 및 데이터 청크 분할을 분석부에 요청한다(S800).Referring to FIG. 8 , the control unit requests the analysis unit to extract detailed information about video content and divide data chunks (S800).

또한, 제어부는 분석부에 비디오 콘텐츠에 대한 오류 검사를 요청할 수 있다. 분석부의 분석이 완료된 후, 제어부는 분석부로부터 분석 완료를 통지 받는다.In addition, the control unit may request an error check on the video content to the analysis unit. After analysis by the analysis unit is completed, the control unit is notified of completion of the analysis from the analysis unit.

제어부는 비디오 콘텐츠 및 오디오 콘텐츠에 대한 트랜스코딩을 트랜스코딩부에 요청한다(S802).The control unit requests the transcoding unit to transcode the video content and the audio content (S802).

제어부는 트랜스코딩 요청과 함께 트랜스코딩 파라미터들 및 오프셋 정보를 함께 전송한다.The control unit transmits transcoding parameters and offset information together with the transcoding request.

제어부는 트랜스코딩된 비디오 콘텐츠와 트랜스코딩된 오디오 콘텐츠의 병합을 병합부에 요청한다(S804).The controller requests the merging unit to merge the transcoded video content and the transcoded audio content (S804).

제어부는 트랜스코딩된 미디어 콘텐츠를 스트리밍 서버에 전달한다(S806).The controller delivers the transcoded media content to the streaming server (S806).

이 외에도, 제어부는 트랜스코딩된 미디어 콘텐츠를 저장용 서버에 전달할 수도 있다.In addition to this, the controller may deliver the transcoded media content to a server for storage.

도 9는 비디오 콘텐츠의 프레임 단위 구성을 나타낸 도면이다.9 is a diagram illustrating a frame-by-frame configuration of video content.

영상(video)을 압축하기 위한 코덱은 수신 장치에서 프레임을 복원할 수 있는 정보를 프레임으로부터 추출하고, 추출된 정보를 수신 장치로 전송한다. 추출된 정보의 양은 프레임 자체의 정보량보다 적으므로, 코덱은 정보 추출을 통해 전송량을 줄일 수 있다. A codec for compressing a video extracts information capable of restoring a frame in a receiving device from a frame, and transmits the extracted information to the receiving device. Since the amount of extracted information is less than that of the frame itself, the codec can reduce the amount of transmission through information extraction.

코덱은 각 프레임마다 인코딩 데이터를 추출할 수 있지만, 더 많은 압축을 위해 두 프레임 간 변화가 있는 부분에 관한 정보를 이용할 수도 있다. 구체적으로, 코덱은 제1 프레임을 복원하기 위한 제1 정보를 제1 프레임으로부터 추출하고, 제1 프레임과 제2 프레임 간 차이에 관한 정보를 제2 정보로 추출하여 수신 장치로 전송한다. 수신 장치는 제1 정보로부터 제1 프레임을 복원하며, 제2 프레임을 제1 정보 및 제2 정보로부터 복원할 수 있다. 이를 움직임 보상이라 한다.The codec can extract encoding data for each frame, but it can also use information about the change between two frames for further compression. Specifically, the codec extracts first information for reconstructing the first frame from the first frame, extracts information about a difference between the first frame and the second frame as second information, and transmits it to the receiving device. The receiving device may reconstruct the first frame from the first information and reconstruct the second frame from the first information and the second information. This is called motion compensation.

도 9를 참조하면, 영상 내 연속되는 프레임들이 도시되어 있다. 각 프레임은 정지영상을 나타낸다. Referring to FIG. 9 , consecutive frames in an image are shown. Each frame represents a still image.

움직임 보상 방법에서 코덱은 세 가지 프레임들을 이용하여 영상을 압출할 수 있다. 세 가지 프레임들은 I-프레임(Infra frame), B-프레임(Bi-directional predicted frame) 및 P-프레임(Predicted frame)이다.In the motion compensation method, the codec can compress an image using three frames. The three frames are an Infra frame (I-frame), a Bi-directional predicted frame (B-frame), and a Predicted frame (P-frame).

I-프레임은 키(key) 프레임 역할을 가지며, 다른 프레임 없이 단일 프레임만으로 독립적으로 인코딩 및 디코딩될 수 있다. I-프레임은 독립적으로 인코딩 및 디코딩될 수 있으므로, 랜덤 액세스(random access)의 기준점으로 사용될 수 있다. 예를 들면, 사용자가 영상 시청 중 임의의 프레임이 위치한 시점을 재생 시작 지점으로 지정하였을 때, 지정된 지점의 근처에 위치한 I-프레임부터 재생될 수 있다. 한편, I-프레임은 JPEG(joint photographic coding experts group) 압축 방법을 이용하여 압축될 수 있다.An I-frame has a key frame role and can be independently encoded and decoded with only a single frame without other frames. Since I-frames can be independently encoded and decoded, they can be used as reference points for random access. For example, when a user designates a point in time at which an arbitrary frame is located while watching a video as a reproduction start point, it may be reproduced from an I-frame located near the designated point. Meanwhile, I-frames may be compressed using a joint photographic coding experts group (JPEG) compression method.

P-프레임은 P-프레임의 이전에 위치한 I-프레임 또는 P-프레임 중 어느 하나를 이용하여 인코딩될 수 있다. 예를 들면, P-프레임과 P-프레임의 이전에 위치한 I-프레임 간 차이점을 변화 정보로 계산하고, 변화 정보를 DCT 변환하여 압축할 수 있다. 수신 장치는 I-프레임을 복원하고, 변화 정보를 이용하여 I-프레임으로부터 P-프레임을 예측할 수 있다.A P-frame can be encoded using either an I-frame or a P-frame that precedes the P-frame. For example, a difference between a P-frame and an I-frame positioned before the P-frame may be calculated as change information, and the change information may be DCT-converted and compressed. The receiving device may reconstruct the I-frame and predict the P-frame from the I-frame using change information.

B-프레임은 B-프레임의 이전에 위치한 I-프레임 또는 P-프레임 중 어느 하나와, B-프레임의 이후에 위치한 I-프레임 또는 P-프레임 중 어느 하나를 참조하여 쌍방향으로 예측되는 프레임이다. A B-frame is a frame that is bidirectionally predicted by referring to either an I-frame or a P-frame located before the B-frame and either an I-frame or a P-frame located after the B-frame.

이처럼, I-프레임은 독립적으로 인코딩 및 디코딩될 수 있으며, P-프레임 및 B-프레임은 다른 프레임을 참조하여 인코딩 및 디코딩될 수 있다. As such, I-frames can be encoded and decoded independently, and P-frames and B-frames can be encoded and decoded with reference to other frames.

P-프레임 및 B-프레임은 다른 프레임을 참조하는 대신, 프레임을 복원하는 데 필요한 정보량이 I-프레임에 비해 적다. P-프레임 크기는 I-프레임의 크기의 3 분의 1에 해당하며, B-프레임의 크기는 P-프레임의 크기의 3 분의 1에 해당한다.P-frames and B-frames do not refer to other frames, but the amount of information required to reconstruct the frame is smaller than that of I-frames. The size of a P-frame corresponds to one third of the size of an I-frame, and the size of a B-frame corresponds to one-third of the size of a P-frame.

전술한 압축 과정을 통해 압축된 프레임의 집합은 픽처 그룹(Group of Picture, GOP)로 지칭될 수 있다. 자세하게는, GOP는 하나의 I-프레임과 하나의 I-프레임의 이후에 위치한 I-프레임 간 간격을 의미하며, I-프레임의 주기를 나타낸다. A set of frames compressed through the aforementioned compression process may be referred to as a Group of Pictures (GOP). In detail, GOP means an interval between one I-frame and one I-frame located after one I-frame, and represents the period of the I-frame.

도 10은 비디오 콘텐츠의 프레임 그룹의 구성을 나타낸 도면이다.10 is a diagram showing the configuration of frame groups of video content.

비디오 콘텐츠의 프레임은 I-프레임들만으로 구성될 수도 있고, P-프레임 또는 B-프레임을 더 포함하여 구성될 수도 있다.A frame of video content may consist of only I-frames or may further include P-frames or B-frames.

GOP의 수는 하나의 I-프레임과 다음 I-프레임 사이에 위치한 프레임 개수이다. 다시 말하면, GOP의 수는 하나의 I-프레임 주기 내에서 P-프레임과 B-프레임의 개수의 총합을 의미할 수 있다. GOP의 수가 증가할수록 압축률은 증하지만 화질이 저하된다. 반대로, GOP의 수가 감소할수록 압축률은 감소하지만 화질이 개선된다. GOP의 수는 임의의 값으로 설정될 수 있다.The number of GOPs is the number of frames located between one I-frame and the next I-frame. In other words, the number of GOPs may mean the sum of the numbers of P-frames and B-frames within one I-frame period. As the number of GOPs increases, the compression rate increases, but the image quality deteriorates. Conversely, as the number of GOPs decreases, the compression rate decreases but image quality improves. The number of GOPs can be set to any value.

하나의 GOP는 하나 이상의 I-프레임을 포함하며, I-프레임은 재생의 기준점이 된다. GOP의 수가 작을수록 I-프레임의 밀도가 높으며, GOP의 수가 클수록 I-프레임의 밀도가 낮다. GOP의 수가 작을수록 랜덤 액세스, 고속 재생, 및 역방향 재생에 적합하다.One GOP includes one or more I-frames, and the I-frames serve as reference points for reproduction. The smaller the number of GOPs, the higher the density of I-frames, and the larger the number of GOPs, the lower the density of I-frames. A smaller number of GOPs is suitable for random access, high-speed playback, and reverse playback.

한편, GOP의 수는 비디오 콘텐츠의 초당 프레임을 기반으로 약 0.5 초에 해당하는 프레임의 개수와 같을 수 있다. 약 0.5 초에 해당하는 프레임의 개수에 따라 생성된 GOP를 long GOP라 지칭한다. 반면, 6 개의 프레임 단위마다 생성된 GOP를 short GOP라 지칭한다. Meanwhile, the number of GOPs may be equal to the number of frames corresponding to about 0.5 seconds based on frames per second of video content. A GOP generated according to the number of frames corresponding to about 0.5 seconds is referred to as a long GOP. On the other hand, a GOP generated every 6 frames is referred to as a short GOP.

한편, 방송 소스를 녹화한 파일인 트랜스포트 스트림(transport stream, TS) 파일은 GOP의 수와 연속된 B-프레임의 수를 포함한다.Meanwhile, a transport stream (TS) file, which is a file recorded from a broadcasting source, includes the number of GOPs and the number of consecutive B-frames.

도 10을 참조하면, GOP 수(N) 및 연속된 B-프레임의 개수(N)가 예시되어 있다. 상단에 위치한 프레임들은 GOP 수가 12이고, 연속된 B-프레임의 개수가 3이다. 반면, 하단에 위치한 프레임들은 GOP 수가 15이고, 연속된 B-프레임의 개수가 5이다.Referring to FIG. 10, the number of GOPs (N) and the number of consecutive B-frames (N) are illustrated. The frames located at the top have a GOP number of 12 and the number of consecutive B-frames is 3. On the other hand, the lower frames have 15 GOPs and 5 consecutive B-frames.

GOP 수 및 연속된 B-프레임 개수를 포함하는 GOP 정보는 시퀀스 헤더(sequence header) 뒤에서 비디오 스트림과 함께 전송될 수 있다. 여기서, 시퀀스 헤더는 화면 해상도 정보, 화면 비율 정보, 초당 프레임 개수, 비트레이트 또는 버퍼 크기 등의 정보를 더 포함할 수 있다.GOP information including the number of GOPs and the number of consecutive B-frames may be transmitted along with a video stream behind a sequence header. Here, the sequence header may further include information such as screen resolution information, aspect ratio information, number of frames per second, bit rate, or buffer size.

전술한 구성들을 통해, 수신 장치는 영상 내 특정 위치를 랜덤 액세스하는 경우, 시퀀스 헤더의 위치를 확인하고, GOP 내 I-프레임을 참조하며 영상을 복원할 수 있다. 수신 장치는 복원된 I-프레임을 이용하여 P-프레임 및 B-프레임을 복원할 수 있다.Through the above configurations, when a receiving device randomly accesses a specific position in a video, it can check the position of the sequence header and refer to the I-frame in the GOP to restore the video. The receiving device may reconstruct the P-frame and the B-frame using the restored I-frame.

한편, 본 발명의 일 실시예에 의하면, 트랜스코딩 시스템은 비디오 콘텐츠를 시간에 따라 복수의 데이터 청크로 논리적 분할하고, 데이터 청크를 트랜스코더들에 할당함으로써, 병렬 트랜스코딩을 수행할 수 있다. 이를 위해, 트랜스코딩 시스템은 비디오 콘텐츠의 재생 시간을 복수의 시간 구간들로 분할할 수 있다. 각 트랜스코더는 할당된 시간 구간을 기준으로 디코딩 및 인코딩을 수행한다.Meanwhile, according to an embodiment of the present invention, the transcoding system may perform parallel transcoding by logically dividing video content into a plurality of data chunks according to time and allocating the data chunks to transcoders. To this end, the transcoding system may divide the reproduction time of video content into a plurality of time intervals. Each transcoder performs decoding and encoding based on the allocated time interval.

하지만, 비디오 콘텐츠는 복수의 프레임들로 구성되며, 비디오 콘텐츠는 I-프레임을 참조 포인트로 하여 디코딩될 수 있다. 즉, 비디오 콘텐츠는 GOP의 수에 따른 프레임 간격으로 배치된 I-프레임을 기준으로 디코딩된다.However, video content is composed of a plurality of frames, and the video content can be decoded using an I-frame as a reference point. That is, video content is decoded based on I-frames arranged at frame intervals according to the number of GOPs.

따라서, 트랜스코딩 시스템의 각 트랜스코더는 할당된 시간 구간에 따른 데이터 청크를 디코딩할 때, 할당된 시간 구간의 시작 지점과 I-프레임의 위치 간 오차가 발생할 수 있다.Therefore, when each transcoder of the transcoding system decodes a data chunk according to an allocated time interval, an error may occur between the start point of the allocated time interval and the position of the I-frame.

도 11a 및 도 11b는 프레임들의 트랜스코딩 구간을 나타낸 도면이다.11A and 11B are diagrams illustrating transcoding sections of frames.

도 11a를 참조하면, 비디오 콘텐츠 내 프레임들이 도시되어 있다.Referring to FIG. 11A , frames within video content are shown.

I-프레임들은 주어진 프레임 간격마다 배치된다. 주어진 프레임 간격은 비디오 콘텐츠의 재생 시간, GOP의 수, 초당 프레임 수 등에 의해 결정된다.I-frames are placed at given frame intervals. A given frame interval is determined by the playback time of video content, the number of GOPs, the number of frames per second, and the like.

비디오 콘텐츠는 동일한 크기를 갖는 시간 구간들로 구분될 수 있다. 예를 들어, 각 시간 구간은 0.5 초의 크기를 가질 수 있다. 이때, T-1번째 시간 구간은 T-1번째 데이터 청크에 대응되며, T번째 시간 구간은 T번째 데이터 청크에 대응된다. Video content may be divided into time sections having the same size. For example, each time interval may have a size of 0.5 seconds. In this case, the T-1 th time interval corresponds to the T-1 th data chunk, and the T th time interval corresponds to the T th data chunk.

도 11a에서는 비디오 콘텐츠 내 I-프레임들 간 프레임 간격과 분할된 시간 구간들이 일치한다. 즉, 프레임 구간의 시작 지점과 시간 구간의 시작 지점이 일치한다.In FIG. 11A, frame intervals between I-frames in video content coincide with divided time intervals. That is, the starting point of the frame interval coincides with the starting point of the time interval.

T번째 트랜스코더는 T번째 시간 구간에 대응되는 데이터 청크를 로드하고, T번째 시간 구간의 디코딩 시작 지점에 위치한 I-프레임부터 디코딩을 수행할 수 있다.The T-th transcoder may load a data chunk corresponding to the T-th time interval and perform decoding from an I-frame located at a decoding start point of the T-th time interval.

T번째 트랜스코더는 디코딩된 데이터 청크를 인코딩 파라미터들에 따라 인코딩한다. 여기서, T번째 트랜스코더는 T번째 시간 구간의 디코딩 시작 지점에 위치한 프레임을 I-프레임으로 설정하여 인코딩할 수 있다.The Tth transcoder encodes the decoded data chunk according to the encoding parameters. Here, the T-th transcoder may encode the frame located at the decoding start point of the T-th time interval by setting it as an I-frame.

예를 들면, 시간 구간의 크기가 4초로 지정되고, 비디오 콘텐츠의 초당 프레임 개수가 24인 경우, 프레임 구간의 시작 지점과 시간 구간의 시작 지점이 일치할 수 있다. 첫 번째 시간 구간은 96개의 프레임을 포함하고, 두 번째 시간 구간은 97번째 프레임부터 시작한다. For example, when the size of the time interval is designated as 4 seconds and the number of frames per second of video content is 24, the start point of the frame interval and the start point of the time interval may coincide. The first time interval includes 96 frames, and the second time interval starts from the 97th frame.

반면, 도 11b를 참조하면, 비디오 콘텐츠 내 I-프레임들 간 프레임 간격과 분할된 시간 구간들이 일치하지 않는다. T번째 프레임 구간의 시작 지점과 T번째 시간 구간의 시작 지점이 일치하지 않는다.On the other hand, referring to FIG. 11B, the frame interval between I-frames in video content and the divided time intervals do not match. The start point of the T-th frame section and the start point of the T-th time section do not coincide.

예를 들면, 시간 구간의 크기가 4초이고, 초당 프레임 개수가 23.96인 경우, 두 번째 시간 구간의 시작 지점은 96번째 프레임과 97번째 프레임 사이에 위치한다. 즉, 두 번째 시간 구간의 시작 지점은 두 번째 프레임 구간의 시작 지점과 다를 수 있다.For example, when the size of the time interval is 4 seconds and the number of frames per second is 23.96, the start point of the second time interval is located between the 96th frame and the 97th frame. That is, the starting point of the second time interval may be different from the starting point of the second frame interval.

T번째 트랜스코더는 T번째 시간 구간의 시작 지점에 위치한 B-프레임을 기준으로 데이터 청크를 로드한다. 즉, T번째 트랜스코더가 로드한 데이터 청크의 첫 프레임은 B-프레임이다. 이 경우, T번째 트랜스코더는 T번째 시간 구간의 시작 지점에서 디코딩을 수행할 수 없게 된다.The T-th transcoder loads the data chunk based on the B-frame located at the start point of the T-th time interval. That is, the first frame of the data chunk loaded by the Tth transcoder is a B-frame. In this case, the T-th transcoder cannot perform decoding at the start point of the T-th time interval.

T번째 트랜스코더는 로드한 데이터 청크 내에서 I-프레임을 탐색하고, 가장 앞에 위치한 I-프레임을 기준으로 디코딩을 시작한다.The T-th transcoder searches for an I-frame within the loaded data chunk and starts decoding based on the first I-frame.

따라서, T번째 시간 구간의 시작 지점과 T번째 시간 구간의 디코딩 시작 지점 사이에 위치한 프레임들은 트랜스코더에 로드되더라도, 트랜스코딩 과정에서 손실된다. 이는, 비디오 콘텐츠를 시청하는 사용자로 하여금 영상 품질을 저하시키는 요인이 된다.Accordingly, frames located between the start point of the T-th time interval and the decoding start point of the T-th interval are lost in the transcoding process even though they are loaded into the transcoder. This becomes a factor in deteriorating image quality for users who watch video content.

트랜스코딩 시스템은 초당 프레임 수를 조정하여, 시간 구간의 시작점과 프레임 구간의 시작점을 일치시킬 수 있으나, 비디오 콘텐츠의 품질에 영향을 미칠 수 있으므로 효과적이지 않다.The transcoding system may adjust the number of frames per second to match the starting point of the time interval with the starting point of the frame interval, but it is not effective because it may affect the quality of video content.

도 12는 본 발명의 일 실시예에 따른 프레임들의 트랜스코딩 구간을 나타낸 도면이다.12 is a diagram illustrating transcoding sections of frames according to an embodiment of the present invention.

도 12를 참조하면, 트랜스코딩 시스템은 주어진 프레임 구간마다 위치한 키 프레임들을의 위치를 구한다. 트랜스코딩 시스템은 비디오 콘텐츠를 복수의 시간 구간들로 분할한다.Referring to FIG. 12, the transcoding system obtains the positions of key frames located in each given frame period. A transcoding system divides video content into a plurality of time intervals.

이후, 트랜스코딩 시스템은 T번째 시간 구간의 시작 지점이 T번째 프레임 구간의 시작 지점과 일치하는지 판단한다. Then, the transcoding system determines whether the starting point of the T-th time interval coincides with the starting point of the T-th frame interval.

트랜스코딩 시스템은 T번째 시간 구간의 시작 지점과 T번째 프레임 구간의 시작 지점이 일치하지 않는 경우, 키 프레임의 위치에 기초하여 T번째 시간 구간의 시작 지점을 조정한다.The transcoding system adjusts the start point of the T-th time interval based on the location of the key frame when the start point of the T-th time interval does not coincide with the start point of the T-th frame interval.

트랜스코딩 시스템은 T번째 시간 구간의 시작 지점을 T번째 시간 구간의 시작 지점에 가장 가까운 키 프레임의 위치로 조정할 수 있다. The transcoding system may adjust the start point of the T-th time interval to the position of the key frame closest to the start point of the T-th time interval.

구체적으로, 트랜스코딩 시스템은 T번째 시간 구간의 시작 지점을 T번째 시간 구간의 시작 지점보다 이전에 위치한 키 프레임들 중 T번째 시간 구간의 시작 지점에 가장 가까운 키 프레임으로 조정할 수 있다.Specifically, the transcoding system may adjust the start point of the T-th time interval to a key frame closest to the start point of the T-th time interval among key frames located before the start point of the T-th time interval.

마찬가지로, 트랜스코딩 시스템은 T번째 시간 구간의 종료 지점을 T번째 시간 구간의 종료 지점에 가장 가까운 키 프레임의 위치로 조정할 수 있다. Similarly, the transcoding system may adjust the end point of the T-th time interval to the position of the key frame closest to the end point of the T-th time interval.

구체적으로, 트랜스코딩 시스템은 T번째 시간 구간의 종료 지점을 T번째 시간 구간의 종료 지점보다 이후에 위치한 키 프레임들 중 T번째 시간 구간의 종료 지점에 가장 가까운 키 프레임으로 조정할 수 있다.Specifically, the transcoding system may adjust the end point of the T-th time interval to a key frame closest to the end point of the T-th time interval among key frames located after the end point of the T-th time interval.

한편, 트랜스코딩 시스템은 비디오 콘텐츠의 시작 지점과 종료 지점에 대해서는 조정하지 않고 그대로 이용할 수 있다.Meanwhile, the transcoding system can be used as it is without adjusting the start and end points of video content.

트랜스코딩 시스템은 조정되지 않은 시간 구간 또는 조정된 시간 구간에 대한 정보들을 트랜스코더들에게 전송한다. 조정된 시간 구간을 수신한 트랜스코더는 비디오 콘텐츠 내에서 조정된 시간 구간에 대응되는 비디오 콘텐츠 파트를 로드하여 디코딩한다. 트랜스코더들이 할당 받은 시간 구간의 시작 지점에 키 프레임이 위치하므로, 트랜스코더들은 할당 받은 시간 구간에 대해 프레임 손실 없이 프레임들을 디코딩할 수 있다.The transcoding system transmits information about the unadjusted time interval or the adjusted time interval to transcoders. Upon receiving the adjusted time interval, the transcoder loads and decodes the video content part corresponding to the adjusted time interval within the video content. Since the key frame is located at the start point of the time interval allocated to the transcoders, the transcoders can decode frames without frame loss for the allocated time interval.

T번째 트랜스코더는 T번째 시간 구간의 디코딩 시작 지점부터 프레임들을 로드하여 디코딩을 수행한다. T번째 트랜스코더는 T번째 시간 구간의 조정된 종료 지점까지 디코딩을 수행한다.The T-th transcoder performs decoding by loading frames from the decoding start point of the T-th time interval. The T-th transcoder performs decoding until the adjusted end point of the T-th time interval.

이후, 트랜스코더는 조정된 시간 구간이 아니라, 조정되지 않은 시간 구간에 따라 인코딩을 수행한다. 조정된 시간 구간에 따라 디코딩된 비디오 콘텐츠 파트를 기존 시간 구간에 따라 인코딩하는 것이다.Thereafter, the transcoder performs encoding not according to the adjusted time interval but according to the non-adjusted time interval. The video content part decoded according to the adjusted time interval is encoded according to the existing time interval.

T번째 트랜스코더는 T번째 시간 구간의 디코딩 시작 지점이 아니라, T번째 시간 구간의 시작 지점에 위치한 프레임부터 인코딩을 수행한다. 즉, T번째 시간 구간의 시작 지점에 위치한 프레임은 키 프레임으로 인코딩된다.The T-th transcoder performs encoding from a frame located at the start point of the T-th time interval, not the decoding start point of the T-th time interval. That is, the frame located at the start point of the Tth time interval is encoded as a key frame.

각 시간 구간에 대응되는 데이터 청크는 프레임 손실 없이 트랜스코딩된다.Data chunks corresponding to each time interval are transcoded without frame loss.

이후, 트랜스코더는 주어진 시간 구간에 따라 트랜스코딩된 데이터 청크들을 저장한다. 트랜스코딩된 데이터 청크들은 병합되어 트랜스코딩된 비디오 콘텐츠가 된다.Then, the transcoder stores the transcoded data chunks according to the given time interval. Transcoded data chunks are merged into transcoded video content.

도 13은 본 발명의 일 실시예에 따른 병렬 트랜스코딩 과정을 나타낸 순서도다.13 is a flowchart illustrating a parallel transcoding process according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 병렬 트랜스코딩 방법은 주어진 프레임 간격마다 키 프레임을 포함하는 비디오 콘텐츠에 적용된다.A parallel transcoding method according to an embodiment of the present invention is applied to video content including key frames at given frame intervals.

도 13을 참조하면, 병렬 트랜스코딩을 위한 연산 장치는 비디오 콘텐츠의 재생 시간을 복수의 시간 구간들로 분할한다(S1300).Referring to FIG. 13, the computing device for parallel transcoding divides the playback time of video content into a plurality of time intervals (S1300).

본 발명의 일 실시예에 의하면, 연산 장치는 컴퓨팅 리소스, 사용자 단말의 재생 환경 또는 네트워크 상태 중 적어도 하나를 고려하여 시간 구간들의 크기를 결정할 수 있다.According to an embodiment of the present invention, the computing device may determine the size of the time intervals in consideration of at least one of a computing resource, a playback environment of a user terminal, or a network state.

또한, 연산 장치는 시간 구간들의 개수에 기초하여 복수의 트랜스코더를 생성한다.Also, the computing device generates a plurality of transcoders based on the number of time intervals.

연산 장치는 키 프레임의 위치에 기초하여, 복수의 시간 구간들 중 하나의 시간 구간의 시작 지점 또는 종료 지점 중 적어도 하나를 조정할 수 있다.Based on the location of the key frame, the computing device may adjust at least one of a starting point or an ending point of one time interval among a plurality of time intervals.

연산 장치는 하나의 시간 구간의 시작 지점을 시작 지점의 이전에 위치한 키 프레임들 중 시작 지점에 가장 가까운 키 프레임의 위치로 조정한다(S1302).The computing device adjusts the starting point of one time interval to the position of a key frame closest to the starting point among key frames located before the starting point (S1302).

연산 장치는 하나의 시간 구간의 종료 지점을 종료 지점의 이후에 위치한 키 프레임들 중 종료 지점에 가장 가까운 키 프레임의 위치로 조정한다(S1304).The computing device adjusts the end point of one time interval to a position of a key frame closest to the end point among key frames located after the end point (S1304).

이 외에도, 연산 장치는 시작 지점을 시작 지점에 가장 가까운 키 프레임의 위치로 조정하거나, 종료 지점을 종료 지점에 가장 가까운 키 프레임의 위치로 조정할 수 있다.In addition to this, the computing device may adjust the start point to the position of the key frame closest to the start point, or adjust the end point to the position of the key frame closest to the end point.

연산 장치는 복수의 트랜스코더 중 하나의 트랜스코더를 이용하여 조정된 시간 구간에 대응하는 비디오 콘텐츠 파트를 디코딩한다(S1306).The computing device decodes the video content part corresponding to the adjusted time interval using one of the plurality of transcoders (S1306).

연산 장치는 하나의 트랜스코더에게 조정된 시간 구간과 함께 트랜스코딩 파라미터들을 전송하여 비디오 콘텐츠 파트를 디코딩할 수 있다.The computing device may decode the video content part by sending the transcoding parameters together with the adjusted time interval to one transcoder.

트랜스코더는 비디오 콘텐츠 내에서 조정된 시간 구간에 대응되는 비디오 콘텐츠 파트를 로드하여 디코딩한다. 조정된 시간 구간의 시작 지점에는 키 프레임이 위치하므로, 트랜스코더는 조정된 시간 구간의 시작 지점부터 디코딩을 수행할 수 있다. 또한, 조정된 시간 구간의 종료 지점은 다음 시간 구간의 시작 지점 및 다음 프레임 구간의 시작 지점에 해당하므로, 트랜스코더는 다음 시간 구간의 시작 지점까지 디코딩할 수 있다.The transcoder loads and decodes the video content part corresponding to the adjusted time interval within the video content. Since the key frame is located at the starting point of the adjusted time interval, the transcoder can perform decoding from the starting point of the adjusted time interval. In addition, since the end point of the adjusted time interval corresponds to the start point of the next time interval and the start point of the next frame interval, the transcoder can decode up to the start point of the next time interval.

한편, 본 발명의 다른 실시예에 의하면, 트랜스코딩 파라미터들은 트랜스코딩된 비디오 콘텐츠에서 키 프레임들 간 시간 간격을 포함한다. 각 트랜스코더들은 인코딩 과정에서 트랜스코딩된 비디오 콘텐츠 내 키 프레임들 간 시간 간격을 고려하여, 키 프레임을 생성한다. 키 프레임들은 영상 재생, 역재생, 건너뛰기 등 재생의 기준점이 된다.Meanwhile, according to another embodiment of the present invention, transcoding parameters include a time interval between key frames in transcoded video content. Each transcoder generates a key frame by considering a time interval between key frames in transcoded video content during an encoding process. Key frames serve as reference points for playback such as video playback, reverse playback, and skipping.

연산 장치는 트랜스코더를 이용함으로써 디코딩된 비디오 콘텐츠 파트를 상기 하나의 시간 구간의 범위에서 인코딩한다(S1308).The computing device encodes the decoded video content part in the range of the one time interval by using a transcoder (S1308).

트랜스코더는 상기 하나의 시간 구간에 대응되는 범위 안에서 디코딩된 비디오 콘텐츠 파트를 인코딩한다. 인코딩된 비디오 콘텐츠 파트는 트랜스코딩된 데이터 청크가 된다.The transcoder encodes the decoded video content part within a range corresponding to the one time interval. Encoded video content parts become transcoded data chunks.

최종적으로, 프레임 손실 없이, 각 시간 구간에 따라 트랜스코딩된 데이터 청크들이 생성된다. 트랜스코딩된 데이터 청크들은 병합됨으로써, 트랜스코딩된 비디오 콘텐츠가 되어 사용자 단말로 전송된다.Finally, transcoded data chunks are generated according to each time interval without frame loss. The transcoded data chunks are merged to become transcoded video content and transmitted to the user terminal.

도 4 내지 8 및 도 13에서는 각 과정들을 순차적으로 실행하는 것으로 기재하고 있으나, 이는 본 발명의 일 실시예의 기술 사상을 예시적으로 설명한 것에 불과한 것이다. 다시 말해, 본 발명의 일 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 일 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 도 4 내지 8 및 도 13에 기재된 순서를 변경하여 실행하거나 각 과정들 중 하나 이상의 과정을 병렬적으로 실행하는 것으로 다양하게 수정 및 변형하여 적용 가능할 것이므로, 도 4 내지 8 및 도 13은 시계열적인 순서로 한정되는 것은 아니다.4 to 8 and 13 describe that each process is sequentially executed, but this is merely an example of the technical idea of an embodiment of the present invention. In other words, those skilled in the art to which an embodiment of the present invention pertains may change and execute the order described in FIGS. 4 to 8 and 13 without departing from the essential characteristics of the embodiment of the present invention, or each process 4 to 8 and 13 are not limited to a time-sequential order, since it will be possible to apply various modifications and variations by executing one or more of these processes in parallel.

한편, 도 4 내지 8 및 도 13에 도시된 과정들은 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다. 즉, 이러한 컴퓨터가 읽을 수 있는　기록매체는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 데이터 저장장치 등의 비일시적인(non-transitory) 매체를 포함한다. 또한 컴퓨터가 읽을 수 있는 기록매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.Meanwhile, the processes shown in FIGS. 4 to 8 and 13 can be implemented as computer readable codes on a computer readable recording medium. A computer-readable recording medium includes all types of recording devices in which data that can be read by a computer system is stored. That is, such a computer-readable recording medium includes non-transitory media such as ROM, RAM, CD-ROM, magnetic tape, floppy disk, and optical data storage device. In addition, the computer-readable recording medium may be distributed to computer systems connected through a network to store and execute computer-readable codes in a distributed manner.

또한, 본 발명의 구성 요소들은 메모리, 프로세서, 논리 회로, 룩-업 테이블(look-up table) 등과 같은 집적 회로 구조를 사용할 수 있다. 이러한 집적 회로 구조는 하나 이상의 마이크로 프로세서 또는 다른 제어 장치의 제어를 통해 본 명세서에 기술 된 각각의 기능을 실행한다. 또한, 본 발명의 구성 요소들은 특정 논리 기능을 수행하기 위한 하나 이상의 실행 가능한 명령을 포함하고 하나 이상의 마이크로 프로세서 또는 다른 제어 장치에 의해 실행되는 프로그램 또는 코드의 일부에 의해 구체적으로 구현될 수 있다. 또한, 본 발명의 구성 요소들은 각각의 기능을 수행하는 중앙 처리 장치(CPU), 마이크로 프로세서 등을 포함하거나 이에 의해 구현될 수 있다. 또한, 본 발명의 구성 요소들은 하나 이상의 프로세서에 의해 실행되는 명령어들을 하나 이상의 메모리에 저장할 수 있다.In addition, components of the present invention may use an integrated circuit structure such as a memory, a processor, a logic circuit, a look-up table, and the like. These integrated circuit structures execute each of the functions described herein through the control of one or more microprocessors or other control devices. In addition, the components of the present invention may be specifically implemented by a program or part of code that includes one or more executable instructions for performing a specific logical function and is executed by one or more microprocessors or other control devices. In addition, the components of the present invention may include or be implemented by a central processing unit (CPU), a microprocessor, etc. that perform each function. Also, components of the present invention may store instructions executed by one or more processors in one or more memories.

이상의 설명은 본 실시예의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 실시예들은 본 실시예의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 실시예의 기술 사상의 범위가 한정되는 것은 아니다. 본 실시예의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 실시예의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely an example of the technical idea of the present embodiment, and various modifications and variations can be made to those skilled in the art without departing from the essential characteristics of the present embodiment. Therefore, the present embodiments are not intended to limit the technical idea of the present embodiment, but to explain, and the scope of the technical idea of the present embodiment is not limited by these embodiments. The scope of protection of this embodiment should be construed according to the claims below, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of rights of this embodiment.

Claims

A computer-implemented method for parallel transcoding video content containing key frames at given frame intervals, comprising:
Dividing a reproduction time of video content into a plurality of time intervals;
adjusting at least one of a starting point and an ending point of one of the plurality of time intervals based on the position of the key frame;
decoding a video content part corresponding to the adjusted time interval by using one of a plurality of transcoders; and
Encoding the decoded video content part in a range of the one time interval using the one transcoder
How to include.

According to claim 1,
The process of adjusting at least one of the start point and the end point of the one time interval,
Adjusting the start point to the position of a key frame closest to the start point or adjusting the end point to the position of a key frame closest to the end point
How to include.

According to claim 1,
The process of adjusting the starting point of the one time interval,
Adjusting the starting point to a position of a key frame closest to the starting point among key frames located before the starting point
How to include.

According to claim 1,
The process of adjusting the end point of the one time interval,
Adjusting the end point to a position of a key frame closest to the end point among key frames located after the end point
How to include.

According to claim 1,
The one transcoder,
and loading and decoding a video content part corresponding to the adjusted time interval in the video content.

According to claim 1,
The process of encoding the decoded video content part,
Setting a frame located at a start point of the one time interval among frames included in the decoded video content part as a key frame
How to include.

According to claim 1,
Determining the size of the time intervals in consideration of at least one of a computing resource, a playback environment of a user terminal, or a network state
How to include more.

According to claim 1,
Process of generating a plurality of transcoders based on the number of time intervals
How to include more.

memory for storing instructions; and
including at least one processor;
By the at least one processor executing the instructions,
Dividing a reproduction time of video content including a key frame into a plurality of time intervals for each given frame interval;
Adjusting at least one of a start point or an end point of one time interval among the plurality of time intervals based on the position of the key frame;
decoding a video content part corresponding to the adjusted time interval using one transcoder among a plurality of transcoders;
and encodes the decoded video content part in a range of the one time interval using the one transcoder.

A computer-readable recording medium recording a computer program for executing the method of any one of claims 1 to 8.