KR101823321B1

KR101823321B1 - Systems and methods of encoding multiple video streams with adaptive quantization for adaptive bitrate streaming

Info

Publication number: KR101823321B1
Application number: KR1020157036373A
Authority: KR
Inventors: 샘 오튼-제이; 블라디미로비치 이반 날레토프
Original assignee: 쏘닉 아이피, 아이엔씨.
Priority date: 2013-05-24
Filing date: 2014-05-23
Publication date: 2018-01-31
Also published as: KR20160021141A; CN105359511A; EP3005689A4; EP3005689A1; JP2016526336A; KR20180010343A; WO2014190308A1

Abstract

대안적 비디오 스트림들로서 소스 비디오를 인코딩하기 위한 방법은 수신된 멀티미디어 콘텐트를 통해 제 1 패스에서 소스 비디오 데이터에 대한 통계들을 수집하며 통계들을 공유 메모리로 기록하는 단계로서, 상기 통계들은 픽셀들의 블록들의 복잡도 측정치들을 포함하는, 상기 소스 비디오 데이터에 대한 통계들을 수집 및 기록하는 단계, 상기 제 1 패스 동안 소스 비디오 데이터에 대한 초기 인코딩 정보를 결정하며 초기 인코딩 정보를 공유 메모리에 기록하는 단계, 제 2 패스 동안 대안적인 비디오 스트림들을 생성하기 위해 수집된 통계들 및 초기 인코딩 정보를 사용하여 병렬로 상기 소스 비디오 데이터를 인코딩하는 단계를 포함하며, 병렬 인코딩 프로세스들은 비디오의 일부에 대해 이미 결정된 부가적인 인코딩 정보를 재사용하며 비디오의 일부에 대해 이미 결정되지 않은 부가적인 인코딩 정보를 생성하고, 상기 부가적인 인코딩 정보는 픽셀들의 블록들에 대한 양자화 파라미터들을 포함한다. A method for encoding source video as alternative video streams includes collecting statistics for source video data in a first pass through received multimedia content and recording statistics into a shared memory, Collecting and recording statistics for the source video data, including measurements, determining initial encoding information for the source video data during the first pass and writing initial encoding information to the shared memory, And encoding the source video data in parallel using collected statistics and initial encoding information to generate alternative video streams, wherein the parallel encoding processes reuse additional encoding information already determined for a portion of the video Video Generate encoded additional information that is not already determined for the portion, and the encoded additional information comprises a quantization parameter for a block of pixels.

Description

[0001] SYSTEMS AND METHODS FOR ENCODING MULTIPLE VIDEO STREAMS WITH ADAPTIVE QUANTIZATION FOR ADAPTIVE BITRATE STREAMING [0002]

본 발명은 일반적으로 비디오 인코딩에 관한 것이며 보다 구체적으로 소스 비디오 스트림으로부터 적응적 비트레이트 스트리밍을 위해 비디오 콘텐트의 다수의 스트림들을 효율적으로 인코딩하기 위한 시스템들 및 방법들에 관한 것이다.The present invention relates generally to video encoding and more particularly to systems and methods for efficiently encoding multiple streams of video content for adaptive bitrate streaming from a source video stream.

용어, 스트리밍 미디어는 재생 디바이스 상에서 미디어의 재생을 설명하며, 여기에서 상기 미디어는 서버상에 저장되며 재생 동안 네트워크를 통해 재생 디바이스로 계속해서 전송된다. 통상적으로, 재생 디바이스는 재생 디바이스가 미디어의 다음 부분의 수신 이전에 버퍼링된 미디어 모두의 재생을 완료하는 단계으로 인해 재생의 중단을 방지하기 위해 재생 동안 임의의 주어진 시간에 버퍼에서 충분한 양의 미디어를 저장한다. 적응적 비트레이트 스트리밍, 또는 적응적 스트리밍은 실시간으로 현재 스트리밍 상태들(예로서, 사용자의 네트워크 대역폭 및 CPU 용량)을 검출하는 단계 및 그에 따라 스트리밍된 미디어의 품질을 조정하는 단계를 수반한다. 통상적으로, 소스 미디어는 다수의 비트 레이트들에서 인코딩되며 재생 디바이스 또는 클라이언트는 이용 가능한 리소스들에 의존하여 상이한 인코딩들을 스트리밍하는 것 사이에서 스위칭한다. 재생 디바이스가 적응적 비트레이트 스트리밍을 시작할 때, 재생 디바이스는 통상적으로 최저 비트레이트 스트림들(대안 스트림들이 이용 가능한)로부터 미디어의 부분들을 요청함으로써 시작된다. 재생 디바이스가 요청된 미디어를 다운로딩함에 따라, 재생 디바이스는 이용 가능한 대역폭을 측정할 수 있다. 이용 가능한 부가적인 대역폭이 있는 경우에, 재생 디바이스는 보다 높은 비트레이트 스트림들로 스위칭할 수 있다. The term streaming media describes playback of media on a playback device, wherein the media is stored on a server and is continuously transferred to the playback device over the network during playback. Typically, the playback device is configured to allow a sufficient amount of media in the buffer at any given time during playback to prevent interruption of playback due to completing playback of all of the buffered media prior to receipt of the next portion of media by the playback device. . Adaptive bitrate streaming, or adaptive streaming, involves detecting current streaming states (e.g., the user's network bandwidth and CPU capacity) in real time and adjusting the quality of the streamed media accordingly. Typically, the source media is encoded at multiple bit rates and the playback device or client switches between streaming different encodings depending on the available resources. When the playback device begins adaptive bitrate streaming, the playback device typically begins by requesting portions of the media from the lowest bitrate streams (alternate streams available). As the playback device downloads the requested media, the playback device can measure the available bandwidth. If there is additional bandwidth available, the playback device may switch to higher bitrate streams.

적응적 스트리밍 시스템들에서, 소스 미디어는 통상적으로 실제 비디오 및 오디오 데이터를 포함하는 복수의 대안 스트림들을 나타내는 최상위 레벨 인덱스 파일로서 미디어 서버상에 저장된다. 각각의 스트림은 통상적으로 하나 이상의 컨테이너 파일들에 저장된다. 상이한 적응적 스트리밍 해법들은 통상적으로 상이한 인덱스 및 미디어 컨테이너들을 이용한다. 마트로시카(Matroska) 컨테이너는 프랑스, 오쏜느(Aussonne)의 마트로시카 비-영리 기관에 의한 개방 표준 프로젝트로서 개발된 미디어 컨테이너이다. 마트로시카 컨테이너는 확장가능한 마크업 언어(XML)의 이진 도함수인, 확장가능한 이진 메타 언어(Extensible Binary Meta Language; EBML)에 기초한다. 마트로시카 컨테이너의 디코딩은 많은 소비자 전자(CE) 디바이스들에 의해 지원된다. 캘리포니아, 샌디에이고의 DivX, LLC에 의해 개발된 DivX Plus 파일 포맷은 마트로시카 포맷 내에 특정되지 않은 요소들을 포함하여, 마트로시카 컨테이너 포맷의 확장을 이용한다. 다른 일반적으로 사용된 미디어 컨테이너 포맷들은 MPEG-4 파트 14(즉, ISO/IEC 14496-14)에 특정된 MP4 컨테이너 포맷 및 MPEG-2 파트 1(즉, ISO/IEC 표준 13818-1)에 특정된 MPEG 수송 스트림(TS) 컨테이너이다. MP4 컨테이너 포맷은 IIS Smooth 스트리밍 및 플래시 다이나믹 스트리밍에서 이용된다. TS 컨테이너는 HTTP 적응적 비트레이트 스트리밍에서 사용된다. 대안 스트림들에서 비디오는 ISO/IEC 동화상 전문가 그룹(MPEG) 및 스위스, 제네바의 국제 전기통신 연합 전기통신 표준화 부문(ITU-T) 의해 공동으로 특정된 고 효율 비디오 코딩(HEVC/H.265) 및 ITU-T에 의해 특정된 H.264/MPEG-4 AVC(개선된 비디오 코딩) 표준과 같은 다양한 블록-지향 비디오 압축 표준들(또는 코덱들)에 따라 인코딩될 수 있다. In adaptive streaming systems, the source media is typically stored on the media server as a top-level index file representing a plurality of alternative streams including actual video and audio data. Each stream is typically stored in one or more container files. Different adaptive streaming solutions typically use different indexes and media containers. The Matroska container is a media container developed as an open standard project by the Martros-Kavby-for-profit organization in Aussonne, France. Martroska Container is based on the Extensible Binary Meta Language (EBML), which is a binary derivative of the Extensible Markup Language (XML). The decoding of the Martroska container is supported by many consumer electronics (CE) devices. The DivX Plus file format, developed by DivX, LLC of San Diego, Calif., Utilizes an extension of the Martroska Container format, including elements not specified in the Martrosika format. Other commonly used media container formats are the MP4 container formats specified in MPEG-4 Part 14 (i.e., ISO / IEC 14496-14) and MPEG-2 Part 1 (i.e. ISO / IEC Standard 13818-1) MPEG transport stream (TS) container. The MP4 container format is used for IIS Smooth streaming and Flash dynamic streaming. The TS container is used in HTTP adaptive bitrate streaming. In alternate streams, video is encoded in a high efficiency video coding (HEVC / H.265) jointly specified by the ISO / IEC Moving Picture Experts Group (MPEG) and the International Telecommunication Union Telecommunication Standardization Sector (ITU-T) Can be encoded according to various block-oriented video compression standards (or codecs) such as the H.264 / MPEG-4 AVC (Enhanced Video Coding) standard specified by ITU-T.

본 발명은 소스 비디오 스트림으로부터 적응적 비트레이트 스트리밍을 위해 비디오 콘텐트의 다수의 스트림들을 효율적으로 인코딩하기 위한 시스템들 및 방법들에 관한 것이다.The present invention relates to systems and methods for efficiently encoding multiple streams of video content for adaptive bitrate streaming from a source video stream.

본 발명의 실시예들에 따라 적응적 비트레이트 스트리밍을 위해 다수의 비디오 스트림들을 인코딩하기 위한 시스템들 및 방법들이 개시된다. 하나의 실시예에서, 복수의 대안적인 비디오 스트림들로서 소스 비디오를 인코딩하도록 구성된 소스 인코더는 소스 인코더 애플리케이션, 공유 메모리, 및 병렬 프로세싱 시스템을 포함하며, 상기 병렬 프로세싱 시스템은 소스 인코딩 애플리케이션에 의해 멀티미디어 콘텐트를 수신하는 것으로서, 상기 멀티미디어 콘텐트는 1차 분해능을 가진 소스 비디오 데이터를 포함하는, 상기 멀티미디어 콘텐트를 수신하고, 상기 수신된 멀티미디어 콘텐트를 통한 제 1 패스에서 소스 비디오 데이터에 대한 통계들을 수집하며 상기 통계들을 공유 메모리에 기록하는 것으로서, 상기 통계들은 픽셀들의 블록들의 복잡도 측정치들을 포함하는, 상기 소스 비디오 데이터에 대한 통계들을 수집 및 기록하고, 상기 수신된 멀티미디어 콘텐트를 통한 상기 제 1 패스 동안 소스 비디오 데이터에 대한 초기 인코딩 정보를 결정하며 상기 초기 인코딩 정보를 공유 메모리에 기록하고, 복수의 병렬 인코딩 프로세스들을 갖고 상기 수신된 멀티미디어 콘텐트를 통해 제 2 패스 동안 복수의 대안적인 비디오 스트림들을 생성하기 위해 수집된 통계들 및 초기 인코딩 정보를 사용하여 상기 소스 비디오 데이터를 병렬로 인코딩하는 것으로서, 상기 소스 비디오의 인코딩은 부가적인 인코딩 정보를 이용하고 상기 병렬 인코딩 프로세스들은 또 다른 병렬 인코딩 프로세스에 의해 비디오의 일부에 대해 이미 결정되었고 상기 공유 메모리에 저장된 부가적인 인코딩 정보를 재사용하도록 구성되며, 상기 병렬 인코딩 프로세스들은 또 다른 병렬 인코딩 프로세스에 의해 비디오의 일부에 대해 이미 결정되지 않았으며 상기 공유 메모리에 상기 생성된 부가적인 인코딩 정보를 저장하도록 구성되고, 상기 부가적인 인코딩 정보는 픽셀들의 블록들에 대한 양자화 파라미터들을 포함하는, 상기 소스 비디오 데이터를 병렬로 인코딩하도록 구성된다. Systems and methods are disclosed for encoding a plurality of video streams for adaptive bitrate streaming in accordance with embodiments of the present invention. In one embodiment, a source encoder configured to encode source video as a plurality of alternative video streams comprises a source encoder application, a shared memory, and a parallel processing system, wherein the parallel processing system is operable to encode multimedia content Wherein the multimedia content comprises source video data having a primary resolution, collects statistics on the source video data in a first pass through the received multimedia content, The statistics collecting and recording statistics for the source video data, including statistics of complexity of blocks of pixels, during the first pass through the received multimedia content, To determine initial encoding information for the source video data and to write the initial encoding information to a shared memory and to generate a plurality of alternative video streams during the second pass through the received multimedia content with a plurality of parallel encoding processes Encoding the source video data in parallel using the collected statistics and initial encoding information, wherein the encoding of the source video utilizes additional encoding information and the parallel encoding processes are performed by another parallel encoding process And wherein the parallel encoding processes are not already determined for a portion of the video by another parallel encoding process and the shared memory It is configured to store the additional information encoding the generated, the additional information encoding is configured to encode said source video data including the quantization parameters for blocks of pixels in parallel.

추가 실시에에서, 소스 비디오 데이터에 대한 통계들은 평균 양자화 파라미터, 헤더 비트들의 크기, 텍스처 비트들의 크기, 인트라 블록들의 수, 인터 블록들의 수, 및 스킵 블록들의 수의 그룹으로부터 선택된 통계들를 포함한다. In a further implementation, statistics for source video data include statistics selected from the group of average quantization parameters, size of header bits, size of texture bits, number of intra blocks, number of interblocks, and number of skip blocks.

또 다른 실시예에서, 소스 비디오 데이터에 대한 초기 인코딩 정보를 결정하도록 구성되는 상기 병렬 프로세싱 시스템은 또한 프레임 복잡도 측정값을 산출하도록 구성된 병렬 프로세싱 시스템을 포함한다. In another embodiment, the parallel processing system configured to determine initial encoding information for source video data also includes a parallel processing system configured to calculate a frame complexity measure.

추가 실시예에서, 또 다른 병렬 인코딩 프로세스에 의해 비디오의 일부에 대해 이미 결정되지 않은 부가적인 인코딩 정보를 생성하도록 구성되는 상기 병렬 인코딩 프로세스들은 소스 비디오 데이터에서 비디오의 프레임의 일부를 인코딩하기 위한 코딩 트리 유닛(Coding Tree Unit; CTU) 크기를 결정하는 단계를 포함한다. In a further embodiment, the parallel encoding processes configured to generate additional encoding information that have not been previously determined for a portion of the video by another parallel encoding process include a coding tree for encoding a portion of a frame of video in the source video data, And determining a size of a Coding Tree Unit (CTU).

또 다른 실시예에서, 소스 비디오 데이터에서 비디오의 프레임의 일부를 인코딩하기 위한 CTU 크기를 결정하는 단계는 제 1 출력 스트림에서 적어도 하나의 출력 CTU로서 인코딩하기 위해 비디오의 프레임의 일부를 선택하는 단계, 크기가 유사한 CTU에 대해 결정되었는지를 확인하는 단계, 크기가 상기 유사한 CTU에 대해 결정되지 않았다면 CTU 크기를 선택하는 단계, 제 2 출력 스트림에 대해 결정된 이전 결정된 CTU 크기를 선택하며 크기가 유사한 CTU에 대해 결정되었다면 제 2 출력 스트림의 분해능과 상기 제 1 출력 스트림의 분해능을 비교하는 단계, 제 1 출력 스트림의 분해능이 제 2 출력 스트림과 동일한 분해능이 아니라면 CTU 크기를 스케일링하는 단계, 상기 선택된 CTU 크기가 상기 출력 CTU에 대해 수용 가능한지를 결정하는 단계, 상기 선택된 CTU 크기가 수용 가능하지 않을 때 보다 작은 CTU 크기를 선택하는 단계, 및 상기 선택된 CTU 크기가 상기 출력 CTU에 대해 수용 가능하다면 비디오의 프레임의 부분에 상기 선택된 CTU 크기를 적용하는 단계를 포함한다. In another embodiment, the step of determining a CTU size for encoding a portion of a frame of video in the source video data comprises selecting a portion of a frame of video for encoding as at least one output CTU in the first output stream, Selecting a CTU size if the size is not determined for the similar CTU, selecting a previously determined CTU size determined for the second output stream, and determining for a similar CTU size Scaling the CTU size if the resolution of the first output stream is not the same resolution as the second output stream, comparing the resolution of the second output stream with the resolution of the first output stream if the resolution of the second output stream is determined; Determining whether the selected CTU size is acceptable for an output CTU, Selecting a smaller CTU size when the selected CTU size is not acceptable and applying the selected CTU size to a portion of a frame of video if the selected CTU size is acceptable for the output CTU.

추가 실시예에서, 소스 비디오 데이터에 대한 초기 인코딩 정보를 결정하도록 구성되는 병렬 프로세싱 시스템은 또한 상기 복수의 대안적인 비디오 스트림들 중 적어도 하나에서 비디오의 적어도 하나의 프레임에 대한 모드 분포를 결정하도록 구성된 상기 병렬 프로세싱 시스템을 포함한다.In a further embodiment, a parallel processing system configured to determine initial encoding information for source video data is further configured to determine a mode distribution for at least one frame of video in at least one of the plurality of alternative video streams. Parallel processing system.

또 다른 실시예에서, 복수의 대안적인 비디오 스트림들을 생성하기 위해 수집된 통계들 및 초기 인코딩 정보를 사용하여 상기 소스 비디오 데이터를 병렬로 인코딩하도록 구성되는 상기 병렬 프로세싱 시스템은 또한 대안적인 비디오 스트림에서 비디오의 프레임에서 프로세싱된 블록들의 카운트를 유지하고, 상기 모드 분포에 기초하여 블록들의 임계 수를 결정하며, 블록들의 카운트가 블록들의 임계 수를 충족시킨다면 블록 유형 결정들에 대한 기준들을 조정하도록 구성되는 상기 병렬 프로세싱 시스템을 포함한다.In another embodiment, the parallel processing system configured to encode the source video data in parallel using collected statistics and initial encoding information to generate a plurality of alternative video streams may also be used to generate video To determine a threshold number of blocks based on the mode distribution, and to adjust criteria for block type decisions if the count of blocks meets a threshold number of blocks. &Lt; RTI ID = 0.0 > Parallel processing system.

다시 추가 실시예에서, 또 다른 인코딩 프로세스에 의해 비디오의 일부에 대해 이미 결정되었으며 상기 공유 메모리에 저장된 부가적인 인코딩 정보를 재사용하도록 구성되는 상기 병렬 인코딩 프로세스들은 또한 제 1 대안 스트림에서의 비디오 프레임에서 제 1 블록을 인코딩할 때 모션 벡터가 제 2 대안 스트림에서의 제 2 대응 블록에 대해 존재하는지의 여부를 결정하고, 상기 제 1 대안 스트림 및 상기 제 2 대안 스트림이 동일한 분해능인지를 결정하고, 상기 제 1 대안 스트림 및 상기 제 2 대안 스트림이 동일한 분해능이 아니라면 상기 모션 벡터를 스케일링하고, 상기 모션 벡터를 정제하며, 상기 제 1 대안 스트림에서이 상기 비디오 프레임에서 상기 제 1 블록을 인코딩할 때 상기 모션 벡터를 적용하도록 구성되는 상기 병렬 인코딩 프로세스들을 포함한다. In a further embodiment, the parallel encoding processes, which have been previously determined for a portion of the video by another encoding process and are configured to reuse additional encoding information stored in the shared memory, may also be included in the video frame in the first alternative stream And determining whether the first alternative stream and the second alternative stream are of the same resolution, determining whether the first alternative stream and the second alternative stream are of the same resolution, determining whether the motion vector is present for the second corresponding block in the second alternative stream, Scaling the motion vector and refining the motion vector if the first alternative stream and the second alternative stream are not of the same resolution, and when encoding the first block in the video frame in the first alternative stream, The parallel encoding processes configured to apply It should.

다시 또 다른 실시예에서, 초기 인코딩 정보는 또한 헤더 크기, 매크로블록 크기, 및 헤더 크기 대 매크로 크기의 상대 비율을 포함한다. In yet another embodiment, the initial encoding information also includes a header size, a macroblock size, and a relative ratio of header size to macro size.

추가의 부가적인 실시예에서, 초기 인코딩 정보는 또한 가상 기준 디코더 데이터를 포함한다.In a further additional embodiment, the initial encoding information also includes virtual reference decoder data.

또 다른 부가적인 실시예에서, 병렬 인코딩 프로세스들의 각각은 상이한 분해능에서 인코딩한다. In yet another additional embodiment, each of the parallel encoding processes encodes at different resolutions.

추가 실시예에서, 상기 병렬 인코딩 프로세스들의 각각은 하나 이상의 대안적인 비디오 스트림들을 인코딩하며 병렬 인코딩 프로세스에 의해 인코딩된 대안적인 비디오 스트림들의 각각은 상이한 비트레이트이다. In a further embodiment, each of the parallel encoding processes encodes one or more alternative video streams and each of the alternative video streams encoded by the parallel encoding process is at a different bit rate.

또 다른 실시예에서, 병렬 인코딩 프로세스들의 각각은 소스 비디오 데이터로부터 복수의 대안적인 비디오 스트림들의 서브세트에서의 각각의 스트림으로 순차적으로 차례로 인코딩한다. In another embodiment, each of the parallel encoding processes sequentially encodes from the source video data sequentially to each stream in a subset of a plurality of alternative video streams.

다시 추가 실시예에서, 부가적인 인코딩 정보는 레이트 왜곡 정보 및 양자화 파라미터들을 포함한다.Again in a further embodiment, the additional encoding information includes rate distortion information and quantization parameters.

다시 또 다른 실시예는 멀티미디어 콘텐트를 수신하는 단계로서, 상기 멀티미디어 콘텐트는 소스 인코더를 사용하여 1차 분해능을 가진 소스 비디오 데이터를 포함하는, 상기 멀티미디어 콘텐트를 수신하는 단계, 소스 인코더를 사용하여 상기 수신된 멀티미디어 콘텐트를 통한 제 1 패스에서 소스 비디오 데이터에 대한 통계들을 수집하며 상기 통계들을 공유 메모리에 기록하는 단계, 소스 인코더를 사용하여 상기 수신된 멀티미디어 콘텐트를 통해 상기 제 1 패스 동안 소스 비디오 데이터에 대한 초기 인코딩 정보를 결정하며 상기 초기 인코딩 정보를 공유 메모리에 기록하는 단계로서, 상기 통계들은 픽셀들의 블록들의 복잡도 측정치들을 포함하는, 상기 초기 인코딩 정보를 결정 및 기록하는 단계, 소스 인코더를 사용하여 복수의 병렬 인코딩 프로세스들을 갖고 상기 수신된 멀티미디어 콘텐트를 통한 제 2 패스 동안 복수의 대안적인 비디오 스트림들을 생성하기 위해 수집된 통계들, 초기 인코딩 정보, 및 부가적인 인코딩 정보를 사용하여 상기 소스 비디오 데이터를 병렬로 인코딩하는 단계으로서, 상기 소스 비디오의 인코딩은 또한 또 다른 병렬 인코딩 프로세스에 의해 비디오의 일부에 대해 이미 결정되었으며 복수의 병렬 인코딩 프로세스들 중 적어도 하나를 사용하여 상기 공유 메모리에 저장된 부가적인 인코딩 정보를 재사용하는 단계, 및 복수의 병렬 인코딩 프로세스들의 또 다른 것에 의해 비디오의 일부에 대해 이미 결정되지 않은 부가적인 인코딩 정보를 생성하는 단계를 포함하는, 상기 소스 비디오 데이터를 병렬로 인코딩하는 단계, 및 병렬 인코더 프로세스를 사용하여 상기 공유 메모리에 상기 생성된 부가적인 인코딩 정보를 저장하는 단계으로서, 상기 부가적인 인코딩 정보는 픽셀들의 블록들에 대한 양자화 파라미터들을 포함하는, 상기 생성된 부가적인 인코딩 정보를 저장하는 단계를 포함한다.Yet another embodiment is a method for receiving multimedia content, the multimedia content including source video data having a primary resolution using a source encoder, the method comprising the steps of: receiving the multimedia content; Collecting statistics on the source video data in a first pass through the multimedia content and recording the statistics in a shared memory, using the source encoder to determine, for the source video data during the first pass, Determining initial encoding information and writing the initial encoding information to a shared memory, wherein the statistics include determining and recording the initial encoding information, including complexity measures of blocks of pixels, Parallel Encoding Pro Encoding the source video data in parallel using the collected statistics, initial encoding information, and additional encoding information to generate a plurality of alternative video streams during a second pass through the received multimedia content Wherein the encoding of the source video is further determined for a portion of the video by another parallel encoding process and reusing additional encoding information stored in the shared memory using at least one of a plurality of parallel encoding processes And generating additional encoding information that has not been previously determined for a portion of the video by another of the plurality of parallel encoding processes, encoding the source video data in parallel and using the parallel encoder process The ball A step of storing the additional information encoding said generated in the memory, the additional encoding information comprises the step of storing, encoding additional information, the generated including quantization parameters for blocks of pixels.

추가 부가적인 실시예에서, 소스 비디오 데이터에 대한 통계들은: 평균 양자화 파라미터, 헤더 비트들의 크기, 텍스처 비트들의 크기, 인트라 블록들의 수, 인터 블록들의 수, 및 스킵 블록들의 수로 이루어진 그룹으로부터 선택된 통계들을 포함한다. In a further additional embodiment, the statistics for the source video data include statistics selected from the group consisting of: average quantization parameter, size of header bits, size of texture bits, number of intra blocks, number of interblocks, .

또 다른 부가적인 실시예에서, 소스 비디오 데이터에 대한 초기 인코딩 정보를 결정하는 단계는 또한 프레임 복잡도 측정치를 산출하는 단계를 포함한다.In yet another additional embodiment, determining the initial encoding information for the source video data also includes calculating a frame complexity measure.

다시 추가 실시예에서, 또 다른 병렬 인코딩 프로세스에 의해 비디오의 일부에 대해 이미 결정되지 않은 부가적인 인코딩 정보를 생성하는 단계는 또한 소스 비디오 데이터에서 비디오의 프레임의 일부를 인코딩하기 위한 코딩 트리 유닛(CTU) 크기를 결정하는 단계를 포함한다.Again in a further embodiment, generating additional encoding information that has not been previously determined for a portion of the video by another parallel encoding process may also include generating a coding tree unit (CTU) for encoding a portion of the video frame in the source video data ) &Lt; / RTI > size.

다시 또 다른 실시예에서, 소스 비디오 데이터에서 비디오의 프레임의 일부를 인코딩하기 위한 CTU 크기를 결정하는 단계는 또한 제 1 출력 스트림에서 적어도 하나의 출력 CTU로서 인코딩하기 위해 비디오의 프레임의 일부를 선택하는 단계, 크기가 유사한 CTU에 대해 결정되었는지를 확인하는 단계, 크기가 상기 유사한 CTU에 대해 결정되지 않았다면 CTU 크기를 선택하고 제 2 출력 스트림에 대해 결정된 이전 결정된 CTU 크기를 선택하는 단계 및 크기가 유사한 CTU에 대해 결정되었다면 상기 제 2 출력 스트림의 분해능과 상기 제 1 출력 스트림의 분해능을 비교하는 단계, 상기 제 1 출력 스트림의 분해능이 제 2 출력 스트림과 동일한 분해능이 아니라면 상기 CTU 크기를 스케일링하는 단계, 상기 선택된 CTU 크기가 상기 출력 CTU에 대해 수용 가능한지를 결정하는 단계, 상기 선택된 CTU 크기가 수용 가능하지 않다면 보다 작은 CTU 크기를 선택하는 단계, 및 상기 선택된 CTU 크기가 상기 출력 CTU에 대해 수용 가능하다면 비디오의 프레임의 부분에 상기 선택된 CTU 크기를 적용하는 단계를 포함한다.In yet another embodiment, determining the CTU size for encoding a portion of a frame of video in the source video data may also include selecting a portion of the frame of video to encode as at least one output CTU in the first output stream Selecting a CTU size and selecting a previously determined CTU size determined for the second output stream if the size is not determined for the similar CTU, Scaling the CTU size if the resolution of the first output stream is not the same resolution as the second output stream, comparing the resolution of the second output stream with the resolution of the first output stream if the resolution of the second output stream is determined to To determine if the selected CTU size is acceptable for the output CTU Selecting a smaller CTU size if the selected CTU size is unacceptable and applying the selected CTU size to a portion of a frame of video if the selected CTU size is acceptable for the output CTU do.

추가의 부가적인 실시예에서, 소스 비디오 데이터에 대한 초기 인코딩 정보를 결정하는 단계는 또한 복수의 대안적인 비디오 스트림들 중 적어도 하나에서 비디오의 적어도 하나의 프레임에 대한 모드 분포를 결정하는 단계를 포함한다. In a further additional embodiment, determining the initial encoding information for the source video data also includes determining a mode distribution for at least one frame of video in at least one of a plurality of alternative video streams .

또 다른 부가적인 실시예에서, 복수의 대안적인 비디오 스트림들을 생성하기 위해 수집된 통계들, 초기 인코딩 정보, 및 부가적인 인코딩 정보를 사용하여 상기 소스 비디오 데이터를 병렬로 인코딩하는 단계는 또한 대안적인 비디오 스트림에서 비디오의 프레임에서 프로세싱된 블록들의 카운트를 유지하는 단계, 상기 모드 분포에 기초하여 블록들의 임계 수를 결정하는 단계, 및 블록들의 카운트가 상기 블록들의 임계 수를 충족시킨다면 블록 유형 결정들에 대한 기준들을 조정하는 단계를 포함한다. In yet another additional embodiment, the step of encoding the source video data in parallel using collected statistics, initial encoding information, and additional encoding information to generate a plurality of alternative video streams, The method comprising: maintaining a count of blocks processed in a frame of video in a stream; determining a threshold number of blocks based on the mode distribution; and if the count of blocks meets a threshold number of blocks, And adjusting the criteria.

다시 추가의 부가적인 실시예에서, 또 다른 병렬 인코딩 프로세스에 의해 비디오의 일부에 대해 이미 결정되었으며 상기 공유 메모리에 저장된 부가적인 인코딩 정보를 재사용하는 단계는 또한 제 1 대안 스트림에서의 비디오 프레임에서 제 1 블록을 인코딩할 때 모션 벡터가 제 2 대안 스트림에서의 제 2 대응 블록에 대해 존재하는지의 여부를 결정하는 단계, 상기 제 1 대안 스트림 및 상기 제 2 대안 스트림이 동일한 분해능인지를 결정하는 단계, 상기 제 1 대안 스트림 및 상기 제 2 대안 스트림이 동일한 분해능이 아니라면 상기 모션 벡터를 스케일링하는 단계, 상기 모션 벡터를 정제하는 단계, 및 상기 제 1 대안 스트림에서의 상기 비디오 프레임에서 상기 제 1 블록을 인코딩할 때 상기 모션 벡터를 적용하는 단계를 포함한다. In yet another additional embodiment, reusing additional encoding information already determined for a portion of the video by another parallel encoding process and stored in the shared memory may also include re-encoding the first encoded video information in the first alternative stream, Determining whether a motion vector is present for a second corresponding block in a second alternative stream when encoding the block, determining whether the first alternative stream and the second alternative stream are at the same resolution, Scaling the motion vector if the first alternative stream and the second alternative stream are not of the same resolution, refining the motion vector, and encoding the first block in the video frame in the first alternative stream &Lt; / RTI > applying the motion vector.

다시 또 다른 부가적인 실시예에서, 초기 인코딩 정보는 또한 헤더 크기, 매크로블록 크기, 및 헤더 크기 대 매크로블록 크기의 상대 비율을 포함한다. In yet another additional embodiment, the initial encoding information also includes a header size, a macroblock size, and a relative ratio of header size to macroblock size.

다시 추가 실시예에서, 초기 인코딩 정보는 또한 가상 기준 디코더 데이터를 포함한다.Again in a further embodiment, the initial encoding information also includes virtual reference decoder data.

다시 추가 실시예에서, 상기 병렬 인코딩 프로세스들의 각각은 상이한 분해능에서 인코딩한다.Again in a further embodiment, each of the parallel encoding processes encodes at different resolutions.

추가 부가적인 실시예에서, 병렬 인코딩 프로세스들의 각각은 하나 이상의 대안적인 비디오 스트림들을 인코딩하며 병렬 인코딩 프로세스에 의해 인코딩된 대안적인 비디오 스트림들의 각각은 상이한 비트레이트이다.In a further additional embodiment, each of the parallel encoding processes encodes one or more alternative video streams and each of the alternative video streams encoded by the parallel encoding process is at a different bit rate.

또 다른 부가적인 실시예에서, 상기 병렬 인코딩 프로세스들의 각각은 상기 소스 비디오 데이터로부터 상기 복수의 대안적인 비디오 스트림들의 서브세트에서의 각각의 스트림으로 순차적으로 차례로 블록을 인코딩한다. In yet another additional embodiment, each of the parallel encoding processes sequentially encodes blocks from the source video data sequentially into respective streams in a subset of the plurality of alternative video streams.

다시 추가 부가적인 실시예에서, 부가적인 인코딩 정보는 레이트 왜곡 정보 및 양자화 파라미터들을 포함한다.In yet a further additional embodiment, the additional encoding information includes rate distortion information and quantization parameters.

다시 추가 부가적인 실시예에서, 픽셀들의 블록들은 코딩 트리 유닛들이다.In yet a further additional embodiment, the blocks of pixels are coding tree units.

다시 또 다른 부가적인 실시예에서, 픽셀들의 블록들은 코딩 유닛들이다.In yet another additional embodiment, the blocks of pixels are coding units.

또 다른 추가 실시예에서, 픽셀들의 블록들은 변환 유닛들이다.In yet another further embodiment, the blocks of pixels are transform units.

또 다른 추가 실시예에서, 픽셀들의 블록들에 대한 양자화 파라미터들은 픽셀들의 블록들에 대한 복잡도 측정치들을 사용하여 생성된다.In yet another further embodiment, quantization parameters for blocks of pixels are generated using complexity measures for blocks of pixels.

또 다른 추가 실시예에서, 픽셀들의 블록들에 대한 양자화 파라미터들은 이전 생성된 양자화 파라미터들을 사용하여 생성된다.In yet a further embodiment, quantization parameters for blocks of pixels are generated using previously generated quantization parameters.

다시 또 다른 부가적인 실시예에서, 픽셀들의 블록들에 대한 양자화 파라미터들은 왜곡 레이트 및 비트 레이트를 사용하여 생성된다.In yet another additional embodiment, the quantization parameters for blocks of pixels are generated using a distortion rate and a bit rate.

도 1은 본 발명의 실시예들에 따른 적응적 스트리밍 시스템의 시스템도.
도 2는 본 발명의 실시예들에 따라 적응적 스트리밍 시스템들에서의 사용을 위해 비디오 데이터의 스트림들을 인코딩하도록 구성된 미디어 서버를 개념적으로 도시한 도면.
도 3은 본 발명의 실시예들에 따라 비디오의 스트림들을 인코딩할 때 블록 크기 정보를 재사용하기 위한 프로세스를 도시하는 흐름도.
도 4는 본 발명의 실시예들에 따라 비디오의 스트림들을 인코딩할 때 블록 유형 결정들을 조정하기 위한 프로세스를 도시하는 흐름도.
도 5는 본 발명의 실시예들에 따라 비디오의 스트림들을 인코딩할 때 모션 벡터들을 재사용하기 위한 프로세스를 도시하는 흐름도.
도 6은 본 발명의 실시예들에 따라 비디오 데이터의 대안 스트림들의 인코딩 에서 통계들 및 인코딩 정보를 공유하기 위한 프로세스를 도시하는 흐름도.
도 7은 본 발명의 실시예들에 따른 픽셀들의 블록에 대한 복잡도 측정치를 생성하기 위한 프로세스를 도시하는 흐름도.
도 8은 본 발명의 실시예들에 따른 픽셀들의 블록에 대한 양자화 파라미터를 생성하기 위한 프로세스를 도시하는 흐름도.
도 9는 본 발명의 실시예들에 따른 픽셀들의 유사한 블록에 대한 양자화 파라미터를 사용하여 픽셀들의 블록에 대한 양자화 파라미터를 생성하기 위한 프로세스를 도시하는 흐름도.
도 10a는 CTU 블록들의 복잡도 측정치들의 플롯을 도시하는 그래프.
도 10b는 CTU 블록들의 복잡도 측정치들 및 양자화 파라미터들의 플롯을 도시하는 그래프.1 is a system diagram of an adaptive streaming system in accordance with embodiments of the present invention.
2 conceptually illustrates a media server configured to encode streams of video data for use in adaptive streaming systems in accordance with embodiments of the present invention.
3 is a flow diagram illustrating a process for reusing block size information when encoding streams of video in accordance with embodiments of the present invention;
4 is a flow diagram illustrating a process for adjusting block type decisions when encoding streams of video in accordance with embodiments of the present invention.
5 is a flow diagram illustrating a process for reusing motion vectors when encoding streams of video in accordance with embodiments of the present invention;
6 is a flow diagram illustrating a process for sharing statistics and encoding information in the encoding of alternative streams of video data in accordance with embodiments of the present invention.
7 is a flow diagram illustrating a process for generating a complexity measure for a block of pixels in accordance with embodiments of the present invention.
8 is a flow diagram illustrating a process for generating quantization parameters for a block of pixels in accordance with embodiments of the present invention.
9 is a flow chart illustrating a process for generating quantization parameters for a block of pixels using quantization parameters for similar blocks of pixels according to embodiments of the present invention.
10A is a graph showing plots of complexity measurements of CTU blocks.
10B is a graph showing plots of complexity measures and quantization parameters of CTU blocks.

이제 도면들로 가면, 본 발명의 실시예들에 따른 적응적 비트레이트 스트리밍을 위한 다수의 비디오 스트림들을 인코딩하기 위한 시스템들 및 방법들이 도시된다. 본 발명의 실시예들에 따르면, 인코더들은 콘텐트에 대한 통계들에 대한 미디어 콘텐트를 분석하고, 상기 콘텐트를 인코딩하기 위해 사용된 인코딩 정보를 결정하며, 상이한 분해능들 및 비트레이트들에서 다수의 비디오 스트림들로서 상기 콘텐트를 인코딩할 수 있다. 본 발명이 HEVC/H.265 및 H.265 AVC와 같은 적응적 스트리밍 시스템들 및 블록-기반 비디오 인코딩 기술들에 대하여 이하에 설명되지만, 설명된 시스템들 및 방법들은 비디오 데이터의 상이한 스트림들이 블록-기반이 아닌 비디오 인코딩 기술들 및 네트워크 클라이언트의 연결 품질에 기초하여 선택되는 종래의 스트리밍 시스템들에서 동일하게 적용 가능하다. Turning now to the drawings, there are shown systems and methods for encoding a plurality of video streams for adaptive bitrate streaming in accordance with embodiments of the present invention. In accordance with embodiments of the present invention, encoders analyze media content for statistics on content, determine encoding information used to encode the content, and provide multiple video streams at different resolutions and bit rates Lt; RTI ID = 0.0 > content. &Lt; / RTI > Although the present invention is described below with respect to adaptive streaming systems and block-based video encoding techniques such as HEVC / H.265 and H.265 AVC, the described systems and methods may be applied to different streams of video data, The present invention is equally applicable to conventional streaming systems that are selected based on non-based video encoding techniques and connection quality of network clients.

적응적 스트리밍 시스템들에서, 멀티미디어 콘텐트는 비디오 데이터의 대안 스트림들의 세트로서 인코딩된다. 비디오 데이터의 각각의 대안 스트림이 동일한 소스 멀티미디어 콘텐트를 사용하여 인코딩되기 때문에, 유사한 인코딩 정보는 비디오 데이터의 각각의 대안 스트림의 인코딩에서 결정된다. 인코딩 정보는 이에 제한되지 않지만, 프레임 복잡도 측정치, 블록 크기들의 선택, 블록 모드 분포 및 모션 추정 결과들을 포함할 수 있다. 본 발명의 많은 실시예들에 따른 시스템들 및 방법들은 비디오 데이터의 적어도 하나의 다른 대안 스트림의 인코딩에서 비디오 데이터의 하나의 대안 스트림의 인코딩에서 결정된 인코딩 정보를 재사용한다. 비디오 데이터의 여러 개의 대안 스트림들의 인코딩에서 인코딩 정보를 재사용함으로써, 비디오 데이터의 대안 스트림들의 인코딩에서의 상당한 개선들이 달성될 수 있으며 특히 상당한 시간 절감들이 본 발명의 실시예들에 따라 실현될 수 있다.In adaptive streaming systems, the multimedia content is encoded as a set of alternative streams of video data. Since each alternative stream of video data is encoded using the same source multimedia content, similar encoding information is determined in the encoding of each alternate stream of video data. The encoding information may include, but is not limited to, frame complexity measurements, selection of block sizes, block mode distribution, and motion estimation results. Systems and methods in accordance with many embodiments of the present invention reuse encoding information determined in the encoding of one alternate stream of video data in the encoding of at least one other alternate stream of video data. By reusing the encoding information in the encoding of several alternative streams of video data, significant improvements in the encoding of alternative streams of video data can be achieved and particularly significant time savings can be realized according to embodiments of the present invention.

적응적 스트리밍 시스템들은 인터넷과 같은, 네트워크를 통해 상이한 최대 비트레이트들 및 분해능들에서 인코딩된 멀티미디어 콘텐트를 스트리밍하도록 구성된다. 적응적 스트리밍 시스템들은 현재 스트리밍 상태들에 기초하여 지원될 수 있는, 최고 품질 멀티미디어 콘텐트를 스트리밍한다. 멀티미디어 콘텐트는 통상적으로 비디오 및 오디오 데이터, 자막들, 및 다른 관련 메타데이터를 포함한다. 네크워크 데이터 레이트에 독립적인 최고 품질 비디오 경험을 제공하기 위해, 적응적 스트리밍 시스템들은 이에 제한되지 않지만, 이용 가능한 네트워크 데이터 레이트 및 비디오 디코더 성능을 포함하여, 다양한 인자들에 따라 비디오 데이터의 전달 전체에 걸쳐 비디오 데이터의 이용 가능한 소스들 사이에서 스위칭하도록 구성된다. 스트리밍 상태들이 악화될 때, 적응적 스트리밍 시스템은 통상적으로 보다 낮은 최대 비트레이트들에서 인코딩된 멀티미디어 스트림들로 스위칭하려고 시도한다. 이용 가능한 네트워크 데이터 레이트가 최저 최대 비트레이트에서 인코딩된 스트림의 스트리밍을 지원할 수 없는 경우에, 그 후 재생은 종종 충분한 양의 콘텐트가 재생을 재시작하기 위해 버퍼링될 수 있을 때까지 중단된다. 본 발명의 실시예들에 따른 적응적 스트리밍 시스템에서 이용될 수 있는 재생 동안 비디오 스트림들 사이에서 스위칭하기 위한 시스템들 및 방법들이 2011년 8월 30일에 출원된, Braness 외의 "하이퍼텍스트 전송 프로토콜을 사용하여 마트로시카 컨테이너 파일들에 저장된 미디어의 적응적 비트레이트 스트리밍을 위한 시스템들 및 방법들"이라는 제목의, 미국 특허 출원 일련 번호 제13/221,682호에 설명되며, 그 전체는 참조로서 통합된다. Adaptive streaming systems are configured to stream encoded multimedia content at different maximum bit rates and resolutions over the network, such as the Internet. Adaptive streaming systems stream the highest quality multimedia content that can be supported based on current streaming states. Multimedia content typically includes video and audio data, captions, and other related metadata. To provide a top quality video experience that is independent of the network data rate, the adaptive streaming systems include, but are not limited to, available network data rates and video decoder capabilities, And to switch between the available sources of video data. When streaming conditions deteriorate, adaptive streaming systems typically attempt to switch to encoded multimedia streams at lower maximum bit rates. If the available network data rate can not support streaming of the encoded stream at the lowest maximum bit rate, then playback is often interrupted until a sufficient amount of content can be buffered to restart playback. Systems and methods for switching between video streams during playback that may be used in an adaptive streaming system in accordance with embodiments of the present invention are disclosed in Braness et al., "The Hypertext Transfer Protocol, filed on August 30, Systems and Methods for Adaptive Bitrate Streaming of Media Stored in Martoscir Container Files Using US Patent Application Serial No. < RTI ID = 0.0 > 13 / 221,682, < / RTI > .

적응적 스트리밍 시스템들에서 이용된 비디오 데이터의 다수의 소스들을 생성하기 위해, 소스 인코더는 멀티미디어 콘텐트의 조각에 포함된 소스 비디오로부터 비디오 데이터의 복수의 대안 스트림들을 인코딩하도록 구성될 수 있다. 적응적 스트리밍 시스템들에서의 사용을 위해 소스 비디오를 인코딩하기 위한 시스템들 및 방법이 2011년 8월 30일에 출원된, Braness 외의 "하이퍼텍스트 전송 프로토콜을 사용하여 적응적 비트레이트 스트리밍을 위해 마트로시카 컨테이너 파일들에 소스 미디어를 인코딩하기 위한 시스템들 및 방법들"이라는 제목의, 미국 특허 출원 번호 제13/221,794호에 개시되며, 그 전체는 참조로서 통합된다. 본 발명의 실시예들에 따르면, 소스 인코더는 미디어 소스 및/또는 미디어 서버를 사용하여 구현될 수 있다.To generate multiple sources of video data used in adaptive streaming systems, a source encoder may be configured to encode a plurality of alternative streams of video data from source video included in a piece of multimedia content. Systems and methods for encoding source video for use in adaptive streaming systems are described in Braness et al., "Hypertext Transfer Protocol, filed on August 30, 2011, Systems and Methods for Encoding Source Media in Illustrator Container Files, " which is incorporated herein by reference in its entirety. According to embodiments of the present invention, the source encoder may be implemented using a media source and / or a media server.

상기 서술된 바와 같이, 동일한 소스 비디오에 기초한 비디오 데이터의 대안 스트림들은 유사한 콘텐트를 포함하고, 그러므로 소스 콘텐트로부터 결정된 통계들 및 비디오 데이터의 하나의 대안 스트림에 대해 결정된 인코딩 정보는 비디오 데이터의 다른 대안 스트림들 중 하나 이상의 인코딩에서 사용될 수 있다. 본 발명의 실시예들에 따르면, 동일한 소스 비디오에 기초한 비디오 데이터의 대안 스트림들의 세트는 동일한 분해능이지만 상이한 비트레이트들에서 비디오 데이터를 포함할 수 있다. 본 발명의 많은 실시예들에서, 비디오 데이터의 특정한 대안 스트림의 인코딩을 위해 산출된 모션 추정 결과들은 비디오 데이터의 다른 대안 스트림들 중에서 재사용될 수 있다. 이하에서 논의되는 바와 같이, 비디오 데이터의 대안 스트림들의 인코딩에서 결정된 다양한 통계들 및 인코딩 정보는 비디오 데이터의 대안 스트림들 중에서 재사용될 수 있다. 본 발명의 실시예들에 따라 비디오 데이터의 대안 스트림들을 인코딩할 때 통계들 및 인코딩 정보를 공유하기 위한 시스템들 및 방법들이 이하에서 추가로 논의된다.As described above, alternative streams of video data based on the same source video include similar content, and thus the encoding information determined for one alternate stream of video data and statistics determined from the source content is stored in another alternative stream of video data Lt; / RTI > and / or < / RTI > According to embodiments of the present invention, the set of alternative streams of video data based on the same source video may include video data at the same resolution but at different bit rates. In many embodiments of the present invention, the motion estimation results produced for encoding a particular alternative stream of video data may be reused among other alternative streams of video data. As discussed below, various statistics and encoding information determined in the encoding of alternative streams of video data may be reused among alternative streams of video data. Systems and methods for sharing statistics and encoding information when encoding alternative streams of video data in accordance with embodiments of the present invention are discussed further below.

적응적Adaptive 스트리밍 시스템 아키텍처 Streaming System Architecture

본 발명의 실시예들에 따른 적응적 스트리밍 시스템들은 사용자 디바이스들로 스트리밍하기 위해 이용 가능해질 비디오의 다수의 스트림들을 생성하도록 구성된다. 본 발명의 많은 실시예들에서, 적응적 스트리밍 시스템은 소스 미디어로부터 비디오의 다수의 스트림들의 인코딩을 수행하는 소스 인코딩 서버를 포함한다. 본 발명의 실시예들에 따른 적응적 스트리밍 시스템이 도 1에 도시된다. 도시된 적응적 스트리밍 시스템(10)은 복수의 대안 스트림들로서 소스 미디어를 인코딩하도록 구성된 소스 인코딩 서버(12)를 포함한다. 소스 미디어는 인코딩 서버(12) 상에 저장될 수 있거나 미디어 서버(13)로부터 검색될 수 있다. 이하에서 추가로 논의되는 바와 같이, 소스 인코딩 서버(12)는 인코딩된 스트림들을 포함하는 컨테이너 파일들을 생성하며, 적어도 복수의 그것은 인코딩된 비디오의 대안 스트림들이다. 인코딩 서버는 각각의 출력 분해능에서 콘텐트에 대한 통계들을 수집하기 위해 제 1 패스를 및 다수의 출력 스트림들로 콘텐트를 인코딩하기 위해 제 2 패스를 만들며, 여기에서 스트림들은 다양한 분해능들 및 비트레이트들을 가질 수 있다. 몇몇 실시예들에서, 제 1 패스는 제 2 패스가 시작되기 전에 완료된다. 다른 실시예들에서, 제 2 패스는 제 1 패스가 완료되기 전에 시작될 수 있다. 다시 말해서, 프레임들이 제 2 패스 프로세스(들)에 의해 프로세싱되기 전에 제 1 패스 프로세스(들)에 의해 프로세싱되는 경우 제 1 및 제 2 패스들에 대한 계산 프로세스들이 동시에 실행될 수 있다. 이들 파일들은 HTTP 서버일 수 있는, 콘텐트 서버(14)로 업로딩된다. 다양한 재생 디바이스들(18, 20, 및 22)은 그 후 인터넷과 같은 네트워크(16)를 통해 콘텐트 서버(14)로부터 인코딩된 스트림들의 부분들을 요청할 수 있다. Adaptive streaming systems in accordance with embodiments of the present invention are configured to generate multiple streams of video to be available for streaming to user devices. In many embodiments of the present invention, an adaptive streaming system includes a source encoding server that performs encoding of multiple streams of video from a source media. An adaptive streaming system according to embodiments of the present invention is shown in FIG. The illustrated adaptive streaming system 10 includes a source encoding server 12 configured to encode the source media as a plurality of alternative streams. The source media can be stored on the encoding server 12 or retrieved from the media server 13. As discussed further below, source encoding server 12 generates container files containing encoded streams, at least a plurality of which are alternative streams of encoded video. The encoding server creates a first pass to collect statistics for the content at each output resolution and a second pass to encode the content to multiple output streams where the streams have various resolutions and bit rates . In some embodiments, the first pass is completed before the second pass begins. In other embodiments, the second pass may be started before the first pass is completed. In other words, if the frames are processed by the first pass process (s) before they are processed by the second pass process (s), the calculation processes for the first and second passes can be performed simultaneously. These files are uploaded to the content server 14, which may be an HTTP server. The various playback devices 18,20, and 22 may then request portions of the encoded streams from the content server 14 over the network 16, such as the Internet.

미디어 콘텐트 스트림들을 전달하기 위한 특정 적응적 스트리밍 시스템이 도 1에 대하여 상기 논의되지만, 다양한 스트리밍 시스템들 중 임의의 것이 본 발명의 실시예들에 따라 미디어 콘텐트 스트림들을 전달하기 위해 이용될 수 있다.Although a particular adaptive streaming system for delivering media content streams is discussed above with respect to FIG. 1, any of a variety of streaming systems may be used to convey media content streams in accordance with embodiments of the present invention.

소스 인코더들Source encoders

도시된 실시예에서, 적응적 비트레이트 스트리밍 시스템은 상이한 분해능들 및/또는 비트레이트들을 가진 인코딩된 비디오의 대안 스트림들로 비디오 콘텐트의 소스 스트리밍을 인코딩할 수 있는 하나 이상의 소스 인코더들을 포함한다. 많은 실시예들에서, 소스 인코더는 멀티미디어의 스트림들을 인코딩할 수 있는 임의의 디바이스를 사용하여 구현될 수 있으며, 여기에서 스트림들은 상이한 분해능들, 샘플링 레이트들, 및/또는 최대 비트레이트들에서 인코딩된다. 본 발명의 실시예에 따른 적응적 스트리밍 시스템 소스 인코더의 기본 아키텍처는 도 2에 도시된다. 소스 인코더(200)는 메모리(230) 및 네트워크 인터페이스(240)와 통신하는 프로세서(210)를 포함한다. 도시된 실시예에서, 휘발성 메모리(230)는 소스 인코딩 애플리케이션(250)을 포함한다. 프로세서는 소스 인코딩 애플리케이션(250)에 의해, 또한 휘발성 메모리에 있는, 소스 비디오 데이터(260)로부터 비디오 데이터의 복수의 스트림들을 인코딩하도록 구성된다. 소스 비디오 데이터(260)는 메모리에 이미 존재할 수 있거나 네트워크 인터페이스(240)를 통해 수신될 수 있다.In the illustrated embodiment, the adaptive bitrate streaming system includes one or more source encoders capable of encoding the source stream of video content into alternative streams of encoded video with different resolutions and / or bitrates. In many embodiments, the source encoder may be implemented using any device capable of encoding multimedia streams, wherein the streams are encoded at different resolutions, sampling rates, and / or maximum bit rates . The basic architecture of an adaptive streaming system source encoder according to an embodiment of the present invention is shown in FIG. Source encoder 200 includes a processor 210 that communicates with memory 230 and network interface 240. In the illustrated embodiment, the volatile memory 230 includes a source encoding application 250. The processor is configured to encode the plurality of streams of video data from the source video data 260, also in the volatile memory, by the source encoding application 250. The source video data 260 may already be in memory or may be received via the network interface 240.

복수의 실시예들에서, 소스 인코더는 다수의 프로세서들을 포함하며 인코딩 프로세스는 다수의 프로세서들 중에서 분포될 수 있다. 많은 실시예들에서, 소스 인코딩 애플리케이션(250)은 각각의 프로세스가 하나 이상의 출력 스트림들을 인코딩하는 인코더 제어기(270)인 하나 이상의 프로세서들 상에서 실행하는 다수의 프로세스들을 론칭할 수 있다. 추가 실시예들에서, 각각의 인코더 제어기는 동일한 분해능에서 및 상이한 비트레이트들에서 다수의 출력 스트림들을 인코딩한다. 여러 개의 실시예들에서, 3개의 출력 분해능들의 각각에 대한 인코더 제어기는 하나 이상의 프로세서들 상에서 실행되도록 론칭되며, 여기에서 출력 분해능들은 768×432, 1280×720, 및 1920×1080이다. 몇몇 실시예들에서, 인코더 제어기(270)는 두 개의 상이한 비트레이트들에서 768×432 출력 스트림들을 인코딩하고, 인코더 제어기(280)는 3개의 상이한 비트레이트들에서 1280×720에서, 및 인코더 제어기(290)는 3개의 상이한 비트레이트들에서 1920×1080에서 인코딩한다. 인코더 제어기들(270, 280, 및 290)는 통상적으로 그것들이 실행할 때 메모리(230)에 존재한다. 본 발명의 많은 실시예들에 따르면, 인코더 제어기들(270, 280, 및 290)은 제어기들 사이에서 통계들, 인코딩 정보 및 다른 정보의 데이터 교환을 위해 메모리에 공유 데이터 버퍼들(295)을 가진다. In multiple embodiments, the source encoder includes a plurality of processors and the encoding process may be distributed among the plurality of processors. In many embodiments, the source encoding application 250 may launch multiple processes executing on one or more processors, each processor being an encoder controller 270 that encodes one or more output streams. In further embodiments, each encoder controller encodes multiple output streams at the same resolution and at different bit rates. In various embodiments, the encoder controller for each of the three output resolutions is launched to run on one or more processors, where the output resolutions are 768 x 432, 1280 x 720, and 1920 x 1080. In some embodiments, the encoder controller 270 encodes 768x422 output streams at two different bit rates, and the encoder controller 280 decodes at 1280x720 at three different bit rates, and the encoder controller 290 encode at 1920 x 1080 at three different bit rates. Encoder controllers 270, 280, and 290 typically reside in memory 230 as they run. In accordance with many embodiments of the present invention, encoder controllers 270,280, and 290 have shared data buffers 295 in memory for exchanging statistics, encoding information, and other information between controllers .

소스 인코더에 대한 특정 아키텍처가 도 2에 도시되지만, 비디오 인코더(250)가 디스크 또는 몇몇 다른 형태의 저장 장치상에 위치되며 런타임 시 메모리(230)로 로딩되는 아키텍처들을 포함하는 다양한 아키텍처들 중 임의의 것은 본 발명의 실시예들에 따른 멀티미디어 콘텐트를 인코딩하기 위해 이용될 수 있다. 본 발명의 실시예들에 따른 비디오 데이터의 대안 스트림들의 인코딩에서 통계들 및 인코딩 정보의 재사용을 위한 시스템들 및 방법들이 이하에서 추가로 논의된다.Although a particular architecture for the source encoder is shown in FIG. 2, any of a variety of architectures including architectures in which video encoder 250 is located on a disk or some other type of storage device and loaded into memory 230 at runtime May be used to encode multimedia content according to embodiments of the present invention. Systems and methods for re-use of statistics and encoding information in the encoding of alternative streams of video data in accordance with embodiments of the present invention are discussed further below.

통계 및 인코딩 정보를 수집 및 사용하는 단계Steps to collect and use statistics and encoding information

본 발명의 많은 실시예들에서, 통계들 및 인코딩 정보는 콘텐트를 다수의 출력 스트림들로 인코딩하기 전에 미디어 콘텐트의 조각에 대해 결정된다. 이하에서 추가로 보다 상세히 논의될 바와 같이, 통계들 및 인코딩 정보는 콘텐트를 어떻게 인코딩할지에 대한 의사 결정을 가속화하기 위해 인코딩 프로세스들 사이에서 저장 및 공유될 수 있다. 제 1 패스에서 수집하기 위한 통계들은 (이에 제한되지 않지만): 평균 양자화 파라미터, 헤더 비트들의 크기, 텍스처 비트들의 크기, 인트라 매크로블록들/CTU들의 수, 인터 매크로블록들/CTU들의 수, 스킵 매크로블록들/CTU들의 수를 포함할 수 있다. 인코딩 정보는 (이에 제한되지 않지만): 프레임 복잡도 측정치들, 코딩 트리 유닛(CTU) 구조, 모드 분포, 및 모션 정보를 포함할 수 있다. 본 발명의 실시예들에 따른 비디오의 대안 스트림들의 인코딩에서 인코딩 정보의 수집 및 사용이 이하에서 논의된다. In many embodiments of the invention, statistics and encoding information are determined for a piece of media content before encoding the content into a plurality of output streams. As will be discussed in further detail below, statistics and encoding information may be stored and shared between encoding processes to speed up the decision on how to encode the content. The statistics for collecting in the first pass include, but are not limited to: average quantization parameters, size of header bits, size of texture bits, number of intra macroblocks / CTUs, number of inter macroblocks / CTUs, Blocks / CTUs. The encoding information may include, but is not limited to: frame complexity measurements, a coding tree unit (CTU) structure, a mode distribution, and motion information. The collection and use of encoding information in the encoding of alternative streams of video in accordance with embodiments of the present invention is discussed below.

프레임 복잡도 측정치Frame complexity measure

소스 콘텐트를 인코딩하기 전에, 프레임 복잡도 측정치는 콘텐트의 각각의 프레임에 할당될 수 있다. 프레임 복잡도 측정치는 프레임에서 시각적 정보의 복잡도의 레벨, 및 그에 의해 상기 프레임(즉, 비트들에서)을 인코딩하는 단계으로부터 출력 콘텐트 스트림을 야기할 데이터의 표시를 나타낸다. 이 기술분야에 알려진 것들을 포함하는 알고리즘들은 프레임 복잡도 측정치를 산출하기 위해 본 발명의 실시예들에 따라 이용될 수 있다. 이러한 알고리즘들은 프레임에서의 픽셀들 또는 다수의 프레임들에서의 대응 픽셀들에 걸친 평균 값들로부터 컬러 및 밝기와 같은 값들의 편차, 및/또는 이전 프레임들에서의 것들과 프레임들에서의 픽셀들, 및/또는 픽셀들의 블록들 사이에서의 유사성들과 같은 인터(즉, 사이) 및 인트라(즉, 내) 프레임 측정들을 고려할 수 있다. 더욱이, 알고리즘들은 모션 추정을 행할 때 콘텐트를 인코딩하는데 사용된 것들과 유사한 측정치들을 산출할 수 있다.Prior to encoding the source content, the frame complexity measure may be assigned to each frame of content. The frame complexity measure represents an indication of the level of complexity of the visual information in the frame and thereby the data that will cause the output content stream from encoding the frame (i.e., in bits). Algorithms, including those known in the art, may be used in accordance with embodiments of the present invention to yield a frame complexity measure. Such algorithms may be based on variations in values, such as color and brightness, from average values across corresponding pixels in pixels or in multiple frames, and / or pixels in frames and ones in previous frames, (I. E., Between) and intra (i. E.) Frame measurements as well as similarities between blocks of pixels. Moreover, the algorithms can yield measurements similar to those used to encode the content when performing motion estimation.

이하에서 추가로 논의될 바와 같이, 프레임 복잡도 측정치는 입력 스트림으로부터 출력 스트림에서의 프레임으로 프레임을 인코딩할 파라미터들을 선택할 때 사용될 수 있다. 본 발명의 많은 실시예들에서, 프레임 복잡도 측정치는 콘텐트의 각각의 프레임에 할당되며 정수로서 표현되고, 여기에서 보다 큰 값은 프레임 내에서 보다 큰 복잡도를 표시한다. 본 발명의 추가 실시예들에서, 프레임 복잡도 측정치는 입력 스트림에서의 프레임에 할당되며 동일한 측정치는 복수의 대안 출력 스트림들에서의 대응 프레임들을 인코딩하기 위한 파라미터들을 선택하기 위해 사용된다.As will be discussed further below, the frame complexity measure may be used to select parameters from the input stream to encode the frame into frames in the output stream. In many embodiments of the present invention, the frame complexity measure is assigned to each frame of content and is represented as an integer, where a larger value indicates greater complexity within the frame. In further embodiments of the present invention, the frame complexity measure is assigned to a frame in the input stream and the same measure is used to select parameters for encoding corresponding frames in a plurality of alternative output streams.

이 기술분야에 알려진 것들을 포함하는 비트레이트 제어 알고리즘들은 얼마나 많은 비트들을 출력 스트림의 비디오 및 버퍼 레벨들의 각각의 인코딩된 프레임에 할당할지를 결정하기 위해 및 비디오의 각각의 프레임을 인코딩할 때 적용할 양자화 레벨들을 선택하기 위해 본 발명의 실시예들에 따라 이용될 수 있다. 본 발명의 실시예들에 따르면, 이들 알고리즘들은 입력 콘텐트 스트림에 대해 한 번 계산된 프레임 복잡도 측정치를 이용하며 상기 입력 스트림으로부터 인코딩된 다수의 출력 스트림들에 대한 양자화 레벨을 결정하기 위해 상기 측정치를 재사용할 수 있다. Bit rate control algorithms, including those known in the art, are used to determine how many bits to allocate to each encoded frame of the video and buffer levels of the output stream and to determine the quantization level to apply when encoding each frame of video May be utilized in accordance with embodiments of the present invention to select a plurality of < RTI ID = 0.0 > According to embodiments of the present invention, these algorithms use a once computed frame complexity measure for an input content stream and reuse the measure to determine a quantization level for a number of output streams encoded from the input stream can do.

블록 크기 결정들Block size decisions

블록-지향 인코딩 표준들에서, 픽셀들은 인코딩 프로세스에서 파티션들 또는 "블록들"로서 취해진다. 몇몇 현재 표준들에서, 블록 크기는 가변적이며 여기에서 프레임에서의 파티션들이 상이한 크기들을 가질 수 있다. HEVC 표준은 코딩 트리 유닛(CTU)의 블록 크기가 통상적으로 64×64, 32×32, 16×16, 또는 8×8 픽셀들인 CTU들로서 알려진 파티션들을 사용한다. 비디오의 프레임은 상이한 크기들 및 배열들의 CTU들의 혼합을 포함할 수 있다. 종종, 프레임에서 CTU들의 블록 크기들(또는 관련 비디오 압축 표준에 따라 픽셀들의 다른 "블록" 파티션들)은 인코딩된 프레임의 비트들에서의 결과적인 크기 및 효율성에 기초하여 선택된다. 본 발명의 실시예들에 따른 인코더는 그것이 또 다른 대안 출력 스트림에서 프레임들의 대응 부분들에 대해 이미 이루어진 유사한 결정들을 가질 때 출력 스트림에 대한 콘텐트의 프레임들을 인코딩할 때 사용하기 위해 블록 크기들에 대한 결정들을 가속화할 수 있다. In block-oriented encoding standards, pixels are taken as partitions or "blocks" in the encoding process. In some current standards, the block size is variable, where partitions in a frame may have different sizes. The HEVC standard uses partitions known as CTUs whose block size of the coding tree unit (CTU) is typically 64x64, 32x32, 16x16, or 8x8 pixels. A frame of video may comprise a mixture of CTUs of different sizes and arrangements. Often, the block sizes of the CTUs in the frame (or other "block" partitions of pixels in accordance with the relevant video compression standard) are selected based on the resulting size and efficiency in the bits of the encoded frame. An encoder in accordance with embodiments of the present invention may be configured for use with block sizes for use in encoding frames of content for an output stream when it has similar determinations already made for corresponding portions of frames in yet another alternative output stream It is possible to accelerate crystals.

인코더는 통상적으로 프레임의 이미지 품질을 유지하면서 결과적인 인코딩된 출력 프레임의 크기(비트들에서)에 적어도 부분적으로 기초한 블록 크기들을 선택한다. 일반적으로, 보다 작은 크기의 출력 프레임들(즉, 보다 효율적인)이 요구되며 인코딩에서 보다 큰 블록 크기들을 사용하여 보다 높은 압축을 갖고 성취될 수 있다. 인코더는 프레임의 일부를 인코딩할 때 보다 큰 블록 크기들을 사용하려고 노력할 수 있으며, 결과적인 이미지의 품질 또는 정확도가 임계치를 충족하지 않는다면, 프레임의 상기 부분에 대해 점진적으로 보다 작은 블록 크기들을 시도한다. 예를 들면, 프레임의 32×32 픽셀 부분은 4개의 16×16 픽셀 CTU들로서 인코딩하기 위해 검사될 수 있다. 16×16 픽셀 CTU들 중 임의의 것이 바람직하지 않다면, 그것은 4개의 8×8 픽셀 CTU들로 추가로 분할될 수 있다. 다른 비디오 표준들에서, 특정한 표준에서 특정된 보다 큰 및 보다 작은 블록 크기들은 유사한 방식으로 검사될 수 있다.The encoder typically chooses block sizes based at least in part on the size (in bits) of the resulting encoded output frame while maintaining the image quality of the frame. In general, smaller output frames (i.e., more efficient) are required and can be achieved with higher compression using larger block sizes in encoding. The encoder may try to use larger block sizes when encoding a portion of the frame and attempt smaller block sizes incrementally for that portion of the frame if the resulting image quality or accuracy does not meet the threshold. For example, a 32x32 pixel portion of a frame may be examined to encode as four 16x16 pixel CTUs. If any of the 16x16 pixel CTUs are undesirable, it can be further divided into four 8x8 pixel CTUs. In other video standards, larger and smaller block sizes specified in a particular standard may be examined in a similar manner.

본 발명의 실시예들에 따르면, 특정한 데이터는 동일한 입력 스트림으로부터 대안 출력 스트림들을 인코딩하는 인코더 제어기들 사이에서 보유되며 공유될 수 있다. 공유 데이터는 프레임들의 부분들에 대해 검사된 블록 크기들을 포함할 수 있다. 예를 들면, 인코더 제어기가 16×16 CTU들(또는 CTU 크기들의 임의의 다른 조합)로서 프레임의 32×32 픽셀 부분을 프로세싱하기 위해 선택된다면, 인코더 제어기 또는 또 다른 인코더 제어기가 동일한 분해능이지만 상이한 비트레이트에서 프레임의 동일한 부분을 인코딩할 때, 그것은 이전에 이루어진 결정(들)을 생략하며 동일한 크기 CTU들을 사용하고/하거나 다른 CTU 크기들을 검사할 수 있다. 보다 높은 비트레이트들에서, 보다 높은 비트레이트는 보다 많은 저장된 정보를 수용할 수 있으며 그러므로 보다 큰 블록 크기들에 기인한 보다 높은 압축이 요구되지 않기 때문에 보다 큰 블록 크기들을 검사하는 단계는 불필요할 수 있다. 그러므로, 인코더 제어기가 보다 높은 비트레이트에서 프레임의 동일한 부분을 인코딩할 때, 그것은 32×32 픽셀 CTU들을 갖고 인코딩하지 않기 위해 이전 검사/결정을 생략하며 간단히 16×16 CTU들을 사용할 수 있고/있거나 8×8 픽셀 CTU들이 타겟 비트레이트를 충족시키면서 동일하거나 보다 양호한 품질을 달성한다면 검사할 수 있다.According to embodiments of the present invention, specific data may be held and shared among the encoder controllers encoding the alternative output streams from the same input stream. The shared data may include block sizes that are checked for portions of frames. For example, if an encoder controller is selected to process a 32 x 32 pixel portion of a frame as 16 x 16 CTUs (or any other combination of CTU sizes), then the encoder controller or another encoder controller will have the same resolution but different bits When encoding the same portion of a frame at rate, it may omit the previously made decision (s) and use the same size CTUs and / or check other CTU sizes. At higher bit rates, the step of checking for larger block sizes may be unnecessary because a higher bit rate can accommodate more stored information and therefore no higher compression due to larger block sizes is required have. Therefore, when the encoder controller encodes the same portion of a frame at a higher bit rate, it omits the previous check / decision to not encode with 32 x 32 pixel CTUs and can simply use 16 x 16 CTUs and / X 8 pixel CTUs meet the target bit rate while achieving the same or better quality.

지난 CTU 크기 선택들은 또한 상이한 분해능의 스트림에서 CTU 크기들을 선택할 때 유용할 수 있다. 예를 들면, 32×32의 CTU 크기가 검사되었으며 16×16 이하의 파티션들을 위하여 분해능 1920×1080을 가진 스트림에서의 프레임의 일부에 대해 거절되었다고 가정하자. 분해능 3840×2160의 스트림에서의 프레임의 동일한 부분은 64×64의 크기를 커버할 것이다. 인코더는 32×32의 크기가 보다 낮은 분해능 스트림에서 거절되었기 때문에 보다 높은 분해능 프레임에서의 상기 부분에 대해 64×64의 CTU 크기를 확인하는 단계를 생략할 수 있다. 인코더는 32×32 이하의 CTU 크기들을 검사함으로써 시작할 수 있다. Past CTU size selections may also be useful when selecting CTU sizes in streams of different resolutions. For example, suppose a CTU size of 32 x 32 is examined and rejected for a portion of a frame in a stream with a resolution of 1920 x 1080 for 16 x 16 or smaller partitions. The same portion of the frame in the stream of resolution 3840x2160 will cover a size of 64x64. The encoder may skip the step of verifying the 64 x 64 CTU size for this portion in the higher resolution frame because the size of 32 x 32 is rejected in the lower resolution stream. The encoder can start by examining CTU sizes of 32 x 32 or less.

본 발명의 실시예들에 따라 비디오의 스트림을 인코딩할 때 블록 크기 정보를 재사용하기 위한 프로세스가 도 3에 도시된다. 프로세스는 크기가 유사한 CTU(즉, 또 다른 스트림에서의 대응 프레임에서 대응 CTU)에 대해 결정되었는지를 확인하는 단계(310)를 포함한다. 그 후 CTU 크기가 검사되기 위해 선택되지 않는다면(312), 통상적으로 보다 큰 크기로부터 시작한다. 크기가 유사한 CTU에 대해 결정되었다면, 유사한 CTU의 소스 스트림의 분해능은 현재 스트림에 비교된다(312). 분해능이 상이하다면, CTU 크기는 현재 스트림으로 스케일링(314)된다. 미리 결정된 CTU 크기(미리 결정된다면) 또는 선택된 CTU 크기(미리 결정되지 않았다면)는 인코딩되는 현재 CTU에 대해 검사된다(316). 크기가 수용 가능하지 않다면, 보다 작은 CTU 크기가 시도된다(318). 크기가 수용 가능하다면, CTU 크기는 현재 CTU에 적용된다(320).A process for reusing block size information when encoding a stream of video in accordance with embodiments of the present invention is shown in FIG. The process includes ascertaining (310) whether the size has been determined for a similar CTU (i. E., The corresponding CTU in the corresponding frame in another stream). If the CTU size is not selected to be checked (312) then it typically begins with a larger size. If the size is determined for a similar CTU, the resolution of the source stream of a similar CTU is compared to the current stream (312). If the resolution is different, the CTU size is scaled (314) to the current stream. The predetermined CTU size (if predetermined) or the selected CTU size (if not already predetermined) is checked 316 for the current CTU being encoded. If the size is not acceptable, a smaller CTU size is attempted (318). If the size is acceptable, the CTU size is applied to the current CTU (320).

비디오들의 스트림을 인코딩할 때 CTU 크기 결정들을 재사용하기 위한 특정 프로세스가 도 3에 대하여 상기 논의되지만, 다양한 프로세스들 중 임의의 것이 본 발명의 실시예들에 따라 다수의 미디어 콘텐트 스트림들을 인코딩할 때 CTU(또는 관련 비디오 압축 표준에 따라 픽셀들의 다른 "블록" 파티션들) 크기 결정들을 재사용하기 위해 이용될 수 있다. 게다가, HEVC가 아닌 다른 비디오 표준들에서 "블록" 파티션들은 여기에서 논의된 것들이 아닌 이용 가능한 크기들을 가질 수 있으며, 여기에서 논의된 블록 크기들을 선택하기 위한 기술들이 동일하게 적용 가능하다는 것이 당업자에게 이해된다. 이하에서 추가로 논의될 바와 같이, 하나의 인코더 제어기(즉, 동일한 분해능 및 상이한 비트레이트들에서 다수의 스트림들을 인코딩하는 소프트웨어 프로세스)에 의한 결정들은 인코더 제어기에 의해 또는 상이한 분해능을 인코딩한 또 다른 인코딩 제어기에 의해 저장되며 재사용될 수 있다.Although a particular process for reusing CTU size decisions when encoding a stream of videos is discussed above with respect to FIG. 3, any of a variety of processes may be used when encoding a plurality of media content streams in accordance with embodiments of the present invention, (Or other "block" partitions of pixels in accordance with the relevant video compression standard) size decisions. Furthermore, it should be appreciated by those skilled in the art that, in other video standards other than HEVC, "block " partitions may have available sizes other than those discussed herein and that techniques for selecting block sizes discussed herein are equally applicable do. As will be discussed further below, decisions by one encoder controller (i. E., A software process that encodes multiple streams at the same resolution and different bit rates) are performed by the encoder controller or by another encoding It can be stored and reused by the controller.

모드mode 분포 및 Distribution and 모드mode 선택들 Choices

모드 분포는 비디오의 프레임 내에서 인트라, 인터, 및 스킵 매크로 블록들(또는 관련 비디오 압축 표준에 따른 픽셀들의 다른 "블록" 파티션들)의 전체 비를 나타낸다. P 프레임들 및 B 프레임들과 같은 예측 프레임들은 인트라(프레임 내에서의 정보만을 갖고 인코딩된) 및/또는 인터(또 다른 프레임에서의 정보를 참조하여 인코디된) 프레임 매크로블록들을 포함할 수 있다. 상이한 비디오 압축 표준들 하에서, 블록 픽셀 파티션들은 매크로블록들 또는 코딩 트리 유닛들(CTU들)로서 다양하게 불리울 수 있다. 모드 분포는 블록 자체가 매크로블록 또는 CTU로 불리우는지에 관계없이 블록 유형들을 나타낼 수 있다. 이 기술분야에 알려진 것들을 포함하여, 테스트들 및/또는 알고리즘들이 이미지 품질을 유지하면서 어떤 유형의 블록이 상기 블록을 인코딩할 때 가장 효율적인지를 알기 위해 프레임에서의 각각의 매크로블록을 분석하기 위해 이용될 수 있다. 예를 들면, 인코딩 효율성의 측정치들은 특정한 매크로블록을 인코딩하기 위해 요구된 비트들의 수일 수 있다. The mode distribution represents the overall ratio of intra, inter, and skip macroblocks (or other "block" partitions of pixels according to the relevant video compression standard) within a frame of video. Predictive frames, such as P frames and B frames, may include intra (encoded with information only within the frame) and / or inter (co-encoded with reference to information in another frame) frame macroblocks . Under different video compression standards, block pixel partitions can be variously disadvantaged as macroblocks or coding tree units (CTUs). The mode distribution may represent block types regardless of whether the block itself is referred to as a macroblock or CTU. Tests and / or algorithms, including those known in the art, may be used to analyze each macroblock in a frame to know which type of block is most efficient when encoding the block while maintaining image quality . For example, measurements of encoding efficiency may be the number of bits required to encode a particular macroblock.

인코딩 효율성 및 이미지 품질을 위해, 대안 스트림들에서 대응 프레임들에 걸친 동일한 모드 분포에 가능한 한 가깝게 유지하는 것이 종종 바람직하다. 예를 들면, 30% 인트라 블록들, 60% 인터 블록들, 및 10% 스킵 블록들을 가진 프레임(p)을 고려해볼 때, 동일한 분해능 및 상이한 비트레이트에서 제 1 대안 스트림에서의 대응 프레임(p1)은 인트라, 인터, 및 스킵 블록들의 유사한 분포, 즉 타겟 모드 분포를 가져야 한다. 상이한 분해능 및 상이한 비트레이트에서 제 2 대안 스트림에서의 대응 프레임(p2)은 이상적으로 또한 블록 유형들의 유사한 분포를 가질 것이다. For encoding efficiency and image quality, it is often desirable to keep as close as possible the same mode distribution across the corresponding frames in the alternative streams. For example, considering the frame p with 30% intra blocks, 60% inter blocks, and 10% skip blocks, the corresponding frame p1 in the first alternative stream at the same resolution and different bit rate, Must have a similar distribution of intra, inter, and skip blocks, i.e., target mode distribution. The corresponding frame p2 in the second alternative stream at different resolutions and different bit rates will ideally also have a similar distribution of block types.

모드 분포는 프레임에서의 블록들을 프로세싱할 때 어떤 블록 유형을 사용할지에 대한 인코더에 의해 이루어진 결정들을 가속화하기 위해 사용될 수 있다. 결정된 모드 분포에 따르려고 함으로써, 결정들의 수는 감소될 수 있다. 예를 들면, 특정한 프레임을 프로세싱하는 끝을 향해, 프레임의 모드 분포는 타겟 모드 분포로부터 벗어날 수 있다. 그러므로, 블록들은 이미지의 양호한 품질 및 무결성을 유지하면서 타겟 모드 분포에 더 가까운 이들 블록들을 인코딩한 후 결과적인 모드 분포를 이끌기 위해 특정한 유형(예로서, 인트라, 인터, 스킵)이도록 선택될 수 있다. 본 발명의 다양한 실시예들에서, 블록 유형을 선택할 때 결정들은 보다 많은 복잡도(예로서, 보다 많은 인트라 코딩된 블록들)가 보다 높은 비트레이트들에서 수용되기 때문에 상이한 비트레이트들에서의 스트림들 사이에서 달라질 수 있다.The mode distribution can be used to accelerate decisions made by the encoder about which block type to use when processing blocks in a frame. By trying to comply with the determined mode distribution, the number of crystals can be reduced. For example, towards the end of processing a particular frame, the mode distribution of the frame may deviate from the target mode distribution. Thus, the blocks may be selected to be of a specific type (e.g., intra, inter, skip) to encode these blocks closer to the target mode distribution while maintaining good quality and integrity of the image, . In various embodiments of the present invention, when choosing a block type, decisions are made between streams at different bit rates (e.g., more intra-coded blocks) because they are accommodated at higher bit rates Lt; / RTI >

본 발명의 실시예들에 따라 블록 유형 결정들을 조정하기 위한 프로세스가 도 4에 도시된다. 프로세스(400)는 프레임에서의 블록들이 인코딩될 때 또는 프레임이 분석되며 블록 유형들이 인코딩되지 않고 선택될 때 이용될 수 있다. 프로세스는 프로세싱되는(예로서, 분석되거나 인코딩된) 블록들의 카운트를 유지하는 단계(410)를 포함한다. 블록들의 임계 수, 또는 대안적으로 프레임에서의 총 블록들의 비율이 도달된다면, 블록 유형들을 선택하기 위한 기준들은 프레임에서의 블록들의 타겟 모드 분포를 유지할 블록 유형들을 선택하기 위해 더 조정될 수 있다(414). 프레임의 끝이 도달되지 않았다면(416), 프로세싱은 프레임에서의 블록들에 대해 계속되며 프로세싱된 블록들의 카운트를 업데이트한다(410).A process for adjusting block type decisions in accordance with embodiments of the present invention is illustrated in FIG. Process 400 may be used when blocks in a frame are encoded or when a frame is analyzed and block types are selected without being encoded. The process includes maintaining (410) a count of blocks to be processed (e.g., analyzed or encoded). If the threshold number of blocks, or alternatively the ratio of the total blocks in the frame, is reached, then the criteria for selecting the block types may be further adjusted to select the block types to maintain the target mode distribution of the blocks in the frame ). If the end of the frame is not reached (416), processing continues for blocks in the frame and updates the count of processed blocks (410).

비디오의 스트림들을 인코딩할 때 블록 유형 결정들을 조정하기 위한 특정 프로세스가 도 4에 대하여 상기 논의되지만, 다양한 프로세스들 중 임의의 것이 본 발명의 실시예들에 따라 다수의 미디어 콘텐트 스트림들을 인코딩할 때 블록 유형 결정들을 조정하기 위해 이용될 수 있다.Although a particular process for adjusting block type decisions when encoding streams of video is discussed above with respect to FIG. 4, any of the various processes may be used when encoding a plurality of media content streams in accordance with embodiments of the present invention, Type determinations. &Lt; RTI ID = 0.0 >

모션motion 추정 calculation

비디오 압축에서, P 프레임들 및 B 프레임들과 같은 예측 프레임들인 기준 프레임(I, P, 또는 B 프레임일 수 있는)으로부터 정보를 인출하며 모션 추정에서 사용된 모션 벡터들 및 잔여 데이터와 같은 정보를 포함한다. 픽셀들의 블록(예로서, 매크로블록, CTU, 또는 유사한 파티션)을 위한 모션 벡터는 통상적으로 예측된 프레임에서 기준 프레임으로부터 픽셀들의 표시된 블록을 위치시킬 곳을 설명한다. 잔여 데이터는 픽셀들의 블록 및 픽셀들의 기준 블록 사이에서의 임의의 차이들을 설명한다. 본 발명의 많은 실시예들에서, 모션 벡터들과 같은 모션 추정 정보는 상이한 해상도 및/또는 비트레이트에서 하나의 출력 스트림으로부터 또 다른 출력 스트림으로 재사용될 수 있다.In video compression, information is extracted from reference frames (which may be I, P, or B frames), which are predictive frames such as P frames and B frames, and information such as motion vectors and residual data used in motion estimation . A motion vector for a block of pixels (e.g., a macroblock, CTU, or similar partition) typically describes where to place the indicated block of pixels from the reference frame in the predicted frame. The residual data describes any differences between a block of pixels and a reference block of pixels. In many embodiments of the present invention, motion estimation information, such as motion vectors, can be reused from one output stream to another output stream at different resolutions and / or bit rates.

본 발명의 여러 개의 실시예들에서, 모션 벡터들은 주어진 분해능 및 비트레이트에서 적어도 하나의 출력 스트림에 대해 생성된다. 모션 벡터들은 그 후 상이한 비트레이트에서 동일한 분해능에서 적어도 하나의 다른 출력 스트림에서 인코딩할 때 적용되며 정제된다. 본 발명의 추가 실시예들에서, 출력 스트림에 대해 생성되는 모션 벡터들은 보다 높은 분해능에서 적어도 하나의 다른 출력 스트림을 인코딩할 때 적용되며 정제된다. 모션 벡터들은 제 1 스트림 및 제 2 스트림 사이에서 분해능에서의 차이에 비례하여 스케일링되며 그 후 보다 높은 분해능에서 이용 가능한 보다 높은 정밀도를 이용하기 위해 정제된다. 예를 들면, 분해능 768×432의 스트림에 대해 생성된 모션 벡터들은 1280/768로 벡터의 수평 구성요소를 및 720/432로 벡터의 수직 구성요소를 곱함으로써 분해능 1280×720의 스트림을 인코딩할 때 사용될 수 있다. 곱셈기들(1280/768 및 720/432)은 분해능에서의 상기 변화에 대한 스케일 인자들로서 고려될 수 있다. 더욱이, 인코더는 매크로블록에 대한 잔여 데이터 및 모션 벡터의 인코딩 비용이 매우 낮다고 결정할 수 있으며 또 다른 출력 스트림에서의 상기 매크로블록에 대한 인코딩 비용을 확인하는 것을 생략할 수 있다. 이러한 식으로, 모션 벡터들은 다수의 출력 스트림들에 걸쳐 재사용될 수 있으며 모션 추정은 새로운 모션 벡터들을 생성할 필요가 없음으로써 가속화된다. 다수의 출력 스트림들에 걸친 모션 추정 정보의 공유 및/또는 재사용을 위한 다른 기술들이 본 발명의 실시예들에 따라 이용될 수 있다. 대안 스트림들을 인코딩할 때 모션 추정 정보를 재사용하기 위한 시스템들 및 방법들은 2012년 5월 31일에 출원된, Zurpal 외의, "비디오 데이터의 대안 스트림들을 인코딩할 때 인코딩 정보의 재사용을 위한 시스템들 및 방법들"이라는 제목의, 미국 특허 출원 번호 제13/485,609호에 개시되며, 그 전체는 참조로서 통합된다.In several embodiments of the present invention, motion vectors are generated for at least one output stream at a given resolution and bit rate. The motion vectors are then applied and refined when encoding in at least one other output stream at the same resolution at different bit rates. In further embodiments of the present invention, the motion vectors generated for the output stream are applied and refined when encoding at least one other output stream at a higher resolution. The motion vectors are scaled proportionally to the difference in resolution between the first stream and the second stream and then refined to take advantage of the higher precision available at higher resolutions. For example, motion vectors generated for a stream of resolution 768x422 may be used to encode a 1280x720 resolution stream by multiplying the horizontal component of the vector by 1280/768 and the vertical component of the vector by 720/432 Can be used. The multipliers 1280/768 and 720/432 can be considered as scale factors for this change in resolution. Furthermore, the encoder can determine that the encoding cost of the residual data and the motion vector for the macroblock is very low and can skip the encoding cost for the macroblock in another output stream. In this way, motion vectors can be reused across multiple output streams and motion estimation is accelerated by not having to generate new motion vectors. Other techniques for sharing and / or reusing motion estimation information across multiple output streams may be utilized in accordance with embodiments of the present invention. Systems and methods for reusing motion estimation information when encoding alternative streams are described in Zurpal et al., Filed May 31, 2012, entitled "Systems for Reuse of Encoding Information When Encoding Alternative Streams of Video Data & Methods, " in U. S. Patent Application Serial No. < RTI ID = 0.0 > 13 / 485,609, < / RTI >

본 발명의 실시예들에 따라 비디오의 스트림을 인코딩할 때 모션 벡터들을 재사용하기 위한 프로세스는 도 5에 도시된다. 비디오의 스트림에서의 프레임을 인코딩할 때, 인코더는 모션 벡터가 현재 프레임을 인코딩할 때 재사용될 수 있는 비디오의 상이한 스트림에서의 대응 매크로블록/CTU에 대해 이미 생성되었는지를 확인(510)할 수 있다. 그렇지 않다면, 새로운 모션 벡터가 생성되며 저장된다(512). 모션 벡터가 존재한다면, 인코더는 비디오의 다른 스트림이 현재 스트림과 동일한 분해능에 있는지를 확인(514)할 수 있다. 그것이 동일한 분해능이라면, 모션 벡터가 직접 적용될 수 있다(516). 모션 벡터는 또한 하나 두 개의 이웃 픽셀들에 걸친 작은 다이아몬드 탐색을 갖고 또는 서브-픽셀 정제를 갖고와 같은, 탐색을 갖고 정제될 수 있다. 다양한 다른 탐색 패턴들이 대안적으로 모션 벡터를 정제하기 위해 이용될 수 있다. 그것이 상이한 분해능이면, 모션 벡터는 적용되기 전에 현재 분해능으로 스케일링(518)되며 정제(520)될 수 있다.The process for reusing motion vectors in encoding a stream of video in accordance with embodiments of the present invention is illustrated in FIG. When encoding a frame in a stream of video, the encoder may verify 510 that the motion vector has already been generated for the corresponding macroblock / CTU in a different stream of video that may be reused when encoding the current frame . If not, a new motion vector is generated and stored (512). If a motion vector is present, the encoder may determine 514 whether another stream of video is at the same resolution as the current stream. If it is the same resolution, the motion vector may be applied directly (516). The motion vector may also be refined with a search, such as with a small diamond search over one or two neighboring pixels, or with sub-pixel refinement. Various other search patterns may alternatively be used to refine the motion vector. If it is a different resolution, the motion vector may be scaled (518) and refined (520) to the current resolution before being applied.

비디오의 스트림을 인코딩할 때 모션 벡터들을 재사용하기 위한 특정 프로세스가 도 5에 대하여 상기 논의되지만, 다양한 프로세스들 중 임의의 것이 본 발명의 실시예들에 따라 다수의 미디어 콘텐트 스트림들을 인코딩할 때 모션 벡터들을 재사용하기 위해 이용될 수 있다.Although a particular process for reusing motion vectors in encoding a stream of video is discussed above with respect to FIG. 5, any of the various processes may be used to encode motion vector streams when encoding multiple media content streams in accordance with embodiments of the present invention. Lt; / RTI >

적응적Adaptive 비트레이트Bit rate 스트리밍을 위한 2-패스 인코딩 2-pass encoding for streaming 콘텐트Content

2-패스 프로세스를 사용한 인코딩 콘텐트는 정보의 공유 및 재사용을 통해 효율성들을 제공할 수 있다. 본 발명의 많은 실시예들에서, 인코더는 콘텐트를 수신하고, 제 1 패스에서 콘텐트를 분석하며 제 2 패스에서 콘텐트를 인코딩한다. 제 1 패스에서 수집된 통계들 및 다른 정보는 제 2 패스에서 인코딩을 가속화하기 위해 사용될 수 있으며 통계들 및 인코딩 정보는 제 2 패스에서의 인코딩 프로세스들 사이에서 공유될 수 있다. 몇몇 실시예들에서, 제 1 패스는 제 2 패스가 시작되기 전에 완료된다. 다른 실시예들에서, 제 2 패스는 제 1 패스가 완료되기 전에 시작될 수 있다. 다시 말해서, 제 1 및 제 2 패스들에 대한 계산 프로세스들은 프레임들이 제 2 패스 프로세스(들)에 의해 프로세싱되기 전에 제 1 패스 프로세스(들)에 의해 프로세싱되는 경우 동시에 실행할 수 있다. Encoding content using a two-pass process can provide efficiencies through sharing and reuse of information. In many embodiments of the invention, the encoder receives the content, parses the content in the first pass, and encodes the content in the second pass. Statistics and other information collected in the first pass may be used to speed up encoding in the second pass and statistics and encoding information may be shared between encoding processes in the second pass. In some embodiments, the first pass is completed before the second pass begins. In other embodiments, the second pass may be started before the first pass is completed. In other words, the computational processes for the first and second passes may be performed concurrently if the frames are processed by the first pass process (s) before being processed by the second pass process (s).

본 발명의 실시예에 따라 콘텐트를 인코딩하기 위한 프로세스가 도 6에 도시된다. 인코딩 프로세스(600)는 미디어 콘텐트를 수신하는 단계를 포함한다(610). 콘텐트는 단일 입력 파일(예로서, 멀티미디어 파일 또는 컨테이너 포맷 파일) 또는 미디어 파일들의 모음에 포함될 수 있다. 콘텐트는 또한 인코더에 의해 수신된 비디오의 입력 스트림일 수 있다. 본 발명의 여러 개의 실시예들에서, 인코딩 프로세스는 인코딩 애플리케이션으로서 인코딩 서버상에서 구현되며 디스크 판독/기록 동작들은 각각의 입력 프레임이 단지 한 번 판독될 필요가 있으므로 감소된다.A process for encoding content in accordance with an embodiment of the present invention is illustrated in FIG. The encoding process 600 includes receiving media content (610). The content may be contained in a single input file (e.g., a multimedia file or container format file) or a collection of media files. The content may also be an input stream of video received by the encoder. In various embodiments of the invention, the encoding process is implemented on the encoding server as an encoding application and the disk read / write operations are reduced since each input frame needs to be read only once.

인코딩 프로세스는 또한 미디어 스트림에 대한 통계들을 수집하기 위해 제 1 패스를 만드는 것을 포함한다(612). 제 1 패스는 소스 입력 비디오 스트림에서 콘텐트의 초기 분석을 하며 인코딩될 각각의 분해능에 대한 통계들을 수집하도록 의도된다. 본 발명의 많은 실시예들에서, 입력 스트림은 각각의 원하는 출력 분해능을 위한 스트림으로 변환되며 통계들은 각각의 스트림에 대해 수집된다. 변환은 컬러 변환(예로서, RGB 대 YUV 등, ITU 601 대 709, 또는 포맷 특정 변환들) 및 각각의 프레임의 치수들/분해능을 리사이징하는 단계를 포함할 수 있다. 프레임 복잡도 측정치와 같은 인코딩 정보는 상기 추가로 논의된 바와 같이 기술들을 사용하여 프레임에서 이미지 데이터의 복잡도의 레벨에 기초하여 스트림에서의 각각의 프레임에 할당될 수 있다. 모드 분포는 또한 각각의 프레임에 대해 결정될 수 있다. 복잡도 측정치들 및 모드 분포들은 최종 스트림들의 인코딩에 대한 입력으로서 사용될 수 있다.The encoding process also includes making a first pass to collect statistics for the media stream (612). The first pass is an initial analysis of the content in the source input video stream and is intended to collect statistics for each resolution to be encoded. In many embodiments of the invention, the input stream is transformed into a stream for each desired output resolution and statistics are collected for each stream. The transform may include color transform (e.g., RGB to YUV, ITU 601 to 709, or format specific transforms) and resizing the dimensions / resolution of each frame. Encoding information, such as a frame complexity measure, may be assigned to each frame in the stream based on the level of complexity of the image data in the frame using techniques as discussed above. The mode distribution may also be determined for each frame. The complexity measures and mode distributions may be used as inputs to the encoding of the final streams.

다른 인코딩 정보는 스트림들을 인코딩할 때 사용될 매크로블록 데이터의 크기 및 헤더 데이터의 크기를 포함할 수 있다. 헤더 크기를 고려해볼 때, 인코더는 얼마나 많은 비트들이 오버헤드 데이터에 할당되는지 및 얼마나 많은 비트들이 콘텐트를 인코딩하기 위해 이용 가능한지를 추정할 수 있다. 매크로블록 크기는 또한 콘텐트를 인코딩할 때 비트 할당의 보다 양호한 추정들에 기여할 수 있다. 효과적인 레이트 제어를 위해, 헤더 크기 대 매크로블록 크기의 상대 비율은 출력 스트림들에 걸쳐 일관적이어야 한다. 헤더 크기, 매크로블록 크기, 및/또는 헤더 크기 대 매크로블록 크기의 상대 비율은 인코딩 프로세스에 대한 인코딩 파라미터들로서 제공될 수 있다. Other encoding information may include the size of the macroblock data to be used when encoding the streams and the size of the header data. Considering the header size, the encoder can estimate how many bits are allocated to the overhead data and how many bits are available to encode the content. The macroblock size may also contribute to better estimates of bit allocation when encoding the content. For effective rate control, the relative ratio of header size to macroblock size must be consistent across the output streams. The relative size of the header size, the macroblock size, and / or the header size to the macroblock size may be provided as encoding parameters for the encoding process.

인코딩 정보는 또한 가상 기준 디코더(hypothetical reference decoder; HRD) 데이터를 포함할 수 있다. HRD 모델은 프레임 크기들 및 버퍼 충만도를 산출하는 단계를 돕기 위해 몇몇 비디오 압축 표준들에서 특정된다. HRD는 재생 디바이스 버퍼에서 버퍼 오버플로우를 회피하기 위해 인코더 상에서 디코더 거동(behavior)을 재생한다. HRD 데이터는 제 1 분석 패스에서 결정되며 제 2 인코딩 패스에서의 사용을 위해 공유 버퍼들 또는 로컬 캐시들에 저장될 수 있다.The encoding information may also include hypothetical reference decoder (HRD) data. The HRD model is specified in several video compression standards to assist in calculating the frame sizes and buffer fullness. The HRD recovers the decoder behavior on the encoder to avoid buffer overflow in the playback device buffer. HRD data is determined in the first analysis pass and may be stored in shared buffers or local caches for use in the second encoding pass.

하나의 분해능을 위해 수집된 다른 통계들은 동일한 분해능 및/또는 또 다른 분해능에서 다양한 비트레이트들에서 스트림들을 인코딩하기 위한 결정들을 가속화하기 위해 사용될 수 있다. 제 1 패스에서 수집하기 위한 통계들은 (이에 제한되지 않지만): 평균 양자화 파라미터, 헤더 비트들의 크기, 텍스처 비트들의 크기, 인트라 매크로블록들/CTU들의 수, 인터 매크로블록들/CTU들의 수, 스킵 매크로블록들/CTU들의 수를 포함할 수 있다. 본 발명의 여러 개의 실시예들에서, 통계들은 내부적으로 캐싱될 수 있고(예로서, 단일 파일로 출력) 및/또는 제 2 패스에서 인코딩할 때 각각의 인코더 제어기에 액세스 가능한 메모리에서의 공유 버퍼들에 저장될 수 있다.Other statistics collected for one resolution may be used to accelerate determinations for encoding streams at various bit rates at the same resolution and / or another resolution. The statistics for collecting in the first pass include, but are not limited to: average quantization parameters, size of header bits, size of texture bits, number of intra macroblocks / CTUs, number of inter macroblocks / CTUs, Blocks / CTUs. In various embodiments of the present invention, statistics may be stored internally (e.g., output as a single file) and / or shared buffers in memory accessible to each encoder controller when encoding in the second pass Lt; / RTI >

제 2 패스에서, 콘텐트는 출력 스트림들로 인코딩된다(616). 본 발명의 많은 실시예들에서, 별개의 스레드 또는 프로세스가 각각의 스레드 또는 프로세스가 인코더 제어기로서 불리우는 각각의 출력 분해능에 대한 인코딩 애플리케이션에 의해 론칭된다(614). 본 발명의 추가 실시예들에서, 각각의 인코더 제어기는 동일한 분해능에서 그러나 상이한 비트레이트들에서 다수의 출력 스트림들을 인코딩한다. 본 발명의 여러 개의 실시예들에서, 인코딩 서버는 다수의 프로세서들을 가지며 인코더 제어기들은 각각 상이한 프로세서들 상에서 실행될 수 있거나 그 외 프로세서들에 걸쳐 분포될 수 있다. 출력 스트림들은 고정 비트 레이트(CBR) 또는 가변 비트 레이트(VBR)에서 인코딩될 수 있다. In the second pass, the content is encoded 616 as output streams. In many embodiments of the invention, a separate thread or process is launched (614) by an encoding application for each output resolution, each thread or process being referred to as an encoder controller. In further embodiments of the present invention, each encoder controller encodes multiple output streams at the same resolution but at different bit rates. In various embodiments of the invention, the encoding server may have multiple processors, and the encoder controllers may each be running on different processors or may be distributed across other processors. The output streams may be encoded at a fixed bit rate (CBR) or a variable bit rate (VBR).

스트림들의 인코딩이 완료될 때까지(618), 인코더 제어기들은 내부 캐시들 및/또는 메모리에서의 공유 버퍼들에 저장된 정보 및 통계들을 사용하여 동시에 실행할 수 있다. 인코더 제어기는 인코딩 정보가 그것이 인코딩하는 블록들 및/또는 프레임들에 대해 존재하는지에 대해 메모리를 확인할 수 있으며(620) 현재 블록 또는 프레임을 프로세싱할 때 상기 정보를 이용할 수 있다(622). 상기 추가로 논의되는 바와 같이, 상이한 유형들의 인코딩 정보는 인코더가 결정들을 생략하도록 허용하기 위해 사용될 수 있거나 현재 블록 또는 프레임을 프로세싱하기 위한 인코딩 정보를 결정하기 위해 시작 포인트로서 사용될 수 있다. 정보가 아직 존재하지 않는다면, 그것은 생성될 수 있으며(624) 스트림의 인코딩(616)은 계속된다.Until the encoding of the streams is complete (618), the encoder controllers may concurrently execute using the information and statistics stored in internal caches and / or shared buffers in memory. The encoder controller can identify (620) the memory as to whether encoding information is present for blocks and / or frames that it encodes (622) and use the information when processing the current block or frame (622). As discussed further above, different types of encoding information may be used to allow the encoder to omit decisions, or may be used as a starting point to determine the encoding information for processing the current block or frame. If the information does not yet exist, it may be generated 624 and the encoding 616 of the stream continues.

공유 정보는 모션 벡터들과 같은 모션 추정 정보를 포함할 수 있다. 스트림에 대해 생성된 모션 벡터들은 동일한 분해능에서 또는 상이한 분해능에서 또 다른 스트림을 인코딩할 때 재사용될 수 있다. 상기 추가로 논의된 바와 같이, 보다 높은 분해능에서 스트림을 인코딩하기 위해 재사용된 모션 벡터는 분해능에서의 차이를 보상하기 위해 스케일링되고 정제될 수 있다. 몇몇 통계들은 인코딩되는 스트림의 분해능에 특정적일 수 있다. 상기 논의된 바와 같이, 특정한 해상도 및 비트레이트에서 CTU 크기 선택들은 유지되며 보다 높은 비트레이트에서 동일한 분해능을 인코딩하는데 적용될 수 있다. 인코더 제어기가 프레임의 일부에서 CTU 크기에 대한 결정을 할 때, 그것은 보다 높은 비트레이트에서 프레임의 동일한 부분을 인코딩할 때 상기 결정을 생략하며 CTU 크기를 재사용하거나 보다 작은 CTU 크기들을 시도할 수 있다.The shared information may include motion estimation information such as motion vectors. The motion vectors generated for the stream can be reused when encoding another stream at the same resolution or at different resolutions. As discussed further above, reused motion vectors may be scaled and refined to compensate for differences in resolution to encode the stream at higher resolutions. Some statistics may be specific to the resolution of the stream being encoded. As discussed above, CTU size selections at a particular resolution and bit rate are retained and can be applied to encode the same resolution at a higher bit rate. When the encoder controller makes a determination on the CTU size in a portion of a frame, it may re-use the CTU size or try smaller CTU sizes, omitting the determination when encoding the same portion of the frame at a higher bit rate.

본 발명의 많은 실시예들에서, 프레임의 주어진 부분에서의 CTU는 상이한 비트레이트들에서 복수의 출력 스트림들에 대해 동일한 인코더 제어기에 의해 순차적으로 인코딩된다. 인코더 제어기가 출력 스트림에 대한 CTU를 인코딩한 후, 그것이 그 후로 상이한 출력 스트림에 대한 CTU를 인코딩한다면, 상기 CTU를 인코딩하기 위해 요구된 데이터는 여전히 프로세서 캐시 또는 다른 로컬 메모리에 있을 가능성이 있다. 본 발명의 실시예들에 따르면, 인코더 제어기는 순차적으로 차례로 각각의 출력 스트림에 대한 CTU를 인코딩한다. 각각의 출력 스트림에 대한 대응 CTU들은 상이한 비트레이트들에서 스트림들에 대한 상이한 양자화 파라미터들을 사용하여 인코딩될 수 있다.In many embodiments of the present invention, the CTU at a given portion of the frame is sequentially encoded by the same encoder controller for a plurality of output streams at different bit rates. If the encoder controller encodes the CTU for the output stream and thereafter encodes the CTU for the different output stream, then the data required to encode the CTU may still be in the processor cache or other local memory. According to embodiments of the present invention, the encoder controller sequentially encodes CTUs for each output stream in turn. The corresponding CTUs for each output stream may be encoded using different quantization parameters for the streams at different bit rates.

블록-지향 비디오 압축을 가진 인코딩은 통상적으로 프레임 내에서 이미지 정보의 압축에 기여하는 양자화 파라미터를 고려하는 단계를 수반한다. 프레임에 대한 양자화 파라미터는 통상적으로 프레임의 모드 분포 및 프레임 복잡도 측정치와 같은 인자들에 기초하여 선택되며 가능한 최고 이미지 품질을 유지하면서 인코딩된 프레임이 타겟 크기(비트들에서)이도록 선택된다. 상기 추가로 논의된 바와 같이, 프레임 복잡도 측정치 및 모드 분포에 대한 값들은 입력 스트림의 프레임에 대해 결정되며 다수의 출력 스트림들에서의 대응하는 출력 프레임들을 프로세싱하는데 이용될 수 있다. 이전 프레임으로부터의 양자화 파라미터(들)는 시작 값으로서 사용될 수 있으며 현재 프레임에 대한 모드 분포 및 프레임 복잡도 측정치를 사용하여 정제될 수 있다. 예를 들면, 다음 프레임이 이전 프레임보다 낮은 복잡도 측정치를 가진다면, 양자화 파라미터들은 덜 복잡한 이미지가 충실도의 심각한 손실 없이 더 압축될 수 있으므로 상기 프레임에 대해 증가될 수 있다. 양자화 파라미터들은 보다 높은 비트레이트에서 데이터에 대한 보다 높은 용량 때문에 보다 높은 비트레이트 스트림에 대해 동일한 프레임을 인코딩할 때 감소될 수 있다. 본 발명의 많은 실시예들에서, 양자화 파라미터들은 각각의 출력 스트림의 타겟 비트레이트에 대해 조정하는 동안 동일한 프레임 복잡도 측정치 및 모드 분포를 사용하여 각각의 출력 스트림에서의 대응 프레임들에 대해 독립적으로 결정될 수 있다. 보다 낮은 비트레이트 출력 스트림에서의 프레임에서 CTU들에 대해 결정된 양자화 파라미터들은 보다 높은 비트레이트 출력 스트림에서의 대응 CTU들로 불리우며 그에 대해 감소될 수 있다. 본 발명의 실시예들에 따른 인코더들은 출력 스트림들을 인코딩할 때 레이트 왜곡 정보를 산출하며 보다 높은 비트레이트에서 출력 스트림을 인코딩할 때 양자화 파라미터들을 선택하기 위해 레이트 왜곡 정보를 재사용할 수 있다. 본 발명의 여러 개의 실시예들에서, 각각의 출력 스트림에서의 각각의 프레임은 그것의 모드 분포가 입력 스트림으로부터의 소스 프레임을 사용하여 결정된 타겟 모드 분포와 대략 동일하도록 인코딩된다. 다른 유형들의 공유 정보는 상기 추가로 상이한 인코더 제어기들에 의해 액세스 가능하고 사용될 수 있다.Encoding with block-oriented video compression typically involves considering quantization parameters that contribute to the compression of image information within the frame. The quantization parameter for a frame is typically selected based on factors such as the frame's mode distribution and a measure of frame complexity and is selected such that the encoded frame is at the target size (in bits) while maintaining the highest possible image quality. As discussed further above, values for frame complexity measurements and mode distributions are determined for a frame of the input stream and can be used to process corresponding output frames in multiple output streams. The quantization parameter (s) from the previous frame can be used as a starting value and can be refined using the mode distribution and frame complexity measurements for the current frame. For example, if the next frame has a lower complexity measure than the previous frame, then the quantization parameters can be increased for that frame as less complex images can be compressed without significant loss of fidelity. The quantization parameters may be reduced when encoding the same frame for a higher bitrate stream due to the higher capacity for data at a higher bitrate. In many embodiments of the invention, the quantization parameters may be independently determined for corresponding frames in each output stream using the same frame complexity measure and mode distribution while adjusting for the target bit rate of each output stream have. The quantization parameters determined for the CTUs in the frame in the lower bitrate output stream are referred to as corresponding CTUs in the higher bitrate output stream and can be reduced therefrom. Encoders in accordance with embodiments of the present invention may generate rate-distortion information when encoding output streams and reuse rate-distortion information to select quantization parameters when encoding an output stream at a higher bit rate. In various embodiments of the present invention, each frame in each output stream is encoded such that its mode distribution is approximately equal to the target mode distribution determined using the source frame from the input stream. Other types of shared information may be accessed and used by the further different encoder controllers.

본 발명의 다양한 실시예들에서, 비디오의 하나 이상의 프레임들은 남아있는 모션 예측을 인코딩하는 블록들로 분할된다. 이들 블록들은 HEVC 표준에서의 변환 유닛들(TU)로서 불리우며 4×4, 8×8, 16×16, 또는 32×32의 크기들을 가질 수 있다. 예측 잔여물들에 대한 다른 유형들의 블록 유닛들은 관련 표준 또는 비디오 코덱에 의존하여 이용될 수 있다. 본 발명의 실시예들에 따른 인코더들은 인코더가 구성되는 비디오 코덱 또는 표준에 의해 정의되는 CTU들, TU들, 코딩 유닛들(CU들) 및/또는 다른 유형들의 인코딩 블록들을 인코딩할 때 양자화 파라미터들을 사용할 수 있다. 스트림의 통계들에 기초한 비디오의 프레임에서 블록들에 대한 양자화 파라미터들 및/또는 이전 산출된 양자화 파라미터들을 선택하는 프로세스는 적응적 양자화로 불리울 수 있다. 픽셀들의 블록들에 대한 양자화 파라미터들은 픽셀들의 이들 블록들에 대한 복잡도 측정치들(상기 추가로 논의된 프레임 복잡도 측정치들과 유사한) 및/또는 픽셀들의 유사한 블록들에 대해 이미 산출된 양자화 파라미터들에 기초하여 제 1 패스 및/또는 제 2 패스에서 산출될 수 있다. 제 1 패스에서 산출되는 양자화 파라미터들 및/또는 복잡도 측정치들이, 예를 들면, 인코더 제어기들에 의해 액세스 가능한 공유 버퍼들에 저장되며 제 2 패스에서 이용될 수 있다. 예를 들면, 하나의 분해능에서 입력 스트림에 대해 산출되는 값들은 상이한 비트 레이트들에서 동일한 분해능에서 출력 스트림들을 인코딩할 때 이용될 수 있다. In various embodiments of the present invention, one or more frames of video are divided into blocks that encode the remaining motion prediction. These blocks are called conversion units (TU) in the HEVC standard and can have sizes of 4x4, 8x8, 16x16, or 32x32. Other types of block units for prediction residues may be used depending on the relevant standard or video codec. Encoders in accordance with embodiments of the present invention may use quantization parameters when encoding CTUs, TUs, coding units (CUs), and / or other types of encoding blocks defined by a video codec or standard for which the encoder is configured Can be used. The process of selecting quantization parameters for blocks and / or previously calculated quantization parameters in a frame of video based on statistics of the stream may be referred to as adaptive quantization. The quantization parameters for the blocks of pixels are based on the complexity measures for these blocks of pixels (similar to the further discussed frame complexity measures) and / or the quantization parameters already calculated for similar blocks of pixels And can be calculated in the first pass and / or the second pass. The quantization parameters and / or complexity measurements produced in the first pass may be stored in, for example, shared buffers accessible by the encoder controllers and utilized in the second pass. For example, values computed for an input stream at one resolution can be used when encoding output streams at the same resolution at different bit rates.

도 6에 대하여 상기 설명된 프로세스에서, 통계들은 입력 비디오 스트림에 대해 수집되거나 산출된다(612). 산출되는 통계는 입력 비디오 스트림 내에서의 프레임에서 픽셀들의 블록에 대한 복잡도 측정치를 포함할 수 있다. 본 발명의 많은 실시예들에서, 픽셀들의 블록에 대한 복잡도 측정치는 제 2 패스에서 블록을 인코딩하기 위한 양자화 파라미터를 산출할 때 저장되며 사용될 수 있다. 본 발명의 실시예들에 따른 복잡도 측정치를 생성하기 위한 프로세스가 도 7에 도시된다. 프로세스는 비디오의 프레임에서 픽셀들의 블록을 선택하는 단계를 포함한다(710). 프로세스는 블록의 복잡도를 분석한다(712). 복잡도 측정치는 예를 들면, AVC 표준에서, 휘도의 히스토그램 및/또는 대비 레벨 분포로부터 산출될 수 있다. 그러나, 임의의 비디오 인코딩 표준하에서 다수의 기술들 중 임의의 것이 본 발명의 실시예들에 따른 복잡도 측정치를 산출하기 위해 사용될 수 있다. 복잡도 측정치는 양자화 파라미터를 생성할 때(예로서, 인코딩의 제 2 패스에서) 사용하기 위해 저장될 수 있다(714). 도 10a는 입력 스트림의 제 1 패스에서 산출된 바와 같이 CTU들의 복잡도 측정치들의 플롯을 도시한다. 도면에 도시된 바와 같이, 복잡도 측정치들은 상이한 CTU들 사이에서 달라질 수 있다. In the process described above with respect to FIG. 6, statistics are collected or calculated for an input video stream (612). The computed statistics may include a measure of complexity for a block of pixels in a frame within the input video stream. In many embodiments of the invention, the complexity measure for a block of pixels may be stored and used when calculating the quantization parameter for encoding the block in the second pass. A process for generating complexity measures in accordance with embodiments of the present invention is shown in FIG. The process includes selecting a block of pixels in a frame of video (710). The process analyzes the complexity of the block (712). The complexity measure can be calculated, for example, from the histogram of luminance and / or the contrast level distribution, in the AVC standard. However, any of a number of techniques under any video encoding standard may be used to produce a complexity measure in accordance with embodiments of the present invention. The complexity measure may be stored 714 for use in generating the quantization parameter (e.g., in the second pass of encoding). 10A shows a plot of the complexity measures of the CTUs as calculated in the first pass of the input stream. As shown in the figure, the complexity measurements may vary between different CTUs.

다시 도 6을 참조하면, 상기 추가로 설명된 프로세스는 그것이 존재하지 않는다면 인코딩 정보를 생성하는 단계(624) 및 그것이 존재한다면 인코딩 정보를 재사용하는 단계(622)를 포함한다. 인코딩 정보는 픽셀들의 특정한 블록(CTU 또는 TU와 같은)에 대한 양자화 파라미터를 포함할 수 있다. 본 발명의 실시예들에 따른 양자화 파라미터를 생성하기 위한 프로세스(인코딩 정보를 생성(624)할 때와 같은)는 도 8에 도시된다. 프로세스는 픽셀들의 블록을 선택하는 단계(810)를 포함한다. 블록에 대한 저장된 복잡도 측정치가 검색된다(812). 제 2 패스에서 블록(CTU 또는 TU와 같은)을 인코딩하기 위한 양자화 파라미터는 제 1 패스에서의 블록에 대해 산출된 복잡도 측정치와 같은 통계들을 사용하여 결정될 수 있다(814). 산출은 또한 평균 기반 화상 양자화 파라미터로부터의 편차를 고려할 수 있다.Referring again to FIG. 6, the further described process includes generating (624) encoding information if it is not present and reusing (622) encoding information if it exists. The encoding information may include quantization parameters for a particular block of pixels (such as CTU or TU). A process for generating quantization parameters (such as when generating 624 encoding information) according to embodiments of the present invention is shown in FIG. The process includes a step 810 of selecting a block of pixels. The stored complexity measure for the block is retrieved (812). The quantization parameter for encoding the block (such as CTU or TU) in the second pass may be determined 814 using statistics such as the complexity measure computed for the block in the first pass. The output can also take into account the deviation from the mean based image quantization parameter.

양자화 파라미터들은 제 1 패스에서 결정된 복잡도 측정치들 및/또는 이미 인코딩된 다른 출력 스트림들에서(예를 들면, 보다 낮은 또는 보다 높은 비트 레이트 스트림에서)의 블록들에 대해 결정된 양자화 파라미터들을 이용할 수 있다. 픽셀들의 유사한 블록에 대해 이미 생성된 양자화 파라미터를 사용하기 위한 프로세스(인코딩 정보를 재사용(622)할 때와 같은)가 도 9에 도시된다. 프로세스는 픽셀들의 블록을 선택하는 단계(910)를 포함한다. 저장된 양자화 파라미터는 유사한 블록에 대해 검색된다(912). 유사한 블록은 선택된 블록과 입력 스트림에서의 프레임에서 동일한 블록으로부터 생성된 출력 스트림에서의 프레임에서의 블록이다. 예를 들면, 유사한 블록은 동일한 분해능이지만 상이한 비트 레이트의 출력 스트림에서 이미 인코딩된 대응 블록일 수 있다. 양자화 파라미터는 검색된 양자화 파라미터를 사용하여 선택된 블록에 대해 산출된다. 양자화 파라미터는 입력 스트림 및/또는 이에 제한되지 않지만 왜곡 레이트 및 비트 레이트와 같은 출력 스트림들의 이미 인코딩된 프레임들에 대해 산출된 통계들에 대해 추가로 조정될 수 있다(916). 양자화 파라미터는 최소 왜곡 및 타겟 비트 레이트를 달성하기 위해 산출될 수 있다. 도 10b는 도 10a에 도시된 복잡도 측정치들의 플롯과 함께 3개의 상이한 출력 스트림들에서(동일한 분해능 및 상이한 비트 레이트들에서) CTU들에 대해 산출되는 양자화 파라미터들의 플롯을 도시한다. 도면에 도시된 바와 같이, 복잡도 측정치 값들이 양자화 파라미터 값들을 결정할 때 사용되었으며 따라서 플롯들은 그것들의 관계로 인해 유사한 일반적 형태를 가진다. The quantization parameters may use the determined quantization parameters for the blocks of complexity measurements determined in the first pass and / or blocks in other output streams that have already been encoded (e.g., in a lower or higher bitrate stream). A process (such as when reusing encoding information 622) for using already generated quantization parameters for similar blocks of pixels is shown in FIG. The process includes selecting (block 910) a block of pixels. The stored quantization parameters are retrieved for similar blocks (912). A similar block is a block in a frame in the output stream generated from the same block in the selected block and the frame in the input stream. For example, a similar block may be a corresponding block that is already encoded in an output stream of the same resolution but at a different bit rate. The quantization parameter is calculated for the selected block using the retrieved quantization parameter. The quantization parameter may be further adjusted 916 for the statistics computed for the input streams and / or already encoded frames of the output streams such as, but not limited to, distortion rate and bit rate. The quantization parameters can be computed to achieve a minimum distortion and a target bit rate. FIG. 10B shows a plot of quantization parameters calculated for CTUs (at the same resolution and at different bit rates) in three different output streams with a plot of the complexity measurements shown in FIG. 10A. As shown in the figure, complexity measure values were used in determining the quantization parameter values, and therefore the plots have a similar general shape due to their relationship.

재생 동안 스트림들의 스위칭을 용이하게 하기 위해, 비디오 콘텐트의 대안 스트림들의 세트는 각각의 스트림에서의 동일한 위치에 인트라 프레임들(I 프레임들)을 가져야 한다. 하나의 스트림에서 I 프레임이 있는 경우에, 다른 대안 스트림들의 각각에서의 동일한 위치(즉, 시간)에서의 대응 프레임들은 I 프레임이어야 한다. I 프레임이 프레임을 렌더링하기 위해 필요한 정보 모두를 포함하기 때문에, 재생 디바이스가 스트림들을 스위칭할 필요가 있을 때, 그것은 재생이 프레임들을 잃지 않고 끊김 없도록 I 프레임에서 스위칭할 수 있다. 본 발명의 많은 실시예들에서, I 프레임들은 복수의 대안 출력 스트림들에서 동일한 위치에서의 프레임들에 대해 인코딩된다. P 프레임 또는 B 프레임이 출력 스트림에 대해 인코딩된 경우, 인코더는 또 다른 출력 스트림을 인코딩할 때 I 프레임과 동일한 위치에서 대응 프레임을 인코딩하기 위해 검사하는 단계를 생략할 수 있으며 B 프레임 또는 P 프레임으로서 프레임을 인코딩할 수 있다. To facilitate switching of streams during playback, the set of alternative streams of video content should have intra frames (I frames) at the same location in each stream. If there is an I frame in one stream, corresponding frames in the same position (i.e., time) in each of the other alternative streams must be I frames. When an I-frame contains all of the information needed to render a frame, when the playback device needs to switch streams, it can switch in an I-frame such that playback does not lose frames and is seamless. In many embodiments of the present invention, I frames are encoded for frames at the same location in a plurality of alternative output streams. When a P frame or a B frame is encoded for an output stream, the encoder may skip checking to encode the corresponding frame at the same position as the I frame when encoding another output stream, and as a B frame or a P frame Frames can be encoded.

공유된 통계들을 이용한 다수의 미디어 콘텐트 스트림들을 인코딩하기 위한 특정 프로세스가 도 6에 대하여 상기 논의되지만, 다양한 프로세스들 중 임의의 것이 본 발명의 실시예들에 따른 적응적 비트레이트 스트리밍을 위한 다수의 미디어 콘텐트 스트림들을 인코딩하기 위해 이용될 수 있다.Although a particular process for encoding a plurality of media content streams using shared statistics is discussed above with respect to FIG. 6, any of a variety of processes may be implemented on a number of media for adaptive bitrate streaming May be used to encode the content streams.

본 발명이 특정한 특정 양상들에서 설명되었지만, 많은 부가적인 수정들 및 변형들이 당업자들에게 명백할 것이다. 그러므로 본 발명은, 본 발명의 범위 및 사상으로부터 벗어나지 않고, 그것들이 따르는 특정한 표준 내에서 특정된 것들을 넘어 특징들을 지원하는 인코더들 및 디코더들을 이용하는 단계과 같은 구현에서의 다양한 변화들을 포함하여, 구체적으로 설명된 것 외에 실시될 수 있다는 것이 이해될 것이다. 따라서, 본 발명의 실시예들은 모든 점들에서 예시적이며 제한적이지 않은 것으로 고려되어야 한다. While the invention has been described in terms of specific embodiments, many additional modifications and variations will be apparent to those skilled in the art. Therefore, the present invention is not intended to be exhaustive or to limit the invention to the precise form disclosed, including various modifications in implementations, such as using encoders and decoders that support features beyond those specified within the specific standard to which they are subjected, It will be understood that the invention may be practiced otherwise than as described. Accordingly, the embodiments of the present invention should be considered in all respects as illustrative and not restrictive.

10: 적응적 스트리밍 시스템 12: 소스 인코딩 서버
13: 미디어 서버 14: 콘텐트 서버
16: 네트워크 18, 20, 22: 재생 디바이스
200: 소스 인코더 210: 프로세서
230: 메모리 240: 네트워크 인터페이스
250: 소스 인코딩 애플리케이션 260: 소스 비디오 데이터
270: 인코더 제어기 280: 인코더 제어기
295: 공유 데이터 버퍼10: adaptive streaming system 12: source encoding server
13: media server 14: content server
16: Network 18, 20, 22: Playback device
200: Source encoder 210: Processor
230: memory 240: network interface
250: source encoding application 260: source video data
270: Encoder controller 280: Encoder controller
295: Shared data buffer

Claims

A source encoder configured to encode source video as a plurality of alternative video streams, the source encoder comprising:
A memory including a source encoding application;
Shared memory; And
By the source encoding application:
Receiving multimedia content, the multimedia content including source video data having a primary resolution; receiving the multimedia content;
Collecting statistics on source video data in a first pass through the received multimedia content and recording the statistics in a shared memory, wherein the statistics include complexity measures of blocks of pixels, and a complexity measure of blocks of pixels is Collecting and recording statistics for the source video data, the statistics representing a level of complexity of visual information within the block of pixels;
Determine initial encoding information for the source video data during the first pass through the received multimedia content and write the initial encoding information to a shared memory;
Encoding the source video data in parallel using collected statistics and initial encoding information to generate a plurality of alternative video streams during a second pass through the received multimedia content with a plurality of parallel encoding processes, Wherein the encoding of the source video utilizes additional encoding information and the parallel encoding processes are configured to reuse additional encoding information already determined for a portion of video by another parallel encoding process and stored in the shared memory, The parallel encoding processes are configured to generate additional encoding information that has not been previously determined for a portion of the video by another parallel encoding process and to store the generated additional encoding information in the shared memory, Wherein the encoding information comprises a parallel processing system configured to encode the source video data in parallel, the quantization parameters including a quantization parameter for blocks of pixels,
The parallel encoding processes are configured to generate additional encoding information that has not yet been determined for a portion of the video by another parallel encoding process is a coding tree unit for encoding a portion of the video frame in the source video data Tree Unit (CTU) size.

The method according to claim 1,
The statistics for the source video data comprising statistics selected from the group consisting of an average quantization parameter, a size of header bits, a size of texture bits, a number of intra blocks, a number of interblocks, and a number of skip blocks.

The method according to claim 1,
Wherein the parallel processing system configured to determine initial encoding information for source video data further comprises the parallel processing system configured to calculate a frame complexity measurement.

delete

The method according to claim 1,
Wherein determining a CTU size for encoding a portion of a frame of video in the source video data comprises:
Selecting a portion of a frame of video to encode as at least one output CTU in a first output stream;
Confirming that the size is determined for a similar CTU;
Selecting a CTU size if the size is not determined for the similar CTU;
Selecting a previously determined CTU size determined for the second output stream and comparing the resolution of the second output stream to the resolution of the second output stream if the magnitude is determined for a similar CTU;
Scaling the CTU size if the resolution of the first output stream is not the same resolution as the second output stream;
Determining if the selected CTU size is acceptable for the output CTU;
Selecting a smaller CTU size when the selected CTU size is unacceptable; And
And applying the selected CTU size to a portion of a frame of the video if the selected CTU size is acceptable for the output CTU.

The method according to claim 1,
Wherein the parallel processing system configured to determine initial encoding information for source video data further comprises the parallel processing system configured to determine a mode distribution for at least one frame of video in at least one of the plurality of alternative video streams Includes a source encoder.

The method according to claim 6,
Wherein the parallel processing system is configured to encode the source video data in parallel using collected statistics and initial encoding information to produce a plurality of alternative video streams comprising:
Maintaining a count of blocks processed in a frame of video in an alternate video stream;
Determine a threshold number of blocks based on the mode distribution;
Wherein the parallel processing system is further configured to adjust criteria for block type decisions if the count of blocks satisfies a threshold number of blocks.

The method according to claim 1,
The parallel encoding processes configured to reuse additional encoding information already determined for a portion of video by another parallel encoding process and stored in the shared memory include:
Determining whether a motion vector exists for a second corresponding block in a second alternative stream when encoding a first block in a video frame in a first alternative stream;
Determine whether the first alternative stream and the second alternative stream are of the same resolution;
Scale the motion vector if the first alternative stream and the second alternative stream are not of the same resolution;
Refine the motion vector;
The parallel encoding processes being further configured to apply the motion vector when encoding the first block in the video frame in the first alternative stream.

The method according to claim 1,
Wherein the initial encoding information further comprises a header size, a macroblock size, and a relative ratio of header size to macroblock size.

The method according to claim 1,
Wherein the initial encoding information further comprises hypothetical reference decoder data.

The method according to claim 1,
Wherein each of the parallel encoding processes encodes at different resolutions.

12. The method of claim 11,
Wherein each of the parallel encoding processes encodes one or more alternative video streams and wherein each of the alternative video streams encoded by the parallel encoding process is at a different bit rate.

The method according to claim 1,
Wherein each of the parallel encoding processes sequentially encodes blocks from the source video data into respective streams in a subset of the plurality of alternative video streams.

The method according to claim 1,
Wherein the additional encoding information comprises rate distortion information and quantization parameters.

The method according to claim 1,
Wherein the blocks of pixels are coding tree units.

The method according to claim 1,
Wherein the blocks of pixels are coding units.

The method according to claim 1,
Wherein the blocks of pixels are transform units.

The method according to claim 1,
Wherein the quantization parameters for the blocks of pixels are generated using complexity measures for blocks of pixels.

The method according to claim 1,
Wherein the quantization parameters for blocks of pixels are generated using previously generated quantization parameters.

The method according to claim 1,
Wherein the quantization parameters for blocks of pixels are generated using a distortion rate and a bit rate.

A method for encoding source video as a plurality of alternative video streams, the method comprising:
A method comprising: receiving multimedia content, the multimedia content comprising source video data having a primary resolution using a source encoder; receiving the multimedia content;
Collecting statistics for source video data in a first pass through the received multimedia content using a source encoder and writing the statistics to a shared memory, the statistics comprising complexity measures of blocks of pixels, Collecting and recording statistics for the source video data, the complexity measure of the block of pixels representing the level of complexity of visual information within the block of pixels;
Determining initial encoding information for the source video data during the first pass through the received multimedia content using a source encoder and writing the initial encoding information to a shared memory; And
Initial encode information, and additional encode information to generate a plurality of alternative video streams during the second pass through the received multimedia content using a plurality of parallel encoding processes using a source encoder, Encoding the source video data in parallel, the encoding of the source video comprising:
Reusing additional encoding information already determined for a portion of video by another parallel encoding process and stored in the shared memory using at least one of the plurality of parallel encoding processes, Reusing the additional encoding information, wherein the additional encoding information includes quantization parameters for the blocks of the encoding information;
Generating additional encoding information that has not yet been determined for a portion of the video by another of the plurality of parallel encoding processes, wherein the additional parallel encoding process further encodes the portion of the video, Wherein generating the information further comprises: determining a coding tree unit (CTU) size for encoding a portion of a frame of video in the source video data, the additional encoding information being generated; And
Further comprising: storing the generated additional encoding information in the shared memory using a parallel encoder process. &Lt; Desc / Clms Page number 21 >

22. The method of claim 21,
Statistics for source video data include: statistics encoding, including statistics selected from the group consisting of average quantization parameters, size of header bits, size of texture bits, number of intra blocks, number of inter blocks, and number of skip blocks Lt; / RTI >

22. The method of claim 21,
Wherein determining the initial encoding information for the source video data further comprises calculating a frame complexity measure.

delete

22. The method of claim 21,
Wherein determining a CTU size for encoding a portion of a frame of video in the source video data comprises:
Selecting a portion of a frame of video to encode as at least one output CTU in a first output stream;
Confirming that the size is determined for a similar CTU;
Selecting a CTU size if the size is not determined for the similar CTU;
Selecting a previously determined CTU size determined for the second output stream and comparing the resolution of the second output stream to the resolution of the second output stream if the magnitude is determined for a similar CTU;
Scaling the CTU size if the resolution of the first output stream is not the same resolution as the second output stream;
Determining if the selected CTU size is acceptable for the output CTU;
Selecting a smaller CTU size if the selected CTU size is unacceptable; And
Further comprising applying the selected CTU size to a portion of a frame of the video if the selected CTU size is acceptable for the output CTU.

22. The method of claim 21,
Wherein determining the initial encoding information for the source video data further comprises determining a mode distribution for at least one frame of video in at least one of the plurality of alternative video streams, Way.

27. The method of claim 26,
Encoding the source video data in parallel using collected statistics, initial encoding information, and additional encoding information to produce a plurality of alternative video streams includes:
Maintaining a count of blocks processed in a frame of video in an alternate video stream;
Determining a threshold number of blocks based on the mode distribution; And
Further comprising adjusting criteria for block type decisions if the count of blocks satisfies a threshold number of blocks.

22. The method of claim 21,
Reusing additional encoding information already determined for a portion of video by another parallel encoding process and stored in the shared memory includes:
Determining whether a motion vector exists for a second corresponding block in a second alternative stream when encoding a first block in a video frame in a first alternative stream;
Determining if the first alternative stream and the second alternative stream are of the same resolution;
Scaling the motion vector if the first alternative stream and the second alternative stream are not of the same resolution;
Refining the motion vector; And
Further comprising applying the motion vector when encoding the first block in the video frame in the first alternative stream.

22. The method of claim 21,
Wherein the initial encoding information further comprises a header size, a macroblock size, and a relative ratio of header size to macroblock size.

22. The method of claim 21,
Wherein the initial encoding information further comprises virtual reference decoder data.

22. The method of claim 21,
Wherein each of the parallel encoding processes encodes at different resolutions.

32. The method of claim 31,
Wherein each of the parallel encoding processes encodes one or more alternative video streams and wherein each of the alternative video streams encoded by the parallel encoding process is at a different bit rate.

22. The method of claim 21,
Wherein each of the parallel encoding processes sequentially encodes blocks from the source video data into respective streams in a subset of the plurality of alternative video streams.

22. The method of claim 21,
Wherein the additional encoding information comprises rate distortion information and quantization parameters.

22. The method of claim 21,
Wherein the blocks of pixels are coding tree units.

22. The method of claim 21,
Wherein the blocks of pixels are coding units.

22. The method of claim 21,
Wherein the blocks of pixels are transformation units.

22. The method of claim 21,
Wherein the quantization parameters for blocks of pixels are generated using complexity measures for blocks of pixels.

22. The method of claim 21,
Wherein the quantization parameters for blocks of pixels are generated using previously generated quantization parameters.

22. The method of claim 21,
Wherein the quantization parameters for blocks of pixels are generated using a distortion rate and a bit rate.