KR100626419B1

KR100626419B1 - Switching between bit-streams in video transmission

Info

Publication number: KR100626419B1
Application number: KR1020037008568A
Authority: KR
Inventors: 카르체위츠마르타; 쿠르체른라집
Original assignee: 노키아 코포레이션
Priority date: 2001-01-03
Filing date: 2002-01-03
Publication date: 2006-09-20
Also published as: KR20030065571A

Abstract

본 발명은 비디오 정보를 전송하기 위한 방법으로서, 적어도 제1 비트-스트림(510) 및 제2 비트-스트림이 형성되는 방법에 관한 것이다. 상기 제1 비트-스트림(510)은 적어도 하나의 비디오 프레임을 포함하고, 상기 제2 비트-스트림(520)은 적어도 하나의 예측 비디오 프레임(524)을 포함한다. 적어도 부분적으로 상이한 부호화 매개변수들이 상기 제1 비트-스트림(510) 및 상기 제2 비트-스트림(520)의 프레임들의 부호화에 사용된다. 상기 제1 비트-스트림(510)의 적어도 하나의 프레임이 전송되며, 상기 전송은 상기 제1 비트-스트림(510)으로부터 상기 제2 비트-스트림(520)으로 교환된다. 상기 제1 비트-스트림(510)으로부터 상기 제2 비트-스트림(520)으로 상기 전송을 교환하는 경우, 제2 교환 프레임(550)이 전송되고, 상기 제2 교환 프레임(550)은 상기 제1 비트-스트림(510)으로부터의 적어도 하나의 참조 프레임 및 상기 제2 비트-스트림(520)의 상기 부호화 매개변수들을 사용하여 부호화된다. 상기 제2 교환 프레임(550)은 제2 세트의 비디오 프레임들의 상기 적어도 하나의 예측 비디오 프레임(524)의 재구성에서 참조 프레임으로서 사용된다. 본 발명은 또한 비디오 정보를 부호화하는 부호기, 비디오 정보를 복호화하는 복호기, 및 부호화된 비디오 정보를 나타내는 신호에 관한 것이다.The present invention relates to a method for transmitting video information, wherein at least a first bit-stream (510) and a second bit-stream are formed. The first bit-stream 510 includes at least one video frame, and the second bit-stream 520 includes at least one predictive video frame 524. At least partially different encoding parameters are used for encoding the frames of the first bit-stream 510 and the second bit-stream 520. At least one frame of the first bit-stream 510 is transmitted, and the transmission is exchanged from the first bit-stream 510 to the second bit-stream 520. When exchanging the transmission from the first bit-stream 510 to the second bit-stream 520, a second exchange frame 550 is transmitted, and the second exchange frame 550 is the first exchange. At least one reference frame from bit-stream 510 and the encoding parameters of second bit-stream 520 are encoded. The second exchange frame 550 is used as a reference frame in the reconstruction of the at least one predictive video frame 524 of a second set of video frames. The invention also relates to an encoder for encoding video information, a decoder for decoding video information, and a signal representing encoded video information.

Description

Switching between bit-streams in video transmission}

본 발명은 비디오 정보를 전송하기 위한 방법에 관한 것으로서, 적어도 제1 비트-스트림 및 제2 비트-스트림이 상기 비디오 정보로부터 형성되며, 상기 제1 비트-스트림은 적어도 하나의 비디오 프레임을 포함하는 제1 세트의 프레임들을 포함하고, 상기 제2 비트-스트림은 적어도 하나의 예측 비디오 프레임을 포함하는 제2 세트의 프레임들을 포함하며, 적어도 부분적으로 상이한 부호화 매개변수들이 상기 제1 비트-스트림 및 상기 제2 비트-스트림의 프레임들의 부호화에 사용되고, 상기 제1 비트-스트림의 적어도 하나의 프레임이 전송되며, 상기 전송은 상기 제1 비트-스트림으로부터 상기 제2 비트-스트림으로 교환되는 비디오 정보 전송 방법에 관한 것이다.The present invention relates to a method for transmitting video information, wherein at least a first bit-stream and a second bit-stream are formed from the video information, the first bit-stream comprising at least one video frame. One set of frames, wherein the second bit-stream comprises a second set of frames comprising at least one predictive video frame, wherein at least partially different encoding parameters comprise the first bit-stream and the first Used to encode frames of two bit-streams, at least one frame of the first bit-stream is transmitted, and the transmission is exchanged from the first bit-stream to the second bit-stream. It is about.

본 발명은 또한 부호기에 관한 것으로서, 비디오 정보로부터 적어도 제1 비트-스트림 및 제2 비트-스트림을 형성하는 수단으로서, 상기 제1 비트-스트림은 적어도 하나의 비디오 프레임을 포함하는 제1 세트의 프레임들을 포함하고, 상기 제2 비트-스트림은 적어도 하나의 예측 비디오 프레임을 포함하는 제2 세트의 프레임들을 포함하는 수단, 상기 제1 비트-스트림 및 상기 제2 비트-스트림의 프레임들을 부호화하는데 적어도 부분적으로 상이한 부호화 매개변수들을 사용하는 수단, 상기 제1 비트-스트림의 적어도 하나의 프레임을 전송하는 수단, 및 상기 전송을 상기 제1 비트-스트림으로부터 상기 제2 비트-스트림으로 교환하는 수단을 포함하는 부호기에 관한 것이다.The invention also relates to an encoder, comprising: means for forming at least a first bit-stream and a second bit-stream from video information, the first bit-stream comprising at least one video frame Means for including a second set of frames comprising at least one predictive video frame, at least partially for encoding the frames of the first bit-stream and the second bit-stream. Means for using different encoding parameters, means for transmitting at least one frame of the first bit-stream, and means for exchanging the transmission from the first bit-stream to the second bit-stream. It is about an encoder.

본 발명은 추가로 신호로부터 비디오 정보를 복호화하는 복호기에 관한 것으로서, 상기 신호는 상기 비디오 정보로부터 형성되는 적어도 제1 비트-스트림 및 제2 비트-스트림으로부터의 프레임들을 포함하고, 상기 제1 비트-스트림은 적어도 하나의 비디오 프레임을 포함하는 제1 세트의 프레임들을 포함하고, 상기 제2 비트-스트림은 적어도 하나의 예측 비디오 프레임을 포함하는 제2 세트의 프레임들을 포함하며, 적어도 부분적으로 상이한 부호화 매개변수들이 상기 제1 비트-스트림 및 상기 제2 비트-스트림의 프레임들의 부호화에 사용되는 복호기에 관한 것이다.The invention further relates to a decoder for decoding video information from a signal, the signal comprising frames from at least a first bit-stream and a second bit-stream formed from the video information. The stream comprises a first set of frames comprising at least one video frame, and the second bit-stream comprises a second set of frames comprising at least one predictive video frame, the at least partially different encoding medium Variables relate to a decoder used for encoding the frames of the first bit-stream and the second bit-stream.

본 발명은 추가로 부호화된 비디오 정보를 나타내고, 상기 비디오 정보로부터 형성되는 적어도 제1 비트-스트림 및 제2 비트-스트림으로부터의 프레임들을 포함하는 신호에 관한 것으로서, 상기 제1 비트-스트림은 적어도 하나의 비디오 프레임을 포함하는 제1 세트의 프레임들을 포함하고, 상기 제2 비트-스트림은 적어도 하나의 예측 비디오 프레임을 포함하는 제2 세트의 프레임들을 포함하며, 적어도 부분적으로 상이한 부호화 매개변수들이 상기 제1 비트-스트림 및 상기 제2 비트-스트림의 프레임들의 부호화에 사용되는 신호에 관한 것이다.The invention further relates to a signal representing encoded video information and comprising a frame from at least a first bit-stream and a second bit-stream formed from the video information, wherein the first bit-stream is at least one. A first set of frames comprising a video frame of the second bit-stream, wherein the second bit-stream comprises a second set of frames comprising at least one predictive video frame, wherein at least partially different encoding parameters And a signal used for encoding the frames of the first bit-stream and the second bit-stream.

최근, 오디오 및 비디오 정보 스트리밍을 포함하는 멀티미디어 응용들이 더 많이 사용되고 있다. 몇몇 국제 표준화 기구들이 오디오 및 비디오 정보를 압축/부호화하고 압축해제/복호화하기 위한 표준들을 정하고 제안했다. 동영상 전문가 그룹에 의해 설정된 MPEG 표준들은 멀티미디어 응용 분야에서 가장 널리 수용되는 국제 표준들이다. VCEG는 국제 전기 통신 연합 전기 통신 표준화 섹터(ITU-T)의 지시하에 근무하는 "비디오 부호화 전문가 그룹(Video Coding Experts Group)"이다. 이 그룹은 동영상의 부호화를 위한 표준 H.26L에 대한 일을 한다.Recently, multimedia applications, including streaming audio and video information, are becoming more popular. Several international standardization bodies have established and proposed standards for compressing / encoding, decompressing / decoding audio and video information. The MPEG standards set by the Video Experts Group are the most widely accepted international standards for multimedia applications. VCEG is the "Video Coding Experts Group" working under the direction of the International Telecommunication Union Telecommunication Standardization Sector (ITU-T). This group works on the standard H.26L for encoding video.

전형적인 비디오 스트림은 종종 프레임들로서 지칭되는 일련의 화상들을 포함한다. 상기 프레임들은 직사각형으로 배열되는 픽셀들을 포함한다. H.261, H.262, H.263, H26L 및 MPEG-4와 같은 현재 비디오 부호화 표준들에 있어서, 3개의 주요한 유형의 화상들이 정의된다: 인트라 프레임들(I-frames; Intra frames), 예측 프레임들(P-frames; Predictive frames) 및 양방향 프레임들(B-frames; Bi-directional frames). 각 화상 유형은 일련의 이미지들에서 상이한 유형의 리던던시(redundancy)를 이용하고 따라서 상이한 레벨의 압축 효율이 되며, 다음에 설명되는 바와 같이, 부호화된 비디오 시퀀스내에서 상이한 기능을 제공한다. 인트라 프레임은 과거 또는 미래 프레임들로부터 어떠한 정보도 사용하지 않고 자신의 프레임 내의 픽셀들의 공간 상관만을 이용하여 부호화되는 비디오 데이터의 프레임이다. 인트라 프레임들은 다른 프레임들의 복호화/압축해제를 위한 기초로서 사용되고 복호화가 시작될 수 있는 부호화된 시퀀스에 접근점들을 제공한다.A typical video stream contains a series of pictures, often referred to as frames. The frames comprise pixels arranged in a rectangle. In the current video coding standards such as H.261, H.262, H.263, H26L and MPEG-4, three major types of pictures are defined: Intra frames, prediction Predictive frames (P-frames) and Bi-directional frames (B-frames). Each picture type uses different types of redundancy in a series of images and thus different levels of compression efficiency, providing different functionality within the encoded video sequence, as described below. An intra frame is a frame of video data that is encoded using only the spatial correlation of the pixels in its frame without using any information from past or future frames. Intra frames are used as the basis for decoding / decompressing other frames and provide access points to the encoded sequence from which decoding can begin.

예측 프레임은 소위 참조 프레임, 즉 부호기 또는 복호기에서 이용가능한 하나 이상의 이전/이후 인트라 프레임들 또는 예측 프레임들로부터 움직임 보상 예측 을 사용하여 부호화/압축되는 프레임이다. 양방향 프레임은 이전 인트라 프레임 또는 예측 프레임 및/또는 이후 인트라 프레임 또는 예측 프레임으로부터의 예측에 의해 부호화/압축되는 프레임이다.A predictive frame is a so-called reference frame, ie a frame that is encoded / compressed using motion compensated prediction from one or more pre / post intra frames or predictive frames available in an encoder or decoder. A bidirectional frame is a frame that is encoded / compressed by prediction from a previous intra frame or prediction frame and / or a subsequent intra frame or prediction frame.

전형적인 비디오 시퀀스에서의 인접 프레임들은 매우 상관되어 있으므로, 인트라 프레임들 대신에 양방향 또는 예측 프레임들을 사용하는 경우 더 높은 압축이 달성될 수 있다. 다른 한편, 시간 예측 부호화가 부호화된 비디오 스트림내에서 사용되는 경우, 양방향 및 예측 프레임들의 부호화에 사용된 모든 다른 이전 및/또는 이후 참조 프레임들을 올바르게 복호화하지 않으면 B-프레임들 및/또는 P-프레임들이 복호화될 수 없다. 부호기에서 사용되는 참조 프레임(들) 및 복호기에서의 각각의 참조 프레임(들)은 전송 동안 오류들로 인하여 또는 송신측에서의 어떤 고의적인 동작으로 인하여 동일하지 않은 경우들에 있어서, 그러한 참조 프레임으로부터 예측을 이용하는 이후 프레임들은 부호화측에서 원래 부호화된 것과 동일한 복호화된 프레임을 생성하도록 복호화측에서 재구성될 수 없다. 이러한 부정합(mismatch)은 단일 프레임에만 한정되지 않고 추가로 움직임 보상된 부호화의 이용으로 인하여 시간 전파된다.Since adjacent frames in a typical video sequence are highly correlated, higher compression can be achieved when using bidirectional or predictive frames instead of intra frames. On the other hand, when temporal prediction encoding is used within an encoded video stream, B-frames and / or P-frames are not correctly decoded for all other previous and / or subsequent reference frames used for encoding bi-directional and predictive frames. Cannot be decrypted. The reference frame (s) used in the encoder and each reference frame (s) in the decoder are not the same because of errors during transmission or due to some deliberate operation on the sender, resulting in prediction from such a reference frame. After use, the frames cannot be reconstructed at the decoding side to produce the same decoded frame as originally encoded at the encoding side. This mismatch is not limited to a single frame but is further time propagated due to the use of motion compensated encoding.

도 1a 내지 도 1c는 전형적인 비디오 부호화/복호화 시스템에서 사용되는 부호화/압축된 비디오 프레임들의 유형들을 도시한다. 바람직하기로는, 부호화 이전에, 비디오 시퀀스의 화상들은 다수-비트 수들의 매트릭스들에 의해 표현되는데, 하나는 이미지 픽셀들의 휘도(밝기)를 나타내고, 다른 2개 각각은 2개의 크로미넌스(색) 성분들 중의 하나를 나타낸다. 도 1a는 인트라 프레임(200)이 자신의 프레 임에 있는 이미지 정보만을 사용하여 부호화되는 방식을 도시한다. 도 1b는 예측 프레임(210)의 구성을 도시한다. 화살표(205a)는 P-프레임(210)을 생성하기 위한 움직임 보상 예측의 사용을 나타낸다. 도 1c는 양방향 프레임들(220)의 구성을 도시한다. B-프레임들은 보통 I-프레임들 또는 P-프레임들 사이에 삽입된다. 도 2는 디스플레이 순서로 일군의 화상들을 나타내고 B-프레임들이 I-프레임들 및 P-프레임들 사이에 어떻게 삽입되는지를 나타낼 뿐만 아니라, 움직임 보상 정보가 흐르는 방향을 나타낸다. 도 1b, 도 1c 및 도 2에 있어서, 화살표들(205a)은 P-프레임들(210)을 재구성하는데 필요한 순방향 움직임 보상 예측 정보를 나타낸다. 반면, 화살표들(215a 및 215b)은 순방향(215a) 및 역방향(215b)으로 B-프레임들(220)을 재구성하는데 사용되는 움직임 보상 정보를 나타낸다. 다시 말하면, 화살표들(205a 및 215a)은 예측 프레임들이 재구성되는 프레임보다 디스플레이 순서에서 더 이른 프레임들로부터 예측되는 경우 정보의 흐름을 나타내고, 화살표들(215b)은 예측 프레임들이 재구성되는 프레임보다 디스플레이 순서에서 더 늦은 프레임들로부터 예측되는 경우 정보의 흐름을 나타낸다.1A-1C show types of encoded / compressed video frames used in a typical video encoding / decoding system. Preferably, prior to encoding, the pictures of the video sequence are represented by many-bit numbers of matrices, one representing the luminance (brightness) of the image pixels, and the other two each have two chrominances (colors). One of the components is shown. 1A shows how the intra frame 200 is encoded using only image information in its frame. 1B shows the configuration of the prediction frame 210. Arrow 205a illustrates the use of motion compensated prediction to generate P-frame 210. 1C shows the configuration of the bidirectional frames 220. B-frames are usually inserted between I-frames or P-frames. 2 shows a group of pictures in display order and shows how B-frames are inserted between I-frames and P-frames, as well as the direction in which motion compensation information flows. 1B, 1C, and 2, arrows 205a represent forward motion compensation prediction information needed to reconstruct P-frames 210. On the other hand, arrows 215a and 215b represent motion compensation information used to reconstruct B-frames 220 in the forward 215a and reverse 215b directions. In other words, the arrows 205a and 215a indicate the flow of information when the prediction frames are predicted from earlier frames in the display order than the frame from which they are reconstructed, and the arrows 215b indicate the display order than the frame from which the prediction frames are reconstructed. Represents the flow of information when predicted from later frames.

움직임 보상 예측에 있어서, 부호화 효율을 개선하기 위하여 비디오 시퀀스에서의 연속 프레임들간의 유사성이 이용된다. 보다 상세하게는, 픽셀들 또는 픽셀들의 영역들이 시퀀스의 연속 프레임들 사이에서 이동하는 방식을 설명하는데 소위 움직임 벡터들이 사용된다. 움직임 벡터들은 비디오 데이터의 주어진 프레임을 압축/부호화 또는 압축해제/복호화하기 위해 오류 데이터와 함께 사용될 수 있는 복호화된 픽셀 값들을 구비하는 비디오 데이터의 과거 또는 미래 프레임에 관련되는 오프셋 값들 및 오류 데이터를 제공한다.In motion compensated prediction, similarity between successive frames in a video sequence is used to improve coding efficiency. More specifically, so-called motion vectors are used to describe how pixels or regions of pixels move between successive frames of a sequence. Motion vectors provide error data and offset values related to past or future frames of video data having decoded pixel values that can be used with the error data to compress / encode or decompress / decode a given frame of video data. do.

P-프레임들을 복호화/압축해제하기 위한 능력은 이전 I- 또는 P-참조 프레임의 이용가능성을 요구하고, 더욱이, B-프레임을 복호화하기 위하여 이후 I- 또는 P-참조 프레임의 이용가능성을 또한 요구한다. 예를 들어, 부호화된/압축된 데이터 스트림이 다음의 프레임 시퀀스 또는 디스플레이 순서를 구비하는 경우:The ability to decode / decompress P-frames requires the availability of a previous I- or P-reference frame, and furthermore, also requires the availability of a later I- or P-reference frame to decode B-frames. do. For example, if the encoded / compressed data stream has the following frame sequence or display order:

I₁ B₂ B₃ P₄ B₅ P₆ B₇ P ₈ B₉ B₁₀ P₁₁ ... P_n-3 B_n-2 P_n-1 I_n,I ₁ B ₂ B ₃ P ₄ B ₅ P ₆ B ₇ P ₈ B ₉ B ₁₀ P ₁₁ ... P _n-3 B _n-2 P _n-1 I _n ,

대응하는 복호화 순서는:The corresponding decoding order is:

I₁ P₄ B₂ B₃ P₆ B₅ P₈ B ₇ P₁₁ B₉ B₁₀ ... P_n-1 B_n-2 I_n이다.I ₁ P ₄ B ₂ B ₃ P ₆ B ₅ P ₈ B ₇ P ₁₁ B ₉ B ₁₀ ... P _n-1 B _n-2 I _n .

복호화 순서는 디스플레이 순서와 상이한데 왜냐하면 B-프레임들의 복호화를 위해 B-프레임들은 미래의 I- 또는 P-프레임들을 필요로 하기 때문이다. 도 2는 상기 프레임 시퀀스의 시작을 디스플레이하고 상술된 바와 같이 프레임들의 의존성을 이해하기 위하여 참조될 수 있다. P-프레임들은 이용가능한 이전 I- 또는 P-참조 프레임을 필요로 한다. 예를 들어, P₄는 복호화되기 위하여 I₁을 필요로 한다. 유사하게, 프레임(P₆)을 복호화/압축해제하기 위하여 프레임(P₆)은 P₄가 이용가능한 것을 필요로 한다. 프레임(B₃)과 같은 B-프레임들은 복호화되기 위하여 P₄ 및 I₁과 같은 과거 및/또는 미래 I- 또는 P- 참조 프레임을 필요로 한다. B-프레임들은 부호화 동안 I- 또는 P-프레임들 사이에 있는 프레임들이다.The decoding order is different from the display order because B-frames require future I- or P-frames for decoding B-frames. 2 can be referenced to display the beginning of the frame sequence and to understand the dependencies of the frames as described above. P-frames require the previous I- or P-reference frame available. For example, P ₄ needs I ₁ to be decoded. Similarly, frame P ₆ requires that P ₄ be available to decode / decompress frame P ₆ . B-frames, such as frame B ₃ , require past and / or future I- or P- reference frames, such as P ₄ and I ₁ , to be decoded. B-frames are frames that are between I- or P-frames during encoding.

부호화 및 복호화를 위한 선행 기술 시스템이 도 3 및 도 4에 도시된다. 도 3의 부호기(300)를 참조하면, 현재 프레임으로 지칭되는, 부호화되는 프레임(301) I(x,y)는 KxL 픽셀들의 직사각형 영역들로 분할된다. 좌표(x,y)는 프레임내의 픽셀들의 위치를 나타낸다. 각 블록은 인트라(intra) 부호화(즉, 블록내의 이미지 데이터의 공간 상관만을 이용) 또는 인터(inter) 부호화(즉, 공간 및 시간 예측 양자를 이용)를 사용하여 부호화된다. 다음 설명은 인터-부호화된 블록들이 형성되는 과정을 고려한다. 각 인터-부호화된 블록은 참조 프레임으로 지칭되는, 프레임 메모리(350)에서의 이전에 (또는 이후에) 부호화되고 전송된 프레임들 R(x,y) 중의 하나로부터 예측(360)된다. 예측을 위해 사용되는 움직임 정보는 참조 프레임 및 현재 프레임(305)을 사용하여 움직임 예측 및 부호화 블록(370)으로부터 획득된다. 상기 움직임 정보는 2차원 움직임 벡터(Δx, Δy)에 의해 표시되는데, 여기서 Δx는 수평 변위이고 Δy는 수직 변위이다. 움직임 보상(MC; motion compensated) 예측 블록에 있어서, 예측 프레임 P(x,y)를 구성하기 위하여 움직임 벡터들이 참조 프레임과 함께 사용된다.Prior art systems for encoding and decoding are shown in FIGS. 3 and 4. Referring to the encoder 300 of FIG. 3, the frame 301 I (x, y) to be encoded, referred to as the current frame, is divided into rectangular regions of KxL pixels. Coordinates (x, y) indicate the position of the pixels in the frame. Each block is encoded using intra coding (i.e., using only spatial correlation of image data in the block) or inter coding (i.e., using both spatial and temporal prediction). The following description considers the process by which inter-coded blocks are formed. Each inter-coded block is predicted 360 from one of the frames R (x, y) previously coded and transmitted in frame memory 350, referred to as a reference frame. The motion information used for prediction is obtained from the motion prediction and coding block 370 using the reference frame and the current frame 305. The motion information is represented by two-dimensional motion vectors Δx, Δy, where Δx is a horizontal displacement and Δy is a vertical displacement. In a motion compensated (MC) prediction block, motion vectors are used together with a reference frame to construct a prediction frame P (x, y).

P(x,y) = R(x+Δx, y+Δy)P (x, y) = R (x + Δx, y + Δy)

이어서, 예측 오차 E(x,y), 즉 현재 프레임 및 예측 프레임 P(x,y) 간의 차이가 다음 식에 따라 계산된다(307).Subsequently, the prediction error E (x, y), that is, the difference between the current frame and the prediction frame P (x, y) is calculated according to the following equation (307).

E(x,y) = I(x,y) - P(x,y)E (x, y) = I (x, y)-P (x, y)

변환 블록(310)에서, 각 KxL 블록에 대한 예측 오차는 변환 기저 함수들 f_ij(x,y)의 가중치 합으로서 표현된다.In transform block 310, the prediction error for each KxL block is expressed as the weighted sum of the transform basis functions f _ij (x, y).

기저 함수들에 대응하는 가중치들 c_err(i,j)은 변환 계수들로 지칭된다. 이 계수들은 이어서 다음 식을 제공하기 위하여 양자화 블록(320)에서 양자화된다.The weights c _err (i, j) corresponding to the basis functions are referred to as transform coefficients. These coefficients are then quantized in quantization block 320 to provide the following equation.

I_err(i,j) = Q(c_err(i,j), QP)I _err (i, j) = Q (c _err (i, j), QP)

여기서, I_err(i,j)는 양자화된 변환 계수들이다. 양자화 연산 Q()는 정보의 손실을 도입하지만, 양자화된 계수들은 더 작은 수의 비트들을 가지고 표현될 수 있다. 압축(정보의 손실)의 레벨은 양자화 매개변수(QP)의 값을 조정함으로써 제어된다.Where I _err (i, j) are quantized transform coefficients. Quantization operation Q () introduces a loss of information, but quantized coefficients can be represented with a smaller number of bits. The level of compression (loss of information) is controlled by adjusting the value of the quantization parameter QP.

움직임 벡터들 및 양자화된 변환 계수들이 다중화기(380)에 공급되기 전에, 그들은 추가로 가변 길이 부호들(VLC; Variable Length Codes)을 사용하여 부호화된다. 이것은 움직임 벡터들 및 양자화된 변환 계수들을 표현하는데 필요한 비트들의 수를 감소시킨다. 부호화된 움직임 벡터들, 양자화된 변환 계수들뿐만 아니라 각 부호화된 프레임을 나타내는데 필요한 다른 부가 정보가 다중화기(380)에서 다중화되고 결과적인 비트-스트림은 복호기로 전송된다(415). 양자화된 변환 계수들은 또한 역양자화 블록(330)에 전달되어 역양자화된 변환 계수들을 획득하고, 추가로 역변환 블록(340)에 전달되어 현재 프레임에 대한 예측 오차 정보 E_c(x,y)를 획득한다. 상기 예측 오차 정보 E_c(x,y)는 덧셈 요소에서 예측 프레임 P(x,y)에 추가 되어 이어서 프레임 메모리(350)에 저장될 수 있는 비디오 프레임을 획득한다.Before the motion vectors and the quantized transform coefficients are supplied to the multiplexer 380, they are further encoded using Variable Length Codes (VLC). This reduces the number of bits needed to represent motion vectors and quantized transform coefficients. Coded motion vectors, quantized transform coefficients, as well as other side information necessary to represent each coded frame are multiplexed in multiplexer 380 and the resulting bit-stream is sent to the decoder (415). The quantized transform coefficients are also passed to inverse quantization block 330 to obtain inverse quantized transform coefficients, and further to inverse transform block 340 to obtain prediction error information E _c (x, y) for the current frame. do. The prediction error information E _c (x, y) is added to the prediction frame P (x, y) in the addition element to obtain a video frame that can be stored in the frame memory 350.

다음에는, 비디오 프레임들을 복호화하는 것이 도 4를 참조하여 설명될 것이다. 복호기(400)는 부호기로부터 상기 다중화된 비디오 비트-스트림(415)을 수신하고, 역다중화기(410)는 상기 비트-스트림을 역다중화하여 복호화되는 비디오 프레임들의 구성 부분들을 획득한다. 이들 구성 부분들은 적어도 부호화된 양자화된 예측 오차 변환 계수들 및 부호화된 움직임 벡터들을 포함하고, 그들은 그 다음 복호화(미도시)되어 양자화된 예측 오차 변환 계수들 및 움직임 벡터들을 획득한다. 상기 양자화된 변환 계수들은 역양자화 블록(420)에서 역양자화되고 다음 관계에 따라 역양자화된 변환 계수들 d_err(i,j)을 획득한다.Next, decoding the video frames will be described with reference to FIG. 4. Decoder 400 receives the multiplexed video bit-stream 415 from an encoder, and demultiplexer 410 demultiplexes the bit-stream to obtain components of video frames that are decoded. These components comprise at least coded quantized prediction error transform coefficients and coded motion vectors, which are then decoded (not shown) to obtain quantized prediction error transform coefficients and motion vectors. The quantized transform coefficients are dequantized in inverse quantization block 420 and obtain dequantized transform coefficients d _err (i, j) according to the following relationship.

d_err(i,j) = Q^-1(I_err(i,j), QP)d _err (i, j) = Q ^-1 (I _err (i, j), QP)

역변환 블록(430)에서, 역양자화된 변환 계수들은 역변환되어 예측 오차 E_c(x,y)를 획득한다.In inverse transform block 430, the inverse quantized transform coefficients are inverse transformed to obtain prediction error E _c (x, y).

현재 부호화된 프레임의 픽셀들은 프레임 메모리(440)로부터 획득되는 참조 프레임 R(x,y)에서 예측 픽셀들을 발견하고, 움직임 보상 예측 블록(450)에서 상기 참조 프레임과 함께 상기 수신된 움직임 벡터들을 사용함으로써 재구성되어, 예측 프레임 P(x,y)를 획득한다. 상기 예측 프레임 P(x,y) 및 상기 예측 오차 정보 E_c(x,y)는 다음 관계에 따라 덧셈 요소(435)에서 덧셈된다.The pixels of the currently encoded frame find the prediction pixels in reference frame R (x, y) obtained from frame memory 440 and use the received motion vectors together with the reference frame in motion compensation prediction block 450. By reconstructing to obtain the prediction frame P (x, y). The prediction frame P (x, y) and the prediction error information E _c (x, y) are added by the addition element 435 according to the following relationship.

I_c(x,y) = R(x+Δx, y+Δy) + E_c(x,y)I _c (x, y) = R (x + Δx, y + Δy) + E _c (x, y)

이들 값들 I_c(x,y)는 추가로 복호화된 비디오 프레임들(445)을 획득하기 위하여 필터링될 수 있다. 상기 값들 I_c(x,y)는 또한 프레임 버퍼(440)에 저장된다. 재구성된 값들 I_c(x,y)는 덧셈 블록(435) 다음에 필터링 블록(도 4에는 미도시)에서 필터링될 수 있다.These values I _c (x, y) may be filtered to obtain further decoded video frames 445. The values I _c (x, y) are also stored in the frame buffer 440. The reconstructed values I _c (x, y) may be filtered in the filtering block (not shown in FIG. 4) following the addition block 435.

비디오 스트리밍은 고정 인터넷에서 중요한 응용으로서 나타났다. 또한 비디오 스트리밍은 가까운 미래에 3G 무선 네트워크들에서 중요하게 될 것으로 예상된다. 스트리밍 응용들에서 전송 서버는 수신기로부터 요청이 있는 경우 미리 부호화된 비디오 비트 스트림을 전송 네트워크를 경유하여 수신기에 전송하기 시작한다. 수신기는 비디오 스트림을 수신하는 동안 상기 비디오 스트림을 재생한다. 현재 네트워크들의 최선의 노력을 다하는 성질은 네트워크 상황들의 변화로 인하여 사용자에게 이용가능한 유효 대역폭에서 변동들을 야기한다. 이러한 변동들을 수용하기 위하여, 전송 서버는 압축된 비디오의 비트율을 스케일링할 수 있다. 실시간 부호화 및 점 대 점 배달을 특징으로 하는 대화형 서비스의 경우에 있어서, 이것은 작동중인 소스 부호화 매개변수들을 조절함으로써 달성될 수 있다. 그러한 조절 가능한 매개변수들은 예를 들어 양자화 매개변수 또는 프레임 율일 수 있다. 상기 조절은 바람직하기로는 전송 네트워크로부터의 피드백에 기초한다. 이전에 부호화된 비디오 비트 스트림이 수신기에 전송되는 경우의 전형적인 스트리밍 시나리오들에 있어서, 상기 해결책은 적용될 수 없다.Video streaming has emerged as an important application on the fixed Internet. Video streaming is also expected to become important in 3G wireless networks in the near future. In streaming applications, the sending server starts sending a pre-encoded video bit stream to the receiver via the transmission network upon request from the receiver. The receiver plays the video stream while receiving the video stream. The best effort of current networks causes variations in the effective bandwidth available to the user due to changes in network conditions. To accommodate these variations, the sending server can scale the bit rate of the compressed video. In the case of an interactive service featuring real-time encoding and point-to-point delivery, this can be achieved by adjusting the source encoding parameters in operation. Such adjustable parameters can be, for example, quantization parameters or frame rates. The adjustment is preferably based on feedback from the transport network. In typical streaming scenarios where a previously encoded video bit stream is transmitted to a receiver, the solution is not applicable.

미리 부호화된 시퀀스들의 경우에서 대역폭 범위성(scalability)을 달성하기 위한 하나의 해결책은 상이한 비트율들 및 품질을 갖는 다수의 독립적인 스트림들을 제공하는 것이다. 그 때 전송 서버는 이용가능한 대역폭에서의 변동들을 수용하기 위하여 스트림들 사이에서 동적으로 교환한다. 다음 예는 이러한 원리를 예시한다. 다수의 비트 스트림들이 동일한 비디오 시퀀스에 대응하는 양자화 매개변수와 같은 상이한 부호화 매개변수들을 가지고 독립적으로 생성된다고 가정해보자. {P_1,n-1, P_1,n, P_1,n+1} 및 {P_2,n-1, P_2,n, P_2,n+1}은 각각 비트 스트림들 1 및 2로부터 복호화된 프레임들의 시퀀스를 나타낸다고 하자. 부호화 매개변수들이 2 비트 스트림들에 대해 상이하기 때문에, 동시에 상기 비트 스트림들로부터 재구성되는 프레임들, 예를 들어 프레임들 P_1,n-1 및 P_2,n-1은 동일하지 않다. 이제, 서버가 처음에 시간 n까지 비트 스트림 1로부터 부호화된 프레임들을 전송하고 시간 n이후에 비트 스트림 2로부터 부호화된 프레임들을 전송한다고 가정하는 경우, 복호기는 프레임들{P_1,n-2, P_1,n-1, P_2,n, P_2,n+1, P_2,n+2}을 수신한다. 이 경우에 있어서, P_2,n은 올바르게 복호화될 수 없는데 왜냐하면 그것의 참조 프레임 P_2,n-1이 수신되지 않았기 때문이다. 다른 한편, P_2,n-1 대신에 수신된 프레임 P_1,n-1은 P_2,n-1과 동일하지 않다. 따라서 임의 위치들에서 비트 스트림들간의 교환은 상이한 시퀀스들의 움직임 보상된 예측에 사용되는 참조 프레임들간의 부정합으로 인하여 시각적인 인공물들에 이르게 한다. 이러한 시각적인 인공물들은 비트 스트림들간의 교환점에서의 프레임에 한정되지 않고, 비디오 시퀀스의 나머지 부분에서 계속되는 움직임 보상된 부호화로 인하여 시간적으로 전파된다.One solution to achieving bandwidth scalability in the case of pre-coded sequences is to provide multiple independent streams with different bit rates and quality. The transport server then dynamically exchanges between the streams to accommodate variations in the available bandwidth. The following example illustrates this principle. Assume that multiple bit streams are generated independently with different coding parameters, such as quantization parameters, corresponding to the same video sequence. {P _{1, n-1} , P _{1, n} , P _{1, n + 1} } and {P _{2, n-1} , P _{2, n} , P _{2, n + 1} } are derived from bit streams 1 and 2, respectively. Assume that it represents a sequence of decoded frames. Since the encoding parameters are different for 2 bit streams, the frames reconstructed from the bit streams at the same time, for example frames P _{1, n-1} and P _{2, n-1,} are not identical. Now, assuming that the server initially transmits the encoded frames from bit stream 1 until time n and after the time n transmits the encoded frames from bit stream 2, the decoder determines that the frames {P _{1, n-2} , P _{1, n-1} , P _{2, n} , P _{2, n + 1} , P _{2, n + 2} }. In this case, P _{2, n} cannot be decoded correctly because its reference frame P _{2, n-1} has not been received. On the other hand, instead of P _{2, n-1} , the received frame P _{1, n-1} is not equal to P _{2, n-1} . The exchange between bit streams at arbitrary locations thus leads to visual artifacts due to mismatches between the reference frames used for motion compensated prediction of different sequences. These visual artifacts are not limited to frames at the point of exchange between bit streams, but propagate in time due to the motion compensated encoding that continues in the rest of the video sequence.

현재 비디오 부호화 표준들에서, 현재 및 미래 프레임들 또는 프레임들의 영역들이 현재 교환 위치 이전의 어떤 정보를 사용하지 않는 위치들에서, 즉 I-프레임들에서만 비트 스트림들간의 완벽한 (부정합이 없는) 교환이 가능하다. 더욱이, 고정(예를 들어 1초) 간격으로 I-프레임들을 위치시킴으로써, 비디오 내용을 스트리밍하기 위하여 랜덤 액세스 또는 (재생 속도가 증가되는) "고속 순방향" 또는 "고속 역방향"과 같은 VCR 기능성이 달성된다. 사용자는 비디오 시퀀스의 일부를 뛰어넘기하거나 어떤 I-프레임 위치에서 재생을 다시 시작할 수 있다. 유사하게, 증가된 재생 속도는 단지 I-프레임들만을 전송함으로써 달성될 수 있다. 이들 응용들에서 I-프레임들을 사용하는 단점은 I-프레임들이 어떤 시간적인 리던던시를 이용하지 않기 때문에 동일 품질에서 I-프레임들은 P-프레임들보다 훨씬 더 많은 수의 비트들을 필요로 한다는 것이다.In current video coding standards, a complete (unmatched) exchange between bit streams is achieved at locations where current and future frames or regions of frames do not use any information before the current exchange location, ie only in I-frames. It is possible. Moreover, by placing I-frames at fixed (eg 1 second) intervals, VCR functionality such as random access or "fast forward" or "fast reverse" (which increases playback speed) is achieved for streaming video content. do. The user can jump over part of the video sequence or resume playback at any I-frame location. Similarly, increased playback speed can be achieved by sending only I-frames. The disadvantage of using I-frames in these applications is that I-frames require much more bits than P-frames at the same quality because they do not use any temporal redundancy.

본 발명의 목적은 가변 전송 환경에서 비디오 이미지들을 전송하기 위한 신규 방법 및 시스템을 제공하는 것이다. 본 발명은 비디오 스트림들간의 올바른 (부정합이 없는) 교환이 압축된 비디오 프레임의 신규 유형을 형성하고 하나의 비트-스트림으로부터 다른 하나의 비트-스트림으로의 교환이 허용되는 위치들에서 상기 신규 유형의 프레임들을 비디오 비트-스트림들에 삽입함으로써 가능할 수 있다는 사상에 기초한다. 본 명세서에서, 상기 신규 유형의 압축된 비디오 프레임은 일반 적으로 S-프레임으로 지칭될 것이다. 보다 상세하게는, S-프레임들은 SP-프레임들 및 SI-프레임들로 분류될 수 있다. 상기 SP-프레임들은 움직임 벡터 정보를 사용하여 이미 복호화된 프레임들로부터 움직임 보상된 예측을 사용하여 복호기에서 형성된다. 상기 SI-프레임들은 복호화되는 프레임내의 이미 복호화된 이웃 픽셀들로부터 공간 (인트라) 예측을 사용하여 복호기에서 형성된다. 일반적으로, 본 발명에 따른 S-프레임은 블록-대-블록(block-by-block) 방식으로 형성되고 인터-부호화된 (SP) 블록들뿐만 아니라 인트라-부호화된 (SI) 블록들을 포함할 수 있다.It is an object of the present invention to provide a novel method and system for transmitting video images in a variable transmission environment. The present invention provides that a new type of compressed (non-mismatched) exchange between video streams forms a new type of compressed video frame and is allowed to exchange from one bit-stream to another bit-stream. It is based on the idea that it may be possible by inserting frames into video bit-streams. In this specification, the new type of compressed video frame will be generally referred to as S-frame. More specifically, S-frames can be classified into SP-frames and SI-frames. The SP-frames are formed in a decoder using motion compensated prediction from frames already decoded using motion vector information. The SI-frames are formed in a decoder using spatial (intra) prediction from already decoded neighboring pixels in the frame to be decoded. In general, an S-frame according to the present invention may be formed in a block-by-block manner and include inter-coded (SI) blocks as well as inter-coded (SP) blocks. have.

본 발명에 따른 방법은 주로 제1 비트-스트림으로부터 제2 비트-스트림으로 전송을 교환하는 경우, 상기 제2 비트-스트림은 적어도 하나의 제1 교환 프레임을 포함하며, 제2 교환 프레임이 전송되고, 상기 제2 교환 프레임은 상기 제1 비트-스트림으로부터의 적어도 하나의 참조 프레임 및 상기 제2 비트-스트림의 상기 부호화 매개변수들을 사용하여 부호화되며, 상기 제2 교환 프레임은 제2 세트의 비디오 프레임들의 적어도 하나의 예측 비디오 프레임의 재구성에 사용되는 참조 프레임으로서 상기 제1 교환 프레임 대신에 사용되는 것을 특징으로 한다.The method according to the invention mainly when exchanging transmissions from the first bit-stream to the second bit-stream, the second bit-stream comprises at least one first exchange frame, the second exchange frame being transmitted The second exchange frame is encoded using at least one reference frame from the first bit-stream and the encoding parameters of the second bit-stream, wherein the second exchange frame is a second set of video frames. Characterized in that it is used in place of the first exchange frame as a reference frame used for reconstruction of at least one prediction video frame.

본 발명에 따른 부호기는 주로 제1 비트-스트림으로부터 제2 비트-스트림으로 전송을 교환하는 수단은 상기 전송을 상기 제1 비트-스트림으로부터 상기 제2 비트-스트림으로 교환할 수 있게 하기 위하여 상기 제1 비트-스트림으로부터의 참조 프레임들 및 상기 제2 비트-스트림의 상기 부호화 매개변수들을 사용하여 제2 교환 프레임을 부호화하는 수단을 포함하는 것을 특징으로 한다.The encoder according to the present invention mainly means for exchanging transmissions from the first bit-stream to the second bit-stream is adapted to enable the exchange of the transmissions from the first bit-stream to the second bit-stream. Means for encoding a second exchange frame using reference frames from one bit-stream and the encoding parameters of the second bit-stream.

본 발명에 따른 복호기는 주로 상기 복호기는 제2 교환 프레임을 복호화하는 수단을 포함하고, 상기 제2 교환 프레임은 제1 비트-스트림으로부터의 적어도 하나의 참조 프레임 및 상기 제2 비트-스트림의 부호화 매개변수들을 사용하여 부호화되며, 상기 제2 교환 프레임은 제2 세트의 비디오 프레임들의 적어도 하나의 예측 비디오 프레임의 재구성에 사용되는 참조 프레임으로서 제1 교환 프레임 대신에 신호에 추가되고, 제2 교환 프레임을 복호화하는 상기 수단은 상기 제1 비트-스트림으로부터의 참조 프레임들 및 상기 제2 비트-스트림의 복호화 매개변수들을 사용하는 수단을 포함하는 것을 특징으로 한다.The decoder according to the present invention mainly comprises a means for decoding the second exchange frame, wherein the second exchange frame comprises at least one reference frame from a first bit-stream and an encoding medium of the second bit-stream. Encoded using the variables, wherein the second exchanged frame is a reference frame used for reconstruction of at least one predictive video frame of the second set of video frames, and is added to the signal instead of the first exchanged frame. Said means for decoding comprises means for using reference frames from said first bit-stream and decoding parameters of said second bit-stream.

본 발명에 따른 신호는 주로 제1 비트-스트림으로부터 제2 비트-스트림으로 전송을 교환하는 경우, 상기 제2 비트-스트림은 적어도 하나의 제1 교환 프레임을 포함하며, 상기 신호는 상기 제1 비트-스트림으로부터의 적어도 하나의 참조 프레임 및 상기 제2 비트-스트림의 부호화 매개변수들을 사용하여 부호화된 제2 교환 프레임을 포함하고, 상기 제2 교환 프레임은 제2 세트의 비디오 프레임들의 적어도 하나의 예측 비디오 프레임의 재구성에 사용되는 참조 프레임으로서 상기 제1 교환 프레임 대신에 사용되는 것을 특징으로 한다.When a signal according to the invention mainly exchanges transmissions from a first bit-stream to a second bit-stream, the second bit-stream comprises at least one first exchange frame, the signal being the first bit. At least one reference frame from the stream and a second exchanged frame encoded using the encoding parameters of the second bit-stream, the second exchanged frame comprising at least one prediction of a second set of video frames A reference frame used for reconstruction of a video frame is used in place of the first exchange frame.

선행 기술의 방법들 및 시스템들에 비해 본 발명에 의해 상당한 장점들이 달성된다. 본 발명은 비트 스트림들간의 교환이 I-프레임들의 위치들에서 뿐 아니라 SP-프레임들의 위치들에서 일어나도록 허용한다. SP-프레임의 부호화 효율이 전형적인 I-프레임의 부호화 효율보다 훨씬 더 좋다. 선행 기술에 따라, I-프레임들이 사용되는 위치들에서 SP-프레임들을 구비하는 비트 스트림들을 전송하는데 더 작은 대역폭이 필요하고, 여전히 전송 상태들을 변경하기에 충분한 적응 가능성을 제공 한다. 하나의 비트 스트림을 다른 하나의 비트 스트림으로 교환하는 것은 본 발명에 따른 SP-프레임이 부호화된 비트 스트림에 놓여지는 위치들에서 수행될 수 있다. 복호기에 의해 비트 스트림으로부터 재구성되는 이미지들은 하나의 비트 스트림으로부터 다른 하나의 비트 스트림으로의 변경의 결과로서 품질이 떨어지지 않는다. 본 발명은 또한 랜덤 액세스, 고속-순방향 및 고속 되감기 동작들이 비트 스트림에서 수행될 수 있다는 장점을 갖는다. 본 발명에 따른 시스템은 상술된 선행 기술 해결책들에 비해 개선된 오류 복구 및 복원력 특성들을 제공한다.Significant advantages are achieved by the present invention over prior methods and systems. The present invention allows the exchange between bit streams to occur at the locations of SP-frames as well as at the locations of I-frames. The coding efficiency of SP-frames is much better than the coding efficiency of typical I-frames. According to the prior art, smaller bandwidth is required to transmit bit streams with SP-frames at the locations where I-frames are used, and still provide sufficient adaptability to change the transmission states. The exchange of one bit stream with another bit stream may be performed at the positions where the SP-frame according to the invention is placed in the encoded bit stream. Images reconstructed from the bit stream by the decoder do not deteriorate as a result of the change from one bit stream to another bit stream. The invention also has the advantage that random access, fast-forward and fast rewind operations can be performed on the bit stream. The system according to the present invention provides improved error recovery and resilience characteristics over the prior art solutions described above.

본 발명의 실시예들의 상기 및 다른 특징들, 태양들 및 장점들은 첨부한 도면들과 함께 다음의 상세한 설명을 참조하여 명백하게 될 것이다. 그러나, 도면들은 본 발명의 제한들을 정의하는 것으로서 아니라 예시의 목적으로만 고안되었고 본 발명의 제한은 첨부된 청구범위에서 참조되어져야 하는 것으로 이해되어야 한다.These and other features, aspects, and advantages of embodiments of the present invention will become apparent with reference to the following detailed description in conjunction with the accompanying drawings. It is to be understood, however, that the drawings are not intended to define the limitations of the invention, but for the purpose of illustration only and that the limitations of the invention should be referred to in the appended claims.

이하 본 발명은 첨부된 도면들을 참조하여 더 상세하게 설명될 것이다.Hereinafter, the present invention will be described in more detail with reference to the accompanying drawings.

도 1a 내지 도 1c 및 도 2는 선행 기술의 비디오 프레임들 부호화/압축을 도시하는 도면들이다.1A-1C and 2 illustrate prior art video frames encoding / compression.

도 3은 일반적인 움직임-보상된 예측 비디오 코딩 시스템(부호기)의 블록도이다.3 is a block diagram of a typical motion-compensated predictive video coding system (encoder).

도 4는 일반적인 움직임-보상된 예측 비디오 코딩 시스템(복호기)의 블록도이다.4 is a block diagram of a typical motion-compensated predictive video coding system (decoder).

도 5는 본 발명에 따른 S-프레임들을 사용하여 2개의 상이한 비트 스트림들간의 교환을 도시하는 도면이다.5 is a diagram illustrating the exchange between two different bit streams using S-frames in accordance with the present invention.

도 6은 본 발명의 바람직한 실시예에 따른 복호기의 블록도이다.6 is a block diagram of a decoder according to a preferred embodiment of the present invention.

도 7은 S-프레임들을 사용한 랜덤 액세스의 도면이다.7 is a diagram of random access using S-frames.

도 8은 S-프레임들을 사용한 고속-순방향 과정의 도면이다.8 is a diagram of a fast-forward process using S-frames.

도 9는 본 발명의 다른 바람직한 실시예에 따른 복호기의 블록도이다.9 is a block diagram of a decoder according to another preferred embodiment of the present invention.

도 10은 본 발명의 또 다른 바람직한 실시예에 따른 복호기의 블록도이다.10 is a block diagram of a decoder according to another preferred embodiment of the present invention.

도 11은 본 발명의 바람직한 실시예에 따른 부호기의 블록도이다.11 is a block diagram of an encoder in accordance with a preferred embodiment of the present invention.

도 12는 본 발명의 다른 바람직한 실시예에 따른 복호기의 블록도이다.12 is a block diagram of a decoder according to another preferred embodiment of the present invention.

도 13은 본 발명의 제2 실시예에 따른 복호기의 블록도이다.13 is a block diagram of a decoder according to a second embodiment of the present invention.

도 14는 SP-프레임들을 사용하는 오류 복원력/복구 과정의 도면이다.14 is a diagram of an error resilience / recovery process using SP-frames.

도 15는 본 발명의 제3 바람직한 실시예에 따른 부호기의 블록도이다.15 is a block diagram of an encoder according to a third preferred embodiment of the present invention.

도 16은 본 발명의 또 다른 바람직한 실시예에 따른 부호기의 블록도이다.16 is a block diagram of an encoder according to another preferred embodiment of the present invention.

이하, 다수의 비트 스트림들이 비디오 신호로부터 형성되는 시스템에서 본 발명이 설명된다. 상기 비디오 신호는 다수의 이미지들, 즉 이미지 시퀀스를 포함하는 어떤 디지털 비디오 신호일 수 있다. 상기 디지털 비디오 신호는 다수의 비트 스트림들을 형성하기 위하여 부호기에서 부호화된다. 각 비트 스트림은 적어도 부분적으로 상이한 부호화 매개변수들을 사용하여 동일한 비디오 신호로부터 부호화된다. 예를 들어, 부호화 매개변수들을 상이하게 선택함으로써 비트율이 변경될 수 있고, 이런 식으로 상이한 비트율들을 갖는 비트 스트림들이 형성될 수 있다. 상기 부호화 매개변수들은 예를 들어 당업자에게 그 자체로서 알려져 있는, 프레임 율, 양자화 매개변수, 공간 해상도 또는 이미지들의 크기에 영향을 미치는 다른 인자일 수 있다. 부호기는 또한 적어도 하나의 인트라 프레임(Intra frame)을 각 비트 스트림에 삽입한다. 전형적으로, 각 비트 스트림의 적어도 제1 프레임은 바람직하기로는 인트라 프레임이다. 이것은 복호기로 하여금 비디오 신호의 재구성을 시작하게 할 수 있다. I-프레임들, P-프레임들 및 B-프레임들의 부호화에 사용되는 부호기는 비디오 신호의 부호화를 수행하는 어떤 선행 기술 부호기일 수 있다. 또는 각각이 다수의 비트 스트림들을 형성하기 위해 상이한 부호화 매개변수들을 사용하는 하나보다 많은 선행 기술 부호기가 존재할 수 있다. 그러나, 또한 본 발명에 따른 SP-프레임들 및/또는 SI-프레임들을 포함하는 비디오 신호를 부호화하기 위하여, 부호기에서 신규 기능이 필요하다. 이것은 더 상세하게 후술될 것이다.The invention is described below in a system in which multiple bit streams are formed from video signals. The video signal can be any digital video signal comprising multiple images, i.e., an image sequence. The digital video signal is encoded in an encoder to form a plurality of bit streams. Each bit stream is encoded from the same video signal using at least partially different encoding parameters. For example, by selecting different encoding parameters, the bit rate can be changed, and in this way bit streams with different bit rates can be formed. The coding parameters may be, for example, frame rate, quantization parameters, spatial resolution or other factors affecting the size of the images, which are known per se to those skilled in the art. The encoder also inserts at least one intra frame into each bit stream. Typically, at least the first frame of each bit stream is preferably an intra frame. This may cause the decoder to begin reconstruction of the video signal. The encoder used for encoding I-frames, P-frames and B-frames can be any prior art encoder that performs the encoding of the video signal. Alternatively, there may be more than one prior art encoder, each using different encoding parameters to form multiple bit streams. However, also in order to encode a video signal comprising SP-frames and / or SI-frames according to the invention, a new function is needed in the encoder. This will be described later in more detail.

부호기는 또한 움직임 보상된 예측 부호화를 사용하여 부호화된 프레임들(P-프레임들 및 옵션으로 B-프레임들)을 비트 스트림들에 삽입한다. 상기 부호기는 또한 본 명세서에서 S-프레임들로 지칭되는 신규 유형의 프레임들을 상이한 비트 스트림들간의 교환이 본 발명에 따라 허용되는 위치들에서 각 비트 스트림에 삽입한다. 상기 S-프레임들은 선행 기술 방법들에서 인트라 부호화된 프레임이 삽입되는 위치들에 삽입될 수 있다. 또는 상기 S-프레임들은 비디오 시퀀스에서 인트라 부호화된 프레임들을 사용하는 것에 추가하여 사용될 수 있다. 바람직하기로는, 상이한 비트 스트림들이 나중 사용을 위해 저장 수단에 저장된다. 그러나, 전부의 비디오 시퀀스들을 저장하는 것이 필요하지 않고 필요한 참조 프레임들을 저장하는 것으로 충분한 경우 전송은 부호화 이후에 실질적으로 즉시 일어날 수 있다는 것이 또한 가능하다. 부호화된 비디오 스트림의 전송은 예를 들어 전송을 위해 상기 저장된 비트 스트림들을 검색하는 수단 및/또는 부호기로부터 직접 비트 스트림들을 수신하는 수단을 구비하는 전송 서버에 의해 수행될 수 있다. 상기 전송 서버는 또한 비트 스트림을 전송 네트워크(미도시)에 전송하는 수단을 구비한다.The encoder also inserts coded frames (P-frames and optionally B-frames) into the bit streams using motion compensated predictive coding. The encoder also inserts a new type of frames, referred to herein as S-frames, into each bit stream at locations where exchange between different bit streams is permitted in accordance with the present invention. The S-frames may be inserted at positions where an intra coded frame is inserted in the prior art methods. Or the S-frames may be used in addition to using intra coded frames in a video sequence. Preferably, different bit streams are stored in storage means for later use. However, it is also possible that the transmission can occur substantially immediately after encoding if it is not necessary to store the entire video sequences and it is sufficient to store the necessary reference frames. The transmission of the encoded video stream may for example be performed by a transmission server having means for retrieving said stored bit streams for transmission and / or means for receiving bit streams directly from an encoder. The transport server also has means for transmitting the bit stream to a transport network (not shown).

이하, 본 발명의 바람직한 실시예에 따른 방법이 설명될 것이다. 도 5는 부호기에서 형성되는 제1 비트 스트림(510)의 부분 및 제2 비트 스트림(520)의 부분을 도시한다. 각각의 비트 스트림들의 소수의 P-프레임들이 도시된다. 상세하게는, 제1 비트 스트림(510)은 P-프레임들(511, 512, 514 및 515)을 포함하는 것으로 도시되고, 제2 비트 스트림(520)은 대응하는 P-프레임들(521, 522, 524 및 525)을 포함한다. 제1 비트 스트림(510) 및 제2 비트 스트림(520) 양자는 또한 대응하는 위치들에서 S-프레임들(513(또한 S₁로 표시된), 523(또한 S₂로 표시된))을 포함한다. 2개의 비트 스트림들(510 및 520)은 예를 들어 상이한 프레임 율들, 상이한 공간 해상도들 또는 상이한 양자화 매개변수들을 사용하여 상이한 비트율들로 부호화된 동일한 시퀀스에 해당한다고 가정한다. 제1 비트 스트림(510)은 전송 네트워크를 경유하여 전송 서버로부터 복호기(600, 1200, 1300(각각 도 6, 도 12 및 도 13))에 전송되고, 상기 전송 서버는 전송되는 비디오 스트림의 비트율을 변경하는 요청을 상기 전송 네트워크로부터 수신한다고 추가로 가정한다. Hereinafter, a method according to a preferred embodiment of the present invention will be described. 5 shows a portion of the first bit stream 510 and a portion of the second bit stream 520 formed at the encoder. A few P-frames of each bit stream are shown. Specifically, the first bit stream 510 is shown to include P-frames 511, 512, 514, and 515, and the second bit stream 520 is corresponding P-frames 521, 522. , 524 and 525). Both the first bit stream 510 and the second bit stream 520 also include S-frames 513 (also denoted S ₁ ), 523 (also denoted S ₂ ) at corresponding positions. It is assumed that the two bit streams 510 and 520 correspond to the same sequence encoded at different bit rates, for example using different frame rates, different spatial resolutions or different quantization parameters. The first bit stream 510 is transmitted from the transmission server to the decoders 600, 1200 and 1300 (FIGS. 6, 12 and 13, respectively) via the transmission network, which transmits the bit rate of the transmitted video stream. It is further assumed that a request to change is received from the transport network.

상술된 바와 같이, S-프레임들은 비디오 시퀀스들내에서 하나의 비트 스트림으로부터 다른 하나의 비트 스트림으로의 교환이 허용되는 위치들에서 부호화 과정 동안 비트 스트림에 위치된다. 도 5에서 볼 수 있는 바와 같이, 본 발명의 바람직한 실시예에서 추가 S-프레임(550)(또한 S₁₂로 표시된)이 S-프레임들(S₁ 및 S₂)에 연관된다. 이러한 S-프레임은 S-프레임의 제2 표현(또는 간단히 제2 S-프레임)으로 지칭되고 비트 스트림 교환 동안에만 전송된다. 이러한 제2 S-프레임(S₁₂)은 제1 비트 스트림(510)의 제n 프레임의 참조 프레임들로부터 움직임 보상된 예측을 사용하고 제2 비트 스트림(520)의 대응하는 S-프레임(523(S₂))의 부호화 매개변수들을 사용하여 비디오 시퀀스의 제n 프레임의 공간 부호화에 의해 생성된다. 도 5에 도시된 경우에 있어서, S-프레임(S₂)은 참조 프레임들로서 제2 비트 스트림(520)으로부터 이전에 재구성된 프레임들을 사용하고 제2 S-프레임(S₁₂)은 참조 프레임들로서 제1 비트 스트림(510)으로부터 이전에 재구성된 프레임들을 사용한다는 것을 주의해야 한다. 그러나, S₂ 및 S₁₂ 양자의 재구성된 픽셀 값들은 동일하다. S-프레임(S₁₂)은 제1 비트 스트림(510)으로부터 제2 비트 스트림(520)으로의 교환이 실제로 수행되는 경우에만 전송된다. 따라서 부호화 단계 동안이 아니라 교환이 수행되는 경우에만 제2 S-프레임들을 형성하는 것이 필요하다. 다른 한편, 전송 동안 계산 부담을 감소시키기 위하여 다른 비트 스트림들이 형성되는 시간에 미리 적어도 몇몇의 제2 S-프레임들을 형성하게 하는 것이 유용할 수 있다. As described above, S-frames are located in the bit stream during the encoding process at positions that allow for the exchange of one bit stream from another bit stream within video sequences. As can be seen in FIG. 5, in a preferred embodiment of the invention an additional S-frame 550 (also indicated as S ₁₂ ) is associated with the S-frames S ₁ and S ₂ . This S-frame is referred to as the second representation of the S-frame (or simply the second S-frame) and is transmitted only during the bit stream exchange. This second S-frame S ₁₂ uses motion compensated prediction from the reference frames of the n-th frame of the first bit stream 510 and corresponds to the corresponding S-frame 523 of the second bit stream 520. S ₂ )) using the encoding parameters to generate by spatial encoding of the n th frame of the video sequence. In the case shown in FIG. 5, S-frame S ₂ uses frames previously reconstructed from second bit stream 520 as reference frames and second S-frame S ₁₂ is used as reference frames. Note that the frames previously reconstructed from the 1 bit stream 510 are used. However, the reconstructed pixel values of both S ₂ and S ₁₂ are the same. The S-frame S ₁₂ is transmitted only if the exchange from the first bit stream 510 to the second bit stream 520 is actually performed. Therefore, it is necessary to form second S-frames only when the exchange is performed, not during the encoding step. On the other hand, it may be useful to have at least some second S-frames formed in advance at the time other bit streams are formed in order to reduce the computational burden during transmission.

전송 서버가 제1 비트 스트림(510)에서 S-프레임(513)(S₁)으로서 부호화되는 비디오 시퀀스의 프레임에 도달하는 경우, 상기 전송 서버는 제2 비트 스트림(520)의 부호화된 프레임들을 사용하는 비디오 스트림의 전송을 계속하는데 필요한 동작들을 시작할 수 있다. 그 시점에서 상기 전송 서버는 이미 제1 비트 스트림(510)으로부터 P-프레임들(511 및 512)을 전송했고 복호기(600, 1200, 1300)는 각각의 P-프레임들(511, 512)을 수신하고 복호화했다. 따라서, 상기 프레임들은 복호기(600, 1200, 1300)의 프레임 메모리(640, 1250, 1360)에 이미 저장되어 있다. 상기 프레임 메모리(640, 1250, 1360)는 P-프레임 또는 B-프레임을 재구성하는데 필요한 모든 프레임들, 즉 현재 프레임이 재구성되는데 필요한 모든 참조 프레임들에 대한 필요한 정보를 저장하는데 충분한 메모리를 포함한다.When the transmission server reaches a frame of a video sequence that is encoded as an S-frame 513 (S ₁ ) in the first bit stream 510, the transmission server uses the encoded frames of the second bit stream 520. May initiate the actions necessary to continue the transmission of the video stream. At that point the transmission server has already transmitted the P-frames 511 and 512 from the first bit stream 510 and the decoders 600, 1200 and 1300 receive the respective P-frames 511 and 512. And decrypted. Thus, the frames are already stored in the frame memories 640, 1250, 1360 of the decoders 600, 1200, 1300. The frame memories 640, 1250, and 1360 include enough memory to store the necessary information for all the frames needed to reconstruct a P-frame or B-frame, i.e. all the reference frames needed for the current frame to be reconstructed.

전송 서버는 제2 비트 스트림(520)의 부호화된 프레임들을 이용하여 비디오 스트림의 전송을 계속하기 위하여 다음 동작들을 수행한다. 상기 전송 서버는 예를 들어 프레임의 유형 정보를 검사함으로써 전송되는 현재 프레임이 S-프레임이고, 따라서 비트 스트림들간의 교환을 수행하는 것이 가능하다는 것을 알아챈다. 물론, 교환을 수행하도록 하는 요청이 수신되거나 어떤 다른 이유로 교환을 수행할 필요가 있는 경우에만 교환이 수행된다. 전송 서버는 제2 비트 스트림의 대응하는 S-프레임(523)을 입력하고, 그것을 제2 S-프레임(550)(S₁₂)을 형성하는데 사용하며, 상기 제2 S-프레임(S₁₂)을 복호기(600, 1200, 1300)에 전송한다. 상기 전송 서버는 제2 비트 스트림의 S-프레임(S₂)을 전송하지 않고 대신에 제2 S-프레임(S₁₂)을 전송 한다. 제2 S-프레임(S₁₂)을 복호화함으로써 복호기(600)가 제2 비트 스트림(520)의 각각의 프레임들(521, 522) 및 S-프레임(523)을 사용하는 경우 생성되는 이미지와 동일한 이미지를 재구성할 수 있는 그러한 방식으로 제2 S-프레임이 형성된다. 제2 S-프레임의 전송 이후에 전송 서버는 제2 비트 스트림(520)의 부호화된 프레임들, 즉 524, 525 등의 전송을 계속한다.The transmission server performs the following operations to continue the transmission of the video stream using the encoded frames of the second bit stream 520. The transmitting server finds out that, for example, by checking the type information of the frame, the current frame transmitted is an S-frame, and thus it is possible to perform exchange between bit streams. Of course, the exchange is performed only if a request is received to perform the exchange or if it is necessary to perform the exchange for some other reason. The transmitting server inputs the corresponding S-frame 523 of the second bit stream and uses it to form a second S-frame 550 (S ₁₂ ), which uses the second S-frame S ₁₂ . It transmits to the decoders 600, 1200, and 1300. The transmitting server does not transmit the S-frame S ₂ of the second bit stream but instead transmits the second S-frame S ₁₂ . By decoding the second S-frame S ₁₂ , the decoder 600 uses the same frames as the image generated when the decoder 600 uses the respective frames 521, 522 and the S-frame 523 of the second bit stream 520. The second S-frame is formed in such a way that the image can be reconstructed. After transmission of the second S-frame, the transmission server continues to transmit the encoded frames of the second bit stream 520, that is, 524, 525, and so on.

S-프레임(513, 523, 550)은 픽셀들 중의 공간 상관만을 사용하여 부호화된 블록들(인트라 블록들) 및 공간 및 시간 상관 양자를 사용하여 부호화된 블록들(인터 블록들)을 포함할 수 있다. 각 인터 블록에 있어서, 이 블록의 예측 P(x,y)는 수신된 움직임 벡터들 및 참조 프레임을 사용하여 복호기(600, 1200, 1300)에서 형성된다. 기저 함수들 f_ij(x,y)에 대응하는 P(x,y)에 대한 변환 계수들(c_pred)이 계산되고 양자화된다. 변환 계수들(c_pred)의 양자화된 값들은 I_pred로서 표시되고 양자화된 변환 계수들(I_pred)의 역양자화된 값들은 d_pred로서 표시된다. 예측 오차에 대한 양자화된 계수들(I_err)이 부호기로부터 수신된다. 이들 계수들의 역양자화된 값들은 d_err로서 표시될 것이다. 인터 블록에서의 각 픽셀 S(x,y)의 값은 기저 함수들 f_ij(x,y)의 가중치 합으로서 복호화되고 가중치 값들(d_rec)은 역양자화된 재구성 이미지 계수들로 지칭될 것이다. d_rec의 값들은 양자화 및 역양자화에 의해 d_rec가 획득될 수 있는 계수들(c_rec)이 존재하도록 되어야 한다. 추가로, 값들(d_rec)은 다음 조건 들 중의 하나를 충족해야 한다.The S-frames 513, 523, 550 may include blocks encoded using only spatial correlation among pixels (intra blocks) and blocks encoded using both spatial and temporal correlation (inter blocks). have. For each inter block, the prediction P (x, y) of this block is formed at decoders 600, 1200, 1300 using the received motion vectors and the reference frame. Transform coefficients c _pred for P (x, y) corresponding to the basis functions f _ij (x, y) are calculated and quantized. Quantized values of transform coefficients c _pred are denoted as I _pred and dequantized values of quantized transform coefficients I _pred are denoted as d _pred . Quantized coefficients I _err for the prediction error are received from the encoder. _Dequantized values of these coefficients will be represented as d _err . The value of each pixel S (x, y) in the inter block will be decoded as the weighted sum of the basis functions f _ij (x, y) and the weight values d _rec will be referred to as dequantized reconstructed image coefficients. The value of d _rec must be such that the coefficient, which may be the _rec d obtained by the quantization and inverse quantization (c _rec) exists. In addition, the values d _rec must satisfy one of the following conditions.

d_rec = d_pred + d_err 또는d _rec = d _pred + d _err or

c_rec = c_pred + d_err c _rec = c _pred + d _err

값들 S(x,y)는 추가로 정규화되고 필터링될 수 있다.The values S (x, y) can be further normalized and filtered.

그 다음, 예를 들어 S-프레임들(513(S₁) 및 523(S₂))과 같은 비트 스트림내에 위치하는 S-프레임들의 부호화가 설명된다.Next, encoding of S-frames located in a bit stream, such as, for example, S-frames 513 (S ₁ ) and 523 (S ₂ ), is described.

일반적으로, 도 5의 프레임들(513 및 523)과 같은 본 발명에 따른 S-프레임은 블록-대-블록(block-by-block) 방식으로 구성된다. 상술된 바와 같이, 블록들 각각은 부호화되는 이미지의 픽셀들 중에서 공간 상관들을 이용하는 그러한 방식(인트라 또는 SI-블록들)으로 또는 비디오 시퀀스의 연속 프레임들에서의 픽셀들의 블록들간의 시간 상관을 이용하는 그러한 방식(인터 또는 SP-블록들)으로 부호화될 수 있다.In general, S-frames according to the present invention, such as frames 513 and 523 of FIG. 5, are configured in a block-by-block manner. As described above, each of the blocks is such a method (intra or SI-blocks) using spatial correlations among the pixels of the image to be encoded or such using temporal correlation between blocks of pixels in successive frames of the video sequence. May be encoded in a manner (inter or SP-blocks).

본 발명에 따른 S-프레임들의 부호화는 본 발명의 제1 실시예에 따라 S-프레임 부호기(1100)의 블록도인 도 11을 참조하여 설명될 것이다.Encoding of S-frames according to the present invention will be described with reference to FIG. 11, which is a block diagram of the S-frame encoder 1100 according to the first embodiment of the present invention.

S-프레임 형식으로 부호화되는 비디오 프레임은 우선 블록들로 분할되고 각 블록은 그 다음 SP-블록, SI-블록 또는 인트라-블록으로 부호화된다. 상기 인트라 블록은 선행 기술로부터 그 자체로서 알려진 것이다. 스위치(1190)는 SI 부호화 모드 및 SP 부호화 모드 사이에서 스위칭하도록 적합하게 동작한다. 즉, 상기 스위치(1190)는 본 발명의 설명에서 사용되는 구조이지만 반드시 물리적인 장치인 것은 아니다. SP-부호화 모드에서 스위치(1190)는 현재 블록에 대한 움직임 보상된 예측(1170)을 획득하도록 동작된다. 움직임 보상된 예측 블록(1170)은 선행 기술로부터 공지된 움직임 보상된 예측에서 사용되는 것과 유사한 방식으로 부호화되는 프레임의 현재 블록에 대한 예측 P(x,y)를 형성한다. 보다 상세하게는, 움직임 보상된 예측 블록(1170)은 현재 블록에서의 픽셀들 및 프레임 메모리(1146)에서 유지되는 재구성된 참조 프레임의 픽셀 값들간의 관계를 나타내는 움직임 벡터를 결정함으로써 부호화되는 프레임의 현재 블록에 대한 예측 P(x,y)를 형성한다.A video frame encoded in S-frame format is first divided into blocks and each block is then encoded into an SP-block, an SI-block or an intra-block. The intra block is known per se from the prior art. The switch 1190 suitably operates to switch between the SI coding mode and the SP coding mode. That is, the switch 1190 is a structure used in the description of the present invention, but is not necessarily a physical device. In the SP-coding mode switch 1190 is operated to obtain a motion compensated prediction 1170 for the current block. The motion compensated prediction block 1170 forms a prediction P (x, y) for the current block of the frame that is encoded in a manner similar to that used in motion compensated prediction known from the prior art. More specifically, motion compensated prediction block 1170 is used to determine the motion vector representing the relationship between the pixels in the current block and the pixel values of the reconstructed reference frame maintained in frame memory 1146. Form a prediction P (x, y) for the current block.

SI-부호화 모드에서 스위치(1190)는 인트라 예측 블록(1180)으로부터 부호화되는 프레임의 현재 블록에 대한 예측을 획득하도록 동작된다. 인트라 예측 블록(1180)은 선행 기술에서 공지된 인트라 예측에서 사용되는 것과 유사한 방식으로 부호화되는 프레임의 현재 블록에 대한 예측 P(x,y)를 형성한다. 보다 상세하게는, 인트라 예측 블록(1180)은 부호화되는 프레임내의 이미 부호화된 이웃 픽셀들로부터의 공간 예측을 사용하여 부호화되는 프레임의 현재 블록에 대한 예측 P(x,y)를 형성한다.In the SI-coding mode, the switch 1190 is operated to obtain a prediction for the current block of the frame to be encoded from the intra prediction block 1180. Intra prediction block 1180 forms a prediction P (x, y) for the current block of the frame that is encoded in a manner similar to that used in intra prediction known in the prior art. More specifically, intra prediction block 1180 forms a prediction P (x, y) for the current block of the frame to be encoded using spatial prediction from neighboring coded neighboring pixels in the frame to be encoded.

SP- 및 SI-부호화 모드들에서 예측 P(x,y)는 픽셀 값들의 블록의 형태를 취한다. 블록(1160)에서 순방향 변환, 예를 들어, 이산 코사인 변환(DCT)이 픽셀 값들의 예측된 블록 P(x,y)에 적용되고, c_pred로 지칭되는 결과적인 변환 계수들은 이어서 양자화된 변환 계수들(I_pred)을 형성하기 위하여 양자화 블록(1150)에서 양자화된다. 대응하는 동작들이 또한 원래 이미지 데이터에 수행된다. 보다 상세하게는, 부호화되는 원래 이미지의 픽셀 값들의 현재 블록이 변환 블록(1110)에 인가된다. 여기서, 순방향 변환(예를 들어 DCT)이 변환 계수들(c_orig)을 형성하기 위하여 원래 이미지 블록의 픽셀 값들에 적용된다. 이들 변환 계수들은 양자화 블록(1120)에 전달되어 양자화된 변환 계수들(I_orig)을 형성하기 위해 양자화된다. 덧셈 요소(1130)는 각각의 양자화 블록들(1150 및 1120)로부터 양자화된 변환 계수들(I_preg 및 I_orig)의 집합들을 수신하고, 다음 관계에 따라 양자화된 예측 오차 계수들(I_err)의 집합을 생성한다.In SP- and SI-coding modes the prediction P (x, y) takes the form of a block of pixel values. In block 1160 a forward transform, for example a discrete cosine transform (DCT), is applied to the predicted block P (x, y) of pixel values, and the resulting transform coefficients, referred to as c _pred , are then quantized Are quantized in quantization block 1150 to form fields I _pred . Corresponding operations are also performed on the original image data. More specifically, the current block of pixel values of the original image to be encoded is applied to the transform block 1110. Here, a forward transform (eg DCT) is applied to the pixel values of the original image block to form transform coefficients c _orig . These transform coefficients are passed to quantization block 1120 and quantized to form quantized transform coefficients I _orig . The addition element 1130 receives sets of quantized transform coefficients I _preg and I _orig from the respective quantization blocks 1150 and 1120, and determines the quantized prediction error coefficients I _err according to the following relationship. Create a set.

I_err = I_orig - I_pred I _err = I _orig -I _pred

상기 양자화된 예측 오차 계수들(I_err)은 다중화기(1135)에 전달된다. 현재 블록이 SP-형식/모드로 부호화된 경우, 다중화기(1135)는 또한 SP-부호화된 블록에 대한 움직임 벡터들을 수신한다. 현재 블록이 SI-형식/모드로 부호화되는 경우, 인트라 예측 블록(1180)에서 SI-부호화된 블록에 대한 예측을 형성하는데 사용되는 인트라 예측 모드에 관한 정보가 상기 다중화기에 전달된다. 바람직하기로는, 가변 길이 부호화가 다중화기(1135)에서 인트라 예측 모드 정보 또는 움직임 벡터에 적용되고 양자화된 예측 오차 계수들(I_err)에 적용되며, 비트-스트림은 다양한 형태들의 정보와 함께 다중화에 의해 형성되고, 이렇게 형성된 비트-스트림은 대응하는 복호기(1200, 1300)(도 12 및 도 13 참조)에 전송된다.The quantized prediction error coefficients I _err are passed to a multiplexer 1135. If the current block is coded in SP-form / mode, the multiplexer 1135 also receives motion vectors for the SP-coded block. When the current block is encoded in SI-form / mode, information about the intra prediction mode used to form the prediction for the SI-coded block in intra prediction block 1180 is passed to the multiplexer. Preferably, variable length coding is applied to the intra prediction mode information or the motion vector in the multiplexer 1135 and to the quantized prediction error coefficients I _err , where the bit-stream is subjected to multiplexing with various types of information. And the bit-stream thus formed are transmitted to the corresponding decoders 1200, 1300 (see Figs. 12 and 13).

본 발명에 따른 S-프레임 부호기(1100)는 또한 로컬 복호화 기능을 포함한 다. 양자화 블록(1150)에서 형성된 양자화된 예측 변환 계수들(I_pred)은 덧셈 요소(1140)에 공급되고, 상기 덧셈 요소(1140)는 또한 양자화 오류 계수들(I_err)을 수신한다. 상기 덧셈 요소(1140)는 양자화된 예측 변환 계수들(I_pred) 및 양자화된 예측 오차 계수들(I_err)을 재결합하여 다음 관계에 따른 재구성된 양자화된 변환 계수들(I_rec)의 집합을 형성한다.S-frame encoder 1100 according to the present invention also includes a local decoding function. The quantized predictive transform coefficients I _pred formed at quantization block 1150 are supplied to addition element 1140, which also receives quantization error coefficients I _err . The addition element 1140 recombines the quantized prediction transform coefficients I _pred and the quantized prediction error coefficients I _err to form a set of reconstructed quantized transform coefficients I _rec according to the following relationship: do.

I_rec = I_pred + I_err I _rec = I _pred + I _err

재구성된 양자화된 변환 계수들은 역양자화 블록(1142)에 전달되고 상기 역양자화 블록(1142)은 상기 재구성된 양자화된 변환 계수들을 역양자화하여 역양자화된 재구성된 변환 계수들(d_rec)을 형성한다. 상기 역양자화된 재구성된 변환 계수들은 더 나아가서 역변환 블록(1144)에 전달되고, 상기 역변환 블록(1144)에서 상기 역양자화된 재구성된 변환 계수들은 예를 들어 역 이산 코사인 변환(IDCT) 또는 블록(1160)에서 수행된 변환에 대응하는 어떤 다른 역변환이 수행된다. 그 결과, 당해 이미지 블록에 대한 재구성된 픽셀 값들의 블록이 형성되고 프레임 메모리(1146)에 저장된다. S-프레임 형식으로 부호화되는 프레임의 다음 블록들이 상술된 부호화 및 로컬 복호화 동작됨에 따라, 현재 프레임의 복호화된 버전은 프레임 메모리에 점진적으로 집합되고, 상기 프레임 메모리로부터 상기 버전은 액세스될 수 있고 동일한 프레임의 다음 블록들의 인트라 예측에서 또는 비디오 시퀀스에서의 다음 프레임들의 인터 (움직임 보상된) 예측에서 사용될 수 있다. The reconstructed quantized transform coefficients are passed to inverse quantization block 1142 and the inverse quantization block 1142 dequantizes the reconstructed quantized transform coefficients to form dequantized reconstructed transform coefficients d _rec . . The inverse quantized reconstructed transform coefficients are further passed to inverse transform block 1144, where the inverse quantized reconstructed transform coefficients are, for example, an inverse discrete cosine transform (IDCT) or block 1160. Any other inverse transform is performed that corresponds to the transform performed in. As a result, a block of reconstructed pixel values for that image block is formed and stored in frame memory 1146. As the next blocks of the frame encoded in the S-frame format are subjected to the above-described encoding and local decoding operations, the decoded version of the current frame is gradually aggregated in the frame memory, from which the version can be accessed and the same frame It can be used in intra prediction of next blocks of or in inter (motion compensated) prediction of next frames in a video sequence.

본 발명의 제1 실시예에 따른 일반적인 S-프레임 복호기의 동작이 이제 도 12를 참조하여 설명될 것이다.The operation of the general S-frame decoder according to the first embodiment of the present invention will now be described with reference to FIG.

도 11과 관련하여 상술된 S-프레임 부호기에 의해 생성된 비트-스트림은 복호기(1200)에 의해 수신되고, 역다중화기(1210)에 의해 구성부분들로 역다중화된다. 상기 복호기는 S-프레임의 복호화된 버전을 블록-대-블록 방식으로 재구성한다. 상술된 바와 같이, S-프레임은 인트라-블록들, SP-부호화된 이미지 블록들 및 SI-부호화된 이미지 블록들을 포함할 수 있다. SP-형식 이미지 블록들에 있어서, 수신된 비트-스트림에서의 정보는 VLC 부호화된 움직임 계수 정보 및 VLC 부호화된 양자화된 예측 오차 계수들(I_err)을 포함한다. SI-형식으로 부호화된 이미지 블록들에 있어서, 수신된 비트-스트림에서의 정보는 VLC 부호화된 양자화된 예측 오차 계수들(I_err)과 함께 SI-부호화된 블록에 대한 인트라 예측을 형성하는데 사용되는 인트라 예측 모드에 관한 VLC 부호화된 정보를 포함한다.The bit-stream generated by the S-frame encoder described above in connection with FIG. 11 is received by the decoder 1200 and demultiplexed into components by the demultiplexer 1210. The decoder reconstructs the decoded version of the S-frame in a block-to-block manner. As mentioned above, the S-frame may include intra-blocks, SP-coded image blocks and SI-coded image blocks. For SP-type image blocks, the information in the received bit-stream includes VLC coded motion coefficient information and VLC coded quantized prediction error coefficients I _err . For SI-coded image blocks, the information in the received bit-stream is used to form intra prediction for the SI-coded block together with the VLC coded quantized prediction error coefficients I _err . Contains VLC coded information about the intra prediction mode.

SP-부호화된 블록을 복호화하는 경우, 역다중화기(1210)는 우선 움직임 벡터 정보 및 양자화된 예측 오차 계수들(I_err)을 복구하기 위하여 적합한 가변 길이 복호화(VLD; variable length decoding)를 수신된 비트-스트림에 적용한다. 그 다음 상기 역다중화기(1210)는 움직임 벡터 정보를 양자화된 예측 오차 계수들(I_err)로부터 분리한다. 상기 움직임 벡터 정보는 움직임 보상 예측 블록(1260)에 공급되고, 상기 비트-스트림으로부터 복구된 양자화된 예측 오차 계수들은 덧셈 요소(1220)의 하나의 입력에 적용된다. 상기 움직임 벡터 정보는 움직임 보상 예측 블록(1260)에 서 프레임 메모리(1250)에 유지되는 이전에 재구성된 프레임의 픽셀 값들과 함께 사용되어 부호기(1100)에서 채용된 방식과 유사한 방식으로 예측 P(x,y)를 형성한다.When decoding an SP-encoded block, demultiplexer 1210 first receives the appropriate variable length decoding (VLD) bits to recover motion vector information and quantized prediction error coefficients (I _err ). -Apply to stream The demultiplexer 1210 then separates the motion vector information from the quantized prediction error coefficients I _err . The motion vector information is supplied to a motion compensation prediction block 1260, and the quantized prediction error coefficients recovered from the bit-stream are applied to one input of the addition element 1220. The motion vector information is used in conjunction with the pixel values of the previously reconstructed frame held in frame memory 1250 in motion compensation prediction block 1260 to predict P (x) in a manner similar to that employed by encoder 1100. , y).

SI-부호화된 블록을 복호화하는 경우, 역다중화기(1210)는 수신된 인트라 예측 모드 정보 및 양자화된 예측 오차 계수들(I_err)에 적합한 가변 길이 복호화를 적용한다. 그 다음 상기 인트라 예측 모드 정보는 상기 양자화된 예측 오차 계수들로부터 분리되고, 인트라 예측 블록(1270)에 공급된다. 상기 양자화된 예측 오차 계수들(I_err)은 덧셈 요소(1220)의 하나의 입력에 공급된다. 상기 인트라 예측 모드 정보는 인트라 예측 블록(1270)에서 프레임 메모리(1250)에 유지되는 현재 프레임의 이전에 복호화된 픽셀 값들과 함께 사용되어 복호화되는 현재 블록에 대한 예측 P(x,y)를 형성한다. 다시, 복호기(1200)에서 수행되는 인트라 예측 과정은 상술된 부호기(1100)에서 수행된 것과 유사하다.When decoding an SI-coded block, demultiplexer 1210 applies variable length decoding suitable for received intra prediction mode information and quantized prediction error coefficients I _err . The intra prediction mode information is then separated from the quantized prediction error coefficients and fed to an intra prediction block 1270. The quantized prediction error coefficients I _err are supplied to one input of an addition element 1220. The intra prediction mode information is used together with previously decoded pixel values of the current frame maintained in the frame memory 1250 in the intra prediction block 1270 to form a prediction P (x, y) for the current block to be decoded. . Again, the intra prediction process performed by the decoder 1200 is similar to that performed by the encoder 1100 described above.

일단 복호화되는 프레임의 현재 블록에 대한 예측이 형성된 경우, 스위치(1280)는 예측된 픽셀 값들을 포함하는 예측 P(x,y)가 변환 블록(1290)에 공급되도록 동작된다. 다시, 스위치(1280)는 반드시 물리적인 장치인 것이 아니라 본 발명의 설명에서 사용된 추상적인 구조이다. SP-부호화된 블록의 경우에 있어서, 스위치(1280)는 움직임 보상 예측 블록(1260)을 변환 블록(1290)에 접속시키도록 동작되고, 반면 SI-부호화된 블록의 경우에 있어서, 스위치(1280)는 인트라 예측 블록(1270)을 변환 블록(1290)에 접속시키도록 동작된다. Once the prediction is made for the current block of the frame to be decoded, the switch 1280 is operated such that a prediction P (x, y) containing the predicted pixel values is supplied to the transform block 1290. Again, the switch 1280 is not necessarily a physical device but an abstract structure used in the description of the present invention. In the case of an SP-coded block, the switch 1280 is operated to connect the motion compensation prediction block 1260 to the transform block 1290, while in the case of an SI-coded block, the switch 1280 Is operated to connect the intra prediction block 1270 to the transform block 1290.

블록(1290)에서, 순방향 변환, 예를 들어 이산 코사인 변환(DCT)이 픽셀 값들의 예측된 블록 P(x,y)에 적용되고 결과적인 변환 계수들(c_pred)은 양자화 블록(1295)에 공급되어, 양자화된 변환 계수들(I_pred)을 형성하기 위해 양자화된다. 그 다음, 양자화된 변환 계수들(I_pred)은 덧셈 요소(1220)의 제2 입력에 공급되어, 상기 예측 오차 계수들(I_err)에 더해져서 다음 관계에 따라 재구성된 양자화된 변환 계수들(I_rec)을 형성한다.In block 1290, a forward transform, for example a discrete cosine transform (DCT), is applied to the predicted block P (x, y) of pixel values and the resulting transform coefficients c _pred are applied to quantization block 1295. Supplied and quantized to form quantized transform coefficients I _pred . The quantized transform coefficients I _pred are then supplied to the second input of the addition element 1220 and added to the prediction error coefficients I _err to reconstruct the quantized transform coefficients ( I _rec ).

I_rec = I_pred + I_err I _rec = I _pred + I _err

상기 재구성된 양자화된 변환 계수들(I_rec)은 더 나아가서 역양자화 블록(1230)에 공급되어 역양자화된 재구성된 변환 계수들(d_rec)을 형성하기 위해 역양자화된다. 그 다음, 상기 역양자화된 재구성된 변환 계수들(d_rec)은 역변환 블록(1240)에 전달되어 예를 들어 역 이산 코사인 변환(IDCT) 또는 블록(1290)에서 수행된 변환에 대응하는 어떤 다른 역변환이 수행된다. 이런 식으로, 당해 이미지 블록에 대한 재구성된 픽셀 값들의 블록이 형성된다. 재구성된 픽셀 값들은 비디오 출력 및 프레임 메모리(1250)에 공급된다. 복호화되는 S-프레임의 다음 블록들이 상술된 복호화 동작됨에 따라, 현재 프레임의 복호화된 버전은 프레임 메모리(1250)에 점진적으로 집합되고, 상기 프레임 메모리로부터 상기 버전은 액세스될 수 있고 동일한 프레임의 다음 블록들의 인트라 예측에서 또는 비디오 시퀀스 에서의 다음 프레임들의 인터 (움직임 보상된) 예측에서 사용될 수 있다.The reconstructed quantized transform coefficients I _rec are further inversely quantized to be supplied to inverse quantization block 1230 to form dequantized reconstructed transform coefficients d _rec . The inverse quantized reconstructed transform coefficients d _rec are then passed to an inverse transform block 1240 to, for example, an inverse discrete cosine transform (IDCT) or any other inverse transform corresponding to the transform performed at block 1290. This is done. In this way, a block of reconstructed pixel values for the image block is formed. Reconstructed pixel values are supplied to video output and frame memory 1250. As the next blocks of the S-frame to be decoded are subjected to the above-described decoding operation, the decoded version of the current frame is gradually collected in the frame memory 1250, from which the version can be accessed and the next block of the same frame. It can be used in intra prediction of or in inter (motion compensated) prediction of the next frames in a video sequence.

본 발명의 제1 실시예에 따른 S-프레임 부호기 및 복호기의 구조 및 기능을 재검토해보면, 본 발명에 따른 S-프레임들이 어떻게 이전 비디오 부호화/복호화 시스템들에 있는 부정합 오류들과 같은 부정합 오류들없이 비트-스트림들간에 교환할 수 있게 하는지를 이해하는 것이 이제 가능하다. 도 5에 도시된 비트-스트림 교환 예를 다시 한번 참조하면, 제1 비트-스트림(510)으로부터 제2 비트-스트림(520)으로의 교환은 각각의 비트-스트림들에서 S-프레임들(S₁(513) 및 S₂(523))의 위치에서 발생한다. 상술된 바와 같이, 교환이 수행되는 경우, S₁₂(550)로 표시된 제2 S-프레임이 부호화되고 전송된다. 제2 S-프레임(S₁₂)은 제2 비트-스트림(520)의 부호화 매개변수들 및 제1 비트-스트림(510)의 참조 프레임들을 사용하여 부호화되고, 상기 제2 프레임(S₁₂)이 복호화되는 경우 그 재구성된 픽셀 값들은 제2 비트-스트림에서의 프레임(S₂)의 전송에서 기인한 픽셀 값들과 동일하다.Reviewing the structure and function of the S-frame encoder and the decoder according to the first embodiment of the present invention, how the S-frames according to the present invention are without mismatch errors such as mismatch errors in previous video encoding / decoding systems. It is now possible to understand whether it is possible to exchange between bit-streams. Referring again to the bit-stream exchange example shown in FIG. 5, the exchange from the first bit-stream 510 to the second bit-stream 520 results in S-frames S in each bit-stream. ₁ 513 and S ₂ 523). As described above, when the exchange is performed, the second S-frame indicated by S ₁₂ 550 is encoded and transmitted. The second S-frame S ₁₂ is encoded using the encoding parameters of the second bit-stream 520 and the reference frames of the first bit-stream 510, wherein the second frame S ₁₂ is When recoded, the reconstructed pixel values are the same as the pixel values resulting from the transmission of frame S ₂ in the second bit-stream.

I² _err 및 I² _pred가 각각 상술된 절차를 가지고 SP-프레임(S ₂)의 부호화에서 획득된, 예측 오차 및 예측 프레임의 양자화된 계수들을 나타낸다고 하자. 그리고, I² _rec가 S-프레임(S₂)의 양자화된 재구성된 이미지 계수들을 나타낸다고 하자. 제2 S-프레임(550)(S₁₂)의 부호화는 다음을 제외하고 S-프레임(523)(S₂)의 부호화에서와 같은 절차들을 따른다. 1) 제2 S-프레임(S₁₂)의 각 블록의 예측에 사용된 참조 프레임(들)은 비디오 시퀀스에서 현재 제n 프레임까지 제1 비트 스트림(510)을 복호화함으로써 획득된 재구성된 프레임들이다. 2) 양자화된 예측 오차 계수들은 다음과 같이 계산된다. I¹² _err = I² _rec - I¹² _pred. 여기서 I¹² _pred는 양자화된 예측 변환 계수들을 나타낸다. 양자화된 예측 오차 계수들(I¹² _err) 및 움직임 벡터들은 복호기(1200)에 전송된다.I ² and I _err ² _pred have the above procedure each let represent the quantized coefficients of the prediction error and the prediction frame obtained by the encoding of the frame SP- (S _2). And, suppose that I ² _rec represents the quantized reconstructed image coefficients of the S-frame S ₂ . The encoding of the second S-frame 550 (S ₁₂ ) follows the same procedures as in the encoding of the S-frame 523 (S ₂ ) except for the following. 1) The reference frame (s) used for prediction of each block of the second S-frame S ₁₂ are reconstructed frames obtained by decoding the first bit stream 510 from the video sequence to the current nth frame. 2) The quantized prediction error coefficients are calculated as follows. I ¹² _err = I ² _rec -I ¹² _pred . Where I ¹² _pred represents the quantized prediction transform coefficients. The quantized prediction error coefficients I ¹² _err and the motion vectors are sent to the decoder 1200.

참조 프레임들로서 교환 이전에 제1 비트 스트림(510)으로부터 재구성된 프레임들을 사용하여 복호기(1200)에서 제2 S-프레임(S₁₂)을 복호화하는 경우, 제2 S-프레임의 계수들(I¹² _pred)이 구성되고 상술된 바와 같이 수신된 양자화된 예측 오차 계수들(I¹² _err)에 더해진다. 즉, I¹² _rec = I¹² _err + I¹² _pred = I² _rec - I¹² _pred + I¹² _pred = I² _rec. 이 수학식으로부터 I¹² _rec와 I² _rec가 동일하다는 것을 알 수 있다. 따라서, 비록 제2 S-프레임(S₁₂) 및 제2 비트 스트림의 S-프레임(S₂)이 상이한 참조 프레임들을 구비한다 하더라도, S₁₂가 복호화되는 경우, S-프레임(S₂)을 복호화하는 것에 기인하는 픽셀 값들과 동일한 재구성된 픽셀 값들을 갖는 이미지를 생성한다.When decoding the second S-frame S ₁₂ in the decoder 1200 using frames reconstructed from the first bit stream 510 before exchange as reference frames, the coefficients I ¹² of the second S-frame. _pred ) is added and added to the received quantized prediction error coefficients I ¹² _err as described above. That is, I ¹² _rec = I ¹² _err + I ¹² _pred = I ² _rec -I ¹² _pred + I ¹² _pred = I ² _rec . This equation shows that I ¹² _rec and I ² _rec are the same. Thus, even if S ₁₂ is decoded, the S-frame S ₂ is decoded, even if the second S-frame S ₁₂ and the S-frame S ₂ of the second bit stream have different reference frames. Produces an image with reconstructed pixel values equal to the pixel values resulting from

본 발명에 따른 S-프레임들의 부호화 및 복호화의 상기 설명으로부터, 선행 기술에 따른 P-프레임들 및 I-프레임들의 부호화 및 복호화와 비교하여 상당한 차이가 있음이 이해될 것이다. 상세하게는, SP 또는 SI-형식으로 이미지 블록을 부호화하거나 복호화하는 경우, 당해 블록에 대한 예측 P(x,y)는 이산 코사인 변환과 같은 변환을 적용함으로써 변환 계수 영역으로 변환된다는 것이 이해되어야 한다. 이렇게 생성된 변환 계수들은 이어서 양자화되고 예측 오차가 양자화된 계수 영역에서 결정된다. 이것은 예측 오차가 공간 (픽셀 값) 영역에서 결정되는 선행 기술에 따른 예측 부호화와 대비된다.From the above description of the encoding and decoding of S-frames according to the invention, it will be appreciated that there is a significant difference compared to the encoding and decoding of P-frames and I-frames according to the prior art. Specifically, when encoding or decoding an image block in SP or SI-form, it should be understood that the prediction P (x, y) for that block is transformed into a transform coefficient region by applying a transform such as a discrete cosine transform. . The transform coefficients thus generated are then quantized and the prediction error is determined in the quantized coefficient region. This is in contrast to the prior art predictive encoding in which the prediction error is determined in the spatial (pixel value) region.

이하, 비트-스트림들(510 및 520) 간의 교환 동안 복호기(1200)의 동작이 상세하게 설명된다. 제1 비트-스트림으로부터 제2 비트-스트림(520)으로의 교환이 일어나는 비디오 시퀀스의 위치에서, 복호기(1200)는 제1 비트-스트림(510)의 이전 P-프레임들(511 및 512)을 이미 수신했고 복호화했다. 복호화된 프레임들은 프레임 메모리(1250)에 저장되고 따라서 참조 프레임들로서 사용 가능하다. 제1 비트-스트림(510)으로부터 제2 비트-스트림(520)으로의 교환이 일어나는 경우, 부호기(1100; 도 11)는 제2 S-프레임(S₁₂, 550)을 구성하고 부호화하여 S₁₂를 나타내는 부호화된 비디오 정보를 복호기(1200)에 전송한다.The operation of decoder 1200 during the exchange between bit-streams 510 and 520 is described in detail below. At the location of the video sequence where the exchange from the first bit-stream to the second bit-stream 520 occurs, the decoder 1200 deletes the previous P-frames 511 and 512 of the first bit-stream 510. Already received and decrypted. Decoded frames are stored in frame memory 1250 and are therefore usable as reference frames. The second bit from the stream 510-the first bit if the exchange of a stream (520) takes place, the encoder (1100; Fig. 11) is the second S ₁₂ to configure the frame S- (S _12, 550) and encoding The encoded video information indicating s is transmitted to the decoder 1200.

상술된 바와 같이, 부호화는 블록-대-블록 방식으로 수행된다. 상세하게는, 제2 S-프레임(S₁₂)은 이미지 블록들의 조합으로서 부호화되고 일반적으로 각 이미지 블록은 SP-부호화된 블록 또는 SI-부호화된 블록 또는 인트라-블록으로서 부호화된다. 제2 S-프레임(S₁₂)의 SP-부호화된 블록들에 있어서, 부호기로부터 복호기로 전 송된 압축된 비디오 정보는 양자화된 예측 오차 변환 계수들(I¹² _err) 및 움직임 벡터 정보의 형태를 취한다. 제2 S-프레임(S₁₂)의 SI-부호화된 블록들에 있어서, 압축된 비디오 정보는 양자화된 예측 오차 변환 계수들(I¹² _err) 및 부호기에서 SI-부호화된 블록에 대한 예측을 형성하는데 사용된 인트라 예측 모드에 관한 정보를 포함한다. 상술된 바와 같이, 표시에 필요한 비트들의 수를 더 감소시키기 위하여 부호기로부터 전송되기 전에, 압축된 비디오 정보는 적합한 가변 길이 부호화(VLC)가 수행된다.As mentioned above, the encoding is performed in a block-to-block manner. Specifically, the second S-frame S ₁₂ is encoded as a combination of image blocks and generally each image block is encoded as an SP-coded block or SI-coded block or intra-block. In the SP-encoded blocks of the second S-frame S ₁₂ , the compressed video information transmitted from the encoder to the decoder takes the form of quantized prediction error transform coefficients I ¹² _err and motion vector information. do. In the SI-coded blocks of the second S-frame S ₁₂ , the compressed video information forms a prediction for the SI-coded block at the encoder and the quantized prediction error transform coefficients I ¹² _err . Contains information about the intra prediction mode used. As described above, the compressed video information is subjected to appropriate variable length coding (VLC) before being sent from the encoder to further reduce the number of bits required for display.

주어진 이미지 블록에 대한 압축된 비디오 정보가 복호기(1200)에 수신되고 우선 적합한 가변 길이 복호화(VLD)가 수행되며 역다중화기(1210)에 의해 그 구성 부분들로 분리된다. 수신된 비트-스트림으로부터 추출된 양자화된 예측 오차 계수들(I¹² _err)은 덧셈기(1220)의 제1 입력에 인가되고 각 이미지 블록에 대한 예측된 픽셀 값들 P(x,y)의 블록이 부호화 모드(SP 또는 SI)에 따라 형성된다. SP-부호화된 블록의 경우에 있어서, 예측된 픽셀 값들 P(x,y)의 블록은 프레임 메모리(1250)에서 이용가능한 제1 비트-스트림(예를 들어 P-프레임(511 또는 512))으로부터의 참조 프레임 및 역다중화기(1210)에 의해 제2 S-프레임(S₁₂)의 부호화된 비디오 정보로부터 추출된 움직임 벡터 정보를 사용하여 움직임 보상 예측 블록(1260)에서 형성된다. SI-부호화된 블록의 경우에 있어서, 예측된 픽셀 값들 P(x,y)의 블록은 또 한 프레임 메모리(1250)에 저장된 제2 S-프레임(S₁₂)의 이전에 복호화된 픽셀들을 사용하여 인트라 예측 블록(1270)에서 형성된다. 인트라 예측은 역다중화기(1210)에 의해 제2 S-프레임(S₁₂)에 대한 수신된 비디오 정보로부터 추출된 인트라 예측 모드 정보에 따라 수행된다.Compressed video information for a given image block is received by the decoder 1200 and first a suitable variable length decoding (VLD) is performed and separated by its demultiplexer 1210 into its components. The quantized prediction error coefficients I ¹² _err extracted from the received bit-stream are applied to the first input of the adder 1220 and the block of predicted pixel values P (x, y) for each image block is encoded. It is formed according to the mode SP or SI. In the case of an SP-coded block, the block of predicted pixel values P (x, y) is derived from the first bit-stream (e.g., P-frame 511 or 512) available in frame memory 1250. Is formed in the motion compensation prediction block 1260 by using the motion vector information extracted from the encoded video information of the second S-frame S ₁₂ by the reference frame and demultiplexer 1210. In the case of an SI-coded block, the block of predicted pixel values P (x, y) also uses previously decoded pixels of the second S-frame S ₁₂ stored in the frame memory 1250. Intra prediction block 1270 is formed. Intra prediction is performed by the demultiplexer 1210 according to the intra prediction mode information extracted from the received video information for the second S-frame S ₁₂ .

일단 제2 S-프레임의 현재 블록에 대한 예측이 형성된 경우, 예측된 픽셀 값들 P(x,y)가 변환 블록(1290)에 전달된다. 여기서 변환 계수들(c_pred)의 집합을 형성하기 위하여 순방향 변환(예를 들어 이산 코사인 변환(DCT))이 예측된 픽셀 값들 P(x,y)에 적용된다. 그 다음, 이들 변환 계수들은 양자화 블록(1295)에 전달되어 양자화된 변환 계수들(I¹² _pred)을 형성하기 위하여 양자화된다. 그 다음, 양자화된 변환 계수들(I¹² _pred)은 덧셈기(1220)의 제2 입력에 인가된다. 덧셈기(1220)는 다음 관계에 따라 재구성된 양자화된 변환 계수들(I¹² _rec)을 형성하기 위하여 상기 양자화된 변환 계수들(I¹² _pred)을 양자화된 예측 오차 변환 계수들(I¹² _err)과 결합한다.Once the prediction for the current block of the second S-frame is made, the predicted pixel values P (x, y) are passed to the transform block 1290. Here, a forward transform (eg, a discrete cosine transform (DCT)) is applied to the predicted pixel values P (x, y) to form a set of transform coefficients c _pred . These transform coefficients are then passed to quantization block 1295 and quantized to form quantized transform coefficients I ¹² _pred . The quantized transform coefficients I ¹² _pred are then applied to the second input of the adder 1220. The adder 1220 may _combine the quantized transform coefficients I ¹² _pred with the quantized prediction error transform coefficients I ¹² _err to form reconstructed quantized transform coefficients I ¹² _rec according to the following relationship. To combine.

I¹² _rec = I¹² _pred + I¹² _err I ¹² _rec = I ¹² _pred + I ¹² _err

그 다음, 상기 재구성된 양자화된 변환 계수들(I¹² _rec)은 역양자화 블록(1230) 에 공급되어 역양자화된 재구성된 변환 계수들(d¹² _rec)을 형성하기 위해 역양자화된다. 그 다음, 상기 역양자화된 재구성된 변환 계수들(d¹² _rec)은 역변환 블록(1240)에 전달되어 역변환 동작(예를 들어 역 이산 코사인 변환(IDCT))이 수행된다. 그 결과, 제2 S-프레임(S₁₂)의 현재 블록에 대한 재구성된 픽셀 값들의 블록이 형성된다. 재구성된 픽셀 값들 I_c(x,y)는 비디오 출력 및 프레임 메모리(1250)에 공급된다. 제2 S-프레임(S₁₂)의 다음 블록들이 부호화되고 부호기(1100)로부터 복호기(1200)로 전송되며 이어서 복호화됨에 따라, 제2 S-프레임의 복호화된 버전은 프레임 메모리(1250)에 점진적으로 축적된다. 상기 프레임 메모리로부터, 제2 S-프레임의 이미 복호화된 블록들이 검색될 수 있고 인트라 예측 블록(1270)에 의해 제2 S-프레임(S₁₂)의 다음 블록들에 대한 예측된 픽셀 값들 P(x,y)를 형성하는데 사용될 수 있다. 여기서 제2 S-프레임(S₁₂)의 각 이미지 블록에 대한 양자화된 예측 오차 변환 계수들이 다음 관계에 따라 부호기(1100)에서 생성된다는 것을 기억해야 한다.The reconstructed quantized transform coefficients I ¹² _rec are then supplied to inverse quantization block 1230 to inverse quantize to form dequantized reconstructed transform coefficients d ¹² _rec . The inverse quantized reconstructed transform coefficients d ¹² _rec are then passed to an inverse transform block 1240 to perform an inverse transform operation (eg an inverse discrete cosine transform (IDCT)). As a result, a block of reconstructed pixel values for the current block of the second S-frame S ₁₂ is formed. Reconstructed pixel values I _c (x, y) are supplied to video output and frame memory 1250. As the next blocks of the second S-frame S ₁₂ are encoded and transmitted from the encoder 1100 to the decoder 1200 and subsequently decoded, the decoded version of the second S-frame is progressively stored in the frame memory 1250. Accumulate. From the frame memory, already decoded blocks of the second S-frame can be retrieved and predicted pixel values P (x) for the next blocks of the second S-frame S ₁₂ by the intra prediction block 1270. , y). It should be noted here that the quantized prediction error transform coefficients for each image block of the second S-frame S ₁₂ are generated at the encoder 1100 according to the following relationship.

I¹² _err = I² _rec - I¹² _pred I ¹² _err = I ² _rec -I ¹² _pred

여기서, I² _rec는 제2 비트-스트림에서 S-프레임(S₂)을 부호화하고 그 후에 복호화함으로써 생성된 양자화된 재구성된 변환 계수 값들이다. 이것은 제2 S-프레임(S₁₂)에 대한 압축된 비디오 정보를 복호화하여 생성된 재구성된 변환 계수 들(I¹² _rec)이 제2 비트-스트림으로부터의 S-프레임(S₂)이 전송되고 복호화된 경우 생성된 계수들과 동일하다는 것을 의미한다. 상술된 바와 같이, 이것은 다음과 같다.Here, I ² _rec are quantized reconstructed transform coefficient values generated by encoding and then decoding the S-frame S ₂ in the second bit-stream. This means that the reconstructed transform coefficients I ¹² _rec generated by decoding the compressed video information for the second S-frame S ₁₂ are transmitted and decoded by the S-frame S ₂ from the second bit-stream. If it is, it means the same as the generated coefficients. As mentioned above, this is as follows.

I¹² _rec = I¹² _pred + I¹² _err I ¹² _rec = I ¹² _pred + I ¹² _err

=I¹² _pred + I² _rec - I¹² _pred = I² _rec = I ¹² _pred + I ² _rec -I ¹² _pred = I ² _rec

따라서, I¹² _rec = I² _rec.Thus, I ¹² _rec = I ² _rec .

따라서, 본 발명의 방법에 따라 제2 S-프레임(S₁₂)을 구성하고, 부호기로부터 복호기로 전송하며, 그 후에 그것을 복호화함으로써, 제1 및 제2 비트-스트림 간에 부정합 없는 교환이 달성될 수 있다는 것을 알 수 있다.Thus, by constructing a second S-frame S ₁₂ in accordance with the method of the present invention, transmitting it from the encoder to the decoder, and then decoding it, a mismatch-free exchange between the first and second bit-streams can be achieved. It can be seen that there is.

제2 S-프레임은 SI-프레임이지만 비트 스트림내의 S-프레임은 SP-프레임인 경우를 고려한다. 이 경우에 있어서, 움직임-보상된 예측을 사용하는 프레임은 공간 예측만을 사용하는 프레임으로 표시된다. 이러한 특수한 경우는 후술되는 랜덤 액세스 및 오류 복원력에 관련된다.Consider the case where the second S-frame is an SI-frame but the S-frame in the bit stream is an SP-frame. In this case, a frame using motion-compensated prediction is represented as a frame using only spatial prediction. This special case relates to the random access and error resilience described below.

상술된 본 발명의 제1 실시예에 따른 부호기(1100) 및 복호기(1200)에서, 양자화된 변환 계수들(I_pred)을 생성하기 위하여 변환 블록들(1160(부호기) 및 1290(복호기))에서 생성된 변환 계수들(c_pred)에 적용된 양자화는 양자화된 예측 오차 변환 계수들(I_err)을 생성하는데 사용되는 것과 같다는 것을 주의해야 한다. 보다 상세하 게는, 본 발명의 제1 실시예에 있어서, 부호화/복호화되는 S-프레임의 이미지 블록에 대한 예측된 픽셀 값들 P(x,y)의 블록이 생성되는 경우, 픽셀 값들의 예측된 블록 P(x,y)에 대응하는 변환 계수들(c_pred)을 양자화하는데 사용되는 양자화 매개변수(QP)는 양자화된 예측 오차 변환 계수들(I_err)을 생성하는데 사용되는 양자화 매개변수들과 동일해야 한다. 이것은 바람직한데, 왜냐하면 재구성된 변환 계수들(I_rec)을 생성하기 위하여 수행된 덧셈이 양자화된 변환 계수 영역에서 수행되기 때문이다. 즉In the encoder 1100 and the decoder 1200 according to the first embodiment of the present invention described above, in the transform blocks 1160 (encoder) and 1290 (decoder) to generate quantized transform coefficients I _pred . Note that the quantization applied to the generated transform coefficients c _pred is the same as that used to generate the quantized prediction error transform coefficients I _err . More specifically, in the first embodiment of the present invention, when a block of predicted pixel values P (x, y) for an image block of an S-frame to be encoded / decoded is generated, the predicted pixel values are predicted. The quantization parameter QP used to quantize the transform coefficients c _pred corresponding to the block P (x, y) includes the quantization parameters used to generate the quantized prediction error transform coefficients I _err . Should be the same. This is desirable because the addition performed to produce the reconstructed transform coefficients I _rec is performed in the quantized transform coefficient region. In other words

I_rec = I_pred + I_err I _rec = I _pred + I _err

이기 때문에, I_pred 및 I_err의 구성에서 동일한 양자화 매개변수들을 사용하지 못하는 것은 재구성된 양자화된 변환 계수들(I_rec)에서 오류가 될 것이다.Because of this, not using the same quantization parameters in the configuration of I _pred and I _err will be an error in the reconstructed quantized transform coefficients I _rec .

도 15는 양자화된 변환 계수들(I_pred 및 I_err)을 생성하기 위한 양자화 매개변수들의 선택에서 더 큰 유연성을 제공하는 본 발명의 제2 실시예에 따른 S-프레임 부호기(1500)의 블록도를 도시한다. 도 15를 도 11과 비교함으로써 알 수 있는 바와 같이, 본 발명의 제2 실시예에 따른 S-프레임 부호기(1500) 및 본 발명의 제1 실시예에 따른 S-프레임 부호기(1100) 간의 주요한 차이는 양자화 블록들(1525 및 1550)의 위치에 관계한다. 본 발명의 제2 실시예에 따른 S-프레임 부호기(1500)의 동작이 이하 도 15를 참조하여 상세하게 설명될 것이다.15 is a block diagram of an S-frame encoder 1500 according to a second embodiment of the present invention that provides greater flexibility in the selection of quantization parameters for generating quantized transform coefficients I _pred and I _err . Shows. As can be seen by comparing FIG. 15 with FIG. 11, the main difference between the S-frame encoder 1500 according to the second embodiment of the present invention and the S-frame encoder 1100 according to the first embodiment of the present invention. Is related to the position of the quantization blocks 1525 and 1550. The operation of the S-frame encoder 1500 according to the second embodiment of the present invention will be described in detail with reference to FIG. 15 below.

본 발명의 제2 실시예에 따라, S-프레임 형식으로 부호화되는 비디오 프레임 은 우선 블록들로 분할되고 그 다음 각 블록은 SP-블록 또는 SI-블록으로 부호화된다. 스위치(1585)는 SP 및 SI 부호화 모드들 사이에서 교환하도록 적합하게 동작된다. SP 부호화 모드에서 스위치(1585)는 움직임 보상 예측 블록(1575)으로부터 부호화되는 프레임의 현재 블록에 대한 움직임 보상된 예측을 획득하도록 동작된다. 움직임 보상 예측 블록(1575)은 현재 블록의 픽셀들 및 프레임 메모리(1570)에 유지되는 재구성된 참조 프레임의 픽셀 값들간의 관계를 설명하는 움직임 벡터를 결정함으로써 부호화되는 프레임의 현재 블록에 대한 예측된 픽셀 값들 P(x,y)의 블록을 형성한다.According to the second embodiment of the present invention, a video frame encoded in S-frame format is first divided into blocks and then each block is encoded into an SP-block or an SI-block. The switch 1585 is suitably operated to exchange between the SP and SI encoding modes. In the SP encoding mode, the switch 1585 is operated to obtain motion compensated prediction for the current block of the frame being encoded from the motion compensated prediction block 1575. The motion compensation prediction block 1575 is predicted for the current block of the frame to be encoded by determining a motion vector that describes the relationship between the pixels of the current block and the pixel values of the reconstructed reference frame maintained in the frame memory 1570. Form a block of pixel values P (x, y).

SI-부호화 모드에서, 스위치(1585)는 인트라 예측 블록(1580)으로부터 부호화되는 프레임의 현재 블록에 대한 예측을 획득하도록 동작된다. 인트라 예측 블록(1580)은 부호화되는 프레임내의 이미 부호화된 이웃 픽셀들로부터 공간 예측을 사용하여 부호화되는 프레임의 현재 블록에 대한 예측된 픽셀 값들 P(x,y)의 블록을 형성하기 위하여 본 발명의 제1 실시예와 관련하여 설명된 방식과 유사한 방식으로 동작한다.In the SI-coding mode, the switch 1585 is operated to obtain a prediction for the current block of the frame to be encoded from the intra prediction block 1580. Intra prediction block 1580 is used to form a block of predicted pixel values P (x, y) for the current block of a frame encoded using spatial prediction from already encoded neighboring pixels in the frame being encoded. It operates in a manner similar to that described in connection with the first embodiment.

SP- 및 SI-부호화 모드들 양자에서, 순방향 변환, 예를 들어 이산 코사인 변환(DCT)이 변환 블록(1590)에서 픽셀 값들의 예측된 블록 P(x,y)에 적용된다. 결과적인 변환 계수들(c_pred)은 덧셈기들(1520 및 1540)에 공급된다. 현재 부호화되는 이미지 블록의 실제 픽셀 값들을 포함하는 원래 이미지 데이터가 변환 블록(1510)에 전달되어 또한 순방향 변환(예를 들어 DCT)된다. 그 다음, 결과적인 변환 계수들(c_orig)은 덧셈기(1520)에 전달되고, 상기 덧셈기는 다음 관계에 따라 예측 오차 변환 계수들(c_err)을 생성하기 위하여 c_orig 및 c_pred 간의 차를 형성한다. In both SP- and SI-coding modes, a forward transform, for example a discrete cosine transform (DCT), is applied to the predicted block P (x, y) of pixel values at transform block 1590. The resulting transform coefficients c _pred are supplied to adders 1520 and 1540. The original image data containing the actual pixel values of the image block currently being encoded is passed to the transform block 1510 for further forward conversion (eg DCT). The resulting transform coefficients c _orig are then passed to an adder 1520, which adds a difference between c _orig and c _pred to produce prediction error transform coefficients c _err according to the following relationship: do.

c_err = c_orig - c_pred c _err = c _orig -c _pred

예측 오차 변환 계수들은 양자화 블록(1525)에 공급되어 양자화된 예측 오차 변환 계수들(I_err)을 형성하기 위하여 양자화 매개변수(PQP)를 사용하여 양자화되고, 상기 양자화된 예측 오차 변환 계수들(I_err)은 그 다음 다중화기(1540)에 전달된다.Prediction error transform coefficients are supplied to quantization block 1525 and quantized using quantization parameter PQP to form quantized prediction error transform coefficients I _err , and the quantized prediction error transform coefficients I _err ) is then passed to multiplexer 1540.

현재 블록이 SP-형식으로 부호화되는 경우, 다중화기(1540)는 또한 SP-부호화된 블록에 대한 움직임 보상된 예측 P(x,y)의 형성에 사용되는 움직임 벡터들에 관한 정보를 수신한다. 현재 블록이 SI-형식으로 부호화되는 경우, SI-부호화된 블록에 대한 예측 P(x,y)를 형성하는데 사용되는 인트라 예측 모드에 관한 정보가 또한 상기 다중화기에 전달된다. 바람직하기로는, 다중화기(1540)는 양자화된 예측 오차 변환 계수들(I_err) 및 움직임 벡터 또는 인트라 예측 모드 정보에 적합한 가변 길이 부호화(VLC)를 적용하고, 다양한 형태의 정보와 함께 다중화함으로써 대응하는 복호기에 전송하기 위한 비트-스트림을 형성한다.When the current block is encoded in the SP-form, the multiplexer 1540 also receives information about the motion vectors used to form the motion compensated prediction P (x, y) for the SP-coded block. When the current block is encoded in SI-form, information about the intra prediction mode used to form the prediction P (x, y) for the SI-coded block is also passed to the multiplexer. Preferably, the multiplexer 1540 applies quantized prediction error transform coefficients (I _err ) and variable length coding (VLC) suitable for motion vector or intra prediction mode information, and correspondingly by multiplexing with various types of information. Form a bit-stream for transmission to the decoder.

양자화된 예측 오차 변환 계수들(I_err)은 양자화 블록(1525)으로부터 역양자화 블록(1530)으로 전달되어 역양자화된 예측 오차 변환 계수들(d_err)을 형성하기 위하여 양자화 매개변수(PQP)를 사용하여 역양자화된다. 상기 역양자화된 예측 오차 변환 계수들(d_err)은 그 다음 덧셈기(1540)에 전달되어 현재 블록에 대한 예측된 픽셀 값들 P(x,y)로부터 생성된 변환 계수들(c_pred)과 결합된다. 보다 상세하게는, 덧셈기(1540)는 다음 관계에 따라 재구성된 변환 계수들(c_rec)을 형성하기 위하여 변환 계수들(c_pred)과 역양자화된 예측 오차 변환 계수들(d_err)을 더한다.Quantized prediction error transform coefficients I _err are passed from quantization block 1525 to inverse quantization block 1530 to form a quantization parameter PQP to form dequantized prediction error transform coefficients d _err . Inverse quantized. The dequantized prediction error transform coefficients d _err are then passed to adder 1540 and combined with transform coefficients c _pred generated from predicted pixel values P (x, y) for the current block. . More specifically, the adder 1540 adds transform coefficients c _pred and inverse quantized prediction error transform coefficients d _err to form reconstructed transform coefficients c _rec according to the following relationship.

c_rec = c_pred + d_err c _rec = c _pred + d _err

상기 재구성된 변환 계수들(c_rec)은 그 다음 양자화 블록(1550)에 전달되어 양자화된 재구성된 변환 계수들(I_rec)을 생성하기 위하여 양자화 매개변수(SPQP)를 사용하여 양자화된다. 재구성된 변환 계수들을 양자화하는데 사용되는 양자화 매개변수(SPQP)는 양자화 블록(1525)에서 예측 오차 변환 계수들(c_err)을 양자화하는데 사용되는 양자화 매개변수(PQP)와 반드시 같은 것은 아니라는 것을 주의해야 한다. 특히, 더 미세한 양자화가 재구성된 변환 계수들(c_rec)에 적용될 수 있고 더 거친 양자화가 예측 오차 계수들(c_err)에 적용될 수 있다. 이것은 결국 복호화된 이미지가 복호기에서 형성되는 경우 더 작은 재구성 오류(왜곡)가 된다.The reconstructed transform coefficients c _rec are then passed to quantization block 1550 and quantized using quantization parameter SPQP to produce quantized reconstructed transform coefficients I _rec . Note that the quantization parameter SPQP used to quantize the reconstructed transform coefficients is not necessarily the same as the quantization parameter PQP used to quantize the prediction error transform coefficients c _err at quantization block 1525. do. In particular, finer quantization can be applied to the reconstructed transform coefficients c _rec and coarser quantization can be applied to the prediction error coefficients c _err . This in turn results in a smaller reconstruction error (distortion) when the decoded image is formed in the decoder.

그 다음, 양자화된 재구성된 변환 계수들(I_rec)은 역양자화 블록(1560)에 공급되어 역양자화된 재구성된 변환 계수들(d_rec)을 형성하기 위하여 양자화 매개변수(SPQP)를 사용하여 역양자화된다. 상기 역양자화된 재구성된 변환 계수들(d_rec)은 그 다음 역변환 블록(1565)에 전달되어 역변환 연산 예를 들어 역 이산 코사인 변환(IDCT)이 수행된다. 이 연산의 결과로서, 당해 이미지 블록에 대한 재구성된 픽셀 값들 I_c(x,y)의 블록이 형성된다. 재구성된 픽셀 값들 I_c(x,y)의 블록은 그 후에 프레임 메모리(1570)에 저장된다. S-프레임 형식으로 부호화되는 프레임의 다음 블록들이 상술된 부호화 및 로컬 복호화 동작됨에 따라, 현재 프레임의 복호화된 버전은 프레임 메모리(1570)에 점진적으로 집합되고, 상기 프레임 메모리로부터 상기 버전은 액세스될 수 있고 동일한 프레임의 다음 블록들의 인트라 예측에서 또는 비디오 시퀀스에서의 다음 프레임들의 인터 (움직임 보상된) 예측에서 사용될 수 있다.The quantized reconstructed transform coefficients I _rec are then supplied to inverse quantization block 1560 using inverse quantization parameter SPQP to form inverse quantized reconstructed transform coefficients d _rec . Is quantized. The inverse quantized reconstructed transform coefficients d _rec are then passed to an inverse transform block 1565 to perform an inverse transform operation, for example an inverse discrete cosine transform (IDCT). As a result of this operation, a block of reconstructed pixel values I _c (x, y) for the image block is formed. The block of reconstructed pixel values I _c (x, y) is then stored in frame memory 1570. As the next blocks of the frame encoded in the S-frame format are subjected to the above-described encoding and local decoding operations, a decoded version of the current frame is gradually collected in the frame memory 1570, and the version from the frame memory can be accessed. And in intra prediction of the next blocks of the same frame or in inter (motion compensated) prediction of the next frames in the video sequence.

본 발명의 제2 실시예에 따른 S-프레임 복호기(1300)의 동작이 이제 도 13을 참조하여 설명될 것이다. 본 발명의 제2 실시예에 따른 S-프레임 부호기(1500)에 의해 생성되고 도 15와 관련하여 상술된 비트-스트림은 복호기(1300)에 의해 수신되고 그 구성 부분들로 역다중화된다. 상기 복호기는 블록-대-블록 방식으로 S-프레임의 복호화된 버전을 재구성한다. 상술된 바와 같이, S-프레임은 일반적으로 SP-부호화된 이미지 블록들 및 SI-부호화된 이미지 블록들을 포함한다. SP-부호화된 이미지 블록들에 있어서, 수신된 비트-스트림에서의 정보는 VLC 부호화된 움직임 벡터 정보 및 VLC 부호화된 양자화된 예측 오차 변환 계수들(I_err)을 포함한다. SI-형식으로 부호화된 이미지 블록들에 있어서, 수신된 비트-스트림에서의 정보는 VLC 부호화된 양자화된 예측 오차 변환 계수들(I_err) 뿐만 아니라, SI-부호화된 블록 에 대한 인트라 예측을 형성하는데 사용되는 인트라 예측 모드에 관한 VLC 부호화된 정보를 포함한다.The operation of the S-frame decoder 1300 according to the second embodiment of the present invention will now be described with reference to FIG. The bit-stream generated by the S-frame encoder 1500 according to the second embodiment of the present invention and described above with respect to FIG. 15 is received by the decoder 1300 and demultiplexed into its components. The decoder reconstructs the decoded version of the S-frame in a block-to-block manner. As mentioned above, an S-frame generally includes SP-coded image blocks and SI-coded image blocks. For SP-encoded image blocks, the information in the received bit-stream includes VLC coded motion vector information and VLC coded quantized prediction error transform coefficients I _err . For SI-coded image blocks, the information in the received bit-stream forms intra prediction for the SI-coded block as well as the VLC coded quantized prediction error transform coefficients (I _err ). Contains VLC coded information about the intra prediction mode used.

SP-부호화된 이미지 블록을 복호화하는 경우, 역다중화기(1310)는 우선 움직임 벡터 정보 및 양자화된 예측 오차 계수들(I_err)을 복구하기 위하여 적합한 가변 길이 복호화(VLD)를 수신된 비트-스트림에 적용한다. 그 다음 상기 역다중화기(1310)는 움직임 벡터 정보를 양자화된 예측 오차 계수들(I_err)로부터 분리한다. 상기 움직임 벡터 정보는 움직임 보상 예측 블록(1370)에 공급되고, 상기 수신된 비트-스트림으로부터 복구된 양자화된 예측 오차 계수들(I_err)은 역양자화 블록(1320)에 인가된다. 수신된 비트-스트림으로부터 복구된 움직임 벡터 정보는 움직임 보상 예측 블록(1370)에서 프레임 메모리(1360)에 유지되는 이전에 재구성된 프레임의 픽셀 값들과 함께 사용되어 부호기(1500)에서 채용된 방식과 유사한 방식으로 복호화되는 현재 블록에 대한 예측 P(x,y)를 형성한다.When decoding an SP-encoded image block, the demultiplexer 1310 first applies appropriate variable length decoding (VLD) to the received bit-stream to recover motion vector information and quantized prediction error coefficients (I _err ). Apply. The demultiplexer 1310 then separates the motion vector information from the quantized prediction error coefficients I _err . The motion vector information is supplied to a motion compensation prediction block 1370, and the quantized prediction error coefficients I _err recovered from the received bit-stream are applied to an inverse quantization block 1320. The motion vector information recovered from the received bit-stream is used in conjunction with the pixel values of the previously reconstructed frame maintained in frame memory 1360 in motion compensation prediction block 1370, similar to the scheme employed in encoder 1500. Form a prediction P (x, y) for the current block to be decoded in such a manner.

SI-부호화된 이미지 블록을 복호화하는 경우, 역다중화기(1310)는 수신된 인트라 예측 모드 정보 및 양자화된 예측 오차 변환 계수들(I_err)에 적합한 가변 길이 복호화를 적용한다. 그 다음 상기 인트라 예측 모드 정보는 상기 양자화된 예측 오차 변환 계수들(I_err)로부터 분리되고, 인트라 예측 블록(1380)에 공급된다. 상기 양자화된 예측 오차 변환 계수들(I_err)은 역양자화 블록(1320)에 공급된다. 수신된 비트-스트림으로부터 복구된 인트라 예측 모드 정보는 인트라 예측 블록(1380)에서 프레임 메모리(1360)에 유지되는 현재 프레임의 이전에 복호화된 픽셀 값들과 함께 사용되어 복호화되는 현재 블록에 대한 예측 P(x,y)를 형성한다. 다시, 복호기(1200)에서 수행되는 인트라 예측 과정은 상술된 대응하는 부호기(1500)에서 수행된 것과 유사하다.When decoding the SI-coded image block, the demultiplexer 1310 applies variable length decoding suitable for the received intra prediction mode information and the quantized prediction error transform coefficients I _err . The intra prediction mode information is then separated from the quantized prediction error transform coefficients I _err and fed to an intra prediction block 1380. The quantized prediction error transform coefficients I _err are supplied to an inverse quantization block 1320. The intra prediction mode information recovered from the received bit-stream is used together with previously decoded pixel values of the current frame maintained in frame memory 1360 in intra prediction block 1380 to predict P ( x, y). Again, the intra prediction process performed at decoder 1200 is similar to that performed at corresponding encoder 1500 described above.

SP- 및 SI-부호화된 이미지 블록들에 있어서, 수신된 비트-스트림으로부터 복구된 양자화된 예측 오차 변환 계수들(I_err)은 역양자화 블록(1320)에서 역양자화된 예측 오차 변환 계수들(d_err)을 형성하기 위하여 양자화 매개변수(PQP)를 사용하여 역양자화된다. 상기 역양자화된 예측 오차 변환 계수들(d_err)은 덧셈기(1325)의 하나의 입력에 인가된다.For SP- and SI-coded image blocks, the quantized prediction error transform coefficients I _err recovered from the received bit-stream are dequantized prediction error transform coefficients d in inverse quantization block 1320. _err ) to dequantize using a quantization parameter (PQP). The dequantized prediction error transform coefficients d _err are applied to one input of an adder 1325.

움직임 보상 예측 블록(1370)에서 움직임 보상된 예측에 의해 또는 인트라 예측 블록(1380)에서 인트라 예측에 의해, 일단 복호화되는 프레임의 현재 블록에 대한 예측 P(x,y)가 형성된 경우, 스위치(1385)는 예측된 픽셀 값들 P(x,y)를 변환 블록(1390)에 공급하기 위해 적합하게 동작된다. 여기서 변환 계수들(c_pred)을 형성하기 위하여 순방향 변환, 예를 들어 이산 코사인 변환(DCT)이 픽셀 값들의 예측된 블록 P(x,y)에 적용된다. 변환 계수들(c_pred)은 그 다음 덧셈기(1325)의 제2 입력에 공급되어, 재구성된 변환 계수들(c_rec)을 형성하기 위하여 역양자화 블록(1320)으로부터 수신된 역양자화된 예측 오차 변환 계수들과 결합된다. 보다 상세하게는, 재구성된 변환 계수들은 다음 관계에 따라 변환 계수들(c_pred)과 역양자화된 예측 오차 변환 계수들(d_err)을 더함으로써 결정된다.If the prediction P (x, y) is formed for the current block of the frame once decoded, either by motion compensated prediction in motion compensated prediction block 1370 or by intra prediction in intra prediction block 1380, switch 1385 ) Is suitably operated to supply the predicted pixel values P (x, y) to the transform block 1390. Here a forward transform, for example a discrete cosine transform (DCT), is applied to the predicted block P (x, y) of pixel values to form transform coefficients c _pred . The transform coefficients c _pred are then supplied to a second input of the adder 1325 to inverse quantized prediction error transform received from inverse quantization block 1320 to form reconstructed transform coefficients c _rec . Combined with coefficients. More specifically, the reconstructed transform coefficients are determined by adding transform coefficients c _pred and dequantized prediction error transform coefficients d _err according to the following relationship.

c_rec = c_pred + d_err c _rec = c _pred + d _err

상기 재구성된 변환 계수들(c_rec)은 그 다음 양자화 블록(1330)에 전달되어 양자화된 재구성된 변환 계수들(I_rec)을 생성하기 위하여 양자화 매개변수(SPQP)를 사용하여 양자화된다. 상기 양자화된 재구성된 변환 계수들(I_rec)은 그 다음 역양자화 블록(1340)에 공급되어, 역양자화된 재구성된 변환 계수들(d_rec)을 형성하기 위하여 양자화 매개변수(SPQP)를 사용하여 역양자화된다. 상기 역양자화된 재구성된 변환 계수들(d_rec)은 그 다음 역변환 블록(1350)에 전달되어, 역변환 연산, 예를 들어 역 이산 코사인 변환(IDCT)이 수행된다. 역변환 블록(1350)에서 적용된 역변환의 결과로서, 당해 이미지 블록에 대한 재구성된 이미지 픽셀들 I_c(x,y)의 블록이 형성된다. 재구성된 픽셀들 I_c(x,y)의 블록은 복호기의 비디오 출력에 그리고 프레임 메모리(1360)에 공급되고, 상기 프레임 메모리에 상기 픽셀들이 저장된다. S-프레임의 다음 블록들이 상술된 복호화 동작됨에 따라, 현재 프레임의 복호화된 버전은 프레임 메모리(1360)에 점진적으로 집합된다. 상기 프레임 메모리로부터 상기 버전은 액세스될 수 있고 동일한 프레임의 다음 블록들의 인트라 예측에서 또는 비디오 시퀀스에서의 다음 프레임들의 인터 (움직임 보상된) 예측에서 사용될 수 있다.The reconstructed transform coefficients c _rec are then passed to quantization block 1330 and quantized using quantization parameter SPQP to generate quantized reconstructed transform coefficients I _rec . The quantized reconstructed transform coefficients I _rec are then supplied to inverse quantization block 1340 using the quantization parameter SPQP to form inverse quantized reconstructed transform coefficients d _rec . Dequantized. The inverse quantized reconstructed transform coefficients d _rec are then passed to an inverse transform block 1350 to perform an inverse transform operation, for example an inverse discrete cosine transform (IDCT). As a result of the inverse transform applied at inverse transform block 1350, a block of reconstructed image pixels I _c (x, y) for that image block is formed. The block of reconstructed pixels I _c (x, y) is supplied to a video output of a decoder and to frame memory 1360, where the pixels are stored. As the next blocks of the S-frame are decoded as described above, the decoded version of the current frame is progressively aggregated in the frame memory 1360. The version from the frame memory can be accessed and used in intra prediction of the next blocks of the same frame or in inter (motion compensated) prediction of the next frames in the video sequence.

도 16에는, 본 발명의 제3 실시예에 따른 부호기가 도시된다. 이 실시예에 있어서, 변환 계수들(c_pred)은 부호기 섹션(블록들 1625 및 1630)에서 그리고 복호기 섹션(블록들 1692 및 1694)에서 동일한 양자화 매개변수(SPQP)를 사용하여 양자화 및 역양자화된다. 그러므로, 부호기는 어떤 추가 양자화 오류를 예측 루프(loop)에 도입하지 않고, 상기 예측 루프에서의 오류 증강은 따라서 효과적으로 방지된다. 블록들(1610, 1620, 1625, 1630, 1640, 1650, 1660, 1665, 1670, 1675, 1680, 1685, 1690)은 각각 도 15에 도시된 블록들(1510, 1520, 1525, 1530, 1540, 1550, 1560, 1565, 1570, 1575, 1580, 1585, 1590)과 유사한 기능을 갖는다.16, an encoder according to a third embodiment of the present invention is shown. In this embodiment, transform coefficients c _pred are quantized and dequantized using the same quantization parameter (SPQP) in encoder section (blocks 1625 and 1630) and in decoder section (blocks 1692 and 1694). . Therefore, the encoder does not introduce any additional quantization error into the prediction loop, and error augmentation in the prediction loop is thus effectively prevented. Blocks 1610, 1620, 1625, 1630, 1640, 1650, 1660, 1665, 1670, 1675, 1680, 1685, 1690 are blocks 1510, 1520, 1525, 1530, 1540, 1550, respectively, shown in FIG. 15. , 1560, 1565, 1570, 1575, 1580, 1585, 1590.

도 6에는, 본 발명의 바람직한 실시예에 따른 복호기(600)가 도시된다. 복호기(600)의 대부분의 구성요소들은 도 12에서 도시된 복호기(1200)의 구성요소들과 동일하다. 복호기(600)의 동작 블록들은 프레임들의 예측 블록들을 복호화하도록 배열되고, 여기서 어떠한 교환 수단도 도 6에 도시되지 않는다. 다른 블록들(610, 615, 620, 630, 640, 650, 660 및 670)은 각각 도 12에 도시된 블록들(1210, 1220, 1230, 1240, 1250, 1260, 1290 및 1295)과 유사한 기능을 갖는다.6, a decoder 600 according to a preferred embodiment of the present invention is shown. Most of the components of the decoder 600 are the same as those of the decoder 1200 shown in FIG. The operational blocks of decoder 600 are arranged to decode predictive blocks of frames, where no exchange means is shown in FIG. 6. The other blocks 610, 615, 620, 630, 640, 650, 660 and 670 each have similar functionality to the blocks 1210, 1220, 1230, 1240, 1250, 1260, 1290 and 1295 shown in FIG. Have

도 9에는, 본 발명의 다른 바람직한 실시예에 따른 복호기(600)가 도시된다. 도 9에 도시된 복호기(600)는 도 6에 도시된 복호기(600)의 변형이다. 도 9에 도시된 복호기 및 도 12에 도시된 복호기간의 차이는 역다중화기(610) 및 덧셈 요소(615)의 하나의 입력 사이에 정규화 블록(680)이 삽입된다는 것이다. 다른 블록들(610, 615, 620, 630, 640, 650, 660 및 670)은 각각 도 12에 도시된 블록들(1210, 1220, 1230, 1240, 1250, 1260, 1290 및 1295)과 유사한 기능을 갖는 다.In Fig. 9, a decoder 600 according to another preferred embodiment of the present invention is shown. The decoder 600 shown in FIG. 9 is a variation of the decoder 600 shown in FIG. The difference between the decoder shown in FIG. 9 and the decoding period shown in FIG. 12 is that a normalization block 680 is inserted between one input of the demultiplexer 610 and the addition element 615. The other blocks 610, 615, 620, 630, 640, 650, 660 and 670 each have similar functionality to the blocks 1210, 1220, 1230, 1240, 1250, 1260, 1290 and 1295 shown in FIG. Have.

도 10에는, 본 발명의 또 다른 바람직한 실시예에 따른 복호기(600)가 도시된다. 복호기(600)의 대부분의 구성요소들은 도 13에 도시된 복호기(1300)의 구성요소들과 동일하다. 복호기(600)의 동작 블록들은 프레임들의 예측 블록들을 복호화하도록 배열되고, 여기서 어떠한 교환 수단도 도 10에 도시되지 않는다. 도 10에 도시된 복호기 및 도 13에 도시된 복호기간의 다른 차이는 역양자화 블록(1230) 대신에 정규화 블록(680)이 사용된다는 것이다. 다른 블록들(610, 615, 620, 630, 640, 650, 660 및 670)은 각각 도 13에 도시된 블록들(1310, 1325, 1330, 1340, 1350, 1360, 1370 및 1390)과 유사한 기능을 갖는다.In Fig. 10, a decoder 600 according to another preferred embodiment of the present invention is shown. Most of the components of the decoder 600 are the same as those of the decoder 1300 shown in FIG. The operational blocks of the decoder 600 are arranged to decode the predictive blocks of the frames, where no exchange means is shown in FIG. 10. Another difference between the decoder shown in FIG. 10 and the decoding period shown in FIG. 13 is that the normalization block 680 is used instead of the dequantization block 1230. The other blocks 610, 615, 620, 630, 640, 650, 660 and 670 each have similar functionality to the blocks 1310, 1325, 1330, 1340, 1350, 1360, 1370 and 1390 shown in FIG. 13, respectively. Have

비디오 프레임의 부호화는 동일한 부호화된 비디오 프레임에서 상이하게 부호화된 영역들이 존재할 수 있도록 블록-대-블록 방식으로 수행될 수 있다. 예를 들어, 프레임의 어떤 부분들은 인터 부호화될 수 있고 상기 프레임의 다른 어떤 부분들은 인트라 부호화될 수 있다. 상기 절차들은 당해 부분의 부호화 절차에 따라 적합하게 프레임의 각 부분에 적용된다.Encoding of the video frame may be performed in a block-to-block manner such that differently encoded regions may exist in the same encoded video frame. For example, some parts of the frame may be inter coded and some other parts of the frame may be intra coded. The above procedures are suitably applied to each part of the frame according to the encoding procedure of that part.

전송 네트워크에 더하여, 비트 스트림 전송 특성들의 변경에 대한 요청이 또한 전송 시스템의 다른 부분들에 의해 발신될 수 있다. 예를 들어, 수신기는 어떤 이유로 매개변수들을 변경하도록 전송 서버에 요청할 수 있다. 이 요청은 예를 들어 전송 네트워크를 경유하여 전송 서버에 전달된다.In addition to the transmission network, a request for a change in the bit stream transmission characteristics may also be sent by other parts of the transmission system. For example, the receiver may ask the sending server to change the parameters for some reason. This request is for example delivered to the transmitting server via the transmitting network.

비록 H.26L이 표준의 예로서 사용된다 하더라도, 본 발명의 실시예들 및 어떤 변경들 및 변형들이 본 발명의 범위내에 있는 것으로 생각된다. Although H.26L is used as an example of the standard, it is contemplated that embodiments of the present invention and certain variations and modifications are within the scope of the present invention.

비트 스트림 교환이 본 발명이 적용될 수 있는 유일한 애플리케이션은 아니다. 비트 스트림들 중의 하나가 낮은 시간 해상도, 예를 들어 초당 1 프레임(1 frame/sec)을 갖는 경우, 이 비트 스트림은 고속 순방향 기능을 제공하는데 사용될 수 있다. 상세하게는, 낮은 시간 해상도를 갖는 비트 스트림으로부터 복호화하고 그 다음 보통의 프레임 율을 갖는 비트 스트림으로 교환하는 것은 그러한 기능을 제공할 것이다. 도 8은 2개의 비트 스트림들을 도시한다. 제2 비트 스트림은 제1 비트-스트림의 프레임 반복 간격보다 더 큰 간격들로 서로로부터 예측된 S-프레임들만을 포함한다. 더욱이, "고속 순방향(Fast Forward)"이 비트-스트림의 어떤 위치에서 시작하고 중단할 수 있다. 이하, 본 발명의 몇몇 다른 응용들이 설명된다.Bit stream exchange is not the only application to which the present invention can be applied. If one of the bit streams has a low time resolution, for example 1 frame / sec, this bit stream can be used to provide fast forward functionality. Specifically, decoding from a bit stream with a low temporal resolution and then swapping into a bit stream with a normal frame rate will provide such functionality. 8 shows two bit streams. The second bit stream includes only S-frames predicted from each other at intervals greater than the frame repetition interval of the first bit-stream. Moreover, "Fast Forward" can start and stop at any point in the bit-stream. In the following, some other applications of the invention are described.

스플라이싱 및 랜덤 액세스(Splicing and Random Access)Splicing and Random Access

상술된 비트 스트림-교환 예는 동일한 이미지들의 시퀀스에 속하는 비트 스트림들을 고려했다. 그러나, 이것이 반드시 비트 스트림 교환이 필요한 모든 상황들에서의 경우는 아니다. 예들은 다음을 포함한다: 동일한 사건을 캡처하는 상이한 카메라들로부터 상이한 관점들로부터 또는 감시를 위해 건물 주위에 위치된 카메라들로부터 도달하는 비트 스트림들간의 교환; 텔레비전 방송, 비디오 브리징(bridging) 등에 광고방송을 삽입하거나 지역적인/전국적인 프로그래밍으로의 교환. 부호화된 비트 스트림들을 연결하는 과정의 일반적인 용어는 스플라이싱(splicing)이다.The bit stream-exchange example described above considered bit streams belonging to the same sequence of images. However, this is not necessarily the case in all situations where bit stream exchange is required. Examples include: exchange between bit streams arriving from different perspectives from different cameras capturing the same event or from cameras located around a building for surveillance; Embed commercials in television broadcasts, video bridging, etc. or switch to local / national programming. The general term for the process of concatenating encoded bit streams is splicing.

상이한 이미지들의 시퀀스에 속하는 비트 스트림들간에 교환이 일어나는 경우, 이것은 단지 비트 스트림들간의 교환에 사용되는 S-프레임들, 즉 도 5에서의 제2 S-프레임(S₁₂)의 부호화에 영향을 미친다. 상세하게는, 상이한 이미지들의 시퀀스로부터의 참조 프레임들들 사용하는 이미지들의 하나의 시퀀스에서의 프레임들의 움직임-보상된 예측의 사용은 양 비트 스트림들이 동일한 이미지들의 시퀀스에 속하는 경우에서만큼 효과적이지 않다. 이 경우에 있어서, 제2 S-프레임들의 공간 예측이 보다 효과적일 것이다. 이것은 도 7에 도시되고, 여기서 교환 프레임은 대응하는 SP-프레임(S₂)을 동일하게 재구성하는 공간 예측만을 사용하는 SI-프레임이다. 이 방법은 랜덤 액세스 메커니즘으로서 비트 스트림에 사용될 수 있고 후술되는 바와 같은 오류 복구 및 복원력에 더 밀접한 관계를 갖는다.If an exchange occurs between bit streams belonging to a sequence of different images, this only affects the encoding of the S-frames used for the exchange between the bit streams, ie the second S-frame S ₁₂ in FIG. 5. . Specifically, the use of motion-compensated prediction of frames in one sequence of images using reference frames from a sequence of different images is not as effective as in the case where both bit streams belong to the same sequence of images. In this case, spatial prediction of the second S-frames will be more effective. This is shown in FIG. 7, where the exchange frame is an SI-frame using only spatial prediction that equally reconstructs the corresponding SP-frame S ₂ . This method can be used for the bit stream as a random access mechanism and has a closer relationship to error recovery and resilience as described below.

오류 복구(Error recovery)Error recovery

상이한 참조 프레임들로부터 예측된, 예를 들어 바로 이전 재구성된 프레임 및 시간적으로 더 이전 재구성된 프레임으로부터 예측된 S-프레임들의 형태를 갖는 단일 프레임의 다양한 표시들은 부호화된 비디오 시퀀스의 오류 복원력 증가 및/또는 비트-스트림에서의 오류들로부터 복구를 개선하는데 사용될 수 있다. 이것은 도 14에 도시된다. 이전-부호화된 비트 스트림의 스트리밍 동안 패킷 손실이 일어나고, 프레임 또는 슬라이스(slice)가 손실된 경우에는, 수신기는 손실된 프레임/슬라이스를 송신기에게 통보하고 상기 송신기는 다음 S-프레임의 대체 표현들 중의 하나를 전송함으로써 응답한다. 상기 대체 표현, 예를 들어 도 14의 프레임(S₁₂)은 수신기에 의해 이미 올바르게 수신된 참조 프레임들을 사용한다. 슬라이스 기반 패킷화 및 전달에 있어서, 송신기는 그러한 슬라이스/프레임 손실에 의해 영향받는 슬라이스들을 더 추정하고 그들의 대체 표현들을 가지고 다음 S-프레임에서의 그러한 슬라이스들만을 갱신할 수 있다.Various representations of a single frame in the form of S-frames predicted from different reference frames, eg, immediately preceding the reconstructed frame and temporally earlier reconstructed frame, increase the error resilience of the encoded video sequence and / or Or to improve recovery from errors in the bit-stream. This is shown in FIG. If packet loss occurs during the streaming of the pre-encoded bit stream, and if a frame or slice is lost, the receiver notifies the transmitter of the lost frame / slice and the transmitter indicates one of the alternative representations of the next S-frame. Respond by sending one. The alternative representation, for example frame S ₁₂ of FIG. 14, uses reference frames already correctly received by the receiver. In slice-based packetization and delivery, the transmitter can further estimate slices affected by such slice / frame loss and update only those slices in the next S-frame with their replacement representations.

유사하게, 스플라이싱의 설명에서 상술된 바와 같이, S-프레임의 제2 표현은 어떠한 참조 프레임들, 즉 도 14에 도시된 것과 같은 SI₂-프레임을 사용하지 않고 생성될 수 있다. 이 경우에 있어서, 송신기는 오류 전파를 중단시키기 위하여 S₂ 대신에 제2 SI-프레임, 즉 SI₂를 전송할 것이다. 이러한 접근은 또한 직접 슬라이스-기반 부호화/패킷화에 확장될 수 있다. 보다 상세하게는, 서버는 SI-프레임으로부터 패킷 손실에 영향을 받는 다음 S-프레임에서의 슬라이스들을 전송한다.Similarly, as described above in the description of splicing, a second representation of an S-frame may be generated without using any reference frames, i.e., an SI ₂ -frame as shown in FIG. In this case, the transmitter will send a _second SI-frame, ie SI ₂ , instead of S ₂ to stop the error propagation. This approach can also be extended to direct slice-based encoding / packetization. More specifically, the server sends slices in the next S-frame that are affected by packet loss from the SI-frame.

오류 복원력(Error resilience)Error resilience

비디오 프레임의 부호화는 동일한 부호화된 비디오 프레임에서 상이하게 부호화된 영역들이 존재할 수 있도록 블록-대-블록 방식으로 수행될 수 있다. 예를 들어, 프레임의 어떤 부분들은 인터 부호화될 수 있고 상기 프레임의 다른 어떤 부분들은 인트라 부호화될 수 있다. 상술된 바와 같이, 인트라-블록 부호화는 어떠한 시간 상관도 사용하지 않기 때문에, 인트라-블록 부호화는 전송 손상들로 인하여 시작될 수 있는 어떠한 오류 전파를 중단시킨다.Encoding of the video frame may be performed in a block-to-block manner such that differently encoded regions may exist in the same encoded video frame. For example, some parts of the frame may be inter coded and some other parts of the frame may be intra coded. As mentioned above, since intra-block coding does not use any time correlation, intra-block coding stops any error propagation that may be started due to transmission corruptions.

손실 있는 전송 네트워크들에서, 인트라 매크로블록 재생(refresh) 전략은 상당한 오류 복원력/복구 성능을 제공할 수 있다. 대화형 클라이언트/서버 시나리오에 있어서, 서버 측에 있는 부호기는 클라이언트로부터 수신된 특정 피드백, 예를 들어 손실된/손상된 프레임/슬라이스/매크로블록의 정확한 위치에 기초하여 또 는 협상을 통해 계산된 예상된 네트워크 상황들 또는 측정된 네트워크 상황들에 기초하여 프레임들/매크로블록들을 부호화하도록 결정한다. 이러한 종류의 인트라-매크로블록 갱신 전략은 오류 복원력 및 오류 복구를 제공함으로써 수신된 비디오의 품질을 개선한다. 최적 인트라-매크로블록 갱신 재생 비율, 즉 매크로블록들이 인트라-부호화되는 주파수는 전송 채널 조건들, 예를 들어 패킷 손실 및/또는 비트 오류율에 의존한다. 그러나, 전형적인 스트리밍 응용들의 경우인 이미 부호화된 비트 스트림들이 전송된 경우, 상기 전략은 직접 적용될 수 없다. 시퀀스가 최악의 경우를 고려하여 예상된 네트워크 조건들을 가지고 부호화되는 것이 필요하거나 추가 오류 복원력/복구 메커니즘들이 요구된다.In lossy transmission networks, intra macroblock refresh strategy can provide significant error resilience / recovery performance. In the interactive client / server scenario, the encoder at the server side is expected to be calculated based on the specific position received from the client, for example, the exact location of the lost / corrupted frame / slice / macroblock or through negotiation. Determine to encode frames / macroblocks based on network conditions or measured network conditions. This kind of intra-macroblock update strategy improves the quality of the received video by providing error resilience and error recovery. The optimal intra-macroblock update reproduction rate, ie the frequency at which macroblocks are intra-encoded, depends on the transport channel conditions, for example packet loss and / or bit error rate. However, if already encoded bit streams are transmitted, which is the case for typical streaming applications, the strategy cannot be applied directly. The sequence needs to be coded with the expected network conditions to account for the worst case or additional error resilience / recovery mechanisms are required.

오류 복구 및 슬라이싱 응용들에서 S-프레임들의 사용에 관한 상기 설명으로부터, S-프레임들 또는 S-프레임들내의 슬라이스들은 어떠한 참조 프레임들을 사용하지 않고 여전히 S-프레임의 동일한 재구성으로 안내하는 SI-프레임들/슬라이스들로서 용이하게 표시될 수 있다는 것을 주목할 수 있다. 이러한 특징은 상술된 적응 인트라 재생 메커니즘에서 이용될 수 있다. 우선, 이미지들의 시퀀스는 S-매크로블록들의 어떤 미리 정해진 비로 부호화된다. 그 다음, 전송 동안, S-매크로블록들 중 몇몇은 SI-매크로 블록들과 같은 제2 표현으로 전송된다. SI 표현으로 전송되는 S-매크로블록들의 수는 상술된 실시간 부호화/배달 접근에서 사용된 방법과 유사한 방식으로 계산될 수 있다.From the above description regarding the use of S-frames in error recovery and slicing applications, the S-frames or slices within S-frames do not use any reference frames and still lead to the same reconstruction of the S-frame. It can be noted that it can be easily displayed as fields / slices. This feature can be used in the adaptive intra regeneration mechanism described above. First, the sequence of images is encoded with some predetermined ratio of S-macroblocks. Then, during transmission, some of the S-macroblocks are sent in a second representation, such as SI-macroblocks. The number of S-macroblocks transmitted in the SI representation can be calculated in a manner similar to the method used in the real time encoding / delivery approach described above.

비디오 리던던시 부호화(Video Redundancy Coding)Video Redundancy Coding

S-프레임들은 I-프레임들 대신으로 동작하지 않는 응용들에서 다른 용도들을 갖는다. 비디오 리던던시 부호화(VRC)가 일 예로서 주어질 수 있다. VRC 방법의 원리는 시퀀스에서의 모든 화상들이 라운드-로빈(round-robin) 방식으로 스레드(thread)들 중의 하나에 할당되는 그러한 방식으로 화상들의 시퀀스를 2 이상의 스레드들로 분할하는 것이다. 규칙적인 간격으로, 모든 스레드들이 소위 동기 프레임으로 수렴된다. 이 동기 프레임으로부터, 새로운 스레드 시리즈가 시작된다. 스레드들 중의 하나가 예를 들어 패킷 손실 때문에 손상되는 경우, 나머지 스레드들은 전형적으로 그대로 유지되고 다음 동기 프레임을 예측하는 데 사용될 수 있다. 손상된 스레드의 복호화를 계속하는 것이 가능한데 이것은 화질을 약간 저하시킨다. 또는 손상된 스레드의 복호화를 중단하는 것이 가능한데 이것은 프레임 율을 떨어뜨린다. 동기 프레임들은 항상 손상되지 않은 스레드들 중의 하나로부터 예측된다. 이것은 전송된 I-프레임들의 수가 적게 유지될 수 있다는 것을 의미하는데, 왜냐하면 완전한 재-동기화가 필요하지 않기 때문이다. 동기 프레임에 대해 하나 보다 많은 표현(P-프레임)이 전송된 경우, 각각은 상이한 스레드로부터 참조 프레임을 사용한다. P-프레임들의 사용으로 인하여, 이들 표현들은 동일하지 않다. 그러므로 표현들 중의 몇몇이 복호화될 수 없고 다음 스레드들을 복호화할 때 대등부들(counterparts)이 사용되는 경우 부정합이 도입된다. 동기 프레임들로서 S-프레임들을 사용하는 것은 이러한 문제를 제거한다.S-frames have other uses in applications that do not operate on behalf of I-frames. Video redundancy coding (VRC) may be given as an example. The principle of the VRC method is to divide the sequence of pictures into two or more threads in such a way that all pictures in the sequence are assigned to one of the threads in a round-robin manner. At regular intervals, all threads converge to a so-called sync frame. From this sync frame, a new thread series begins. If one of the threads is damaged due to, for example, packet loss, the remaining threads typically remain intact and can be used to predict the next sync frame. It is possible to continue decrypting the damaged thread, which degrades the image quality slightly. Alternatively, it is possible to stop decrypting the damaged thread, which reduces the frame rate. Sync frames are always predicted from one of the intact threads. This means that the number of transmitted I-frames can be kept small, because no full re-synchronization is needed. If more than one representation (P-frame) has been sent for a sync frame, each uses a reference frame from a different thread. Due to the use of P-frames, these representations are not the same. Hence mismatch is introduced if some of the representations cannot be decrypted and counterparts are used when decrypting the next threads. Using S-frames as sync frames eliminates this problem.

본 발명은 상술된 실시예들에 제한되지 않고 첨부된 청구범위 내에서 변경될 수 있다는 것이 명백하다.It is apparent that the invention is not limited to the above-described embodiments but may be modified within the scope of the appended claims.

Claims

A method for transmitting video information, wherein at least a first bit-stream 510 and a second bit-stream are formed from the video information, the first bit-stream 510 comprising at least one video frame. A first set of frames, the second bit-stream 520 includes a second set of frames, including at least one predictive video frame 524,

At least partially different encoding parameters are used for encoding the frames of the first bit-stream 510 and the second bit-stream 520,

At least one frame of the first bit-stream 510 is transmitted,

In the transmission of the video information is exchanged from the first bit-stream 510 to the second bit-stream 520,

When exchanging the transmission from the first bit-stream 510 to the second bit-stream 520, the second bit-stream 520 is a first exchange frame in at least one second bit-stream. 523, wherein a second exchange frame 550 is transmitted, wherein the second exchange frame 550 includes at least one reference frame and the second bit-stream from the first bit-stream 510. Encoded using the encoding parameters of 520, wherein the second exchange frame 550 is a reference frame used for reconstruction of the at least one predictive video frame 524 of the second set of video frames. Used in place of the first exchange frame (523) in the second bit-stream.

2. The method of claim 1, wherein the first bit-stream (510) comprises a first exchange frame (513) in at least one first bit-stream.

The method of claim 1, wherein the first bit-stream 510 is configured to perform one intra frame to perform a transition from one position of the video information to another position of the video information. And one second exchange frame (550).

3. The method according to claim 1 or 2, characterized in that the first bit-stream 510 comprises only intra frames and first exchange frames 513 to perform a fast forward operation on the video information. Video information transmission method.

The method according to claim 1 or 2, wherein the first exchange frame 523 or the second exchange frame 550 in the second bit-stream is a prediction video frame, and the prediction information includes only intra prediction information. Video information transmission method.

The method according to claim 2, wherein a first exchange frame 513 in a first bit-stream is formed such that transform coefficients c _pred are calculated and quantized to form quantized values I _pred of the transform coefficients, is the quantized prediction error coefficients (I _err) is defined, the transform coefficients in (c _rec) by a quantization from the, in the reconstructed quantized transform coefficients (I _rec) transform coefficients that are to be acquired (c _rec) is and the quantized transform coefficients to reconstruct the presence (I _rec) is defined, in the reconstruction of the quantized transform coefficients (I _rec) are the following conditions:

I _rec = I _pred + I _err , or

c _rec = c _pred + d _err

Here, d _err is a video information transmission method, characterized in that the dequantized values of the prediction error.

7. The method of claim 6, wherein the same quantization parameters are used for the quantized prediction error coefficients (I _err ).

7. The method of claim 6, wherein different quantization parameters are used for quantization of the transform coefficients c _rec and quantization of a prediction error.

A method for transmitting video information, wherein at least a bit-stream 510 is formed from the video information, the bit-stream 510 comprising a first set of frames comprising at least one video frame, and at least one A second set of frames comprising a predictive video frame 524 of

At least partially different encoding parameters are used for encoding the frames of the first set of frames and the frames of the second set,

At least one frame of the bit-stream 510 is transmitted,

And wherein said transmission is exchanged from said first set of frames to said second set of frames.

When exchanging the transmission from the first set of frames to the second set of frames, the second set of frames includes a first exchange frame 523 in at least one second bit-stream, A second exchange frame 550 is transmitted, and the second exchange frame 550 is encoded using at least one reference frame from the first set of frames and the encoding parameters of the second set of frames. And the second exchange frame 550 is a reference frame used for reconstruction of the at least one predictive video frame 524 of the second set of video frames and a first exchange frame 523 in the second bit-stream. Instead of)

The second exchange frame 550 is used to recover from transmission errors, the second exchange frame 550 is a prediction video frame, and the prediction information is prediction information from video frames that are earlier than the previous frame of the prediction video frame. Video information transmission method comprising a.

At least one frame of the bit-stream 510 is transmitted,

The second exchange frame (550) is used to recover from transmission errors, the second exchange frame (550) is a predictive video frame, and the prediction information includes only intra prediction information.

11. The method of claim 10, wherein both the first exchange frame 523 and the second exchange frame 550 in the at least one second bit-stream produce the same reconstruction result of the at least one prediction video frame 524. Video information transmission method, characterized in that.

11. The method of claim 10, wherein both the first exchange frame (523) and the second exchange frame (550) in the at least one second bit-stream have the same reconstructed values.

Means for forming at least a first bit-stream 510 and a second bit-stream 520 from video information, the first bit-stream comprising a first set of frames comprising at least one video frame; The second bit-stream (520) comprises means for including a second set of frames comprising at least one predictive video frame (524);

Means for using at least partially different encoding parameters to encode the frames of the first bit-stream (510) and the second bit-stream (520);

Means for transmitting at least one frame of the first bit-stream (510); And

An encoder comprising means for exchanging said transmission from said first bit-stream 510 to said second bit-stream 520,

The means for exchanging the transmission from the first bit-stream 510 to the second bit-stream 520 converts the transmission from the first bit-stream 510 to the second bit-stream 520. Means for encoding a second exchange frame 550 using the reference frames from the first bit-stream 510 and the encoding parameters of the second bit-stream 520 to be interchangeable with Encoder comprising a.

14. An encoder according to claim 13, comprising means for generating prediction information using the reference frames (1670, 1675), and means for performing quantization and inverse quantization on the prediction information (1692, 1694). .

14. An encoder according to claim 13, comprising means (1670, 1675) for generating prediction information using the reference frames, and means (1690) for converting the prediction information.

A decoder for decoding video information from a signal,

The signal comprises frames from at least a first bit-stream 510 and a second bit-stream 520 formed from the video information, the first bit-stream comprising at least one video frame; Includes a set of frames, the second bit-stream 520 includes a second set of frames, including at least one predictive video frame 524,

In the decoder wherein at least partially different encoding parameters are used for encoding the frames of the first bit-stream 510 and the second bit-stream 520,

The decoder includes means for decoding the second exchange frame 550,

The second exchange frame 550 is encoded using at least one reference frame from the first bit-stream 510 and the encoding parameters of the second bit-stream 520, and the second exchange frame. Frame 550 is a reference frame used for reconstruction of the at least one predictive video frame 524 of the second set of video frames instead of the first exchange frame 523 in a second bit-stream 520. Is added to the signal,

Said means for decoding a second exchange frame 550 comprises means for using reference frames from said first bit-stream 510 and decoding parameters of said second bit-stream 520. Decoder.

17. The method of claim 16, wherein the first exchanged frame 513 in the first bit-stream is regions encoded by intra prediction using only spatial correlation and inter prediction using motion compensation. Includes regions encoded by

The decoder

Means for using motion compensation information in the reconstruction;

Means for using spatial correlation information in the reconstruction; And

And switching means for performing reconstruction of each region by the means using motion compensation information or by the means using spatial correlation information, depending on the prediction method used by the encoding of each region. Decoder, characterized in that.

A computer-readable recording medium representing encoded video information and storing signal data including frames from at least a first bit-stream 510 and a second bit-stream 520 formed from the video information. ,

The first bit-stream includes a first set of frames including at least one video frame, and the second bit-stream 520 includes a second set of at least one predictive video frame 524. Including frames,

In signal data wherein at least partially different encoding parameters are used for encoding the frames of the first bit-stream 510 and the second bit-stream 520,

When exchanging the transmission from the first bit-stream 510 to the second bit-stream 520, the second bit-stream 520 is defined as one of at least one second bit-stream 520. 1 exchange frame 523,

The signal data comprises at least one reference frame from the first bit-stream 510 and a second interchange frame 550 encoded using the encoding parameters of the second bit-stream 520 and ,

The second exchange frame 550 is a reference frame used for the reconstruction of the at least one predictive video frame 524 of the second set of video frames (the first exchange frame in the second bit-stream 520). 523) A computer readable recording medium, characterized in that being used instead.

A decoder for decoding video information from a signal,

The signal comprises at least one video block and at least one predictive video block, the predicted video block being predicted from the at least one video block,

A decoder comprising a memory for storing reference information about previously decoded blocks, and a predictor for forming a block predicted using the reference information stored in the memory.

A transformer for transforming the predicted block to form a transformed predicted block, and

And an adder for adding the transformed predicted block with information representing a current block to obtain added information for use in decoding the current block.

20. The decoder of claim 19, further comprising an inverse quantizer and an inverse transformer for inversely quantizing and inversely transforming the current block after the addition.

21. The decoder of claim 19 or 20, comprising a quantizer for quantizing the transformed predicted block prior to the addition.

22. The decoder of claim 21, wherein the information indicative of the current block is obtained by transforming and quantizing at least a video block, wherein the transform and quantization are the same as the transform and quantization used for the predicted block.

21. The apparatus of claim 19 or 20, wherein the information indicative of the current block is obtained by transforming and quantizing at least a video block, and the decoder includes an inverse quantizer for dequantizing the information indicative of the current block prior to the addition. Decoder, characterized in that.

21. The apparatus of claim 19 or 20, wherein the information representing the current block is obtained by transforming and quantizing at least a video block, and the decoder includes a normalization block that scales the information representing the current block prior to the addition. Decoder, characterized in that.

24. The decoder of claim 23, comprising a quantizer, an inverse quantizer, and an inverse transformer for quantizing, inverse quantizing, and inverse transforming the current block after the addition.

20. The decoder of claim 19, wherein transform basis functions are determined to be used for the transform.