KR20050031460A

KR20050031460A - Method and apparatus for performing multiple description motion compensation using hybrid predictive codes

Info

Publication number: KR20050031460A
Application number: KR1020057001444A
Authority: KR
Inventors: 미핼라 반데르샤르; 딥팩 디.에스. 투라가
Original assignee: 코닌클리케 필립스 일렉트로닉스 엔.브이.
Priority date: 2002-07-31
Filing date: 2003-07-24
Publication date: 2005-04-06
Also published as: EP1527607A1; WO2004014083A1; AU2003249461A1; JP2005535219A; CN1672421A

Abstract

An improved multiple description coding (MDC) method and apparatus is provided which extends multi-description motion compensation (MDMC) by allowing for multi-frame prediction and is not limited to only I and P frames. Further, the coding method of the invention extends MDMC for use with any conventional predictive codec, such as, for example, MPEG2/4 and H.26L. The improved MDC permits the use of any conventional predictive coder for use as a top and bottom predictive encoder. Further, the top and bottom predictive coders can advantageously include B-frames and multiple prediction motion compensation. Still further, any of the top, middle and bottom predictive encoders can be a scalable encoder (e.g., FGS-like or data-partitioning like where the motion vectors (MVs) are sent first, temporal scalability etc.).

Description

METHOD AND APPARATUS FOR PERFORMING MULTIPLE DESCRIPTION MOTION COMPENSATION USING HYBRID PREDICTIVE CODES}

본 발명은 일반적으로 네트워크나 다른 타입의 통신 매체를 통해 전송하기 위한 데이터, 음성, 오디오, 이미지, 비디오 및 다른 타입의 신호의 다중 표현 코딩(MDC)에 관한 것이다.The present invention generally relates to multiple representation coding (MDC) of data, voice, audio, image, video and other types of signals for transmission over a network or other type of communication medium.

오늘날의 네트워크를 통해 흐르는 정보의 많은 부분은 악화된 상태에서도 유용하다. 이들 예로는 음성, 오디오, 정지 이미지 및 비디오를 포함한다. 이러한 정보가 패킷 손실을 겪게 되면, 실시간 제약으로 인해 재전송이 불가능할 수 있다. 총 전송율, 왜곡 및 지연에 관한 우수한 성능은, 손실된 패킷을 반복하는 것보다 비트 스트림에 여분을 추가함으로써 종종 달성될 수 있다.Much of the information flowing through today's networks is useful even in a deteriorating state. Examples include voice, audio, still images and video. If such information suffers from packet loss, retransmissions may not be possible due to real time constraints. Good performance in terms of total rate, distortion, and delay can often be achieved by adding extras to the bit stream rather than repeating lost packets.

여분은, 데이터가 스트림 중 일부 여분을 가지고 여러 개의 스트림으로 쪼개지는 다중 표현 코딩(MDC)을 통한 한 방향으로 비트 스트림에 추가될 수 있다. 모든 스트림이 수신되면, 순전히 압축을 위해 설계된 시스템보다 약간 더 높은 비트율을 가지는 것을 희생하여 낮은 왜곡을 보장할 수 있게 된다. 반면에 스트림의 오직 일부만이 수신되면 재구성의 품질이 적절하게(gracefully) 떨어지고, 이는 순전히 압축을 위해 설계된 시스템에 있어서는 거의 일어나기 힘든 경우이다. 다수의 해상도나 계층화된 소스 코딩과 달리 표현 체계가 존재하지 않아서, 다중 표현 코딩은 채널이나 패킷 네트워크를 우선권의 준비 없이 삭제하기에 적합하다.Redundancy can be added to the bit stream in one direction through multiple representation coding (MDC), where the data is split into several streams with some redundancy in the stream. Once all streams are received, low distortion can be guaranteed at the expense of having a slightly higher bit rate than a system designed purely for compression. On the other hand, if only a portion of the stream is received, the quality of reconstruction is gracefully degraded, which is rarely the case for systems designed purely for compression. Unlike multiple resolutions or layered source coding, there is no representation scheme, so multiple representation coding is suitable for deleting channels or packet networks without provisioning priority.

다중 표현 코딩은 다수의 방식으로 구현될 수 있다. 한가지 방식은 인코더에서 홀수 및 짝수 프레임 시퀀스를 개별적으로 수집하고, 그 결과로 인한 시간적으로 서브-샘플링된 시퀀스를 독립적으로 코딩함으로써, 들어오는 비디오 스트림을 채널의 임의의 서브세트로 분리하는 것이다. 디코더에서 서브-샘플링된 시퀀스 중 하나를 수신하면, 비디오 스트림은 프레임 속도의 절반으로 디코딩될 수 있다. 비디오 스트림의 상관된(correlated) 성질로 인해, 서브-샘플링된 시퀀스 중 오직 1개만을 수신하는 것은, 움직임 보상된 에러 삭제 기술을 사용하는 중간 프레임의 회복을 고려한다. 이러한 기술은 1998년 11월의 비디오 기술에 관한 회로 및 시스템에 관한 IEEE 회보 페이지 867 내지 페이지 877에 실린, Wenger 등의 "H.263+에서의 에러 탄성 지원"에 상당히 상세히 설명된다.Multiple representation coding can be implemented in a number of ways. One approach is to separate the incoming video stream into any subset of the channel by separately collecting odd and even frame sequences at the encoder and independently coding the resulting temporal sub-sampled sequence. Upon receiving one of the sub-sampled sequences at the decoder, the video stream may be decoded at half the frame rate. Due to the correlated nature of the video stream, receiving only one of the sub-sampled sequences allows for recovery of the intermediate frame using motion compensated error cancellation techniques. This technique is described in great detail in Wenger et al., "Error Resilience Support in H.263 +," published in IEEE 1998 pages 867-877 on circuits and systems for video technology in November 1998.

에러 탄성을 달성하기 위해, 2002년 6월의 비디오 기술에 관한 회로 및 시스템에 관한 IEEE 회보 12권, 페이지 4348 내지 페이지 4352에 실린, Wang과 Lin의 "다중 표현 움직임 보상을 사용하는 에러 탄성 비디오 코딩"은 다중 표현 코딩을 구현하는 한가지 방법을 설명한다. 이러한 접근법에 따르면, 시간적인 예측기는 인코드 동안에 과거의 짝수 및 홀수 프레임 모두를 인코더가 사용하는 것을 허용함으로써, 오직 하나의 표현만이 디코더에서 수신될 때 인코더와 디코더 사이의 부정합을 생성하게 된다. 부정합 에러는 이러한 문제점을 극복하기 위해, 명백하게 인코드된다. 인코더가 예측을 위해 홀수와 짝수 프레임 모두를 사용하도록 허용하는 것의 주요 이점은 코딩 효율 면에 있다. 시간적인 필터 탭(tap)을 변경함으로써, 여분의 양이 제어될 수 있다. 개시된 이 방법은 여분의 양과 에러 탄성 사이의 정당한 융통성을 제공한다.To achieve error resilience, Wang and Lin's "Error Resilient Video Coding Using Multiple Representation Motion Compensation," published in IEEE 2002, 12, page 4348 to page 4352, on Circuits and Systems on Video Technology, June 2002. Describes one way to implement multiple representation coding. According to this approach, the temporal predictor allows the encoder to use both past even and odd frames during encoding, resulting in a mismatch between the encoder and the decoder when only one representation is received at the decoder. Mismatch errors are explicitly encoded to overcome this problem. The main advantage of allowing the encoder to use both odd and even frames for prediction is in terms of coding efficiency. By changing the temporal filter tap, the extra amount can be controlled. This method disclosed provides just flexibility between the extra amount and the error resilience.

Wang과 Lin 접근법의 결점은 그것이 오직 I와 P 프레임(B-프레임은 아님)으로만 제한된다는 점이다. 이러한 접근법의 또다른 결점은 그것이 H.26L에 이용된 것과 같은 다수 프레임 예측을 허용하지 않는다는 점이다. 이들 결점은 MDMC의 코딩 효율을 제한하고 또한 이용 가능한 코덱 모듈(codec module)을 사용하는 대신 완전한 독점적인(proprietary) 구현을 요구한다는 점이다.The drawback of the Wang and Lin approach is that it is limited to only I and P frames (not B-frames). Another drawback of this approach is that it does not allow multiple frame prediction as used in H.26L. These drawbacks limit the coding efficiency of MDMC and also require a completely proprietary implementation instead of using the available codec modules.

도 1은 본 발명의 일 실시예에 따른 MDMC 인코더를 도시하는 도면.1 illustrates an MDMC encoder in accordance with an embodiment of the present invention.

본 발명은 전술한 결점을 극복하는 개선된 다중 표현 코딩(MDC) 방법 및 장치를 제공하는 것이다. 특히 본 발명의 코딩 방법은 다수 프레임 예측을 허용함으로써, 다수 표현 움직임 보상(MDMC)을 확장하고, I와 P 프레임으로만 제한하지 않는다. 또한 본 발명의 코딩 방법은, 예를 들면 MPEG2/4와 H.26L과 같은 임의의 종래 예측 코덱을 가지고 사용하기 위한 MDMC를 확장한다.The present invention provides an improved multiple representation coding (MDC) method and apparatus that overcomes the aforementioned drawbacks. In particular, the coding method of the present invention allows for multiple frame prediction, thereby extending the multiple representation motion compensation (MDMC) and is not limited to only I and P frames. The coding method of the present invention also extends MDMC for use with any conventional predictive codec such as MPEG2 / 4 and H.26L.

본 발명의 제 1 양태에 따르면, 상부, 중간 및 하부 코더(coder)의 3개의 예측 코더를 포함하는 개선된 MDMC 인코더가 제공된다. 입력 프레임은 3개의 개별 입력으로서 인코더에 공급된다. 입력 프레임은 중앙 인코더에 공급된다. 또한 입력 프레임은 프레임의 2개의 서브-스트림으로 나누어지거나 분리되고, 제 1 서브-스트림은 홀수 프레임만을 포함하며, 제 2 서브-스트림은 짝수 프레임만을 포함한다. 홀수 프레임을 포함하는 제 1 서브-스트림은 인코드된 홀수 프레임 시퀀스를 만들어내도록 상부 인코더에 의해 인코드될 입력으로서 제공되고, 짝수 프레임을 포함하는 제 2 서브-스트림은 인코드된 짝수 프레임 시퀀스를 만들어내도록 하부 인코더에 의해 인코드될 입력으로서 제공된다. 다른 실시예는, 예를 들어 3개의 프레임 중 2개 모두가 상부 인코더에 의해 인코드되고 매 세 번째 프레임이 하부 인코더에 의해 인코드되는 불균형을 이룬 분할과 같은 상이한 표준을 사용하여 프레임을 분할할 수 있다는 점이 주목된다. 프레임의 본래의 분할되지 않은 입력 스트림은 짝수 프레임으로부터 홀수 프레임의 예측을 계산하는 중앙 인코더에 적용된다. 추가적으로, 중앙 인코더는 홀수 프레임으로부터 짝수 프레임의 예측을 개별적으로 계산한다. 이후 예측의 나머지는 중앙 인코더와 제 1 및 제 2 측 인코더 사이에서 각각 계산된다. 본 발명의 MDMC 인코더는 상부 인코더의 출력과 함께 짝수 프레임의 예측에 대응하는 제 1 계산된 예측 나머지를 출력하고, 하부 인코더의 출력과 함께 홀수 프레임의 예측에 대응하는 제 2 계산된 예측의 나머지를 출력한다.According to a first aspect of the present invention, there is provided an improved MDMC encoder comprising three predictive coders: top, middle and bottom coders. The input frame is fed to the encoder as three separate inputs. The input frame is fed to a central encoder. In addition, the input frame is divided or divided into two sub-streams of the frame, the first sub-stream includes only odd frames, and the second sub-stream includes only even frames. The first sub-stream containing odd frames is provided as an input to be encoded by the upper encoder to produce an encoded odd frame sequence, and the second sub-stream comprising even frames contains an encoded even frame sequence. It is provided as an input to be encoded by the lower encoder to produce. Another embodiment may divide a frame using different standards, such as unbalanced splitting, for example, in which two of the three frames are encoded by the upper encoder and every third frame is encoded by the lower encoder. It is noted that it can. The original undivided input stream of the frame is applied to a central encoder that calculates the prediction of odd frames from even frames. In addition, the central encoder separately calculates prediction of even frames from odd frames. The remainder of the prediction is then calculated between the central encoder and the first and second side encoders, respectively. The MDMC encoder of the present invention outputs the first calculated prediction residual corresponding to the prediction of the even frame together with the output of the upper encoder, and the remainder of the second calculated prediction corresponding to the prediction of the odd frame together with the output of the lower encoder. Output

본 발명의 제 2 양태에 따르면, 프레임의 시퀀스를 나타내는 비디오 신호를 인코드하는 방법이 제공되는데, 이 방법은 프레임의 시퀀스를 제 1 서브-시퀀스와 제 2 서브 시퀀스로 분할하는 단계, 제 1 서브-시퀀스를 제 1 측 인코더에 적용하는 단계, 제 2 서브-시퀀스를 제 2 측 인코더에 적용하는 단계, 프레임의 본래의 분할되지 않은 시퀀스를 중앙 인코더에 적용하는 단계, 제 1 측 인코더의 출력과 중앙 인코더 사이의 제 1 예측 나머지를 계산하는 단계, 제 2 측 인코더의 출력과 중앙 인코더 사이의 제 2 예측 나머지를 계산하는 단계, 제 1 예측 나머지와 제 1 측 인코더의 출력을 제 1 데이터 서브-스트림으로 결합하는 단계, 제 2 예측 나머지와 제 2 측 인코더의 출력을 제 2 데이터 서브-스트림으로 결합하는 단계, 및 제 1 및 제 2 데이터 서브 스트림을 개별적으로 전송하는 단계를 포함한다.According to a second aspect of the invention, a method is provided for encoding a video signal representing a sequence of frames, the method comprising dividing a sequence of frames into a first sub-sequence and a second subsequence, the first sub Applying a sequence to the first side encoder, applying a second sub-sequence to the second side encoder, applying an original undivided sequence of frames to the central encoder, output of the first side encoder and Calculating a first prediction remainder between the center encoder, calculating a second prediction remainder between the output of the second side encoder and the central encoder, subtracting the output of the first prediction remainder and the first side encoder with the first data sub- Combining into a stream, combining the second prediction residual and the output of the second side encoder into a second data sub-stream, and opening the first and second data sub-streams. And a step of transmitting to the enemy.

본 발명의 장점은 다음을 포함한다:Advantages of the present invention include the following:

(1) 임의의 종래의 예측 코더는 상부 및 하부 인코더용으로 사용될 수 있다. 또한 상부 및 하부 예측 코더는 B-프레임과 다중 예측 움직임 보상을 유리하게 포함할 수 있다.(1) Any conventional predictive coder can be used for the upper and lower encoders. The upper and lower predictive coders may also advantageously include B-frames and multiple predicted motion compensation.

(2) 임의의 상부, 중간 및 하부 예측 인코더는 스케일러블(scalable) 인코더{예를 들어, 움직임 벡터(MV)가 먼저 보내지는 FGS형 또는 데이터-분할(partitioning)형, 시간 스케일러빌리티(scalability) 등}일 수 있다. 예를 들어 중간 인코더만이 스케일러블 인코더일 경우, 중간 인코더는 오직 채널이 허용하는 정보만큼 보내게 된다. 이용 가능한 대역폭이 매우 낮은 것으로 결정되는 극단적인 경우에는, 측 코더에 의해 인코드된 정보만이 전송된다. 추가 대역폭이 이용 가능하게 될 때, 채널이 허용하는 것만큼의 부정합 신호가 스케일러블 중간 인코더를 사용하여 전송되게 된다.(2) Any upper, middle and lower prediction encoders are scalable encoders (e.g., FGS or data-partitioning, temporal scalability, where a motion vector (MV) is sent first). And the like. For example, if only the intermediate encoder is a scalable encoder, the intermediate encoder will only send as much information as the channel allows. In extreme cases where the available bandwidth is determined to be very low, only the information encoded by the side coder is transmitted. As the additional bandwidth becomes available, as many mismatched signals as the channel allows, are sent using the scalable intermediate encoder.

(3) 시스템의 복잡성을 제한하기 위해, 부정합 신호를 결정하기 위한 현재의 짝수/홀수 프레임의 홀수/짝수 프레임 시퀀스로부터의 예측이 B-프레임으로 만들어질 수 있다.(3) To limit the complexity of the system, predictions from the odd / even frame sequences of the current even / odd frame to determine the mismatched signal can be made into B-frames.

(4) 종래 방식과 같이 측 예측 에러(즉, 측 코더에 관한 짝수 프레임과 홀수 프레임 사이의 에러)를 계산 및 코딩하고 또한 측 예측 에러와 중앙 에러 사이의 부정합(즉, 현재 프레임과 이전 2개의 프레임으로부터의 예측 사이의 에러)을 계산 및 코딩하는 대신, 대안적으로 중앙 에러가 계산된다.(4) Compute and code the side prediction error (i.e., the error between even and odd frames with respect to the side coder), as well as the mismatch between the side prediction error and the center error (i.e., the current frame and the previous two Instead of calculating and coding the error between predictions from the frame), the central error is alternatively calculated.

이제 동일한 참조 번호는 도면 전체에 걸쳐 대응하는 부분을 나타내는 도면을 참조한다.The same reference numbers now refer to the drawings which indicate corresponding parts throughout the drawings.

다중 표현 코딩(MDC)은 입력 신호를 다수의 개별 비트 스트림으로 코딩하는 것이 목표인 압축의 일 형태를 가리키고, 이러한 다수의 비트 스트림은 종종 다중 표현이라고 불린다. 이들 개별 비트 스트림은 모두 서로 독립적으로 디코딩 가능한 특성을 가진다. 디코더가 특히 임의의 단일 비트 스트림을 수신한다면, 유용한 신호를 생성하기 위해 비트 스트림을 디코딩할 수 있다(임의의 나머지 비트 스트림으로의 액세스를 요구하지 않고). MDC는 디코딩된 신호의 품질이 더 많은 비트 스트림이 정확히 수신될 때 개선되는 추가 특성을 가진다. 예를 들어, 비디오가 MDC를 통해 총 N개인 스트림으로 코딩된다고 가정하자. 디코더가 이들 N개의 스트림 중 어느 하나를 수신하는 한, 디코더는 비디오의 유용한 버전을 디코딩할 수 있다. 디코더가 2개의 스트림을 수신하면, 1개의 스트림만을 수신하는 경우에 비해, 비디오의 개선된 버전을 디코딩할 수 있다. 이러한 품질의 개선은 수신기가 모든 N개의 스트림을 수신할 때까지 계속되고, 이 경우 수신기는 최대 품질을 재구성할 수 있다. Multiple representation coding (MDC) refers to a form of compression that aims to code an input signal into multiple discrete bit streams, which are often referred to as multiple representations. These individual bit streams all have decodable characteristics independently of each other. If the decoder receives in particular any single bit stream, it can decode the bit stream to produce a useful signal (without requiring access to any remaining bit stream). MDC has the additional characteristic that the quality of the decoded signal is improved when more bit streams are received correctly. For example, suppose that a video is coded into a total of N streams via MDC. As long as the decoder receives any of these N streams, the decoder can decode a useful version of the video. If the decoder receives two streams, it can decode an improved version of the video as compared to receiving only one stream. This improvement in quality continues until the receiver has received all N streams, in which case the receiver can reconstruct the maximum quality.

비디오의 MDC 코딩을 달성하는 많은 상이한 접근 방식이 존재한다. 그 중 하나는 상이한 프레임을 상이한 스트림으로 독립적으로 코딩하는 것이다. 예를 들어, 비디오 시퀀스의 각 프레임은 예를 들어 JPEG, JPEG-2000과 같은 인트라 프레임(intra frame) 코딩이나 만을 사용하는 단일 프레임(나머지 프레임과는 독립적으로)으로서 코딩되거나, I-프레임 인코딩만을 사용하는 임의의 비디오 코딩 표준(예를 들어, MPEG-1/2/4, H.26-1/3)으로서 코딩될 수 있다. 이후, 상이한 프레임이 상이한 스트림으로 보내질 수 있다. 예를 들어, 모든 짝수 프레임 시퀀스는 스트림(1)으로 보내질 수 있고, 모든 홀수 프레임은 스트림(2)으로 보내질 수 있다. 각 프레임이 나머지 프레임으로부터 독립적으로 디코딩될 수 있기 때문에, 각 비트 스트림도 나머지 비트 스트림으로부터 독립적으로 디코딩될 수 있다. 이러한 MDC 비디오 코딩의 간단한 형태는 전술한 특성을 가지지만, 인터 프레임(inter-frame) 코딩의 부족으로 인해 압축 면에서 매우 효율적이지 못하다.There are many different approaches to achieving MDC coding of video. One of them is to independently code different frames into different streams. For example, each frame of a video sequence may be coded as a single frame (independent of the remaining frames) using only intra frame coding, such as JPEG or JPEG-2000, or only I-frame encoding, for example. It can be coded as any video coding standard (eg MPEG-1 / 2/4, H.26-1 / 3) to use. Thereafter, different frames may be sent in different streams. For example, all even frame sequences can be sent to stream 1 and all odd frames can be sent to stream 2. Since each frame can be decoded independently from the remaining frames, each bit stream can also be decoded independently from the remaining bit streams. This simple form of MDC video coding has the characteristics described above, but is not very efficient in terms of compression due to the lack of inter-frame coding.

도 1을 상세히 설명하기 전에, 디지털화된 화상 내의 픽셀의 계층적 배치와, MPEG2 표준에서 사용된 바와 같은 예측 전략에 관한 일부 정의를 상기한다. 휘도 및 크로미넌스 샘플(픽셀) 모두 각각 8×8 매트릭스(각 8개의 픽셀의 8개의 행)로 이루어진 블록으로 그룹화되고; 일정한 개수의 휘도 및 크로미넌스 블록(예를 들어, 휘도 데이터 4 블록과 크로미넌스 데이터의 2개의 대응하는 블록)이 매크로 블록을 형성하며; 이후 디지털화된 화상이, 선택된 프로파일(즉, 해상도에 대한)과 전원 주파수에 그 크기가 의존하는 매크로 블록의 매트릭스를 포함하는데, 예를 들면 50㎐의 전원의 경우, 그 크기는 최소 18×32 매크로 블록으로부터 최대 72×120개의 매크로 블록까지의 범위를 가질 수 있다. 화상은 번갈아 프레임 구조(이어지는 행의 픽셀은 상이한 필드에 속한다) 또는 필드 구조(모든 픽셀은 동일한 필드에 속한다)를 가질 수 있다. 그 결과, 매크로 블록은 프레임이나 필드 구조도 가질 수 있다. 화상은 차례차례 화상의 그룹으로 조직되는데, 첫 번째 화상은 항상 I 화상이고 그 다음에 다수의 B 화상(양방향적으로 삽입된 화상으로서, 전방으로 또는 후방으로 예측 또는 양쪽 모두의 예측을 거치고, '전방'의 의미는 예측이 이전 참조 화상에 기초한다는 것이고, '후방'의 의미는 예측이 앞으로의 참조 화상에 기초한다는 것을 의미한다)이 오며, 그 다음 B 화상의 예측을 위해 사용되는 P 화상이 I 화상 직후에 인코드된다.Before describing FIG. 1 in detail, recall the hierarchical arrangement of pixels in the digitized picture and some definitions regarding prediction strategies as used in the MPEG2 standard. Both luminance and chrominance samples (pixels) are each grouped into blocks consisting of an 8x8 matrix (eight rows of eight pixels each); A certain number of luminance and chrominance blocks (e.g., four blocks of luminance data and two corresponding blocks of chrominance data) form a macro block; The digitized picture then contains a matrix of macroblocks whose size depends on the selected profile (i.e. for resolution) and the power supply frequency, for example at a power supply of 50 kW, the size is at least 18 × 32 macros. It can range from blocks up to 72 × 120 macroblocks. The picture may alternately have a frame structure (pixels in subsequent rows belong to different fields) or field structures (all pixels belong to the same field). As a result, the macro block may also have a frame or field structure. The pictures are in turn organized into groups of pictures, with the first picture always being an I picture, followed by a number of B pictures (bidirectionally inserted pictures, forward or backward prediction, or both predictions, 'Forward' means that the prediction is based on the previous reference picture, and 'Rear' means that the prediction is based on the future reference picture), and then the P picture used for the prediction of the B picture It is encoded immediately after the I picture.

이제 도 1을 참조하면, 미도시된 소스는 코딩 순서, 즉 화상이 예측을 위해 참조 화상을 이용하기 전에 이용 가능한 참조 화상을 만드는 순서로 이미 배치된 프레임의 시퀀스(201)(즉, 프레임 구조)를 인코더(200)에 공급한다. 전체 프레임 시퀀스(201)는 코딩되는 화상에서의 각 매크로-블록에 관한 하나 또는 그 이상의 움직임 벡터 각 벡터에 연관된 비용 또는 에러를 계산하고 방출하는 움직임 추정 유닛(미도시)에 의해 수신된다. 인코더(200)는 제 1 측 인코더{측 인코더(1)}(202), 중앙 인코더(204) 및 제 2 측 인코더(206)를 포함한다. 전체 프레임 시퀀스(201)는 중앙 인코더(204)에 그 전부가 적용된다. 본 실시예에서는 전체 프레임 시퀀스(201)의 짝수 프레임 시퀀스(210) 서브세트를 구성하는 전체 프레임 시퀀스(201)의 제 1 서브세트(210)가 제 1 측 인코더(202)에 적용된다. 본 실시예에서 전체 프레임 시퀀스(201)의 홀수 프레임 시퀀스(220)를 구성하는 전체 프레임 시퀀스(201)의 제 2 서브세트(220)는, 제 2 측 인코더(206)에 적용된다.Referring now to FIG. 1, a not shown source is a sequence of frames 201 (ie, frame structure) already placed in coding order, i.e., making the reference picture available before the picture uses the reference picture for prediction. To the encoder 200. The full frame sequence 201 is received by a motion estimation unit (not shown) that calculates and emits a cost or error associated with each vector of one or more motion vectors for each macro-block in the picture to be coded. The encoder 200 includes a first side encoder {side encoder 1} 202, a central encoder 204 and a second side encoder 206. The entire frame sequence 201 is applied in its entirety to the central encoder 204. In the present embodiment, the first subset 210 of the full frame sequence 201 constituting a subset of the even frame sequences 210 of the full frame sequence 201 is applied to the first side encoder 202. In the present embodiment, the second subset 220 of the full frame sequence 201 constituting the odd frame sequence 220 of the full frame sequence 201 is applied to the second side encoder 206.

이제, 예측 인코딩 동작을 요약한다.Now we summarize the predictive encoding operation.

A. 제 1 측 인코더(202)A. First Side Encoder 202

입력 시퀀스(201)의 서브세트를 포함하는 홀수 프레임 서브-시퀀스(210)가, 제 1 측 인코더(202)에 적용된다. 제 1 측 인코더(202)는 임의의 종래의 예측 코덱(예를 들어, MPEG-1/2/4, H.26-1/3)으로서 유리하게 구현될 수 있다는 점이 주목되어야 한다. 홀수 프레임 서브-시퀀스(210)는 인코드된 홀수 프레임 서브-시퀀스(211)를 출력하는 제 1 측 인코더(202)에 의해 인코드된다. 인코드된 홀수 프레임 서브-시퀀스(211)는 제 1 데이터 서브-스트림(245)에서 출력될 하나의 성분으로서 포함된다. 인코드된 홀수 프레임 서브-시퀀스(211)는 또한 아래에 설명될 중앙 인코더 서브-모듈(230)에 입력으로서 공급될 수도 있다.An odd frame sub-sequence 210 that includes a subset of the input sequence 201 is applied to the first side encoder 202. It should be noted that the first side encoder 202 may be advantageously implemented as any conventional prediction codec (eg MPEG-1 / 2/4, H.26-1 / 3). The odd frame sub-sequence 210 is encoded by the first side encoder 202 which outputs the encoded odd frame sub-sequence 211. The encoded odd frame sub-sequence 211 is included as one component to be output in the first data sub-stream 245. The encoded odd frame sub-sequence 211 may also be supplied as an input to the central encoder sub-module 230, which will be described below.

B. 제 2 측 인코더(206)B. Second Side Encoder 206

입력 시퀀스(220)의 서브-세트를 포함하는 짝수 프레임 서브-시퀀스(220)는 제 2 측 인코더(206)에 적용된다. 제 2 측 인코더(206)는, 제 1 측 인코더(202)와 유사하게, 임의의 종래의 예측 코덱(예를 들어, MPEG-1/2/4, H.26-1/3)으로서 유리하게 구현될 수도 있음이 주목되어야 한다. 짝수 프레임 서브-시퀀스(220)는 인코드된 짝수 프레임 서브 시퀀스(212)를 출력하는 제 2 측 인코더(206)에 의해 인코드된다. 인코드된 짝수 프레임 서브 시퀀스(212)는 제 2 데이터 서브-스트림(255)에서 출력이 될 하나의 성분으로서 포함된다. 인코드된 짝수 프레임 서브 시퀀스(212)는 또한 아래에 설명될 중앙 인코더 서브-모듈(232)로의 입력으로서 공급된다.An even frame sub-sequence 220 comprising a sub-set of the input sequence 220 is applied to the second side encoder 206. The second side encoder 206 is advantageously like any conventional prediction codec (eg, MPEG-1 / 2/4, H.26-1 / 3), similar to the first side encoder 202. It should be noted that it may be implemented. The even frame sub-sequence 220 is encoded by a second side encoder 206 that outputs the encoded even frame subsequence 212. The encoded even frame subsequence 212 is included as one component to be output in the second data sub-stream 255. The encoded even frame subsequence 212 is also supplied as an input to the central encoder sub-module 232, which will be described below.

C. 중앙 인코더(204)C. Central Encoder 204

전체 프레임 시퀀스(201)가 중앙 인코더(204)에 적용된다.The entire frame sequence 201 is applied to the central encoder 204.

중앙 인코더 서브-모듈(250)은 움직임 벡터의 제 1 세트(214)를 계산하고 또한 짝수 프레임 예측 시퀀스(215)를 계산 및 인코드하며, 이는 입력 시퀀스(201)의 홀수 프레임으로부터 짝수 프레임의 예측을 구성한다. 중앙 인코더 서브-모듈(250)은 짝수 프레임 예측 시퀀스(215)와 제 1 움직임 벡터 시퀀스(214)를 출력하고, 이들 모두 중앙 인코더 서브-모듈(230)에 입력으로서 공급된다.The central encoder sub-module 250 calculates the first set 214 of motion vectors and also calculates and encodes the even frame prediction sequence 215, which predicts even frames from the odd frames of the input sequence 201. Configure The central encoder sub-module 250 outputs the even frame prediction sequence 215 and the first motion vector sequence 214, both of which are supplied as input to the central encoder sub-module 230.

중앙 인코더 서브-모듈(260)은 움직임 벡터의 제 2 세트(216)를 계산하고 또한 홀수 프레임 예측 시퀀스(217)를 계산 및 인코드하며, 이는 입력 시퀀스(201)의 짝수 프레임으로부터 홀수 프레임의 예측을 구성한다. 중앙 인코더 서브 모듈(250)은 홀수 프레임 예측 시퀀스(217)와 제 2 움직임 벡터 시퀀스(216)를 출력하고, 이들 모두 중앙 인코더 서브-모듈(232)에 입력으로서 공급된다.The central encoder sub-module 260 calculates the second set 216 of motion vectors and also calculates and encodes the odd frame prediction sequence 217, which predicts odd frames from the even frames of the input sequence 201. Configure The central encoder submodule 250 outputs the odd frame prediction sequence 217 and the second motion vector sequence 216, both of which are supplied as input to the central encoder sub-module 232.

중앙 인코더 서브-모듈(230)은 2가지 기능 또는 과정을 수행한다. 제 1 과정은 인코드된 움직임 벡터(218)의 제 1 세트를 출력하기 위해 서브-모듈(250)로부터 수신된 제 1 세트의 움직임 벡터(214)를 인코드하는 것에 관련된 것이다. 제 2 기능 또는 과정은 제 1 예측 나머지(221)를 계산하는 것에 관련된 것으로, 이는 다음과 같이 계산될 수 있다.The central encoder sub-module 230 performs two functions or processes. The first process involves encoding the first set of motion vectors 214 received from sub-module 250 to output a first set of encoded motion vectors 218. The second function or process relates to calculating the first prediction remainder 221, which may be calculated as follows.

제 1 예측 나머지 = e_c - e_s First prediction remainder = e _c -e _s

여기서, e_c= 짝수 프레임 예측 프레임 시퀀스(215)이며,Where e _c = even frame prediction frame sequence 215,

e_s= 인코드된 홀수 프레임 서브 시퀀스(211)이다.e _s = encoded odd frame subsequence 211.

중앙 인코더 서브-모듈(230)은 코딩된 움직임 벡터(218)의 제 1 세트와 함께 인코드된 제 1 예측 나머지(221)를 포함한다. 이들 출력은 인코드된 홀수 프레임 서브 시퀀스(211)와 결합되고(포인트 A), 제 1 데이터 서브-스트림(245)으로서 집합적으로 출력한다.The central encoder sub-module 230 includes a first prediction remainder 221 encoded with the first set of coded motion vectors 218. These outputs are combined with the encoded odd frame subsequence 211 (point A) and collectively output as the first data sub-stream 245.

유사하게, 제 2 예측 나머지가 제 2 데이터 서브-스트림(255)에서의 포함을 위해, 다음과 같이 계산될 수 있다.Similarly, the second prediction remainder may be calculated as follows, for inclusion in the second data sub-stream 255.

제 2 예측 나머지 = e_c - e_s 2nd prediction remainder = e _c -e _s

여기서, e_c= 홀수 프레임 예측 프레임 시퀀스(217)이며,Where e _c = odd frame prediction frame sequence 217,

e_s= 인코드된 짝수 프레임 서브 시퀀스(212)이다.e _s = encoded even frame subsequence 212.

중앙 인코더 서브-모듈(232)은 코딩된 움직임 벡터(219)의 제 2 세트와 함께 인코드된 제 2 예측 나머지(222)를 포함한다. 이들 출력은 인코드된 짝수 프레임 시퀀스(212)(포인트 B)와 결합되고 제 2 데이터 서브-스트림(255)으로서 출력한다.The central encoder sub-module 232 includes a second prediction remainder 222 encoded with the second set of coded motion vectors 219. These outputs are combined with the encoded even frame sequence 212 (point B) and output as a second data sub-stream 255.

본 발명의 바람직한 실시예의 전술한 설명은 예시와 설명의 목적으로 제시되었다. 이들은 개시된 정확한 형태로 본 발명을 총망라하거나 제한하고자 의도된 것은 아니고, 분명히 많은 수정과 변경이 상기 가르침의 면에서 가능하다. 당업자에게 명백한 이러한 수정 및 변경은 첨부된 청구항에 의해 한정된 바와 같은 본 발명의 범위 내에 포함된 것으로 의도된다.The foregoing description of the preferred embodiment of the present invention has been presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. Such modifications and variations apparent to those skilled in the art are intended to be included within the scope of the invention as defined by the appended claims.

본 발명은 일반적으로 데이터, 음성, 오디오, 이미지, 비디오 및 네트워크나 다른 타입의 통신 매체를 통해 전송하기 위한 다른 타입의 신호의 다중 표현 코딩(MDC)에 이용 가능하다.The present invention is generally available for multiple representation coding (MDC) of data, voice, audio, image, video and other types of signals for transmission over a network or other type of communication medium.

Claims

An encoding method for encoding an input frame sequence 201,

a) encoding a first sub-sequence 210 of the frame from the input frame sequence 201 to produce an encoded first sub-sequence 211 of the frame;

b) encoding a second sub-sequence 220 of the frame from the input frame sequence 201 to produce an encoded second sub-sequence 212 of the frame;

c) calculating a first prediction frame sequence (215) from the second sub-sequence (220) of the frame;

d) calculating a second prediction frame sequence (217) from the first sub-sequence (210) of the frame;

e) calculating a first set (214) of motion vectors from the first prediction frame sequence (215);

f) calculating a second set (216) of motion vectors from said second prediction frame sequence (217);

g) calculating a first prediction remainder as an error difference between the first prediction frame sequence (215) and the encoded first sub-sequence (211) of the frame;

h) calculating a second prediction remainder as an error difference between the second prediction frame sequence (217) and the encoded second sub-sequence (212) of a frame;

i) encoding said first prediction remainder, said second prediction remainder, said first set of motion vectors (214), and said second set of motion vectors (216);

j) determining network conditions;

k) the encoded first prediction remainder 218, the encoded first set 221 of motion vectors and the encoded first sub-sequence 211 of the frame according to the determined network conditions; Scalablely combining as a first data sub-stream 245;

l) the encoded second prediction remainder 219, the encoded second set 222 of motion vectors and the encoded second sub-sequence 212 of the frame according to the determined network conditions; Scalablely combining as a second data sub-stream (255); And

m) independently transmitting the first and second data sub-streams (245, 255).

2. The method of claim 1 wherein the determined network condition is channel bandwidth determination.

2. The method of claim 1, comprising a preliminary step of placing the input frame sequence (201) in a predetermined coding order prior to step (a).

2. The method of claim 1, wherein the first sub-sequence (210) of the frame comprises only odd frames from the input frame sequence (201).

2. The method of claim 1, wherein the second sub-sequence (220) of the frame comprises only even frames from the input frame sequence (201).

The method of claim 1, wherein the second sub-sequence (220) of the frame comprises only frames from the input frame sequence (201) that are not included in the first sub-sequence (210) of the frame.

2. The method of claim 1, wherein the first and second sub-sequences of frames (210, 220) are selected according to user preferences.

The method of claim 1, wherein the input frame sequence comprises an intra-frame (I), a prediction frame (P), and a bidirectional frame (B).

An encoder 200 that encodes an input sequence 201 of a frame,

a) encoding a first sub-sequence 210 of a frame from the input frame sequence 201 at a first side encoder 202;

b) encoding a second sub-sequence 220 of the frame from the input frame sequence 201 at a second side encoder 206;

c) calculating a first predictive frame sequence (215) from the second sequence (220) of frames at the central encoder (204);

d) calculating a second predicted frame sequence (217) from the first sub-sequence (210) of the frame at the central encoder (204);

e) calculating a first set of motion vectors 214 from the first prediction frame sequence 215 at the central encoder 204;

f) calculating a second set of motion vectors 216 from the second prediction frame sequence 217 at the central encoder 204;

g) a first prediction remainder as an error difference between the first prediction frame sequence 215 at the central encoder 204 and the encoded first sub-sequence 211 of the frame at the central encoder 204. To calculate;

h) calculating a second prediction remainder as an error difference between the second prediction frame sequence (217) and the encoded second sub-sequence (212) of a frame at the central encoder (204);

i) encoding, at the central encoder (204), the first prediction remainder, the second prediction remainder, the first set of motion vectors (214), and the second set of motion vectors (216);

j) determining network conditions;

k) the encoded first prediction remainder 218, the encoded first set 221 of motion vectors and the encoded first sub-sequence 211 of the frame according to the determined network conditions; Scalable combining as a first data sub-stream 245;

l) the encoded second prediction remainder 219, the second set 222 of motion vectors, and the encoded second sub-sequence 212 of the frame, according to the determined network condition; Scalable combining as sub-stream 255; And

m) independently transmitting the first and second data sub-streams (245, 255) from the encoder (200).

10. The encoder (200) of claim 9, wherein the first side encoder (202), the second side encoder (206) and the central encoder (204) are conventional predictive encoders.

The encoder (200) of claim 10, wherein the first side encoder (202), the second side encoder (206) and the central encoder (204) are scalable encoders.

11. The method of claim 10, wherein the conventional predictive encoders are MPEG1, MPEG2, MPEG4, MPEG7, H.261, H.262, H.263, H.263 +, H.263 ++, H.26L, and H. Encoder 200, which is an encoder selected from the group of encoders including a 26L encoder.

10. The encoder (200) of claim 9, wherein the encoder (200) is included in a telecommunications transmitter of a wireless network.

A system for encoding an input sequence 201 of a frame,

Means for encoding a first sub-sequence (210) of a frame from the input frame sequence (201) to produce an encoded first sub-sequence (211) of a frame;

Means for encoding a second sub-sequence (220) of the frame from the input frame sequence (201) to produce an encoded second sub-sequence (212) of the frame;

Means for calculating a first predictive frame sequence (215) from the second sequence (220) of frames;

Means for calculating a second prediction frame sequence (217) from the first sub-sequence (210) of the frame;

Means for calculating a first set of motion vectors (214) from the first prediction frame sequence (215);

Means for calculating a second set of motion vectors (216) from the second prediction frame sequence (217);

Means for calculating a first prediction remainder as an error difference between the first prediction frame sequence (215) and the encoded first sub-sequence (211) of a frame;

Means for calculating a second prediction remainder as an error difference between the second prediction frame sequence (217) and the encoded second sub-sequence (212) of a frame;

Means for encoding the first prediction remainder, the second remainder, the first set of motion vectors (214), and the second set of motion vectors (216);

Means for determining network conditions;

The encoded first prediction remainder 218, the encoded first set 221 of motion vectors and the encoded first sub-sequence 211 of a frame are firstly determined according to the determined network condition. Means for scalable coupling as data sub-stream 245;

The encoded second prediction remainder 219, the encoded second set 222 of motion vectors and the encoded second sub-sequence 212 of a frame, according to the determined network condition; Means for scalable coupling as data sub-stream 255; And

Means for independently transmitting said first and second data sub-streams (245, 255).

15. The system of claim 14, further comprising means for placing the input frame sequence (201) in a predetermined coding order.