CA2318272A1

CA2318272A1 - Method and apparatus for advanced television signal encoding and decoding

Info

Publication number: CA2318272A1
Application number: CA002318272A
Authority: CA
Inventors: Yendo Hu
Original assignee: Individual
Current assignee: Tiernan Communications Inc
Priority date: 1998-01-26
Filing date: 1999-01-21
Publication date: 1999-07-29
Also published as: EP1051839A2; WO1999038316A3; JP2002502159A; WO1999038316A2; AU2337099A

Abstract

The specification discloses a method and apparatus for encoding and decoding advanced television signals using standard MPEG-2 compression engines while maintaining the compression efficiency of such compression engines. The architecture provides parallel processing using standard MPEG-2 compression engines in an overlapping arrangement that does not sacrifice compression performance. A video encoder includes plural regional processors for encoding an input stream of video images. Each video image is divided into regions that have overlapping portions, with each processor encoding a particular region of a current video image in the stream. The regional processors each store a reference frame in a local memory based on a prior video image in the stream for use in the motion compensation of the encoding process. A reference frame processor coupled to the plural local memories updates each reference frame with information from reference frames stored in adjacent local memories. The encoded video images are made up of macroblocks and each regional processor includes means for removing certain macroblocks from the encoded video images that correspond to the overlap portions and concatenating the resulting encoded video images with that of other regional processors to provide an output video stream.

Description

METHOD AND APPARATUS FOR ADVANCED TELEVISION SIGNAL
ENCODING AND DECODING
BACKGROUND OF THE INVENTION
The Federal Communications Commission (FCC) has adopted major elements of the Advanced Television Systems Committee (ATSC) Digital Television standard for use by terrestial broadcasters. The ATSC Digital Television (DTV) standard addresses five key components of a model system for delivering multimedia information to users. A
block diagram of such a model system as defined by the International Telecommunications Union, Radio Communication Sector (ITU-R), Task Group 11/3 is shown in FIG. 1 and includes video, audio, tranport, RF/transmission and receiver components. The video subsystem 100 compresses raw video into a digital video data elementary stream in accordance with the MPEG-2 standard defined by the Moving Picture Experts Group in ISO/IEC IS 13818-2 International Standard (1994), MPEG-2 Video. The audio subsystem-102 compresses raw audio into a digital audio data elementary stream in accordance with the Digital Audio Compression 3 (DAC-3) standard defined by the Audio Specialist Group within ITU.
The service multiplex and transport component 104 multiplexes the video data elementary stream, the audio data elementary stream, ancillary and control data elementary streams into a single bit~stream using the transport stream syntax defined by ISO/IEC IS 13818-1 International Standard (1994), MPEG-2 Systems. The RF/transmission component 106 includes a channel coder and a modulator. The channel coder introduces additional information into the transport stream to allow the receiver 108 to reconstruct partially corrupted bit streams. The modulator encodes the digital data into RF
5 signals using vestigial sideband transmission.
The MPEG-2 standard applies five compression techniques to achieve a high compression ratio: discrete cosine transform (DCT), difference encoding, quantization, entropy encoding and motion compensation.
10 A DCT is applied to blocks of 8 x 8 pixels to provide 64 coefficients that represent spatial frequencies. For blocks without much detail, the high frequency coefficients have small values that can be set to zero.
15 Video frames are encoded into intra frames (I
frames) which do not rely on information from other frames to reconstruct the current frame, and inter frames, P and B, which rely on information from other frames to reconstruct the current frame. P frames rely 20 on the previous P or I frame while B frames rely on the previous I or P and the future I or P to construct the current frame. These previous or future I and P frames are referred to as reference frames. The P and B frames include only the differences between the current frame 25 and the adjacent frames. For low motion video sequences, the P and B frames will have very little information content.
The MPEG-2 compression algorithm performs motion estimation between adjacent frames to improve the 30 prediction capability between frames. The compression algorithm searches for a motion vector for every four blocks, known as a macroblock, that provides the distance and direction of motion for the current macroblock.
The DCT coefficients of each block are weighted and quantized based on a quantization matrix that matches the response of the human eye. The results are combined with the motion vectors and then encoded using variable length encoding to provide a stream for transmission.
The computational demand required to carry out the video compression specified in the MPEG-2 standard is significant. For applications in which real-time compression is necessary, such as live broadcast, the approach taken to achieve such video compression becomes critical. There are two known approaches for implementing MPEG-2 compression: sliced-based and macroblock-based. In the slice-based approach shown in FIG. 2, a video frame 120 is divided into several contiguous regions 120A. Each region is assigned to a different processor (P1, P2, P3, P4, P5) for processing.
A dedicated central processor 122 manages the overall compression operation. In the macroblock-based approach shown in FIG. 3, each macroblock 124 is completely processed and delivered to an output buffer 126 before processing the next macroblock.
The MPEG-2 standard defines algorithmic tools known as profiles and sets of constraints on parameter values (e.g., picture size, bit rate) known as levels. The known MPEG-2 compression engines noted above have been designed to meet the main profile Q main level portion of the standard for conventional broadcast television wo ~r~ms pcrmsmai4ia signals such as NTSC and PAL. The main level is specified as 720 pixels by 480 active lines at 30 frames per second. In contrast, the DTV signal is specified as 1920 pixels by 1080 active lines at 30 frames per second.
This is known as the MPEG-2 high level. The computational demand needed for the DTV signal specified as main profile Q high level is approximately six times that needed for existing standard television signals specified as main profile c~ main level.
SUMMARY OF THE INVENTION
It would be desirable to take advantage of existing MPEG-2 compression engines to encode higher definition video signals while maintaining the compression efficiency of such compression engines.
The method and apparatus of the present invention provides an architecture capable of addressing the computational demand required for high-definition video signals, such as a DTV signal compliant with MPEG-2 main profile Q high level, using standard MPEG-2 compression engines operating in the main profile Q main level mode.
The invention provides parallel processing using such standard MPEG-2 compression engines in an overlapping arrangement that does not sacrifice compression performance.
Accordingly, a video encoder of the present invention comprises plural regional processors for encoding an input stream of video images. Each video image is divided into regions that have overlapping portions, with each processor encoding a particular region of a current video image in the stream according to an encoding process that includes motion compensation such as MPEG-2 main profile Q main level. The regional processors each store a reference frame in a local memory based on a prior video image in the stream for use in the motion compensation of the encoding process. A reference frame processor coupled to the plural local memories updates each reference frame with information from reference frames stored in adjacent local memories. The encoded video images are made up of macroblocks and each regional processor includes means for removing certain macroblocks from the encoded video images that correspond to the overlap portions and concatenating the resulting encoded video images with that of other regional processors to provide an output video stream.
In an embodiment, the regional processors each include an image selection unit for selecting a particular image region from each of the video images. A
compression engine compresses the selected image region to provide a compressed image region stream of macroblocks. A macroblock remover removes certain macroblocks from the compressed image region stream that correspond to the overlapping portions. A stream concatenation unit concatenates the compressed image region stream with such streams from each regional processor to provide an output video stream.
While the preferred embodiment includes multiple regional processors for processing the overlapping regions, the present invention encompasses single processor embodiments in which each region is processed successively.
According to another aspect of the invention, a video decoder includes a demultipexer, multiple regional decoders, a reference frame memory and a multiplexes.
The demultiplexer demultiplexes a compressed stream of video images to plural region streams. Each video image is divided into contiguous regions, each region stream being associated with a particular region. The regional decoders each decode a particular region stream according to a decoding process that includes motion compensation such as MPEG-2 main profile at main level. The reference frame memory stores reference frames associated with each regional decoder. The regional decoders retrieve reference frames of adjacent regions for use in the motion compensation process. The multiplexes multiplexes the decoded region streams to a decoded output stream.
BRIEF DESCRIPTION OF THE DRAWINGS
The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views.
FIG. 1 is a block diagram of a model advanced television system.
FIG. 2 is a block diagram illustrating a slice-based compression approach for MPEG-2 main profile at main -level.

FIG. 3 is a block diagram illustrating a macroblock-based compression approach for MPEG-2 main profile at main level.

FIG. 4 is a diagram illustrating a first processor arrangement in accordance with the present invention.

FIG. 5 is a diagram illustrating a preferred processor arrangement in accordance with the present invention.

FIG. 6 is a block diagram of a video encoding subsystem the present invention.
of FIG. 7 (includes Figs. 7A-7C) is a schematic block diagram of video compression engine of the video a subsystem FIG. 6.
of FIG. 8 is a block diagram illustrating a synchronizat ion configuration for the compression engine of FIG. 7.

FIG. 9 is a diagram illustrating local image selection om a global image for the engine of FIG. 7.
fr FIG. IO is a diagram illustrating raw and active regions of he global image of FIG. 9.
t FIG. 11 is a diagram illustrating an active region within a raw region of the image of FIG. 10.

FIG. 12 is a block diagram of a token passing arrangement for a 2080I video processing configuration.

FIG. 13 is a block diagram of a token passing arrangement for a 720p video processing configuration.

FIG. 14 is a block diagram illustrating allocation of reference images in reference buffers of a local memory of e system of FIG. 7.
th _g_ FIG. 15 is a diagram illustrating the reference image updating arrangement of the present invention.
FIG. 16 is a diagram illustrating regions of a reference image in accordance with the invention.
FIG. 17 is a block diagram of a reference image manager of the system of FIG. 7.
FIG. 18 is a block diagram of a local manager of the reference image manager of FIG. 17.
FIG. 19 is a diagram illustrating the decoding arrangement of the present invention.
FIG. 20 is a block diagram of a decoder system of the present invention.
FIG. 21 is a diagram illustrating motion compensation in the decoder system of FIG. 20.
FIG. 22 is a block diagram of the reference frame store of the decoder system of FIG. 20.
DETAILED DESCRIPTION OF THE INVENTION
The present invention employs a parallel processing arrangement that takes advantage of known MPEG-2 main profile at main level (mp/ml) compression engines to provide a highly efficient compression engine for encoding high definition television signals such as the DTV signal that is compliant with MPEG-2 main profile at high level.
A first approach to using MPEG-2 compression engines in a parallel arrangement is shown in FIG. 4. In this arrangement, a total of nine MPEG-2 mp/ml compression engines are configured to process contiguous regions _.
encompassing an ATSC DTV video image I42 (1920 pixels by _g-1080 lines). Each MPEG-2 mp/ml engine is capable of processing a region 144 equivalent to an NTSC video image (720 pixels by 480 lines). As shown in FIG. 4, engines 3, 6, 7, 8 and 9 encode regions smaller than NTSC images.
The compression provided by this first approach is less than optimal. The motion compensation performed within each engine is naturally constrained to not search beyond its NTSC image format boundaries. As a result, macroblocks along the boundaries between assigned engine areas may not necessarily benefit from motion compensation.
The preferred approach of the present invention shown in FIG. 5 provides a parallel arrangement of MPEG-2 compression engines in which the engines are configured to process overlapping regions 146, 148, 150, 152 of an ATSC DTV video image I42. with the preferred approach, motion compensation performed by a particular engine for its particular region is extended into adjacent regions.
As noted in the background, motion compensation uses a reference image (I or P frame) for predicting the current frame in the frame encoding process. The preferred approach extends motion compensation into adjacent regions by updating the reference images at the end of the frame encoding process with information from reference frames of adjacent engines.
As described further herein, each engine stores at most two reference frames in memory. If at the end of a frame encoding process either of the~two reference frames have been updated, then that reference frame is further updated to reflect the frame encoding results from adjacent engines.
A preferred embodiment of a video encoder 100A of the present invention is shown in FIG. 6. The video encoder 100A includes a digital video receiver 160 and a compression engine subsystem 162. The digital video receiver 160 receives uncompressed video from external sources in either of two different digital input formats:
Panasonic 720p (progressive scan) parallel format and 1080I serial format. The 1080I serial format provides uncompressed 1080 line interlaced (1080I) video at a rate of 1.484 Gbps following the SMPTE292M standard. The digital receiver 160 converts the input signals into a common internal format referred to as TCI422-40 format in which 20 bits carry two Y components with 10 bit resolution, and 20 bits carry the chrominance components with 10 bit resolution.
The preferred embodiment of the video compression engine subsystem 162 shown in FIG. 7 includes a video input connector 200, a system manager 202, a bit allocation processor 204, several regional processors 206 and a PES header generator 208. There are nine regional processors 206 shown in the arrangement of FIG. 7, though other arrangements are possible, e.g., an arrangement of 12 regional processors can be implemented to provide a greater range of motion compensation. Each regional processor 206 includes a local image selection unit 210, an MPEG-2 compression unit 212, a macroblock remover and stream concatenation unit 214/216, and a local memory 218. The compression subsystem 162 also includes one or more reference image managers (RIMs) 220. In the arrangement of FIG. 7, there are four RIMs 220. The RIM
220 is described further herein.
The MPEG-2 compression unit 212 is preferably an IBM
model MM30 single package, three chip unit, though any standard MPEG-2 compression engine capable of main profile ~ main level operation can be used.
The video input connector 200 terminates a system control bus 222 and a video data bus 224 referred to as the TCI422-40 bus. The control bus 222 carries control data from a system control processor (not shown) to the system manager 202. The TCI422-40 bus 224 carries video data from the digital receiver 160 (FIG. 6).
The system manager 202 initializes the regional processors 206, interacts with the outside world, monitors video processing status, performs processor synchronization, and updates Frame PTS. An AM29k chip manufactured by Advanced Micro Devices is used to implement this function.
The system manager 202 holds all execution FPGA
files in an internal FLASH memory. Upon startup, the system manager initializes all processors and FPGAs with preassigned files from FLASH memory. The system manager 202 configures the following parameters for MPEG-2 compression units 212:
~ The GOP structure ~ The frame rate ~ Progressive encoding for 720p video, interlaced encoding for 1080I video.
~ The encoded frame size The following table gives the frame size of each MPEG-2 compression unit 212 for 1080I encoding.

Processor 1 2 3 4 5 6 7 8 9 Vertical 720 720 720 720 720 720 720 720 720 Pixels Horizontal 480 384 480 480 384 480 480 384 480 Lines The following table gives the frame size of each MPEG-2 compression unit for 720p encoding.
Processor 1 2 3 4 S 6 7 8 9 Vertical 720 720 720 720 720 720 720 720 NA

Pixels Horizontal 240 240 240 240 240 240 240 240 NA

Lines The system manager monitors the video compression process. It polls the health status registers in the local image selection unit, the MPEG-2 unit, the macroblock remover unit and the stream concatenate unit of each regional processor 206 at a rate of once per second.
The system manager 202 synchronizes the frame encoding process over the nine regional processors 206.
The following presents the motivations behind the need to l0 synchronize. Next, the tasks required by the system manager to synchronize the parallel processors are described.
A scalable MPEG-2 architecture requires each regional processor 206 to finish the current frame encoding process before starting the next frame. This requirement exists because of the need to update the reference images across the adjacent parallel processors.
Each MPEG-2 engine uses internal reference images to compress the current image. The internal reference images are derived from the results of the compression process for the previous frames. In the scalable MPEG-2 architecture of the present invention, sections of the reference image are updated using reference images from adjacent processors.
The following drives the need for synchronization:
1. The reference images are updated after each encoding process.
2. Each MPEG-2 compression unit must update the internal reference image using information from the reference image in the adjacent processors before it can properly encode the next image.
Referring now to FIG. 8, each MPEG-2 compression unit generates a current image compression complete (CICC) signal 250 after each encoding process. When all CICC signals are detected, the system manager 202 triggers the reference image manager 220 to update the internal reference images of each MPEG-2 compression unit using a common reference image update (RIU) signal 252.
The system manager must respond promptly when all CICC
signals are active, since any delay will cut into the MPEG-2 engine encoding time.
Each reference image manager activates a reference image update complete (RIUC) signal 254 when complete.
When all RIUC signals are detected, the system manager triggers all local image selection units 210 to start loading the next frame into the compression units 212 through a common start image loading (SIL) signal 256.
The delay between the time when the RIU is activated and when the RIUC is activated may be as short as one cycle.
The system manager must respond promptly when all RIUC
signals are activated.
The system manager updates the PTS in the PES header generator. The system manager receives an interrupt every time when regional processor #1 receives a new picture header. It then compute a new PTS value from the latched STC value at processor #1's video input port and the frame type from processor #1's compressed output port.
The bit allocation processor 204 is responsible for ensuring that the cumulative compressed bit rate from all of the regional processors meets the target compressed video bit rate defined externally. The bit allocation processor dynamically changes the compression quality setting of each MPEG-2 engine to maintain optimal video quality.
The loca-1- image selection unit (LISU) 210 extracts a local image from the uncompressed input data on the TCI422-40 data bus 224. It outputs the local image in a format that complies with the input format specified by the MPEG-2 unit 212. The LISU supports the following programmable registers:
1. input video format register: This register defines the video format the data on TCI422-40 bus represents. 0 = 1080I, Z = 720p 2. local image location registers: These registers specify the location of a local field image within a global field image 300 (FIG. 9).
The registers specify points within the field image, not the reconstructed progressive image. Keep in mind that the 720p video has only one field image per frame, whereas the 1080I video has two field images per frame.
Four registers specify the corner locations of a local image 302 within a global image 300 as shown in FIG. 9. The four registers are defined below:
Hstart register: Pixel index of the first active pixel in local image 302. First pixel in global image 300 will have an index value of 1.
Hstop register: Pixel index of the first non-active pixel after the local image.
Vstart register: Line index of the first active line in local image. First line in global image will have an index value of 1.
Vstop register: Line index of the first non-active line after the local image.
The following table gives the values for the registers for the different formats supported by each MPEG-2 unit 212.

P# Format Input Hstart Hstop Vstart Vstop Format Register Register Register Register Register 1 720p 1 1 721 401 561 2 720p 1 1 721 241 401 3 720p 1 1 721 1 241 4 720p 1 561 1281 401 561 5 720p 1 561 1281 241 401 6 720p 1 561 1281 1 241 7 720p 1 1 721 561 721 8 720p 1 561 1281 561 721 9 72Op 0 0 0 0 0 The macroblock remover and bit concatenation tMR.BC) units 214/216 are responsible for converting the MPEG-2 main profile ~ main level bit streams received from the MPEG-2 units 212 to ATSC compliant bit streams. Each 5 MRBC unit performs two tasks: macroblock removal and bit stream concatenation by scanning and editing the bit streams from the MPEG-2 unit 212 and by communicating with other MRBC units.
The scalable MPEG-2 architecture (FIG. 7) employs nine MPEG-2 compression units 212 for 1080I video format encoding and 8 MPEG-2 compression units for 720P video format encoding.
Each MPEG-2 compression unit is responsible for compressing a specific region of the target image 300 called an active-region 310. The target picture 300 is covered by the active regions 310 without overlapping.
Figure 10 shows raw-regions 310B and active-regions 310 for 1080I.
Each MPEG-2 compression unit actually compresses a larger region (raw-region) of the target picture 300 than its active-region. An active-region 310 is a sub-region of the corresponding raw-region 310B. Therefore the target picture is covered also by the raw-regions but with overlapping between adjacent raw-regions. Every raw-region 310B, active-region 310 or overlapped region 310A is ensured to have sizes of multiple of 16 (or 32) so that the active-region can be obtained by removing some macroblocks from the raw-region.
The macroblock remover 214 removes the macroblocks which are in the overlap region 310A but not in the active-region 310.
The size of active regions is derived from the following:
Hactl - 592. Vactl = Vact3 - 352.
Hact2 - 480. Vact2 = 128.
Hact3 - 608. Vol - 128.

WO 99138316 PCTlUS99/01410 Ho112 = 128.
Ho123 - 112.
Referring now to FIG. 11, for each MPEG-2 compression unit 212, the following integer parameters are defined with respect to the macroblock positions:
raw height: the height of the raw-region 3108.
raw width: the width of the raw-region 3108.
left alignment: the mark where the active-region 310 macroblocks 320 start horizontally, thus macroblocks to the left of this mark in the raw-region need to be removed.
right alignment: the mark where the active-region macroblocks ends horizontally, thus macroblocks to the right of this mark in the raw region need to be removed.
top alignment: the mark where the active-region macroblocks start vertically, thus macroblocks above this mark in the raw region need to be removed.
bottom alignment: the mark where the active-region macroblocks end vertically, thus macroblocks below this mark in the raw region need to be removed.
The following definitions of two other parameters are specific to 1080I.
head lnb-skip: the number of macroblocks skipped from left alignment and the non-skipped macroblock in the active-region.

tail mb_skip: the number of macroblocks skipped from the last non-skipped macroblock in the active-region to right-alignment.
For the convenience of expression, the following convention is used to denote these parameters:
((raw width, raw height), (left alignment, right alignment), (top alignment, bottom alignment)) and is called the configuration vector for the MRBC unit.
Note that this configuration vector defines the IO boundaries of the raw-region 310B and the active-region 310 of the current MPEG-2 compression unit and hence which macroblocks need to be removed and which macroblocks need to be kept.
The values of the configuration vectors for each MRHC unit for 1080I are as follows:
#1: ( (45,30) , (0, 41) , (0, 26) ) #2 : { (45, 30} , (4, 41) , (0, 26) ) #3 : { (45, 30) , (3, 45) , (0, 26) ) #4: ( (45,24) , (0, 41) , (4, 20) ) #5: ( {45,24) , (4, 41) , (4, 20) ) #6: ( (45,24) , (3, 45) , (4, 20) ) #7: ( (45, 30) , (0, 41) , (4, 30) ) #8: ((45,30), {4, 41), (4, 30) ) #9: ( (45, 30) , {3, 45) , (4, 30) ) The MRBC unit 214/216 scans and edits the coded bit streams for slices on a row basis. Horizontally, macroblocks in the area between the top of the raw-region and top alignment, between the bottom alignment and the bottom of the raw-region should be removed. For each row wo ~r~s3i6 rcr~rs~roi4io in the raw-region, macroblocks in the area between the left start of the raw-region and left alignment, between right alignment and the right end of the raw-region should be removed. The resulting bit stream is called an mr-processed row. Since each MPEG-2 unit uses a single slice for each row, an mr-processed row is also called an mr-processed slice in this context.
For 1080I, besides producing mr-processed rows, some of the macroblock removers need to produce values for head mb-skip and/or tail mb-skip. Furthermore, a local variable quant trace is used to record the value of quantiser scale code which is initially set to the quantiser scale code in the slice header and is updated every time a quantise-scale code is encountered in the following macroblocks until left-alignment.
Starting from a given slice, the macroblock remover scans the coded bit streams and performs the following procedures:
- Computes head mb-skip if required (specific to 1080I) .
- Updates quant trace until left alignment. A
check is made that the macroblock quant flag is set in the first non-skipped macroblock in the active-region. If not, the macroblock quant flag is set and the value quantiser_scale code is set to the value of quant trace, and a macroblock header is rebuilt accordingly (specific to MRBC units on, the 2°d and 3rd column for 1080I encoding).

- Forms the mr-processed slice by only preserving macroblocks in the active-region in the process of scanning.
- Computes the value of tail mb_skip if required (specific to 1080I?.
Bit streams from each local MPEG-2 compression unit 212 have to be concatenated to form uniform ATSC
compliant DTV bit streams. Every MRBC unit has to put its local mr-processed slice into the output buffer 208 (FIG. 7) at the right time. In other words, the behavior of the MRBC units 214/216 has to be synchronized. Thus, a token mechanism is used to synchronize MRBC units.
For 1080I video format encoding, the communication model is as shown in FIG. 12. For 720p video format encoding, processors #2, #5, #8 are removed as shown in FIG. 13. An extra row is added to the bottom.
A token is an indication that the MRBC unit holding the token can sent its bit stream to the output buffer along output bus 228. When a MRBC unit receives a completion signal from another MRBC unit, it has the token. The MRBC unit #1 is responsible for initiating new tokens. The MRBC unit #1 has a time-out variable.
When the time-out is reached, a fault will be generated and the system manager will reset. Tokens are sent through a designated line 270 between the MRBC units.
Only one active token is allowed at any given time.
For 1080I video format encoding, each DTV slice is obtained by concatenating three local slices in the three MRBC units of the same row. Since each macroblock header contains information about the number of skipped macroblocks between the previous non-skipped macroblock and the current macroblock, this information needs to be updated when the local mr-processed slice is integrated into a DTV slice. To be more specific, the first non-skipped macroblock in the second and the third local processed slices should have its header updated.
Proper header information is inserted into DTV bit streams by the MRHC units. The header information is obtained by scanning the bit streams from the output buffer of the local MPEG-2 unit 212. For 1080I video format encoding, MRBC unit #4 and #~ are responsible for only inserting slice header information.
The macroblock _skipping information, tail cnb-skip, from the last MRBC unit is received and combined with the local head mb_skip. The total macroblock skipping information is then inserted into the macroblock header of the first non-skipped macroblock in the mr-processed slice and the slice bit stream is then put into the DTV
output buffer. Then the local tail mb-skip is sent to the next MRBC unit via the dedicated 8-pin data bus 228.
MRBC units (#1, #4, #~) in the first column only send tail mb skip information; MRBC units (#2, #5, #8) in the second column both receive and send tail cnb_skip information; MRBC units (#3, #6, #9) in the third column only receive tail mb-skip information.
Upon receiving a token signal, the MRBC unit updates the mr-processed slice and outputs it to the DTV output buffer, then turns the token over to next MRBC unit by activating the token line.

The next step for sending the token for 10801 video format encoding is determined by the following rules:
- If there is a next MRBC unit in the same row, send the token to the next MRBC unit.
- If the current MRBC unit is at the end of the row of MRBC units, it sends the token to the first MRBC unit of the same row if the slice sent to the output buffer is not the last slice in the active-region; it sends the token to the first MRBC unit of the next row if the slice sent is the last slice in the active region.
One exception is that MRBC unit #9 sends a token to MRBC unit #1 after the last mr-processed slice.
As noted above, the reference image managers (RIMs) 220 (FIG. 7) manage the updating of reference images used by each of the MPEG-2 compression units 212. Each RIM
220 transfers information from the local memory 218 within one regional processor 206 to the local memory of adjacent processors. The reference images within each MPEG-2 unit are updated by the compression engine during the frame encoding process. There are two reference frames stored in each local memory 218. Only one reference frame is updated during each frame encoding period. The reference frames are updated only when encoding the I or P frames. The following example shows the order in which the two reference frames are updated as shown in FIG. 14.
Consider two reference images stored in reference buffer A and reference buffer B of local memory 218 and-a compressed output sequence IBBPBBPBBIBB. The compression process for the first I frame creates a reference image.
This image is stored in reference buffer A. The compression process for the next two B frames does not create any reference images. The compression process for the next P frame creates a reference image. This image is stored in reference buffer B. No new reference images are created until the next P frame. The reference image of this P frame is stored in reference buffer A. The previous reference image created when compressing the I
frame is then lost.
The reference buffer need only be updated by the RIM
220 if the reference buffer is modified by the frame compression process. A reference image 400A is updated by the RIM 220 using information from the reference images 400A from adjacent processors, as FIG. 15 shows the regions 400B within the reference images 400 of the adjacent processors used by the RIM to update regions 400C of the reference image 400A in the center processor.
For those side and corner processors that do not have adjacent processors on all sides, the regions bordering those empty neighbors in the reference image will not require updating.
The RIM 220 keeps track of the relationship between the frame type and the reference image update based on the guidelines noted previously. The RIM identifies which one of the two reference buffers A and B; if any, was updated by the MPEG-2 units at the end of each frame encoding process.

The RIM 220 determines when the MPEG-2 units have completed encoding of the current frame by monitoring the Henc-inc signal from the IHM encoder, the output time of the picture information block, and the vertical synch signal at the video input port.
The RIM computes the begin address within the local memory for the chroma and luma reference images using information from the picture information block extracted from the MPEG-2 compression unit, the Henc_int signal, and the update status of reference buffer A and B. One can assume that the luma and chroma reference begin address will not change between each frame encoding process. The begin address is defined by the compression configuration.
The RIM 220 updates all modified reference images at the end of each encoding process. The RIM updates each reference image according to the table below and as shown in FIG. 16:
Pr BRt BRb BR1 HRr ARt ARb AR1 ARr - ~rci specifies cne rirsz pixel dust aster region Rtln and Rbln;
- AR1 specifies the first pixel just after region Atln nd Abn;
- BRt specifies the line just after region Rtn;

- ARt specifies the line just after region Atn;
- ARb specifies the line just after region Aln and Arn;
- HRb specifies the line just after region Rln and Rrn;
- BRr specifies the first pixel just after region Rbn and Rtn;
- ARr specifies the first pixel just after region Abn and Atn.
Four RIM processors 220 perform the functions required in the preferred embodiment of FIG. 7. The components within each RIM processor are shown in the block diagram of FIG. I7. A single RIM processor 220 manages the local memory for four video processors designated here as Ptl, Ptr, Pbl, and Pbr. The RIM has access to the 64 bit data, 9 bit address, and associated DRAM control signals. Within the RIM, 12 local managers 220A handle the different border regions 400B (FIG. 15) around each reference image 400.
The diagram of FIG. 18 shows the components within each local manager 220A. The local manager holds four buffers 220B, 220C, two to hold the border image for reference image A, and two to hold the border image for reference image H. During each frame encoding process, 25 the MPEG-2 unit reads and writes into the local memory 218 (FIG. 7) when manipulating data within the AR region.
Simultaneously, the local manager will update one of the AR/BR buffers 220B, 220C . For this example, buffer A
holds data that mirrors the AR region within the local memory of the MPEG-2 unit. At the end of the frame _.

encoding process, a controller 232 within the local manager will re-map AR/BR buffer A into the BR region of the adjacent MPEG-2 unit. Buffer B, which was mapped as the BR region for the adjacent MPEG-2 unit, is re-mapped back into the AR region of the center processor. It is the responsibility of the controller to re-map the buffers every time the reference image is updated through a frame encoding process.
The PES header generator 208 inserts the PES header in the video elementary stream. The PES header generator extracts this information directly from the compressed stream. For the picture type information, the generator extracts this information from the picture header within the compressed bit stream. For the PTS value, the PES
generator computes the PTS value from the following information: picture type pt, gop structure gops, and the input video STC timestamp, STCvi. The PES header generator latches the STCvi value using the vertical synchronous signal going into the MPEG-2 unit of regional processor ##1.
The computational demand required to decode an ATSC
compliant video stream is also significant. Thus, the common sequential decoding architecture used for standard MPEG-2 mlC~mp decoding will not meet the demand. The scalable architecture of the present invention uses existing MPEG2 decoding engines to decode ATSC DTV video streams.
An embodiment of a decoder system comprises four parallel regional decoders. The system requires that each decoder be capable of decoding a video frame size wo ~r~s3i6 rcr~s~roi4io that is 2.5 times the NTSC format. This requirement is not unreasonable, since the known decoding algorithm is not demanding. As shown in FIG. 19, each decoder decodes a local region 500A, 5008, 500C, 500D of 1920 pixels by 270 lines within an ATSC frame 500 that is 1920 pixels by 1080 lines.
The block diagram of FIG. 20 shows the components of a decoder system 520 that includes a compressed stream demultiplexer 522, four parallel regional decoders 524A, 5248, 524C, 524D, reference frame stores 526A, 5268, 526C
and a multiplexer 528.
The compressed stream demultiplexer 522 demultiplexes a compressed video stream 521 on the basis of slice header information to provide region streams 523A, 5238, 523C, 523D. The regional decoder 524A, 524B, 524C, 524D decodes a compressed bit stream 523A, 5238, 523C, 523D that is fully compliant with MPEG-2 except for the following two exceptions: the stream defines a frame that is 1920 pixels by 270 lines; the motion vectors extend beyond the vertical dimension by a maximum of 270 lines. In the preferred embodiment, an existing MPEG-2 decoder is modified so as to fulfill the noted exceptions spelled out for the regional decoder 524A, 5248, 524C, 524D.
The regional decoder must also address the decoding dependencies between adjacent regions. There exists one dependency that is critical to the decoding process:
motion vector compensation. The motion vector compensation procedure utilizes pixel information from a reference image to create the pixels within the current .

WO 99!38316 PCT/US99/01410 macroblock. The reference image is an image created from previous I or P frames. Thus, as shown in FIG. 21, through motion compensation, the procedure reaches into regions beyond the local region to create the pixels within the current block. The maximum depth the procedure will reach is governed by the maximum length defined by the motion vectors.
Each regional decoder makes the reference image available for other decoders in order for the other decoders to correctly carry out the motion vector compensation procedure. The reference image is shared between adjacent regions. The assumption is that the maximum motion vector will not exceed the width of each region, which is 270 lines. This is a reasonable I5 assumption, since no realistic compressed video sequences will generate motion vectors greater than 270 lines.
The reference images are shared through the reference frame store as shown in FIG. 22. The regional decoders simultaneously write into two memory locations 530, 532, 534, 536: one (530, 536) for future access by the current decoder, and one (532, 534) for future access by the adjacent decoders. The embodiment resolves simultaneous reading by two decoders when performing motion vector compensation routine.
The multiplexes 528 multiplexes the uncompressed frame regions back into a full frame. The multiplexes constructs an 8 bit digital data stream following the SMPTE 274M standard.

EQUIVALENTS
While this invention has been particularly shown and described with reference to preferred embodiments thereof, it will be understood by those skilled in the 5 art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Those skilled in the art will recognize or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described specifically herein. Such eauivalent~ aro intended to be encompassed in the scope of the claims.

Claims

What ie claimed is;

1. A video encoder comprising:
a processor for encoding an input stream of video images, each video image divided into regions that have overlapping portions, the processor encoding each region of a current video image in the stream according to an encoding process that includes motion compensation and storing a reference frame for each region in a memory based on a prior video image in the stream for use in the motion compensation of the encoding process; and a reference frame processor coupled to the memory for updating each reference frame with information from reference frames of adjacent regions to extend motion compensation into the adjacent regions;
wherein the encoded video images comprise macroblocks and the processor further comprises means for removing certain macroblocks from the encoded video images that correspond to the overlapping portions to provide an output video stream.

2. The video encoder of Claim 1 wherein the processor comprises plural regional processors for encoding the input stream of video images, each video image divided into regions that have overlapping portions, each processor encoding a particular region of the current video image in the stream according to an encoding process that includes motion compensation and storing a reference frame for each region in a local memory based on a prior video image in the stream for use in the motion compensation of the encoding process;
wherein the reference frame processor is coupled to the plural local memories for updating each reference frame with information from reference frames of adjacent regions.

3. The video encoder of Claim 2 wherein each regional processor further comprises means for concatenating the macroblock processed encoded video images with that of other regional processors to provide an output video stream.

4. The video encoder of Claim 2 wherein the encoding process for each region comprises MPEG-2 encoding with main profile at main level.

5. The video encoder of Claim 2 wherein the input stream is an ATSC compliant digital video stream.

6. The video encoder of Claim 2 wherein the input stream is a digital video stream compliant with MPEG-2 main profile at high level.

7. The video encoder of Claim 1 wherein the processor comprises plural regional processors for encoding the input stream of video images, each video image divided into regions that have overlapping portions, each regional processor comprising:
an image selection unit for selecting a particular image region from each of the video images;
a compression engine for compressing the selected image regions to provide a compressed image region stream comprising macroblocks according to an encoding process that includes motion compensation;
a local memory for storing a reference frame based on a prior compressed image region for use in the motion compensation of the encoding process;
a macroblock remover for removing certain macroblocks from the compressed image region stream that correspond to the overlapping portions;
a stream concatenation unit for concatenating the compressed image region stream with such streams from each regional processor to provide an output video stream wherein the reference frame processor is coupled to the plural local memories for updating each reference frame with information from reference frames of adjacent regions.

8. The video encoder of Claim 7 wherein the compression engine is an MPEG-2 main profile at main level engine.

9. A method of video encoding comprising the steps of:
providing an input stream of video images, each video image divided into regions that have overlapping portions, for each region:
encoding a particular region of a current video image in the stream according to an encoding process that includes motion compensation wherein the encoded video images include macroblocks;
storing a reference frame for each region in a local memory based on a prior video image in the stream for use in the motion compensation of the encoding process;
updating each reference frame with information from reference frames of adjacent regions to extend motion compensation into the adjacent regions; and removing certain macroblocks from the encoded video images that correspond to the overlapping portions to provide an output video stream.

10. The method of Claim 9 further comprising for each region:
selecting a particular image region from each of the video images;
compressing the selected image regions to provide a compressed image region stream comprising macroblocks according to an encoding process that includes motion compensation: and concatenating the compressed image region stream with other such streams to provide an output video stream.