WO2009047697A2

WO2009047697A2 - System and method for error concealment

Info

Publication number: WO2009047697A2
Application number: PCT/IB2008/054088
Authority: WO
Inventors: Stephane E. Valente; Hugues J. M. De Perthuis
Original assignee: Nxp B.V.
Priority date: 2007-10-08
Filing date: 2008-10-06
Publication date: 2009-04-16
Also published as: WO2009047697A3

Abstract

A method and a system conceal errors in video data transmission. An encoded current bitstream is received from a demodulator at a first decoder, which decodes the current bitstream. An encoded previous bit stream is received from a memoryat a second decoder, which decodes the previous bitstream. When it is determined that the current bitstream does not include an error, the decoded current bitstream output from the first decoder is selected, and when it is determined that the current bitstream does include an error, the decoded previous bitstream output from the second decoder is selected. The second decoder may operate in parallel with the first decoder.

Description

System and method for error concealment

BACKGROUND OF THE INVENTION

Video display devices, such as computer monitors and televisions, receive video data through digital interfaces, such as a Digital Video Interface (DVI) or a Multimedia Interface (HDMI), typically through wired connections. However, the availability of ultra- wideband (UWB) wireless networks, such those developed through the WiMedia Alliance, has made wireless transmission of video signals to display devices more practical. For example, video may be transmitted over a UWB wireless communications link at about 5.0 Gbps for a 1920 x 1200 @ 60Hz computer screen over a few meters (e.g., less than about 10 meters) with a bandwidth limited to a few hundred mbps (e.g., about 50 to 500 mbps). Compression is be used to reduce the amount of data, as well as the corresponding bandwidth needed for transmitting video. Generally, an image includes a frame that is divided into macroblocks (e.g., 16 x 16 pixels), on which compression may be performed, resulting in a bitstream of compressed data representing the original image. Macroblock compression schemes include, for example, the ITU-T H.264 standard, by the International Telecommunication Union, and the SMPTE 42 IM video codec (VC-I) standard, by Microsoft Corporation.

The H.264 standard, in particular, allows an image to be segmented into "slices," which are processed independently of one another. Fig. 1 depicts a representative image, divided into five slices 101-105. Each slice can be any number of macroblocks, as indicated by the differing sizes of the slices 101-105. Typically, a slice may include several lines of macroblocks. In accordance with the H.264 standard, each slice is a self-contained unit, which can be encoded and decoded separately. Therefore, if a transmission error occurs with respect to data representing one slice (e.g., slice 101), the remainder of the data representing the remainder of the slices (e.g., slices 102-105) can still be decoded successfully. In other words, data are lost only from the point at which the transmission error occurs to the beginning of the next slice. Therefore, the greater the number of slices into which an image is divided, the less data are lost when a transmission error occurs. In contrast, when a transmission error occurs in data representing an image frame that has not subdivided into slices, the entire frame must be retransmitted instead of just a single slice (i.e., the slice in which the error occurs). However, because a slice is a self-contained unit, it cannot reference data, such as macroblocks, in other slices of the same frame. This may impact compression efficiency, since spatial prediction, e.g., based on the value of the upper and left rows of macroblocks, will not occur when these reference macroblocks belong to a different slice. Therefore, the number of image slices is a trade-off between error resilience and efficiency.

Error handling and compensating for lost data are particularly relevant with respect to wireless networks, which generally tend to be less reliable than wired communications networks, especially when transporting high bandwidth video data. Therefore, a video decoder that receives video data must be able to handle transmission packet errors. For example, when a data packet is lost, the decoder receives no data to decode at least a portion of the image, and must therefore interpolate pixels, for example, to fill in the missing macroblocks.

One error concealment scheme consists in copying pixels of a previously decoded picture, and substituting the copied pixels for the lost data. However, if the pixels of the last decoded picture have not been stored in a memory accessible to the decoder, e.g., due to memory restrictions, this error concealment scheme is not available. Accordingly, the lost packets cannot be adequately concealed, resulting in reduced quality of the video.

What is needed, therefore, is a system and method for error concealment that overcomes at least the shortcomings of known systems and methods described above.

SUMMARY OF THE INVENTION

In a representative embodiment, a system for viewing web content in real time includes a first decoder for receiving and decoding a current bitstream of video data and a second decoder for decoding data in at least one previously received bitstream of video data. The system also includes a controller for selecting one of the first decoder and the second decoder to output decoded data. Illustratively, the controller selects the first decoder to output the decoded data when the current bitstream has no errors, and the controller selects the second decoder to output the decoded data stream when the current data stream has an error.

In another representative embodiment, a method for concealing errors in transmission of data representing at least one image. The method includes xeceiving an encoded current bitstream from a demodulator at a first decoder; decoding the current bitstream; receiving an encoded previous bit stream from a memory at a second decoder; decoding the previous bitstream; determining whether the current bitstream includes an error; selecting the decoded current bitstream output from the first decoder when the current bitstream does not have an error; and selecting the decoded previous bitstream output from the second decoder when the current bitstream does have an error.

BRIEF DESCRIPTION OF THE DRAWINGS The present teachings are best understood from the following detailed description when read with the accompanying drawing figures. The features are not necessarily drawn to scale. Wherever practical, like reference numerals refer to like features. Fig. 1 is a diagram illustrating an image frame divided into slices. Fig. 2 is a block diagram illustrating a system for concealing errors in data transmission, according to a representative embodiment.

Fig. 3 is a flowchart illustrating a method for concealing errors in data transmission, according to a representative embodiment.

DETAILED DESCRIPTION OF THE INVENTION In the following detailed description, for purposes of explanation and not limitation, representative embodiments disclosing specific details are set forth in order to provide a thorough understanding of the present teachings. Descriptions of well-known devices, hardware, software, firmware, methods and systems may be omitted so as to avoid obscuring the description of the example embodiments. Nonetheless, such hardware, software, firmware, devices, methods and systems that are within the purview of one of ordinary skill in the art may be used in accordance with the representative embodiments.

Fig. 2 is a block diagram illustrating a system for concealing errors in video data transmission, according to a representative embodiment. The system includes a transmitting side system 250 configured to send video data over a wireless communications link 260 to a receiving side system 270. The wireless communications link 260 may be any type of wireless network capable of transmitting video data, such as a UWB wireless link according to WiMedia Alliance standards.

The transmitting side system 250 receives a video signal from a video source 205. The video source 205 may be an external source of images, originating for example in a cable modem, a digital television receiver, a satellite receive, a digital video disk (DVD) player or the like. Likewise, the video source 205 may be an internal source of images, such as a personal computer generating high definition graphics. Regardless of the video source, the system transmits the received images over the wireless network 260 to the receiving side system 270 ultimately to be displayed on a display device 230, such as a digital television display or a computer monitor.

As discussed above, the image is segmented into slices (e.g., shown in Fig. 1), each of which includes a number of macroblocks (e.g., 16 x 16 pixels), macroblock pixels are input to an encoder 212, which performs compression processing on the image data. In an illustrative embodiment, the encoder 212 includes a transform circuit, such as a Discrete Cosine Transform (DCT) circuit, for determining transform coefficients of the image data, a quantizer for quantizing the transform coefficients, and an encoder, such as an entropy encoder, for encoding the quantized coefficients. The encoding process may also include spatial or temporal prediction of the current signal. Other compression (and corresponding decompression) circuits may be incorporated, without departing from the spirit and scope of the present teachings.

In an embodiment, the encoder 212 compresses the video data on-the-fly for communication over the wireless communication link 260 using a macroblock compression scheme, such as the ITU-T H.264 standard, by the International Telecommunication Union, or the SMPTE 42 IM video codec(VC-l) standard, by Microsoft Corporation. Accordingly, the video data are processed at the macroblock level. Further, as discussed above, each image frame input to the encoder 212 is segmented into slices, which include macroblocks. Each slice is self-contained, so data from other slices or other image frames (e.g., for prediction purposes) does not have to be used in the compression/decompression processing of a slice. Compression schemes typically involve various picture types, including intra (I) picture prediction, predicted (P) picture prediction and bi-directional (B) picture prediction. The P-picture and B-picture prediction types may also be referred to as inter- picture or temporal prediction because they use macroblocks from other images (e.g., previous images) to encode and decode data. Accordingly, P-picture and B-picture prediction types require large buffers to store the previous image data. For example, for a 1200 pixel image with 12 bit components, a buffer would require 1920 x 1200 x 3 x 16 bits (assuming 12 bit components to be aligned on 16 bit memory location), which is approximately 110 Mb or 13.8 MB. Therefore, the bandwidth required to store a current image and retrieve a previous image would be up to 1.658 GB/s (2 x 13.8 MB x 60). In fact, the bandwidth may actually be higher, as a practical matter, because of inherent inefficiencies of memories, especially when non-aligned prediction is used. Such a large bandwidth requires a large memory with a wide bus, such as a DDR2 random access memory with a 32 bit bus, which is too expensive for the consumer market. In comparison, an I-picture is coded without reference to any picture other than itself, so storage of previous image data is not necessary for compression and decompression processing. Intra I-picture prediction is intra-frame or spatial prediction, and works on the assumption that adjacent macroblocks within a frame have similar properties. In an embodiment, e.g., using the H.264 standard, only an I-picture prediction is used for compressing the video data, eliminating the need for the large memory required by P-picture or B-picture prediction, and thus enabling a low cost implementation. Using only I-picture prediction, a macroblock of 16 x 16 pixels of a single image slice may be encoded and/or decoded by storing only the line of pixels located to the left and top boundaries of the current macroblocks. For example, when pixels arrive in scan line order, only one line of macroblocks needs to be in memory, plus one line of pixels. This relatively small amount of memory can be reasonably accommodated by an internal buffer.

Furthermore, I-picture prediction enables a short latency period. For example, as soon the lines of a macroblock are received (e.g., 16 lines of 16 pixels), compression begins. Also, I-picture prediction enables a constant quality to be provided by using a constant quantization parameter. For example, if there is enough bandwidth, variable bit rate (VBR) may be used rather than constant bit rate (CBR). In contrast, using temporal prediction might result in flickering on the display 230, and is overly cumbersome, especially when transmitting a mostly fixed image, with no (or few) changing features, from a computer, for example. Quality would differ between successive images and each time a new reference picture is used.

Referring again to Fig. 2, the output of the encoder 212 is connected to modulator 214, which modulates the encoded bitstream onto a carrier. For example, the encoded signal may be modulated according to a multi-band OFDM scheme for transmission over a UWB network. The modulated signal is amplified and sent to an antenna 218 for transmission over the wireless network 260. In an embodiment, the transmitting side system 250 may also include a controller or central processing unit (CPU) (not shown), which may determine whether current image data, including the current slice and/or macroblock, should be encoded and sent to the receiving side system 270, for example, based on whether the current image data simply repeats the previous image data. Likewise, the CPU may be able to identify portions of the current image data which change, and determine to send only the changing portions, thus increasing efficiency.

The signal is received by an antenna 228 of the receiving side system 270, and demodulated by the demodulator 224. The demodulated bitstream is sent to a first decoder 222, which is configured to decode current data, which is the video data bitstream received in real time. Decoding the bitstream includes, for example, performing inverse entropy decoding (and demultiplexing), inverse quantization and inverse transform processing to obtain the corresponding video data (i.e., pixel data). Notably, in H.264, an approximation of DCT may be used instead of DCT.

If the first decoder 222 encounters an error when attempting to decode current bitstream, such as a dropped packet, the first decoder 222 is unable to decode the remaining portion of the current bitstream (e.g., the data representing the remaining macrob locks of the current slice). Accordingly, a second decoder 223, operating in parallel with the first decoder 222, decodes a reference bitstream, which is previously received encoded video data stored in the memory 221. The memory 221 may be a first-in- first-out (FIFO) memory, for example. Because the memory 221 stores encoded data, i.e., which is still compressed, the memory 221 may be relatively small, as compared to a memory capable of storing the corresponding decoded image data. The macroblocks from reference bit stream, decoded by the decoder 223, is then substituted for the macroblocks decoded from the erroneous bitstream received by the first decoder 222, as described below.

As stated above, the first decoder 222 and the second decoder 223 may operate in parallel. This is necessary, for example, for timely decoding of the video data, whether the current or reference bitstreams are used. However, there may be situations in which only one or the other decoder is actively decoding (current or stored) image data. For example, if the first decoder 222 does not receive any data for a predetermined period of time, e.g., such as the decoder 222 is in FIFO underflow (typically several milliseconds (ms)), and the image is not yet complete, the second decoder 223 will function on its own to decode reference bitstreams to cover the period during which the first decoder 222 receives no data. A controller 226, including a central processing unit (CPU), operates a switch

227 to select the output from either the first decoder 222 or the second decoder 223 as the decoded bitstream. The switch 227 may be a multiplexer, for example. Further, although the controller 226 is depicted as a separate element of the receiving side system 270, in various embodiments, the controller 226 and the associated functionality may be implemented as a CPU included in the first decoder 222, the second decoder 223 or both.

When the first decoder 222 successfully decodes the current bitstream, the switch 227 is set to select the output of the first decoder 222. When the first decoder 222 is not able to successfully decode the current video data, e.g., based on transmission errors, then the switch 227 is set to select the output of the second decoder 222. The selected video data are stored in a memory 229, which may be a FIFO memory, and output to a display 230. The display 230 may be a computer monitor, a television screen, or the like.

When the current demodulated data contains no errors, the bitstream may also be sent to the memory 221, where it is stored as the reference bitstream for subsequent decoding by the second decoder 223, when needed. As described above, the memory 221 provides previously received video data to the second decoder 223, which may be decoded by the second decoder 223 (e.g., in parallel with the first decoder 222 decoding the current bitstream) and substituted for the current bitstream when the current bitstream has errors. Accordingly, the stored video data should include relatively recent reference bitstreams, since the more recent data are likely to be more similar to the current bitstream it is intended to replace. Therefore, the memory 221 may be periodically updated.

Because the current image has been segmented into slices, as discussed above, the previous image data stored in the memory 221 likewise represents image slices, generally corresponding to slices of the current image. Also, because the slices are self-contained, data from each slice stored in the memory 221 may be updated independently of the other slice data.

For example, in an embodiment, the controller 226 may monitor the incoming current bitstreams and determine when a current bitstream (or set of bitstreams) corresponding to a particular slice location is good. The current bitstream may then be stored (as a reference bitstream) in the memory 221, replacing an older, previously stored bitstream of the same slice. Accordingly, the image data stored in the memory 221 may include slice data from different consecutive images. Of course, an entire frame of the image data stored in the memory 221 may be replaced at one time, which is particularly efficient when the currently transmitted image has changed significantly from the previously transmitted image(s). The memory 221 may store additional reference bitstreams beyond what is necessary for a representing a single frame, to account for variations in the structures of the frames.

Also, the sizes of the slices in different images may differ. For example, the mapping of the first half of an image may be changed from one slice to three slices, collecting including the same number of macrob locks, in the next consecutive image.

Therefore, the switching may be implemented at the level of the macrob locks, and updating previous image data stored in the memory 221 may include updating mapping determinations. Fig. 3 is a flowchart illustrating a method for concealing errors in video data transmission, according to a representative embodiment. At step S310, the receiving side system 270 receives and demodulates the compressed video data bitstream from the wireless network 260 at the demodulator 224. The first decoder 222 receives the demodulated bitstream at step S312 and determines whether the bitstream includes errors, such as dropped packets. In alternative embodiments, errors in the received video data may be detected by various means. For example, the first decoder 222 may identify errors while it is decoding the bitstreams, sending a flag upon detection of an error. Alternatively, missing data may be identified, for example, in the transport layer, based on coherency bits of the bitstream, e.g., included in packet headers or headers indicating the beginning of each slice. The defective bitstream is associated with a slice by the controller, which controls the switch 227.

When the current demodulated bitstream does not have errors (S314 - No), the (encoded) bitstream may be used to replace a corresponding portion of the previous image data stored in the memory 221, discussed above, to be used for subsequent error concealment processing. Regardless of whether the bitstream is stored in the memory 221, the switch 227 selects the output of the first decoder 222 at step S316, and the decoded bitstream is stored in the FIFO memory 229 at step S318.

When the current demodulated bitstream does have errors (S314 - Yes), a reference bitstream, which is a correctly received bitstream of encoded data previously stored in the memory 221, is retrieved from the memory 221 at step S320 and sent to the second decoder 223 for decoding at step S322. In an embodiment, the memory 221 stores previously received encoded data representing image slices. In order to retrieve an appropriate substitution for the current macroblock bitstream that contains the error, a corresponding macroblock bitstream of the previously stored slices is selected for decoding, e.g., under the direction of the controller 226. Alternatively, depending on the speed of decoder 222 and the real-time constraint, the second decoder will run in parallel from the start and will not wait for an error to be detected to start. This embodiment is useful in application where there is too much time needed to reach the macroblock where the error occurred to ensure memory 229 has substantially no underflow. At step S324, the switch 227 selects the output of the second decoder 223 as the decoded bitstream. The decoded bitstream, which includes the decoded, previous image data, is stored in the FIFO memory 229 at step S326.

It is then determined whether a resynchronization point of the current bitstream has been reached. As stated above, each image slice is self-contained and otherwise independent of all other image slices. Therefore, when an error is detected in a macroblock of the current bitstream, the entire slice, from the point of error detection forward, must be replaced by previous slice data stored in the memory 221. Once the end of the current slice is detected, e.g., by the controller 226, the error detection process may begin again, on the next image slice, which is a resynchronization point. If a resynchronization point in the bitstream has not been reached (S328 - No), the receiving side system 270 continues to decode the previously stored data, corresponding to the macroblocks remaining in the current slice, through the second decoder 223, thereby repeating steps S320 through S326. When the next resynchronization point is detected (S328 - Yes), the process returns to step S310 to begin demodulating the next image slice, beginning with a current coded bitstream, i.e., corresponding to the first macroblock of the next slice.

At step S330, the controller 226 determines whether the transmission of image data from the transmitter side system 250 is complete. When the transmission is not complete, and there is additional data to be received (step S330 - No), the process returns to step S310 to demodulate the next current coded bitstream, i.e., corresponding to a next macroblock. The next macroblock may or may not be included in the same slice as immediately preceding decoded bitstream. Steps S310 through S330 are repeated until the transmission is complete (step S330 - Yes). The decoded image data stored in the memory 229 is displayed at the display 230 at as decoded image frames are completed. Alternatively, depending on the size of the memory 229, the image may be provided directly on the display 230.

Generally, according to representative embodiments, the concealment of errors due to lost data packets is performed in the compressed domain. In particular, the first and second decoders 222 and 223 operate in parallel, such that the first decoder 222 decodes a currently received data stream, while the second decoder 123 decodes a previously encoded (and received) data stream.

When the current bitstream received by the first decoder 222, the current macroblock corresponding to the received bitstream is reconstructed entirely by the first decoder 222. However, if the first decoder 222 detects an error in the current bitstream with respect to the current macroblock, the decoding process switches to the second decoder 223, which decodes the last correctly received bitstream, in place of the current bitstream, and continues to do so until the next resynchronization point. At that point, the first decoder 222 resumes the decoding process with respect to the current bitstream (which now includes data of the next slice). Although the present teachings have been described in detail with reference to particular embodiments, persons possessing ordinary skill in the art to which the present teachings pertain will appreciate that various modifications and enhancements may be made without departing from the spirit and scope of the claims that follow. Also, the various devices and methods described herein are included by way of example only and not in any limiting sense.

Claims

CLAIMS:

1. A system for concealing errors in data transmission, comprising: a first decoder for receiving and decoding a current bitstream of video data; and a second decoder for decoding data in at least one previously received bitstream of video data; and a controller for selecting one of the first decoder and the second decoder to output decoded data, wherein the controller selects the first decoder to output the decoded data when the current bitstream has no errors, and the controller selects the second decoder to output the decoded data stream when the current data stream has an error.

2. The system according to claim 1, wherein the first decoder and the second decoder operate in parallel.

3. The system according to claim 2, wherein when the controller selects the second decoder, the second decoder continues output the decoded data until a resynchronization point of the current bitstream is received.

4. The system according to claim 3, wherein the controller selects the first decoder to output the decoded data after the resynchronization point of the current bitstream is received.

5. The system according to claim 1, wherein the at least one previous bitstream comprises a reference bitstream.

6. The system according to claim 5, further comprising: a memory for storing the reference bitstream.

7. The system according to claim 6, wherein the controller periodically updates the reference bitstream in the memory using at least a portion of the data from the current bitstream.

8. The system according to claim 7, wherein the controller updates the reference bitstream in the memory after the portion of the current bitstream has been successfully received.

9. The system according to claim 7, wherein the controller updates the reference bitstream in the memory after an entire image, at least a portion of which is represented by the current bitstream, has been successfully received.

10. A method for concealing errors in transmission of data representing at least one image, the method comprising: - receiving an encoded current bitstream from a demodulator at a first decoder; decoding the current bitstream; receiving an encoded previous bit stream from a memory at a second decoder; decoding the previous bitstream; determining whether the current bitstream includes an error; - selecting the decoded current bitstream output from the first decoder when the current bitstream does not have an error; and selecting the decoded previous bitstream output from the second decoder when the current bitstream does have an error.

11. The method according to claim 10, wherein decoding the current bitstream and decoding the previous bitstream occur substantially simultaneously.

12. The method according to claim 10, wherein decoding the previous bitstream occurs after the previous bitstream is determined to include the error.

13. The method according to claim 10, further comprising: storing at least a portion of the current bitstream in the memory when the current bitstream is determined not to include an error, the portion of the current bitstream replacing a corresponding portion of the previous bitstream.

14. The method according to claim 10, wherein the at least one image comprises a plurality of slices, the current bitstream representing data from a portion of a first slice of the plurality of slices.

15. The method according to claim 14, further comprising: continuing to select the decoded previous bitstream output from the second decoder when the current bitstream does have the error, until a resynchronization point in the current bitstream is detected.

16. The method according to claim 15, further comprising: selecting the decoded current bitstream output from the first decoder after the resynchronization point is detected.

17. The method according to claim 16, wherein the resynchronization point comprises an indication of a beginning of a second slice of the plurality of slices.

18. The method according to claim 10, wherein the memory comprises a first-in- first-out memory.

19. A method for concealing errors in transmission of video data, comprising: decoding a current bitstream, representing a portion of an image, received by a first decoder from a transmitter via a wireless communications link; decoding a previously stored bitstream, substantially corresponding to the portion of the image, received by a second decoder from a memory; and substituting at least a portion of the decoded previously stored bitstream for at least a portion of the current bitstream when the current bitstream is determined to have an error.

20. The method according to claim 19, wherein the current bitstream received via the wireless communications link is encoded according to an H.264 standard.