WO2003063495A2

WO2003063495A2 - Scalable video communication

Info

Publication number: WO2003063495A2
Application number: PCT/EP2003/000523
Authority: WO
Inventors: Catherine Mary Dolbear; Paola Marcella Hobson
Original assignee: Motorola Inc
Priority date: 2002-01-24
Filing date: 2003-01-20
Publication date: 2003-07-31
Also published as: AU2003206758A1; GB2384638B; GB0201589D0; GB2384638A; WO2003063495A3

Abstract

A method for improving a quality of a scalable video sequence communicated over an error-prone network (500). The video sequence includes at least one base layer and at least one enhancement layer comprising a number of intercoded macroblocks. the method includes the step of replacing (560) at least one inter-coded macroblock with an intra-coded macroblock in one or more enchancement layers of the video sequence. Replacing selected inter-coded macroblocks with intra-coded macroblocks in one or more enhancement layers provides improved video picture quality in an error-prone communication environment such as mobile or wireless communication systems or non-guaranteed quality of service environments such as the internet. The technique limits propagation of errors within the enhancement layer (s) of the video sequence.

Description

Scalable Video Communication

Field of the Invention

This invention relates to video transmission systems and video encoding/decoding techniques. The invention is applicable to, but not limited to, a method for protecting enhancement layer predicted (EP) pictures in a scalable video compression system that may be subject to errors.

Background of the Invention

In the field of video technology, it is known that video is transmitted as a series of still images/pictures. Since the quality of a video signal can be affected during coding or compression of the video signal, it is known to include additional information or 'layers', based on a difference between the video signal and the encoded video bit stream. The inclusion of additional layers enables the quality of the received signal, following decoding and/or decompression, to be enhanced. Hence, a hierarchy of base pictures and enhancement pictures, partitioned into one or more layers, is used to produce a layered video bit stream.

A scalable video bit-stream refers to the ability to transmit and receive video signals of more than one resolution and/or quality simultaneously. A scalable video bit-stream is one that may be decoded at different rates, according to the bandwidth available at the decoder. This enables the user with access to a higher bandwidth channel to decode high quality video, whilst a lower bandwidth user is still able to view the same video, albeit at a lower quality. The main application for scalable video transmission is where multiple decoders, with access to differing bandwidths, are receiving images from a single encoder .

In a layered (scalable) video bit stream, enhancements to the video signal may be added to a base layer either by:

(i) Increasing the resolution of the picture (spatial scalability) ;

(ii) Including error information to improve the Signal to Noise Ratio of the picture (SNR scalability) ;

(iii) Including extra pictures to increase the frame rate (temporal scalability) ; or

(iv) Providing a continuous enhancement that may be truncated at any chosen bit rate (Fine Granular Scalability) .

Such enhancements may be applied to the whole picture or to an arbitrarily shaped object within the picture, which is termed object-based scalability. In order to preserve the disposable nature of the temporal enhancement layer, the H.263+ ITU H.263 [ITU-T Recommendation, H.263, "Video Coding for Low Bit Rate Communication"] standard states that pictures included in the temporal scalability mode should be bi-directionally predicted (B) pictures, as shown in the video stream of FIG. 1.

FIG. 1 shows a schematic illustration of a scalable video arrangement 100 illustrating B picture prediction dependencies, as known in the field of video coding techniques. An initial intra-coded frame (Ii) 110 is followed by a bi-directionally predicted frame (B₂) 120. This, in turn, is followed by a (uni-directional) predicted frame (P₃) 130, and again followed by a second bi- directionally predicted frame (B₄) 140. This again, in turn, is followed by a (uni-directional) predicted frame (P₅) 150, and so on.

As an enhancement to the arrangement of FIG. 1, a layered video bit stream may be used. FIG. 2 is a schematic illustration of a layered video arrangement, known in the field of video coding techniques. A layered video bit stream includes a base layer 205 and one or more enhancement layers 235.

The base layer (layer-1) includes one or more intra-coded pictures (I pictures) 210 sampled, coded and/or compressed from the original video signal pictures. Furthermore, the base layer will include a plurality of subsequent predicted inter-coded pictures (P pictures) 220, 230 predicted from the intra-coded picture (s) 210. Inter-coded pictures are encoded to include the changes between the current picture and a previous picture.

In the enhancement layers (layer-2 or layer-3 or higher layer(s)) 235, three types of picture may be used:

(i) Bi-directionally predicted (B) pictures (not shown) ; (ii) A first enhanced intra-coded (El) picture 240 based on the intra-coded picture (s) 210 of the base layer 205, or the current lower layer enhancement picture if more than one enhancement layer is used; and (iii) Enhanced predicted (EP) pictures 250, 260, based on the inter-coded predicted pictures 220, 230 of the base layer 205. EP pictures 250, 260 contain macroblocks that are predicted from either the current lower layer picture or the previous picture within the same enhancement layer.

In video coding systems using ITU H.263 [ITU-T Recommendation, H.263, "Video Coding for Low Bit Rate Communication"] video compression technology, the image data are compressed in macroblocks (MBs) . These MBs comprise four luma and two chroma blocks of 8x8 pixels. This definition of macroblock is also used in the MPEG family of standards.

The vertical arrows from the lower, base layer illustrate that the picture in the enhancement layer is predicted from a reconstructed approximation of that picture in the reference (lower) layer.

If prediction is only formed from the lower layer, then the enhancement layer picture is referred to as an El picture. It is possible, however, to create a modified bi- directionally predicted picture using both a prior enhancement layer picture and a temporally simultaneous lower layer reference picture. This type of picture is referred to as an EP picture or "Enhancement" P-picture. The prediction flow for El and EP pictures is shown in FIG. 2. Although not specifically shown in FIG. 2, an El picture in an enhancement layer may have a P picture as its lower layer reference picture, and an EP picture may have an intra-coded picture as its lower-layer reference picture .

For both El and EP pictures, the prediction from the reference layer uses no motion vectors. However, as with normal P pictures, EP pictures use motion vectors when predicting from their temporally, prior-reference picture in the same layer.

Current standards incorporating the aforementioned scalability techniques include MPEG-4 and H.263. These standards create highly compressed bit-streams, which represent the coded video. However, due to the use of high compression, the bit-streams are very prone to corruption by network errors during the transmission process . For example, in the case of streaming video over an error prone network, even with existing network level error resilience tools employed, it is inevitable that some bit-level corruption will occur in the bit-stream. Hence, errors are passed on to the decoder.

When transmitting packets containing scalable video data over an error-prone network, it is desirable that the packets containing the base layer data are given a higher priority and that they are encoded with more error- protection than packets containing enhancement layer data. This lower level of error-protection for enhancement layer data means that enhancement layer data is more likely to be affected by errors. Apart from the first enhancement picture, all SNR or spatial enhancement pictures (EP pictures) are predicted - either from the current lower layer picture or the previous picture in the same enhancement layer. Clearly, if an error affects the picture that they have been predicted from, they too will contain errors. Hence, enhancement layer data is more susceptible to be affected by error propagation.

In the base layer, macroblocks in the P pictures can be intra-coded at regular intervals (for example, H.263 requires every macroblock to be intra-coded at least once every one hundred and thirty two frames) , to reduce error propagation. In the enhancement layer, certain macroblocks in EP pictures can be predicted solely from the base layer picture, to reduce the impact of error propagation.

Error resilience methods described in the standards allow re-synchronisation markers to be added to the bit stream, so that if an error corrupts the bit stream, earlier recovery of error-free data can be achieved. Data partitioning, whereby the header, motion vector and texture information are separated, also contributes to error resilience. The MPEG4 standard allows for reversible Variable Length Codes, so that if an error corrupts the start of the code, it can be decoded from the end instead of the start, thereby increasing the recovery of error-free data that would ordinarily be discarded.

However, the aforementioned approaches have the disadvantage that once an error has corrupted an enhancement layer, there is still no known method of preventing its propagation into subsequent pictures, if the base layer is also affected by errors. Furthermore, a current known technique, of introducing an El macroblock, will not solve this problem, as it is possible that the macroblock has been predicted from a base layer macroblock also in error.

A need therefore exists for improved error resilience in a scalable video transmission system, wherein the abovementioned disadvantages may be alleviated. Prior art arrangements are known for non-scalable video systems. See for example EP-A-0798938 , EP-A-0633699, EP-A- 0536630 and WO-A-9811502.

Statement of Invention

The present invention provides a method for improving a quality of a scalable video sequence communicated over an error-prone network, see claim 1, a video communication system, as claimed in claim 10, a video communication unit, as claimed in claim 18, a video encoder, as claimed in claim 19, a video decoder, as claimed in claim 20, a mobile radio device, as claimed in claim 21, and a storage medium storing processor-implementable instructions, as claimed in claim 23. Further aspects of the present invention are as claimed in the dependent claims.

In summary, an apparatus and a method for improving the quality of scalable video enhancement layers transmitted over an error-prone network are described. The apparatus and method use periodic replacement of inter-coded macroblocks (MBs) in each EP or El picture within an enhancement layer of a video sequence with intra-coded MBs. In this manner, propagation of errors within the enhancement layer (s) can be reduced.

Brief Description of the Drawings

FIG. 1 is a schematic illustration of a video coding arrangement showing picture prediction dependencies, as known in the field of video coding techniques. FIG. 2 is a schematic illustration of a known layered video coding arrangement . Exemplary embodiments of the present invention will now be described, with reference to the accompanying drawings, in which: FIG. 3 is a block diagram of a video communication unit, adapted to introduce substantially periodic intra-coded MBs in an enhancement layer of a video sequence, in accordance with the preferred embodiment of the present invention.

FIG. 4 is a schematic representation of a scalable video communication system adapted to introduce substantially periodic intra-coded MBs in an enhancement layer of a video sequence in accordance with the preferred embodiment of the present invention.

FIG. 5 is a flowchart illustrating the preferred method and parameters used when introducing substantially periodic intra-coded MBs in an enhancement layer of a video sequence .

Description of Preferred Embodiments

Referring now to FIG. 3, a block diagram of a video subscriber unit 300, adapted to support the inventive concepts of the preferred embodiments of the present invention, is shown.

The preferred embodiment of the present invention is described with respect to a wireless video communication unit, for example one capable of operating in the 3^rd generation partnership project (3GPP) standard for future cellular wireless communication systems. However, it is within the contemplation of the invention that the inventive concepts herein described are equally applicable for other wireless or fixed communication units capable of video transmissions. The video subscriber unit 300 contains an antenna 302 preferably coupled to a duplex filter, antenna switch or circulator 304.that provides isolation between receive and transmit chains within the video subscriber unit 300.

The receiver chain includes receiver front-end circuitry 306, effectively providing reception, filtering and intermediate or base-band frequency conversion. The front- end circuit 306 receives video signal transmissions from another communication unit, for example its associated Node B or base transceiver station (BTS) . The front-end circuit 306 is serially coupled to a signal processing function 308, generally realised by a digital signal processor (DSP) . The signal processing function 308 performs signal demodulation, error correction and formatting, and recovers the video data bit-stream. Recovered data from the signal processing function 308 is serially coupled to a video codec function 309, the operation of which is further described with respect to FIG. 4.

As known in the art, received signals that have been decrypted by the video codec function 309 are typically input to a baseband processing device 310, which takes the video decoded information received from the video codec function 309 and formats it in a suitable manner to send to a video display 311.

In different embodiments of the invention, the signal processing function 308, video codec function 309 and baseband processing function 310 may be provided within the same physical device. A controller 314 is configured to control the information flow and operational state of the elements of the subscriber unit 300.

In accordance with the preferred embodiment of the invention, in the video subscriber unit 300, the video codec function 309 in the receive path (i.e. the decoder) has been adapted to interpret information transmitted from the encoding unit. The video codec function also identifies within the bit-stream of the received video signal which macroblocks within the enhancement layer have been periodically intra-coded. The decoder has been further adapted by providing parameter information back to the encoder, wherein the parameter information relates to a received video sequence, such as the received signal level, received bit error rate, etc.

As regards the transmit chain, this essentially includes a video input device 320 coupled in series through a baseband processor 310, video codec function 309, signal processing function 308, transmitter/modulation circuitry 322 and a power amplifier 324. The processor 308, transmitter/modulation circuitry 322 and the power amplifier 324 are operationally responsive to the controller, with an output from the power amplifier coupled to the duplex filter, antenna switch or circulator 304, as known in the art .

The transmit chain in the video subscriber unit 300 takes the video input bit-stream from the video input device 320 and the video codec function 309 encodes the video bit- strea into scalable layers as described in greater detail with respect to FIG. 4. The video encoded information is then passed to the signal processor where it is formatted to include, for example, error protection, for subsequent modulation and transmission by transmit/modulation circuitry 322 and power amplifier 324.

In the preferred embodiment of the present invention, the timer/counter 318 for encoding functions is also adapted to count numbers of macroblocks in a video transmission sequence such that intra-coded macroblocks can be periodically introduced. Furthermore, the encoder syntax has been changed so that the replacing of an inter-coded MB with an intra-coded MB is signalled to the decoder.

The scalable video encoder is adapted such that it interprets feedback information received from the video decoder, for example via a return channel or other feedback device which may support the Real-Time Control Protocol (RTCP) about the received signal strength and/or received error rate. The scalable video encoder is preferably also adapted such that it deduces likely error conditions on the communication link, based on such parameter information.

In response to these feedback conditions, and other parameters that will be subsequently described, the encoder will replace certain inter-coded MBs with intra-coded MBs, which are calculated directly from the enhancement layer data. Appearance of these MBs will be signalled in the bit-stream by a change to the MB header, such that the decoder may recognise when an intra MB occurs .

The signal processor function 308, video codec function 309 and baseband processing function 310 in the transmit chain may be implemented as distinct from the corresponding functions in the receive chain. Alternatively, a single processor may be used to implement corresponding processing operations of both transmit and receive signals, as shown in FIG. 3. In a yet further alternative embodiment, the signal processor function 308, video codec function 309 and baseband processing function 310, for both transmit and receive chains, may be combined into a single processor, for example a digital signal processor (DSP) .

Of course, the various components within the video subscriber unit 300 can be realised in discrete or integrated component form, with an ultimate structure therefore being merely an arbitrary selection.

Referring next to FIG. 4, a schematic representation of the primary functions of a video communication system 400 is shown. The video communication system includes a video encoder 415 and video decoder 425, adapted to incorporate the preferred embodiment of the present invention. In FIG. 4, a video picture F₀ is compressed 410 in a video encoder 415 to produce the base layer bit stream signal to be transmitted at a rate ri kilobits per second (kbps) . This signal is decompressed 420 at a video decoder 425 to produce the reconstructed base layer picture F₀' .

The compressed base layer bit stream is also decompressed at 430 in the video encoder 415 and compared with the original picture F₀ at 440 to produce a difference signal 450. This difference signal is compressed at 460 and transmitted as the enhancement layer bit stream at a rate r₂ kbps. This enhancement layer bit stream is decompressed at 470 in the video decoder 425 to produce the enhancement layer picture F₀' ' which is added to the reconstructed base layer picture F₀' at 480 to produce the final reconstructed picture F₀ ^'" .

In accordance with the preferred embodiment of the present invention, the compression function 460 in the video encoder 415 has been adapted to enable one or more intra- coded macroblocks, to replace inter-coded macroblocks in one or more of the enhancement layer bit-streams.

Furthermore, the decompression function 470 in the video decoder 425 has been adapted to recognise the location (s) of the incorporated intra-coded macroblocks in the enhancement layer bit-stream, when it would ordinarily expect inter-coded macroblocks. The decompression function 470 is able to do this in response to signalling information within the bit-stream of the video signal received from the video encoder 415, preferably in the MB header .

In accordance with the preferred embodiment of the present invention, the decision on which macroblocks are selected for periodically intra-coding is preferably made in response to a number of parameters, - for example radio signal strength indication (RSSI) measurements in the area of the video communication unit . The video encoder has been adapted to interpret feedback information, such as RSSI information or received error rates, transmitted from the video decoder 425 on a return channel 490. The return channel 490 may comprise any feedback device, for example one supporting RTCP.

The video encoder 415 then utilises this information in determining which enhancement layer inter-coded MBs to replace with intra-coded MBs in the encoded video sequence. The use of such incorporated intra-coded macroblocks in the enhancement layer bit-stream is further described with regard to the flowchart of FIG 5.

Referring now to FIG. 5, a flowchart 500 illustrates the preferred method for deciding which macroblock (MB) frames are to be replaced with intra-coded frames in a substantially periodic manner.

A 'FrequencyThreshold' parameter is used to control the periodicity of the forced intra-coded MB frames. It is within the contemplation of the present invention that the FrequencyThreshold parameter may be user defined, via an input port of the communication device. Alternatively, the FrequencyThreshold parameter may be fixed, for example by setting the parameter value to be ^λ132' during manufacture similar to the H.263 standard.

When each video frame is processed 510 in the compressor function 460, a counter is compared to the stored FrequencyThreshold parameter, as in step 520. If the threshold has not been exceeded in step 520, a frequency counter is increased, as shown in step 515. In this manner, a pre-defined distance can periodically separate the forced intra-coded MBs. The minimum distance between forced intra-coded MBs is determined by the allocated FrequencyThreshold parameter. In the preferred embodiment of the invention, it is envisaged that information on whether any macroblocks in the same spatial location in previous frames had been encoded as forced intra MBs, is also take into account. If the threshold has been exceeded in step 520, a determination is preferably made, in step 530, as to whether any corresponding MB, located at the same position within a preceding frame, is a forced enhancement intra- coded MB. If one or more previous MBs at the same position is a forced enhancement intra-coded MB(s), then it is not required to force an intra MB in this position, and the counter is incremented, as shown in step 515. The processing then continues with the next MB in sequence.

However, if one or more previous MBs at the same position within a preceding frame is not a forced enhancement intra- coded MB, in step 530, then an RSSI measurement (or equivalent error metric) of the received video transmission is compared against a SignalThreshold value, as shown in step 540. It is envisaged that the SignalThreshold value would be radio network dependent. If the RSSI measurement for the MB or frame of the received video transmission is less than the SignalThreshold value in step 540, then it may be required to force an intra MB, so the processing proceeds to step 550. Otherwise, a forced MB is not required, and the counter is incremented, as in step 515. The processing then continues with the next MB in sequence.

In accordance with an improved embodiment of the present invention, it is envisaged that the peak signal to noise ratio (PSNR) of the reconstructed macroblock at the same spatial location in the one or more previous frames can also be used in the decision process. Hence, if the RSSI measurement for the MB or the frame of the received video transmission is less than the SignalThreshold value, in step 540, then a PSNR measurement of the previous received video frame is compared against a PSNR threshold, as shown in step 550. If the PSNR measurement for the previous received video frame is less than the PSNR threshold, then it would be advantageous to force an intra-coded MB. Hence, instead of the expected inter-coded MB, the MB is forced to be encoded as an intra MB, as shown in step 560. Otherwise, a forced MB is not required, and the counter is incremented, as shown in step 515. The processing then continues with the next MB in sequence.

In the context of the preferred embodiment of the present invention, PSNR refers to the picture quality, whereas RSSI refers to the radio link quality. A low PSNR means lots of compression, and consequently more susceptibility to errors. Thus, the use of a forced intra-coded MB benefits low PSNR scenarios.

It is envisaged that the PSNRThreshold is made dependent upon the bit rates available in the communication unit to encode the base layer and the enhancement layer, and thereby the PSNR value that should be targeted. However, when encoding at say, less than 32kbps, a PSNRThreshold of 33dB is selected in the preferred embodiment. When encoding at say, between 32kbps and 64kbps a PSNRThreshold of 36dB could be selected in the preferred embodiment. When encoding at say, over 64kbps a PSNRThreshold of 40dB could be selected in the preferred embodiment. In general, the preferred embodiment would select for the PSNRThreshold value the target PSNR used in the rate control algorithm of the encoder.

If the PSNR measurement for the previous received video frame is less than the PSNR threshold, then the current MB is modified to be an intra-coded MB, as shown in step 560. In this manner, when the preferred criteria of periodicity, signal to noise ratio, received signal strength level and location of previous forced intra-coded MBs have been satisfied, then a suitable location of a forced intra-coded MB has been determined.

The intra-coded MB count is then set to a numerical 0' value, as in step 570, and the process repeats for that particular video transmission.

Advantageously, the use of such additional information, as provided by the above-preferred embodiment, ensures that the selection of appropriate macroblocks will be more accurate. This negates the need to perform a complete intra-coded macroblock operation on an enhancement video stream, shortly following an El macroblock. As more bits are needed to encode an intra-coded macroblock, only certain macroblocks are intra-coded per frame.

It is within the contemplation of the present invention that the aforementioned parameters are examples of any number of parameters that can be used to effect the forced intra-coded refresh operation. Furthermore, although the preferred embodiment has been described with respect to four parameters, it is envisaged that any combination of these parameters could be used to improve the performance of enhancement layer transmissions. Additionally, it is envisaged that the configuration to effect the forced introduction of intra-coded MBs can be based upon the parameters falling below, or exceeding, particular thresholds, as appropriate to the particular implementation. In the preferred embodiment of the present invention, the compressor function 460 has been adapted to store and run the following algorithm/pseudo code:

Preferred algorithm;

If (MB_Intra_Count [i] > FrequencyThreshold)

{

If (MBTYPE [i] t-i AND MBTYPE [i] _t-2 AND MBTYPE [i] _t-3 1 = ENHANCEMENT_INTRA) {

If (Radio_Signal_Strength < SignalThreshold AND PSNR [I] t-1 < PSNRThreshold) MBType [i] _t = INTRA

} MB_Intra_Count [i] = 0

} else

MB_Intra_Count [i] ++

END

In the preferred algorithm, as detailed above, the following definitions are used: th MBTYPE [i] _t is the macroblock type for the i macroblock in frame t; and MB_Intra_Count [i] counts the number of macroblocks over previous and current frames that have not been intra-coded.

This algorithm requires additional bits to be encoded, as MBs must be intra-coded rather than inter-coded. However, the inventors of the present invention have determined that the decrease in coding efficiency in the enhancement layer would be less than 4%, as only one or two macroblocks per frame would need to be intra-coded. Advantageously, the gain in visual quality far outweighs the additional bit expenditure, especially in poor error conditions where the base layer was being corrupted.

To avoid an excessive decrease in coding efficiency, the number of intra-coded MBs should be limited to a maximum number per frame. This limit can be user definable, and will depend on the level of decrease in coding efficiency that is acceptable to the user. Typically, this may be between zero and four MBs per quarter common intermediate format (QCIF) sized frame, and pro-rata for other image sizes. One benefit in using an extra criterion, such as setting a maximum limit of intra-coded MBs per frame, is that the bit rate can be maintained at a low level . For example, if all other conditions are met but the user only wants a maximum of four forced intra-coded MBs per frame, then the user or user's video unit may be allowed to choose not to intra-code any more MBs when that maximum is reached.

It is within the contemplation of the invention that alternative encoding and decoding configurations could be adapted to use such periodically incorporated intra-coded macroblocks within the enhancement layer bit-stream (s) . As a result, the inventive concepts described should not be viewed as being limited to the example configuration provided in FIG. 4, or the flowchart of FIG. 5.

It is further envisaged that any layered scalable video codec can be readily adapted to incorporate the preferred embodiment of the present invention, of periodically incorporating intra-coded macroblocks within the enhancement layer bit-stream (s) . More generally, the adaptation or programming of code or threshold levels may be implemented in the respective video communication unit in any suitable manner. For example, new apparatus may be added to a conventional video communication unit, or alternatively existing parts of a conventional video communication unit may be adapted, for example by reprogramming one or more processors therein. As such, the required adaptation may be implemented in the form of processor-implementable instructions stored on a storage medium, such as a floppy disk, hard disk, PROM, RAM or any combination of these or other storage multimedia.

It is also within the contemplation of the invention that such adaptation of a video encoding or video decoding operation may be facilitated by any video communication unit operating in a video communication system, for example user/subscriber equipment, such as mobile or portable radios or telephones .

It will be understood that the proposed error resilience technique using forced intra refresh in a scalable enhancement layer provides the advantage of improved video picture quality in an error-prone communication environment, such as mobile or wireless communication systems, or in non-guaranteed quality of service environments, such as the internet.

One could imagine multimedia services where the base layer of the video stream is free of charge, for example as a preview, and it is the additional quality of the enhancement layer that must be paid for by the customer. In this case, an invention improving the error resilience of the enhancement layer, as described above, would be of significant benefit. In this context, users will only pay for the enhancement layer if it has good resilience to errors in the communication link.

Other error resilience or error concealment methods have been shown to improve visual quality in many instances. However, in the case where errors have corrupted both the base layer and the enhancement layer, the inventive concepts hereinbefore described provide a particular improvement in the video picture quality. The present invention has been described with reference to replacing macroblocks in a scalable video sequence, where the macroblock is consistent with the definition in the MPEG-4 standard. However, a skilled artisan would recognise that the inventive concepts hereinbefore described could be applied to any format or number of video/image data bits contained within a frame, and not therefore limited to the particular macroblock configuration described.

It is within the contemplation of the present invention that the aforementioned inventive concepts may be applied to any video communication unit and/or video communication system including, inter alia, arbitrary-shaped object encoding, as described in MPEG4. In particular, the inventive concepts find particular use in wireless (radio) devices such as mobile telephones/mobile radio units and associated wireless communication systems. Such wireless communication units may include a portable or mobile PMR radio, a personal digital assistant, a laptop computer or a wirelessly networked PC. Although the preferred embodiment of the present invention has been described with reference to the MPEG-4 standard, scalable video system technology may be implemented in the 3^rd generation (3G) of digital cellular telephones, commonly referred to as the Universal Mobile Telecommunications

Standard (UMTS) . Scalable video system technology may also find applicability in the packet data variants of both the current 2^nd generation of cellular telephones, commonly referred to as the general packet-data radio system (GPRS) and the TErrestrial Trunked RAdio (TETRA) standard for digital private and public mobile radio systems. Furthermore, scalable video system technology may also be utilised in the Internet. The aforementioned inventive concepts will therefore find applicability in, and thereby benefit, all these emerging technologies.

In summary, a method for improving a quality of a scalable video sequence communicated over an error-prone network has been described. The video sequence includes at least one base layer and at least one enhancement layer comprising a number of inter-coded macroblocks. The method includes the step of replacing at least one inter-coded macroblock with an intra-coded macroblock in one or more enhancement layers of the video sequence .

A video communication system has also been described. The video communication system includes a video encoder comprising a processor for encoding a video sequence into a scalable video sequence having at least one enhancement layer. Macroblock replacement means are operably coupled to the processor to replace at least one inter-coded macroblock with an intra-coded macroblock in one or more enhancement layers of the video sequence in response to the determined parameter. A transmitter transmits the scalable video sequence with said at least one replaced macroblock. The video communication system also includes a video decoder comprising a receiver for receiving the macroblock replaced scalable video sequence from the video encoder and macroblock interpreting means operably coupled to the receiver to interpret whether a received macroblock is a replaced intra-coded macroblock in one or more enhancement layers of the video sequence.

A video communication unit, an adapted video encoder, an adapted video decoder, and a mobile radio device incorporating any one of these units, have also been described.

Generally, the inventive concepts contained herein are equally applicable to any suitable video or image transmission system. Whilst specific, and preferred, implementations of the present invention are described above, it is clear that one skilled in the art could readily apply variations and modifications of such inventive concepts.

Thus, an apparatus and a method for improving error resilience using forced intra refresh in a scalable enhancement layer have been provided, whereby the aforementioned disadvantages with prior art arrangements have been substantially alleviated.

Claims

1. A method for improving a quality of a scalable video sequence communicated over an error-prone network (500) , wherein the scalable video sequence includes at least one base layer and at least one enhancement layer comprising a number of inter-coded macroblocks, the method characterised by the step of : replacing (560) at least one inter-coded macroblock with at least one intra-coded macroblock in one or more enhancement layers of the video sequence.

2. The method for improving a quality of a scalable video sequence communicated over an error-prone network according to Claim 1, further characterised by the step of: determining a parameter of the video sequence and replacing a selected at least one inter-coded macroblock with an intra-coded macroblock in response to said parameter.

3. The method for improving a quality of a scalable video sequence communicated over an error-prone network according to Claim 2 , wherein the step of determining includes determining how many inter-coded macroblocks to replace with intra-coded macroblocks and/or under what conditions said replacement is to be performed.

4. The method for improving a quality of a scalable video sequence communicated over an error-prone network according to Claim 2 or Claim 3, wherein the step of determining includes counting a number of macroblocks since a previous replacement with an intra-coded macroblock, and the step of replacing is performed if said number of macroblocks exceeds a threshold value that sets a periodicity of said replaced intra-coded macroblocks.

5. The method for improving a quality of a scalable video sequence communicated over an error-prone network according to any of preceding Claims 2 to 4, wherein the step of determining includes determining a received signal level of the received video bit sequence and the step of replacing is performed if said received signal level is below a signal threshold value, for example a signal threshold value that is radio network dependent and sets a threshold level where errors are likely to occur within the error prone network.

6. The method for improving a quality of a scalable video sequence communicated over an error-prone network according to any of preceding Claims 2 to 5, wherein the step of determining includes determining an acceptable PSNR metric for the encoded video bit sequence, and the step of replacing is performed if said PSNR is below a PSNR threshold value.

7. The method for improving a quality of a scalable video sequence communicated over an error-prone network according to any of preceding Claims 2 to 6, wherein the step of determining includes determining a total number of macroblocks to intra-code per frame, and the step of replacing is performed in response to said determination.

8. The method for improving a quality of a scalable video sequence communicated over an error-prone network according to any of preceding Claims 2 to 7, wherein one or more of said parameters is user definable or pre- determined, for example to provide a trade off between error resiliency and bit efficiency.

9. The method for improving a quality of a scalable video sequence communicated over an error-prone network according to any of preceding Claims 1 to 8 , wherein the one or more video enhancement layers comprise part of an H.263 or MPEG-4 scalable video bit-stream.

10. A video communication system (400) comprising: a video encoder (415) comprising: a processor (309) for encoding a video sequence into a scalable video sequence having at least one enhancement layer comprising a number of inter-coded macroblocks; macroblock replacement means (460) operably coupled to said processor to replace at least one inter-coded macroblock by an intra-coded macroblock in one or more enhancement layers of the video sequence; and a transmitter, operably coupled to said processor, for transmitting said macroblock replaced scalable video sequence ; and a video decoder (425) comprising: a receiver for receiving said scalable video sequence with said at least one replaced macroblock from said video encoder; and macroblock interpreting means (470) operably coupled to said receiver to interpret whether a received macroblock is a replaced intra-coded macroblock in one or more enhancement layers of the video sequence .

11. The video communication system (400) according to Claim 10, wherein said video encoder further comprises: parameter determining means, operably coupled to a processor, to determine a parameter of the scalable video sequence; and said macroblock replacement means (460) replaces an inter-coded macroblock with an intra-coded macroblock in one or more enhancement layers of the video sequence in response to said determined parameter.

12. The video communication system (400) according to Claim 10 or Claim 11 further comprising a feedback path from said video decoder to said video encoder, wherein said video decoder informs said video encoder of a parameter relating to a received video sequence via said feedback path.

13. The video communication system (400) according to Claim 11, wherein said parameter determining means determines a number of macroblocks since a previous replacement with an intra-coded macroblock; and said macroblock replacement means (460) replaces at least one inter-coded macroblock with an intra-coded macroblock if said number of macroblocks exceeds a threshold value that sets a periodicity of said replaced intra-coded macroblocks .

14. The video communication system (400) according to Claim 12 , wherein said video decoder receiver determines a received signal level of the received video bit sequence and informs said parameter determining means in said video encoder of said received signal level via said feedback path; and said macroblock replacement means (460) replaces at least one inter-coded macroblock with an intra-coded macroblock if said received signal level is above or below a signal threshold value, for example a signal threshold value that is radio network dependent and sets a threshold level where errors are likely to occur within the error prone network.

15. The video communication system (400) according to Claim 12, wherein said video encoder determines an acceptable

PSNR value for encoding the video bit sequence; wherein said macroblock replacement means (460) replaces at least one inter-coded macroblock with an intra-coded macroblock if said PSNR value is below a PSNR threshold value.

16. The video communication system (400) according to Claim 11, wherein said parameter determining means determines a total number of macroblocks to intra code per frame; and wherein said macroblock replacement means (460) replaces at least one inter-coded macroblock _..with an intra- coded macroblock in response to said determination or wherein said macroblock replacement means (460) is inhibited if said maximum number of macroblocks to intra code per frame is exceeded.

17. The video communication system (400) according to any of preceding Claims 10 to 16, wherein said video communication system (400) supports H.263 or MPEG-4 scalable video bit-streams.

18. A video communication unit (300) adapted for use in the method of any of Claims 1 to 9 or adapted for use in the communication system of any of Claims 10 to 17.

19. A video encoder (415) adapted for use in the method of any of Claims 1 to 9 or adapted for use in the communication system of any of Claims 10 to 17.

20. A video decoder (425) adapted for use in the method of any of Claims 1 to 9 or adapted for use in the communication system of any of Claims 10 to 17.

21. A mobile radio device comprising a video communication unit in accordance with claim 18 or a video encoder in accordance with claim 19 or a video decoder in accordance with claim 20.

22. The mobile radio device of claim 21, wherein the mobile radio device is a mobile phone, a portable or mobile PMR radio, a personal digital assistant, a lap-top computer or a wirelessly networked PC.

23. A storage medium storing processor-implementable instructions for controlling a processor to carry out the method of any of claims 1 to 9.