WO2003063495A2 - Scalable video communication - Google Patents

Scalable video communication Download PDF

Info

Publication number
WO2003063495A2
WO2003063495A2 PCT/EP2003/000523 EP0300523W WO03063495A2 WO 2003063495 A2 WO2003063495 A2 WO 2003063495A2 EP 0300523 W EP0300523 W EP 0300523W WO 03063495 A2 WO03063495 A2 WO 03063495A2
Authority
WO
WIPO (PCT)
Prior art keywords
video
intra
coded
macroblock
error
Prior art date
Application number
PCT/EP2003/000523
Other languages
French (fr)
Other versions
WO2003063495A3 (en
Inventor
Catherine Mary Dolbear
Paola Marcella Hobson
Original Assignee
Motorola Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola Inc filed Critical Motorola Inc
Priority to AU2003206758A priority Critical patent/AU2003206758A1/en
Publication of WO2003063495A2 publication Critical patent/WO2003063495A2/en
Publication of WO2003063495A3 publication Critical patent/WO2003063495A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/187Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scalable video layer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/37Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability with arrangements for assigning different transmission priorities to video input data or to video coded data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • This invention relates to video transmission systems and video encoding/decoding techniques.
  • the invention is applicable to, but not limited to, a method for protecting enhancement layer predicted (EP) pictures in a scalable video compression system that may be subject to errors.
  • EP enhancement layer predicted
  • video is transmitted as a series of still images/pictures. Since the quality of a video signal can be affected during coding or compression of the video signal, it is known to include additional information or 'layers', based on a difference between the video signal and the encoded video bit stream. The inclusion of additional layers enables the quality of the received signal, following decoding and/or decompression, to be enhanced. Hence, a hierarchy of base pictures and enhancement pictures, partitioned into one or more layers, is used to produce a layered video bit stream.
  • a scalable video bit-stream refers to the ability to transmit and receive video signals of more than one resolution and/or quality simultaneously.
  • a scalable video bit-stream is one that may be decoded at different rates, according to the bandwidth available at the decoder. This enables the user with access to a higher bandwidth channel to decode high quality video, whilst a lower bandwidth user is still able to view the same video, albeit at a lower quality.
  • the main application for scalable video transmission is where multiple decoders, with access to differing bandwidths, are receiving images from a single encoder .
  • enhancements to the video signal may be added to a base layer either by:
  • Such enhancements may be applied to the whole picture or to an arbitrarily shaped object within the picture, which is termed object-based scalability.
  • object-based scalability In order to preserve the disposable nature of the temporal enhancement layer, the H.263+ ITU H.263 [ITU-T Recommendation, H.263, "Video Coding for Low Bit Rate Communication"] standard states that pictures included in the temporal scalability mode should be bi-directionally predicted (B) pictures, as shown in the video stream of FIG. 1.
  • FIG. 1 shows a schematic illustration of a scalable video arrangement 100 illustrating B picture prediction dependencies, as known in the field of video coding techniques.
  • An initial intra-coded frame (Ii) 110 is followed by a bi-directionally predicted frame (B 2 ) 120. This, in turn, is followed by a (uni-directional) predicted frame (P 3 ) 130, and again followed by a second bi- directionally predicted frame (B 4 ) 140. This again, in turn, is followed by a (uni-directional) predicted frame (P 5 ) 150, and so on.
  • FIG. 2 is a schematic illustration of a layered video arrangement, known in the field of video coding techniques.
  • a layered video bit stream includes a base layer 205 and one or more enhancement layers 235.
  • the base layer (layer-1) includes one or more intra-coded pictures (I pictures) 210 sampled, coded and/or compressed from the original video signal pictures. Furthermore, the base layer will include a plurality of subsequent predicted inter-coded pictures (P pictures) 220, 230 predicted from the intra-coded picture (s) 210. Inter-coded pictures are encoded to include the changes between the current picture and a previous picture.
  • I pictures intra-coded pictures
  • P pictures predicted inter-coded pictures
  • s intra-coded picture
  • enhancement layers layer-2 or layer-3 or higher layer(s) 235
  • three types of picture may be used:
  • EP pictures 250, 260 contain macroblocks that are predicted from either the current lower layer picture or the previous picture within the same enhancement layer.
  • MBs macroblocks
  • These MBs comprise four luma and two chroma blocks of 8x8 pixels. This definition of macroblock is also used in the MPEG family of standards.
  • the vertical arrows from the lower, base layer illustrate that the picture in the enhancement layer is predicted from a reconstructed approximation of that picture in the reference (lower) layer.
  • the enhancement layer picture is referred to as an El picture. It is possible, however, to create a modified bi- directionally predicted picture using both a prior enhancement layer picture and a temporally simultaneous lower layer reference picture. This type of picture is referred to as an EP picture or "Enhancement" P-picture.
  • the prediction flow for El and EP pictures is shown in FIG. 2.
  • an El picture in an enhancement layer may have a P picture as its lower layer reference picture, and an EP picture may have an intra-coded picture as its lower-layer reference picture .
  • the packets containing the base layer data are given a higher priority and that they are encoded with more error- protection than packets containing enhancement layer data.
  • This lower level of error-protection for enhancement layer data means that enhancement layer data is more likely to be affected by errors.
  • all SNR or spatial enhancement pictures (EP pictures) are predicted - either from the current lower layer picture or the previous picture in the same enhancement layer. Clearly, if an error affects the picture that they have been predicted from, they too will contain errors. Hence, enhancement layer data is more susceptible to be affected by error propagation.
  • macroblocks in the P pictures can be intra-coded at regular intervals (for example, H.263 requires every macroblock to be intra-coded at least once every one hundred and thirty two frames) , to reduce error propagation.
  • certain macroblocks in EP pictures can be predicted solely from the base layer picture, to reduce the impact of error propagation.
  • Error resilience methods described in the standards allow re-synchronisation markers to be added to the bit stream, so that if an error corrupts the bit stream, earlier recovery of error-free data can be achieved.
  • Data partitioning whereby the header, motion vector and texture information are separated, also contributes to error resilience.
  • the MPEG4 standard allows for reversible Variable Length Codes, so that if an error corrupts the start of the code, it can be decoded from the end instead of the start, thereby increasing the recovery of error-free data that would ordinarily be discarded.
  • the present invention provides a method for improving a quality of a scalable video sequence communicated over an error-prone network, see claim 1, a video communication system, as claimed in claim 10, a video communication unit, as claimed in claim 18, a video encoder, as claimed in claim 19, a video decoder, as claimed in claim 20, a mobile radio device, as claimed in claim 21, and a storage medium storing processor-implementable instructions, as claimed in claim 23. Further aspects of the present invention are as claimed in the dependent claims.
  • an apparatus and a method for improving the quality of scalable video enhancement layers transmitted over an error-prone network are described.
  • the apparatus and method use periodic replacement of inter-coded macroblocks (MBs) in each EP or El picture within an enhancement layer of a video sequence with intra-coded MBs. In this manner, propagation of errors within the enhancement layer (s) can be reduced.
  • MBs inter-coded macroblocks
  • FIG. 1 is a schematic illustration of a video coding arrangement showing picture prediction dependencies, as known in the field of video coding techniques.
  • FIG. 2 is a schematic illustration of a known layered video coding arrangement . Exemplary embodiments of the present invention will now be described, with reference to the accompanying drawings, in which: FIG. 3 is a block diagram of a video communication unit, adapted to introduce substantially periodic intra-coded MBs in an enhancement layer of a video sequence, in accordance with the preferred embodiment of the present invention.
  • FIG. 4 is a schematic representation of a scalable video communication system adapted to introduce substantially periodic intra-coded MBs in an enhancement layer of a video sequence in accordance with the preferred embodiment of the present invention.
  • FIG. 5 is a flowchart illustrating the preferred method and parameters used when introducing substantially periodic intra-coded MBs in an enhancement layer of a video sequence .
  • FIG. 3 a block diagram of a video subscriber unit 300, adapted to support the inventive concepts of the preferred embodiments of the present invention, is shown.
  • the preferred embodiment of the present invention is described with respect to a wireless video communication unit, for example one capable of operating in the 3 rd generation partnership project (3GPP) standard for future cellular wireless communication systems.
  • 3GPP 3 rd generation partnership project
  • the video subscriber unit 300 contains an antenna 302 preferably coupled to a duplex filter, antenna switch or circulator 304.that provides isolation between receive and transmit chains within the video subscriber unit 300.
  • the receiver chain includes receiver front-end circuitry 306, effectively providing reception, filtering and intermediate or base-band frequency conversion.
  • the front- end circuit 306 receives video signal transmissions from another communication unit, for example its associated Node B or base transceiver station (BTS) .
  • the front-end circuit 306 is serially coupled to a signal processing function 308, generally realised by a digital signal processor (DSP) .
  • DSP digital signal processor
  • the signal processing function 308 performs signal demodulation, error correction and formatting, and recovers the video data bit-stream. Recovered data from the signal processing function 308 is serially coupled to a video codec function 309, the operation of which is further described with respect to FIG. 4.
  • received signals that have been decrypted by the video codec function 309 are typically input to a baseband processing device 310, which takes the video decoded information received from the video codec function 309 and formats it in a suitable manner to send to a video display 311.
  • the signal processing function 308, video codec function 309 and baseband processing function 310 may be provided within the same physical device.
  • a controller 314 is configured to control the information flow and operational state of the elements of the subscriber unit 300.
  • the video codec function 309 in the receive path (i.e. the decoder) has been adapted to interpret information transmitted from the encoding unit.
  • the video codec function also identifies within the bit-stream of the received video signal which macroblocks within the enhancement layer have been periodically intra-coded.
  • the decoder has been further adapted by providing parameter information back to the encoder, wherein the parameter information relates to a received video sequence, such as the received signal level, received bit error rate, etc.
  • this essentially includes a video input device 320 coupled in series through a baseband processor 310, video codec function 309, signal processing function 308, transmitter/modulation circuitry 322 and a power amplifier 324.
  • the processor 308, transmitter/modulation circuitry 322 and the power amplifier 324 are operationally responsive to the controller, with an output from the power amplifier coupled to the duplex filter, antenna switch or circulator 304, as known in the art .
  • the transmit chain in the video subscriber unit 300 takes the video input bit-stream from the video input device 320 and the video codec function 309 encodes the video bit- strea into scalable layers as described in greater detail with respect to FIG. 4.
  • the video encoded information is then passed to the signal processor where it is formatted to include, for example, error protection, for subsequent modulation and transmission by transmit/modulation circuitry 322 and power amplifier 324.
  • the timer/counter 318 for encoding functions is also adapted to count numbers of macroblocks in a video transmission sequence such that intra-coded macroblocks can be periodically introduced. Furthermore, the encoder syntax has been changed so that the replacing of an inter-coded MB with an intra-coded MB is signalled to the decoder.
  • the scalable video encoder is adapted such that it interprets feedback information received from the video decoder, for example via a return channel or other feedback device which may support the Real-Time Control Protocol (RTCP) about the received signal strength and/or received error rate.
  • RTCP Real-Time Control Protocol
  • the scalable video encoder is preferably also adapted such that it deduces likely error conditions on the communication link, based on such parameter information.
  • the encoder will replace certain inter-coded MBs with intra-coded MBs, which are calculated directly from the enhancement layer data. Appearance of these MBs will be signalled in the bit-stream by a change to the MB header, such that the decoder may recognise when an intra MB occurs .
  • the signal processor function 308, video codec function 309 and baseband processing function 310 in the transmit chain may be implemented as distinct from the corresponding functions in the receive chain.
  • a single processor may be used to implement corresponding processing operations of both transmit and receive signals, as shown in FIG. 3.
  • the signal processor function 308, video codec function 309 and baseband processing function 310, for both transmit and receive chains may be combined into a single processor, for example a digital signal processor (DSP) .
  • DSP digital signal processor
  • the various components within the video subscriber unit 300 can be realised in discrete or integrated component form, with an ultimate structure therefore being merely an arbitrary selection.
  • FIG. 4 a schematic representation of the primary functions of a video communication system 400 is shown.
  • the video communication system includes a video encoder 415 and video decoder 425, adapted to incorporate the preferred embodiment of the present invention.
  • a video picture F 0 is compressed 410 in a video encoder 415 to produce the base layer bit stream signal to be transmitted at a rate ri kilobits per second (kbps) .
  • This signal is decompressed 420 at a video decoder 425 to produce the reconstructed base layer picture F 0 ' .
  • the compressed base layer bit stream is also decompressed at 430 in the video encoder 415 and compared with the original picture F 0 at 440 to produce a difference signal 450.
  • This difference signal is compressed at 460 and transmitted as the enhancement layer bit stream at a rate r 2 kbps.
  • This enhancement layer bit stream is decompressed at 470 in the video decoder 425 to produce the enhancement layer picture F 0 ' ' which is added to the reconstructed base layer picture F 0 ' at 480 to produce the final reconstructed picture F 0 '" .
  • the compression function 460 in the video encoder 415 has been adapted to enable one or more intra- coded macroblocks, to replace inter-coded macroblocks in one or more of the enhancement layer bit-streams.
  • the decompression function 470 in the video decoder 425 has been adapted to recognise the location (s) of the incorporated intra-coded macroblocks in the enhancement layer bit-stream, when it would ordinarily expect inter-coded macroblocks.
  • the decompression function 470 is able to do this in response to signalling information within the bit-stream of the video signal received from the video encoder 415, preferably in the MB header .
  • the decision on which macroblocks are selected for periodically intra-coding is preferably made in response to a number of parameters, - for example radio signal strength indication (RSSI) measurements in the area of the video communication unit .
  • the video encoder has been adapted to interpret feedback information, such as RSSI information or received error rates, transmitted from the video decoder 425 on a return channel 490.
  • the return channel 490 may comprise any feedback device, for example one supporting RTCP.
  • the video encoder 415 then utilises this information in determining which enhancement layer inter-coded MBs to replace with intra-coded MBs in the encoded video sequence.
  • the use of such incorporated intra-coded macroblocks in the enhancement layer bit-stream is further described with regard to the flowchart of FIG 5.
  • a flowchart 500 illustrates the preferred method for deciding which macroblock (MB) frames are to be replaced with intra-coded frames in a substantially periodic manner.
  • a 'FrequencyThreshold' parameter is used to control the periodicity of the forced intra-coded MB frames. It is within the contemplation of the present invention that the FrequencyThreshold parameter may be user defined, via an input port of the communication device. Alternatively, the FrequencyThreshold parameter may be fixed, for example by setting the parameter value to be ⁇ 132' during manufacture similar to the H.263 standard.
  • a counter is compared to the stored FrequencyThreshold parameter, as in step 520. If the threshold has not been exceeded in step 520, a frequency counter is increased, as shown in step 515. In this manner, a pre-defined distance can periodically separate the forced intra-coded MBs. The minimum distance between forced intra-coded MBs is determined by the allocated FrequencyThreshold parameter. In the preferred embodiment of the invention, it is envisaged that information on whether any macroblocks in the same spatial location in previous frames had been encoded as forced intra MBs, is also take into account.
  • step 530 an RSSI measurement (or equivalent error metric) of the received video transmission is compared against a SignalThreshold value, as shown in step 540. It is envisaged that the SignalThreshold value would be radio network dependent. If the RSSI measurement for the MB or frame of the received video transmission is less than the SignalThreshold value in step 540, then it may be required to force an intra MB, so the processing proceeds to step 550. Otherwise, a forced MB is not required, and the counter is incremented, as in step 515. The processing then continues with the next MB in sequence.
  • the peak signal to noise ratio (PSNR) of the reconstructed macroblock at the same spatial location in the one or more previous frames can also be used in the decision process.
  • PSNR peak signal to noise ratio
  • a PSNR measurement of the previous received video frame is compared against a PSNR threshold, as shown in step 550. If the PSNR measurement for the previous received video frame is less than the PSNR threshold, then it would be advantageous to force an intra-coded MB.
  • the MB is forced to be encoded as an intra MB, as shown in step 560. Otherwise, a forced MB is not required, and the counter is incremented, as shown in step 515. The processing then continues with the next MB in sequence.
  • PSNR refers to the picture quality
  • RSSI refers to the radio link quality
  • the PSNRThreshold is made dependent upon the bit rates available in the communication unit to encode the base layer and the enhancement layer, and thereby the PSNR value that should be targeted.
  • a PSNRThreshold of 33dB is selected in the preferred embodiment.
  • a PSNRThreshold of 36dB could be selected in the preferred embodiment.
  • a PSNRThreshold of 40dB could be selected in the preferred embodiment.
  • the preferred embodiment would select for the PSNRThreshold value the target PSNR used in the rate control algorithm of the encoder.
  • the current MB is modified to be an intra-coded MB, as shown in step 560.
  • the preferred criteria of periodicity, signal to noise ratio, received signal strength level and location of previous forced intra-coded MBs have been satisfied, then a suitable location of a forced intra-coded MB has been determined.
  • the intra-coded MB count is then set to a numerical 0' value, as in step 570, and the process repeats for that particular video transmission.
  • the use of such additional information ensures that the selection of appropriate macroblocks will be more accurate. This negates the need to perform a complete intra-coded macroblock operation on an enhancement video stream, shortly following an El macroblock. As more bits are needed to encode an intra-coded macroblock, only certain macroblocks are intra-coded per frame.
  • the aforementioned parameters are examples of any number of parameters that can be used to effect the forced intra-coded refresh operation.
  • the preferred embodiment has been described with respect to four parameters, it is envisaged that any combination of these parameters could be used to improve the performance of enhancement layer transmissions.
  • the configuration to effect the forced introduction of intra-coded MBs can be based upon the parameters falling below, or exceeding, particular thresholds, as appropriate to the particular implementation.
  • the compressor function 460 has been adapted to store and run the following algorithm/pseudo code:
  • th MBTYPE [i] t is the macroblock type for the i macroblock in frame t; and MB_Intra_Count [i] counts the number of macroblocks over previous and current frames that have not been intra-coded.
  • This algorithm requires additional bits to be encoded, as MBs must be intra-coded rather than inter-coded.
  • the inventors of the present invention have determined that the decrease in coding efficiency in the enhancement layer would be less than 4%, as only one or two macroblocks per frame would need to be intra-coded.
  • the gain in visual quality far outweighs the additional bit expenditure, especially in poor error conditions where the base layer was being corrupted.
  • the number of intra-coded MBs should be limited to a maximum number per frame. This limit can be user definable, and will depend on the level of decrease in coding efficiency that is acceptable to the user. Typically, this may be between zero and four MBs per quarter common intermediate format (QCIF) sized frame, and pro-rata for other image sizes.
  • QCIF quarter common intermediate format
  • One benefit in using an extra criterion, such as setting a maximum limit of intra-coded MBs per frame, is that the bit rate can be maintained at a low level . For example, if all other conditions are met but the user only wants a maximum of four forced intra-coded MBs per frame, then the user or user's video unit may be allowed to choose not to intra-code any more MBs when that maximum is reached.
  • any layered scalable video codec can be readily adapted to incorporate the preferred embodiment of the present invention, of periodically incorporating intra-coded macroblocks within the enhancement layer bit-stream (s) .
  • the adaptation or programming of code or threshold levels may be implemented in the respective video communication unit in any suitable manner.
  • new apparatus may be added to a conventional video communication unit, or alternatively existing parts of a conventional video communication unit may be adapted, for example by reprogramming one or more processors therein.
  • the required adaptation may be implemented in the form of processor-implementable instructions stored on a storage medium, such as a floppy disk, hard disk, PROM, RAM or any combination of these or other storage multimedia.
  • any video communication unit operating in a video communication system for example user/subscriber equipment, such as mobile or portable radios or telephones .
  • the proposed error resilience technique using forced intra refresh in a scalable enhancement layer provides the advantage of improved video picture quality in an error-prone communication environment, such as mobile or wireless communication systems, or in non-guaranteed quality of service environments, such as the internet.
  • inventive concepts hereinbefore described provide a particular improvement in the video picture quality.
  • the present invention has been described with reference to replacing macroblocks in a scalable video sequence, where the macroblock is consistent with the definition in the MPEG-4 standard.
  • inventive concepts hereinbefore described could be applied to any format or number of video/image data bits contained within a frame, and not therefore limited to the particular macroblock configuration described.
  • inventive concepts may be applied to any video communication unit and/or video communication system including, inter alia, arbitrary-shaped object encoding, as described in MPEG4.
  • inventive concepts find particular use in wireless (radio) devices such as mobile telephones/mobile radio units and associated wireless communication systems.
  • wireless communication units may include a portable or mobile PMR radio, a personal digital assistant, a laptop computer or a wirelessly networked PC.
  • scalable video system technology may be implemented in the 3 rd generation (3G) of digital cellular telephones, commonly referred to as the Universal Mobile Telecommunications
  • Scalable video system technology may also find applicability in the packet data variants of both the current 2 nd generation of cellular telephones, commonly referred to as the general packet-data radio system (GPRS) and the TErrestrial Trunked RAdio (TETRA) standard for digital private and public mobile radio systems.
  • GPRS general packet-data radio system
  • TETRA TErrestrial Trunked RAdio
  • scalable video system technology may also be utilised in the Internet. The aforementioned inventive concepts will therefore find applicability in, and thereby benefit, all these emerging technologies.
  • the video sequence includes at least one base layer and at least one enhancement layer comprising a number of inter-coded macroblocks.
  • the method includes the step of replacing at least one inter-coded macroblock with an intra-coded macroblock in one or more enhancement layers of the video sequence .
  • the video communication system includes a video encoder comprising a processor for encoding a video sequence into a scalable video sequence having at least one enhancement layer.
  • Macroblock replacement means are operably coupled to the processor to replace at least one inter-coded macroblock with an intra-coded macroblock in one or more enhancement layers of the video sequence in response to the determined parameter.
  • a transmitter transmits the scalable video sequence with said at least one replaced macroblock.
  • the video communication system also includes a video decoder comprising a receiver for receiving the macroblock replaced scalable video sequence from the video encoder and macroblock interpreting means operably coupled to the receiver to interpret whether a received macroblock is a replaced intra-coded macroblock in one or more enhancement layers of the video sequence.
  • a video communication unit an adapted video encoder, an adapted video decoder, and a mobile radio device incorporating any one of these units, have also been described.
  • inventive concepts contained herein are equally applicable to any suitable video or image transmission system. Whilst specific, and preferred, implementations of the present invention are described above, it is clear that one skilled in the art could readily apply variations and modifications of such inventive concepts.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A method for improving a quality of a scalable video sequence communicated over an error-prone network (500). The video sequence includes at least one base layer and at least one enhancement layer comprising a number of intercoded macroblocks. the method includes the step of replacing (560) at least one inter-coded macroblock with an intra-coded macroblock in one or more enchancement layers of the video sequence. Replacing selected inter-coded macroblocks with intra-coded macroblocks in one or more enhancement layers provides improved video picture quality in an error-prone communication environment such as mobile or wireless communication systems or non-guaranteed quality of service environments such as the internet. The technique limits propagation of errors within the enhancement layer (s) of the video sequence.

Description

Scalable Video Communication
Field of the Invention
This invention relates to video transmission systems and video encoding/decoding techniques. The invention is applicable to, but not limited to, a method for protecting enhancement layer predicted (EP) pictures in a scalable video compression system that may be subject to errors.
Background of the Invention
In the field of video technology, it is known that video is transmitted as a series of still images/pictures. Since the quality of a video signal can be affected during coding or compression of the video signal, it is known to include additional information or 'layers', based on a difference between the video signal and the encoded video bit stream. The inclusion of additional layers enables the quality of the received signal, following decoding and/or decompression, to be enhanced. Hence, a hierarchy of base pictures and enhancement pictures, partitioned into one or more layers, is used to produce a layered video bit stream.
A scalable video bit-stream refers to the ability to transmit and receive video signals of more than one resolution and/or quality simultaneously. A scalable video bit-stream is one that may be decoded at different rates, according to the bandwidth available at the decoder. This enables the user with access to a higher bandwidth channel to decode high quality video, whilst a lower bandwidth user is still able to view the same video, albeit at a lower quality. The main application for scalable video transmission is where multiple decoders, with access to differing bandwidths, are receiving images from a single encoder .
In a layered (scalable) video bit stream, enhancements to the video signal may be added to a base layer either by:
(i) Increasing the resolution of the picture (spatial scalability) ;
(ii) Including error information to improve the Signal to Noise Ratio of the picture (SNR scalability) ;
(iii) Including extra pictures to increase the frame rate (temporal scalability) ; or
(iv) Providing a continuous enhancement that may be truncated at any chosen bit rate (Fine Granular Scalability) .
Such enhancements may be applied to the whole picture or to an arbitrarily shaped object within the picture, which is termed object-based scalability. In order to preserve the disposable nature of the temporal enhancement layer, the H.263+ ITU H.263 [ITU-T Recommendation, H.263, "Video Coding for Low Bit Rate Communication"] standard states that pictures included in the temporal scalability mode should be bi-directionally predicted (B) pictures, as shown in the video stream of FIG. 1.
FIG. 1 shows a schematic illustration of a scalable video arrangement 100 illustrating B picture prediction dependencies, as known in the field of video coding techniques. An initial intra-coded frame (Ii) 110 is followed by a bi-directionally predicted frame (B2) 120. This, in turn, is followed by a (uni-directional) predicted frame (P3) 130, and again followed by a second bi- directionally predicted frame (B4) 140. This again, in turn, is followed by a (uni-directional) predicted frame (P5) 150, and so on.
As an enhancement to the arrangement of FIG. 1, a layered video bit stream may be used. FIG. 2 is a schematic illustration of a layered video arrangement, known in the field of video coding techniques. A layered video bit stream includes a base layer 205 and one or more enhancement layers 235.
The base layer (layer-1) includes one or more intra-coded pictures (I pictures) 210 sampled, coded and/or compressed from the original video signal pictures. Furthermore, the base layer will include a plurality of subsequent predicted inter-coded pictures (P pictures) 220, 230 predicted from the intra-coded picture (s) 210. Inter-coded pictures are encoded to include the changes between the current picture and a previous picture.
In the enhancement layers (layer-2 or layer-3 or higher layer(s)) 235, three types of picture may be used:
(i) Bi-directionally predicted (B) pictures (not shown) ; (ii) A first enhanced intra-coded (El) picture 240 based on the intra-coded picture (s) 210 of the base layer 205, or the current lower layer enhancement picture if more than one enhancement layer is used; and (iii) Enhanced predicted (EP) pictures 250, 260, based on the inter-coded predicted pictures 220, 230 of the base layer 205. EP pictures 250, 260 contain macroblocks that are predicted from either the current lower layer picture or the previous picture within the same enhancement layer.
In video coding systems using ITU H.263 [ITU-T Recommendation, H.263, "Video Coding for Low Bit Rate Communication"] video compression technology, the image data are compressed in macroblocks (MBs) . These MBs comprise four luma and two chroma blocks of 8x8 pixels. This definition of macroblock is also used in the MPEG family of standards.
The vertical arrows from the lower, base layer illustrate that the picture in the enhancement layer is predicted from a reconstructed approximation of that picture in the reference (lower) layer.
If prediction is only formed from the lower layer, then the enhancement layer picture is referred to as an El picture. It is possible, however, to create a modified bi- directionally predicted picture using both a prior enhancement layer picture and a temporally simultaneous lower layer reference picture. This type of picture is referred to as an EP picture or "Enhancement" P-picture. The prediction flow for El and EP pictures is shown in FIG. 2. Although not specifically shown in FIG. 2, an El picture in an enhancement layer may have a P picture as its lower layer reference picture, and an EP picture may have an intra-coded picture as its lower-layer reference picture .
For both El and EP pictures, the prediction from the reference layer uses no motion vectors. However, as with normal P pictures, EP pictures use motion vectors when predicting from their temporally, prior-reference picture in the same layer.
Current standards incorporating the aforementioned scalability techniques include MPEG-4 and H.263. These standards create highly compressed bit-streams, which represent the coded video. However, due to the use of high compression, the bit-streams are very prone to corruption by network errors during the transmission process . For example, in the case of streaming video over an error prone network, even with existing network level error resilience tools employed, it is inevitable that some bit-level corruption will occur in the bit-stream. Hence, errors are passed on to the decoder.
When transmitting packets containing scalable video data over an error-prone network, it is desirable that the packets containing the base layer data are given a higher priority and that they are encoded with more error- protection than packets containing enhancement layer data. This lower level of error-protection for enhancement layer data means that enhancement layer data is more likely to be affected by errors. Apart from the first enhancement picture, all SNR or spatial enhancement pictures (EP pictures) are predicted - either from the current lower layer picture or the previous picture in the same enhancement layer. Clearly, if an error affects the picture that they have been predicted from, they too will contain errors. Hence, enhancement layer data is more susceptible to be affected by error propagation.
In the base layer, macroblocks in the P pictures can be intra-coded at regular intervals (for example, H.263 requires every macroblock to be intra-coded at least once every one hundred and thirty two frames) , to reduce error propagation. In the enhancement layer, certain macroblocks in EP pictures can be predicted solely from the base layer picture, to reduce the impact of error propagation.
Error resilience methods described in the standards allow re-synchronisation markers to be added to the bit stream, so that if an error corrupts the bit stream, earlier recovery of error-free data can be achieved. Data partitioning, whereby the header, motion vector and texture information are separated, also contributes to error resilience. The MPEG4 standard allows for reversible Variable Length Codes, so that if an error corrupts the start of the code, it can be decoded from the end instead of the start, thereby increasing the recovery of error-free data that would ordinarily be discarded.
However, the aforementioned approaches have the disadvantage that once an error has corrupted an enhancement layer, there is still no known method of preventing its propagation into subsequent pictures, if the base layer is also affected by errors. Furthermore, a current known technique, of introducing an El macroblock, will not solve this problem, as it is possible that the macroblock has been predicted from a base layer macroblock also in error.
A need therefore exists for improved error resilience in a scalable video transmission system, wherein the abovementioned disadvantages may be alleviated. Prior art arrangements are known for non-scalable video systems. See for example EP-A-0798938 , EP-A-0633699, EP-A- 0536630 and WO-A-9811502.
Statement of Invention
The present invention provides a method for improving a quality of a scalable video sequence communicated over an error-prone network, see claim 1, a video communication system, as claimed in claim 10, a video communication unit, as claimed in claim 18, a video encoder, as claimed in claim 19, a video decoder, as claimed in claim 20, a mobile radio device, as claimed in claim 21, and a storage medium storing processor-implementable instructions, as claimed in claim 23. Further aspects of the present invention are as claimed in the dependent claims.
In summary, an apparatus and a method for improving the quality of scalable video enhancement layers transmitted over an error-prone network are described. The apparatus and method use periodic replacement of inter-coded macroblocks (MBs) in each EP or El picture within an enhancement layer of a video sequence with intra-coded MBs. In this manner, propagation of errors within the enhancement layer (s) can be reduced.
Brief Description of the Drawings
FIG. 1 is a schematic illustration of a video coding arrangement showing picture prediction dependencies, as known in the field of video coding techniques. FIG. 2 is a schematic illustration of a known layered video coding arrangement . Exemplary embodiments of the present invention will now be described, with reference to the accompanying drawings, in which: FIG. 3 is a block diagram of a video communication unit, adapted to introduce substantially periodic intra-coded MBs in an enhancement layer of a video sequence, in accordance with the preferred embodiment of the present invention.
FIG. 4 is a schematic representation of a scalable video communication system adapted to introduce substantially periodic intra-coded MBs in an enhancement layer of a video sequence in accordance with the preferred embodiment of the present invention.
FIG. 5 is a flowchart illustrating the preferred method and parameters used when introducing substantially periodic intra-coded MBs in an enhancement layer of a video sequence .
Description of Preferred Embodiments
Referring now to FIG. 3, a block diagram of a video subscriber unit 300, adapted to support the inventive concepts of the preferred embodiments of the present invention, is shown.
The preferred embodiment of the present invention is described with respect to a wireless video communication unit, for example one capable of operating in the 3rd generation partnership project (3GPP) standard for future cellular wireless communication systems. However, it is within the contemplation of the invention that the inventive concepts herein described are equally applicable for other wireless or fixed communication units capable of video transmissions. The video subscriber unit 300 contains an antenna 302 preferably coupled to a duplex filter, antenna switch or circulator 304.that provides isolation between receive and transmit chains within the video subscriber unit 300.
The receiver chain includes receiver front-end circuitry 306, effectively providing reception, filtering and intermediate or base-band frequency conversion. The front- end circuit 306 receives video signal transmissions from another communication unit, for example its associated Node B or base transceiver station (BTS) . The front-end circuit 306 is serially coupled to a signal processing function 308, generally realised by a digital signal processor (DSP) . The signal processing function 308 performs signal demodulation, error correction and formatting, and recovers the video data bit-stream. Recovered data from the signal processing function 308 is serially coupled to a video codec function 309, the operation of which is further described with respect to FIG. 4.
As known in the art, received signals that have been decrypted by the video codec function 309 are typically input to a baseband processing device 310, which takes the video decoded information received from the video codec function 309 and formats it in a suitable manner to send to a video display 311.
In different embodiments of the invention, the signal processing function 308, video codec function 309 and baseband processing function 310 may be provided within the same physical device. A controller 314 is configured to control the information flow and operational state of the elements of the subscriber unit 300.
In accordance with the preferred embodiment of the invention, in the video subscriber unit 300, the video codec function 309 in the receive path (i.e. the decoder) has been adapted to interpret information transmitted from the encoding unit. The video codec function also identifies within the bit-stream of the received video signal which macroblocks within the enhancement layer have been periodically intra-coded. The decoder has been further adapted by providing parameter information back to the encoder, wherein the parameter information relates to a received video sequence, such as the received signal level, received bit error rate, etc.
As regards the transmit chain, this essentially includes a video input device 320 coupled in series through a baseband processor 310, video codec function 309, signal processing function 308, transmitter/modulation circuitry 322 and a power amplifier 324. The processor 308, transmitter/modulation circuitry 322 and the power amplifier 324 are operationally responsive to the controller, with an output from the power amplifier coupled to the duplex filter, antenna switch or circulator 304, as known in the art .
The transmit chain in the video subscriber unit 300 takes the video input bit-stream from the video input device 320 and the video codec function 309 encodes the video bit- strea into scalable layers as described in greater detail with respect to FIG. 4. The video encoded information is then passed to the signal processor where it is formatted to include, for example, error protection, for subsequent modulation and transmission by transmit/modulation circuitry 322 and power amplifier 324.
In the preferred embodiment of the present invention, the timer/counter 318 for encoding functions is also adapted to count numbers of macroblocks in a video transmission sequence such that intra-coded macroblocks can be periodically introduced. Furthermore, the encoder syntax has been changed so that the replacing of an inter-coded MB with an intra-coded MB is signalled to the decoder.
The scalable video encoder is adapted such that it interprets feedback information received from the video decoder, for example via a return channel or other feedback device which may support the Real-Time Control Protocol (RTCP) about the received signal strength and/or received error rate. The scalable video encoder is preferably also adapted such that it deduces likely error conditions on the communication link, based on such parameter information.
In response to these feedback conditions, and other parameters that will be subsequently described, the encoder will replace certain inter-coded MBs with intra-coded MBs, which are calculated directly from the enhancement layer data. Appearance of these MBs will be signalled in the bit-stream by a change to the MB header, such that the decoder may recognise when an intra MB occurs .
The signal processor function 308, video codec function 309 and baseband processing function 310 in the transmit chain may be implemented as distinct from the corresponding functions in the receive chain. Alternatively, a single processor may be used to implement corresponding processing operations of both transmit and receive signals, as shown in FIG. 3. In a yet further alternative embodiment, the signal processor function 308, video codec function 309 and baseband processing function 310, for both transmit and receive chains, may be combined into a single processor, for example a digital signal processor (DSP) .
Of course, the various components within the video subscriber unit 300 can be realised in discrete or integrated component form, with an ultimate structure therefore being merely an arbitrary selection.
Referring next to FIG. 4, a schematic representation of the primary functions of a video communication system 400 is shown. The video communication system includes a video encoder 415 and video decoder 425, adapted to incorporate the preferred embodiment of the present invention. In FIG. 4, a video picture F0 is compressed 410 in a video encoder 415 to produce the base layer bit stream signal to be transmitted at a rate ri kilobits per second (kbps) . This signal is decompressed 420 at a video decoder 425 to produce the reconstructed base layer picture F0' .
The compressed base layer bit stream is also decompressed at 430 in the video encoder 415 and compared with the original picture F0 at 440 to produce a difference signal 450. This difference signal is compressed at 460 and transmitted as the enhancement layer bit stream at a rate r2 kbps. This enhancement layer bit stream is decompressed at 470 in the video decoder 425 to produce the enhancement layer picture F0' ' which is added to the reconstructed base layer picture F0' at 480 to produce the final reconstructed picture F0 '" .
In accordance with the preferred embodiment of the present invention, the compression function 460 in the video encoder 415 has been adapted to enable one or more intra- coded macroblocks, to replace inter-coded macroblocks in one or more of the enhancement layer bit-streams.
Furthermore, the decompression function 470 in the video decoder 425 has been adapted to recognise the location (s) of the incorporated intra-coded macroblocks in the enhancement layer bit-stream, when it would ordinarily expect inter-coded macroblocks. The decompression function 470 is able to do this in response to signalling information within the bit-stream of the video signal received from the video encoder 415, preferably in the MB header .
In accordance with the preferred embodiment of the present invention, the decision on which macroblocks are selected for periodically intra-coding is preferably made in response to a number of parameters, - for example radio signal strength indication (RSSI) measurements in the area of the video communication unit . The video encoder has been adapted to interpret feedback information, such as RSSI information or received error rates, transmitted from the video decoder 425 on a return channel 490. The return channel 490 may comprise any feedback device, for example one supporting RTCP.
The video encoder 415 then utilises this information in determining which enhancement layer inter-coded MBs to replace with intra-coded MBs in the encoded video sequence. The use of such incorporated intra-coded macroblocks in the enhancement layer bit-stream is further described with regard to the flowchart of FIG 5.
Referring now to FIG. 5, a flowchart 500 illustrates the preferred method for deciding which macroblock (MB) frames are to be replaced with intra-coded frames in a substantially periodic manner.
A 'FrequencyThreshold' parameter is used to control the periodicity of the forced intra-coded MB frames. It is within the contemplation of the present invention that the FrequencyThreshold parameter may be user defined, via an input port of the communication device. Alternatively, the FrequencyThreshold parameter may be fixed, for example by setting the parameter value to be λ132' during manufacture similar to the H.263 standard.
When each video frame is processed 510 in the compressor function 460, a counter is compared to the stored FrequencyThreshold parameter, as in step 520. If the threshold has not been exceeded in step 520, a frequency counter is increased, as shown in step 515. In this manner, a pre-defined distance can periodically separate the forced intra-coded MBs. The minimum distance between forced intra-coded MBs is determined by the allocated FrequencyThreshold parameter. In the preferred embodiment of the invention, it is envisaged that information on whether any macroblocks in the same spatial location in previous frames had been encoded as forced intra MBs, is also take into account. If the threshold has been exceeded in step 520, a determination is preferably made, in step 530, as to whether any corresponding MB, located at the same position within a preceding frame, is a forced enhancement intra- coded MB. If one or more previous MBs at the same position is a forced enhancement intra-coded MB(s), then it is not required to force an intra MB in this position, and the counter is incremented, as shown in step 515. The processing then continues with the next MB in sequence.
However, if one or more previous MBs at the same position within a preceding frame is not a forced enhancement intra- coded MB, in step 530, then an RSSI measurement (or equivalent error metric) of the received video transmission is compared against a SignalThreshold value, as shown in step 540. It is envisaged that the SignalThreshold value would be radio network dependent. If the RSSI measurement for the MB or frame of the received video transmission is less than the SignalThreshold value in step 540, then it may be required to force an intra MB, so the processing proceeds to step 550. Otherwise, a forced MB is not required, and the counter is incremented, as in step 515. The processing then continues with the next MB in sequence.
In accordance with an improved embodiment of the present invention, it is envisaged that the peak signal to noise ratio (PSNR) of the reconstructed macroblock at the same spatial location in the one or more previous frames can also be used in the decision process. Hence, if the RSSI measurement for the MB or the frame of the received video transmission is less than the SignalThreshold value, in step 540, then a PSNR measurement of the previous received video frame is compared against a PSNR threshold, as shown in step 550. If the PSNR measurement for the previous received video frame is less than the PSNR threshold, then it would be advantageous to force an intra-coded MB. Hence, instead of the expected inter-coded MB, the MB is forced to be encoded as an intra MB, as shown in step 560. Otherwise, a forced MB is not required, and the counter is incremented, as shown in step 515. The processing then continues with the next MB in sequence.
In the context of the preferred embodiment of the present invention, PSNR refers to the picture quality, whereas RSSI refers to the radio link quality. A low PSNR means lots of compression, and consequently more susceptibility to errors. Thus, the use of a forced intra-coded MB benefits low PSNR scenarios.
It is envisaged that the PSNRThreshold is made dependent upon the bit rates available in the communication unit to encode the base layer and the enhancement layer, and thereby the PSNR value that should be targeted. However, when encoding at say, less than 32kbps, a PSNRThreshold of 33dB is selected in the preferred embodiment. When encoding at say, between 32kbps and 64kbps a PSNRThreshold of 36dB could be selected in the preferred embodiment. When encoding at say, over 64kbps a PSNRThreshold of 40dB could be selected in the preferred embodiment. In general, the preferred embodiment would select for the PSNRThreshold value the target PSNR used in the rate control algorithm of the encoder.
If the PSNR measurement for the previous received video frame is less than the PSNR threshold, then the current MB is modified to be an intra-coded MB, as shown in step 560. In this manner, when the preferred criteria of periodicity, signal to noise ratio, received signal strength level and location of previous forced intra-coded MBs have been satisfied, then a suitable location of a forced intra-coded MB has been determined.
The intra-coded MB count is then set to a numerical 0' value, as in step 570, and the process repeats for that particular video transmission.
Advantageously, the use of such additional information, as provided by the above-preferred embodiment, ensures that the selection of appropriate macroblocks will be more accurate. This negates the need to perform a complete intra-coded macroblock operation on an enhancement video stream, shortly following an El macroblock. As more bits are needed to encode an intra-coded macroblock, only certain macroblocks are intra-coded per frame.
It is within the contemplation of the present invention that the aforementioned parameters are examples of any number of parameters that can be used to effect the forced intra-coded refresh operation. Furthermore, although the preferred embodiment has been described with respect to four parameters, it is envisaged that any combination of these parameters could be used to improve the performance of enhancement layer transmissions. Additionally, it is envisaged that the configuration to effect the forced introduction of intra-coded MBs can be based upon the parameters falling below, or exceeding, particular thresholds, as appropriate to the particular implementation. In the preferred embodiment of the present invention, the compressor function 460 has been adapted to store and run the following algorithm/pseudo code:
Preferred algorithm;
If (MB_Intra_Count [i] > FrequencyThreshold)
{
If (MBTYPE [i] t-i AND MBTYPE [i] t-2 AND MBTYPE [i] t-3 1 = ENHANCEMENT_INTRA) {
If (Radio_Signal_Strength < SignalThreshold AND PSNR [I] t-1 < PSNRThreshold) MBType [i] t = INTRA
} MB_Intra_Count [i] = 0
} else
MB_Intra_Count [i] ++
END
In the preferred algorithm, as detailed above, the following definitions are used: th MBTYPE [i] t is the macroblock type for the i macroblock in frame t; and MB_Intra_Count [i] counts the number of macroblocks over previous and current frames that have not been intra-coded.
This algorithm requires additional bits to be encoded, as MBs must be intra-coded rather than inter-coded. However, the inventors of the present invention have determined that the decrease in coding efficiency in the enhancement layer would be less than 4%, as only one or two macroblocks per frame would need to be intra-coded. Advantageously, the gain in visual quality far outweighs the additional bit expenditure, especially in poor error conditions where the base layer was being corrupted.
To avoid an excessive decrease in coding efficiency, the number of intra-coded MBs should be limited to a maximum number per frame. This limit can be user definable, and will depend on the level of decrease in coding efficiency that is acceptable to the user. Typically, this may be between zero and four MBs per quarter common intermediate format (QCIF) sized frame, and pro-rata for other image sizes. One benefit in using an extra criterion, such as setting a maximum limit of intra-coded MBs per frame, is that the bit rate can be maintained at a low level . For example, if all other conditions are met but the user only wants a maximum of four forced intra-coded MBs per frame, then the user or user's video unit may be allowed to choose not to intra-code any more MBs when that maximum is reached.
It is within the contemplation of the invention that alternative encoding and decoding configurations could be adapted to use such periodically incorporated intra-coded macroblocks within the enhancement layer bit-stream (s) . As a result, the inventive concepts described should not be viewed as being limited to the example configuration provided in FIG. 4, or the flowchart of FIG. 5.
It is further envisaged that any layered scalable video codec can be readily adapted to incorporate the preferred embodiment of the present invention, of periodically incorporating intra-coded macroblocks within the enhancement layer bit-stream (s) . More generally, the adaptation or programming of code or threshold levels may be implemented in the respective video communication unit in any suitable manner. For example, new apparatus may be added to a conventional video communication unit, or alternatively existing parts of a conventional video communication unit may be adapted, for example by reprogramming one or more processors therein. As such, the required adaptation may be implemented in the form of processor-implementable instructions stored on a storage medium, such as a floppy disk, hard disk, PROM, RAM or any combination of these or other storage multimedia.
It is also within the contemplation of the invention that such adaptation of a video encoding or video decoding operation may be facilitated by any video communication unit operating in a video communication system, for example user/subscriber equipment, such as mobile or portable radios or telephones .
It will be understood that the proposed error resilience technique using forced intra refresh in a scalable enhancement layer provides the advantage of improved video picture quality in an error-prone communication environment, such as mobile or wireless communication systems, or in non-guaranteed quality of service environments, such as the internet.
One could imagine multimedia services where the base layer of the video stream is free of charge, for example as a preview, and it is the additional quality of the enhancement layer that must be paid for by the customer. In this case, an invention improving the error resilience of the enhancement layer, as described above, would be of significant benefit. In this context, users will only pay for the enhancement layer if it has good resilience to errors in the communication link.
Other error resilience or error concealment methods have been shown to improve visual quality in many instances. However, in the case where errors have corrupted both the base layer and the enhancement layer, the inventive concepts hereinbefore described provide a particular improvement in the video picture quality. The present invention has been described with reference to replacing macroblocks in a scalable video sequence, where the macroblock is consistent with the definition in the MPEG-4 standard. However, a skilled artisan would recognise that the inventive concepts hereinbefore described could be applied to any format or number of video/image data bits contained within a frame, and not therefore limited to the particular macroblock configuration described.
It is within the contemplation of the present invention that the aforementioned inventive concepts may be applied to any video communication unit and/or video communication system including, inter alia, arbitrary-shaped object encoding, as described in MPEG4. In particular, the inventive concepts find particular use in wireless (radio) devices such as mobile telephones/mobile radio units and associated wireless communication systems. Such wireless communication units may include a portable or mobile PMR radio, a personal digital assistant, a laptop computer or a wirelessly networked PC. Although the preferred embodiment of the present invention has been described with reference to the MPEG-4 standard, scalable video system technology may be implemented in the 3rd generation (3G) of digital cellular telephones, commonly referred to as the Universal Mobile Telecommunications
Standard (UMTS) . Scalable video system technology may also find applicability in the packet data variants of both the current 2nd generation of cellular telephones, commonly referred to as the general packet-data radio system (GPRS) and the TErrestrial Trunked RAdio (TETRA) standard for digital private and public mobile radio systems. Furthermore, scalable video system technology may also be utilised in the Internet. The aforementioned inventive concepts will therefore find applicability in, and thereby benefit, all these emerging technologies.
In summary, a method for improving a quality of a scalable video sequence communicated over an error-prone network has been described. The video sequence includes at least one base layer and at least one enhancement layer comprising a number of inter-coded macroblocks. The method includes the step of replacing at least one inter-coded macroblock with an intra-coded macroblock in one or more enhancement layers of the video sequence .
A video communication system has also been described. The video communication system includes a video encoder comprising a processor for encoding a video sequence into a scalable video sequence having at least one enhancement layer. Macroblock replacement means are operably coupled to the processor to replace at least one inter-coded macroblock with an intra-coded macroblock in one or more enhancement layers of the video sequence in response to the determined parameter. A transmitter transmits the scalable video sequence with said at least one replaced macroblock. The video communication system also includes a video decoder comprising a receiver for receiving the macroblock replaced scalable video sequence from the video encoder and macroblock interpreting means operably coupled to the receiver to interpret whether a received macroblock is a replaced intra-coded macroblock in one or more enhancement layers of the video sequence.
A video communication unit, an adapted video encoder, an adapted video decoder, and a mobile radio device incorporating any one of these units, have also been described.
Generally, the inventive concepts contained herein are equally applicable to any suitable video or image transmission system. Whilst specific, and preferred, implementations of the present invention are described above, it is clear that one skilled in the art could readily apply variations and modifications of such inventive concepts.
Thus, an apparatus and a method for improving error resilience using forced intra refresh in a scalable enhancement layer have been provided, whereby the aforementioned disadvantages with prior art arrangements have been substantially alleviated.

Claims

Claims
1. A method for improving a quality of a scalable video sequence communicated over an error-prone network (500) , wherein the scalable video sequence includes at least one base layer and at least one enhancement layer comprising a number of inter-coded macroblocks, the method characterised by the step of : replacing (560) at least one inter-coded macroblock with at least one intra-coded macroblock in one or more enhancement layers of the video sequence.
2. The method for improving a quality of a scalable video sequence communicated over an error-prone network according to Claim 1, further characterised by the step of: determining a parameter of the video sequence and replacing a selected at least one inter-coded macroblock with an intra-coded macroblock in response to said parameter.
3. The method for improving a quality of a scalable video sequence communicated over an error-prone network according to Claim 2 , wherein the step of determining includes determining how many inter-coded macroblocks to replace with intra-coded macroblocks and/or under what conditions said replacement is to be performed.
4. The method for improving a quality of a scalable video sequence communicated over an error-prone network according to Claim 2 or Claim 3, wherein the step of determining includes counting a number of macroblocks since a previous replacement with an intra-coded macroblock, and the step of replacing is performed if said number of macroblocks exceeds a threshold value that sets a periodicity of said replaced intra-coded macroblocks.
5. The method for improving a quality of a scalable video sequence communicated over an error-prone network according to any of preceding Claims 2 to 4, wherein the step of determining includes determining a received signal level of the received video bit sequence and the step of replacing is performed if said received signal level is below a signal threshold value, for example a signal threshold value that is radio network dependent and sets a threshold level where errors are likely to occur within the error prone network.
6. The method for improving a quality of a scalable video sequence communicated over an error-prone network according to any of preceding Claims 2 to 5, wherein the step of determining includes determining an acceptable PSNR metric for the encoded video bit sequence, and the step of replacing is performed if said PSNR is below a PSNR threshold value.
7. The method for improving a quality of a scalable video sequence communicated over an error-prone network according to any of preceding Claims 2 to 6, wherein the step of determining includes determining a total number of macroblocks to intra-code per frame, and the step of replacing is performed in response to said determination.
8. The method for improving a quality of a scalable video sequence communicated over an error-prone network according to any of preceding Claims 2 to 7, wherein one or more of said parameters is user definable or pre- determined, for example to provide a trade off between error resiliency and bit efficiency.
9. The method for improving a quality of a scalable video sequence communicated over an error-prone network according to any of preceding Claims 1 to 8 , wherein the one or more video enhancement layers comprise part of an H.263 or MPEG-4 scalable video bit-stream.
10. A video communication system (400) comprising: a video encoder (415) comprising: a processor (309) for encoding a video sequence into a scalable video sequence having at least one enhancement layer comprising a number of inter-coded macroblocks; macroblock replacement means (460) operably coupled to said processor to replace at least one inter-coded macroblock by an intra-coded macroblock in one or more enhancement layers of the video sequence; and a transmitter, operably coupled to said processor, for transmitting said macroblock replaced scalable video sequence ; and a video decoder (425) comprising: a receiver for receiving said scalable video sequence with said at least one replaced macroblock from said video encoder; and macroblock interpreting means (470) operably coupled to said receiver to interpret whether a received macroblock is a replaced intra-coded macroblock in one or more enhancement layers of the video sequence .
11. The video communication system (400) according to Claim 10, wherein said video encoder further comprises: parameter determining means, operably coupled to a processor, to determine a parameter of the scalable video sequence; and said macroblock replacement means (460) replaces an inter-coded macroblock with an intra-coded macroblock in one or more enhancement layers of the video sequence in response to said determined parameter.
12. The video communication system (400) according to Claim 10 or Claim 11 further comprising a feedback path from said video decoder to said video encoder, wherein said video decoder informs said video encoder of a parameter relating to a received video sequence via said feedback path.
13. The video communication system (400) according to Claim 11, wherein said parameter determining means determines a number of macroblocks since a previous replacement with an intra-coded macroblock; and said macroblock replacement means (460) replaces at least one inter-coded macroblock with an intra-coded macroblock if said number of macroblocks exceeds a threshold value that sets a periodicity of said replaced intra-coded macroblocks .
14. The video communication system (400) according to Claim 12 , wherein said video decoder receiver determines a received signal level of the received video bit sequence and informs said parameter determining means in said video encoder of said received signal level via said feedback path; and said macroblock replacement means (460) replaces at least one inter-coded macroblock with an intra-coded macroblock if said received signal level is above or below a signal threshold value, for example a signal threshold value that is radio network dependent and sets a threshold level where errors are likely to occur within the error prone network.
15. The video communication system (400) according to Claim 12, wherein said video encoder determines an acceptable
PSNR value for encoding the video bit sequence; wherein said macroblock replacement means (460) replaces at least one inter-coded macroblock with an intra-coded macroblock if said PSNR value is below a PSNR threshold value.
16. The video communication system (400) according to Claim 11, wherein said parameter determining means determines a total number of macroblocks to intra code per frame; and wherein said macroblock replacement means (460) replaces at least one inter-coded macroblock ..with an intra- coded macroblock in response to said determination or wherein said macroblock replacement means (460) is inhibited if said maximum number of macroblocks to intra code per frame is exceeded.
17. The video communication system (400) according to any of preceding Claims 10 to 16, wherein said video communication system (400) supports H.263 or MPEG-4 scalable video bit-streams.
18. A video communication unit (300) adapted for use in the method of any of Claims 1 to 9 or adapted for use in the communication system of any of Claims 10 to 17.
19. A video encoder (415) adapted for use in the method of any of Claims 1 to 9 or adapted for use in the communication system of any of Claims 10 to 17.
20. A video decoder (425) adapted for use in the method of any of Claims 1 to 9 or adapted for use in the communication system of any of Claims 10 to 17.
21. A mobile radio device comprising a video communication unit in accordance with claim 18 or a video encoder in accordance with claim 19 or a video decoder in accordance with claim 20.
22. The mobile radio device of claim 21, wherein the mobile radio device is a mobile phone, a portable or mobile PMR radio, a personal digital assistant, a lap-top computer or a wirelessly networked PC.
23. A storage medium storing processor-implementable instructions for controlling a processor to carry out the method of any of claims 1 to 9.
PCT/EP2003/000523 2002-01-24 2003-01-20 Scalable video communication WO2003063495A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2003206758A AU2003206758A1 (en) 2002-01-24 2003-01-20 Scalable video communication

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0201589A GB2384638B (en) 2002-01-24 2002-01-24 Scalable video communication
GB0201589.9 2002-01-24

Publications (2)

Publication Number Publication Date
WO2003063495A2 true WO2003063495A2 (en) 2003-07-31
WO2003063495A3 WO2003063495A3 (en) 2003-12-24

Family

ID=9929652

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2003/000523 WO2003063495A2 (en) 2002-01-24 2003-01-20 Scalable video communication

Country Status (3)

Country Link
AU (1) AU2003206758A1 (en)
GB (1) GB2384638B (en)
WO (1) WO2003063495A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2895172A1 (en) * 2005-12-20 2007-06-22 Canon Kk METHOD AND DEVICE FOR ENCODING A VIDEO STREAM CODE FOLLOWING HIERARCHICAL CODING, DATA STREAM, METHOD AND DECODING DEVICE THEREOF
US7995656B2 (en) 2005-03-10 2011-08-09 Qualcomm Incorporated Scalable video coding with two layer encoding and single layer decoding
EP1782652B1 (en) * 2004-07-27 2021-01-27 Telecom Italia S.p.A. Video-communication in mobile networks

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1847200B1 (en) 2004-06-04 2009-07-22 Hill-Rom Services, Inc. Mattress with heel pressure relief portion

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6025888A (en) * 1997-11-03 2000-02-15 Lucent Technologies Inc. Method and apparatus for improved error recovery in video transmission over wireless channels
WO2000008861A1 (en) * 1998-08-07 2000-02-17 Nokia Mobile Phones Limited Adaptive digital video codec for wireless transmission
US6304295B1 (en) * 1998-09-18 2001-10-16 Sarnoff Corporation Region-based refresh strategy for video compression

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69233538T2 (en) * 1991-09-30 2006-06-29 Kabushiki Kaisha Toshiba, Kawasaki Device for processing band-compressed signals for recording / playback
KR0166725B1 (en) * 1993-06-30 1999-03-20 김광호 Forced intra-frame coding method
US5909513A (en) * 1995-11-09 1999-06-01 Utah State University Bit allocation for sequence image compression
GB9606641D0 (en) * 1996-03-29 1996-06-05 Digi Media Vision Ltd Method and system for the compression and decompression of digital television signals encoded in the PALplus system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6025888A (en) * 1997-11-03 2000-02-15 Lucent Technologies Inc. Method and apparatus for improved error recovery in video transmission over wireless channels
WO2000008861A1 (en) * 1998-08-07 2000-02-17 Nokia Mobile Phones Limited Adaptive digital video codec for wireless transmission
US6304295B1 (en) * 1998-09-18 2001-10-16 Sarnoff Corporation Region-based refresh strategy for video compression

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Recommendation H.263: Video coding for low bit rate communication" ITU-T DRAFT RECOMMENDATION H.263, February 1998 (1998-02), pages 1-167, XP002176560 *
HARTUNG J ET AL: "A REAL-TIME SCALABLE SOFTWARE VIDEO CODEC FOR COLLABORATIVE APPLICATIONS OVER PACKET NETWORKS" PROCEEDINGS OF THE ACM MULTIMEDIA 98. MM '98. BRISTOL, SEPT. 12 - 16, 1998, ACM INTERNATIONAL MULTIMEDIA CONFERENCE, NEW YORK, NY: ACM, US, vol. CONF. 6, 12 September 1998 (1998-09-12), pages 419-426, XP000977531 ISBN: 1-58113-036-8 *
LEE J-Y ET AL: "MOTION-COMPENSATED LAYERED VIDEO CODING FOR PLAYBACK SCALABILITY" IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, IEEE INC. NEW YORK, US, vol. 11, no. 5, May 2001 (2001-05), pages 619-628, XP001096941 ISSN: 1051-8215 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1782652B1 (en) * 2004-07-27 2021-01-27 Telecom Italia S.p.A. Video-communication in mobile networks
US7995656B2 (en) 2005-03-10 2011-08-09 Qualcomm Incorporated Scalable video coding with two layer encoding and single layer decoding
FR2895172A1 (en) * 2005-12-20 2007-06-22 Canon Kk METHOD AND DEVICE FOR ENCODING A VIDEO STREAM CODE FOLLOWING HIERARCHICAL CODING, DATA STREAM, METHOD AND DECODING DEVICE THEREOF
WO2007072228A2 (en) * 2005-12-20 2007-06-28 Canon Kabushiki Kaisha A method and device for coding a scalable video stream, and an associated decoding method and device
WO2007072228A3 (en) * 2005-12-20 2007-10-25 Canon Kk A method and device for coding a scalable video stream, and an associated decoding method and device
US8542735B2 (en) 2005-12-20 2013-09-24 Canon Kabushiki Kaisha Method and device for coding a scalable video stream, a data stream, and an associated decoding method and device

Also Published As

Publication number Publication date
AU2003206758A1 (en) 2003-09-02
GB2384638B (en) 2004-04-28
GB0201589D0 (en) 2002-03-13
GB2384638A (en) 2003-07-30
WO2003063495A3 (en) 2003-12-24

Similar Documents

Publication Publication Date Title
US10484719B2 (en) Method, electronic device, system, computer program product and circuit assembly for reducing error in video coding
KR101005682B1 (en) Video coding with fine granularity spatial scalability
EP1157562B1 (en) Video coding
EP1994757B1 (en) Method and apparatus for error resilience algorithms in wireless video communication
US20050163211A1 (en) Scalable video transmission
US9319700B2 (en) Refinement coefficient coding based on history of corresponding transform coefficient values
US20050249285A1 (en) Method and apparatus for frame prediction in hybrid video compression to enable temporal scalability
EP1769643A2 (en) Method, apparatus, and system for enhancing robustness of predictive video codecs using a side-channel based on distributed source coding techniques
US20040062304A1 (en) Spatial quality of coded pictures using layered scalable video bit streams
Worrall et al. Prioritisation of data partitioned MPEG—4 video over mobile networks
WO2003063495A2 (en) Scalable video communication
GB2381980A (en) Error concealment in scalable video transmissions
WO2002019709A1 (en) Dual priority video transmission for mobile applications
You et al. Modified rate distortion optimization using inter-block dependence for H. 264/AVC intra coding
Stockhammer Is fine-granular scalable video coding beneficial for wireless video applications?
Gang et al. Error resilient multiple reference selection for wireless video transmission
Le Léannec et al. Packet loss resisilent H. 263+ compliant video coding
Stockhammer et al. H. 264/AVC for wireless applications
KR100690710B1 (en) Method for transmitting moving picture
Tian et al. Error resilient video coding techniques using spare pictures
Gong Rate-distortion-based mode selection for H. 264/AVC in wireless environments
Fang et al. Robust group-of-picture architecture for video transmission over error-prone channels
WO2001015458A2 (en) Dual priority video transmission for mobile applications
Castellà TREBALL DE FI DE CARRERA
Lee et al. Residual motion coding method for error resilient transcoding system

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SC SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP