WO2005022918A1 - System and method for encoding and decoding enhancement layer data using descriptive model parameters - Google Patents

System and method for encoding and decoding enhancement layer data using descriptive model parameters Download PDF

Info

Publication number
WO2005022918A1
WO2005022918A1 PCT/IB2004/002770 IB2004002770W WO2005022918A1 WO 2005022918 A1 WO2005022918 A1 WO 2005022918A1 IB 2004002770 W IB2004002770 W IB 2004002770W WO 2005022918 A1 WO2005022918 A1 WO 2005022918A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
image
encoder
sub
encoded
Prior art date
Application number
PCT/IB2004/002770
Other languages
English (en)
French (fr)
Inventor
Dzevdet Burazerovic
Wilhelmus Bruls
Stijn Waele
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Priority to US10/569,126 priority Critical patent/US7953156B2/en
Priority to EP04769188A priority patent/EP1661405B1/en
Priority to AT04769188T priority patent/ATE435567T1/de
Priority to JP2006524459A priority patent/JP4949836B2/ja
Priority to DE602004021818T priority patent/DE602004021818D1/de
Priority to CN2004800248162A priority patent/CN1843039B/zh
Publication of WO2005022918A1 publication Critical patent/WO2005022918A1/en
Priority to KR1020067004230A priority patent/KR101073535B1/ko

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • H04N19/29Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding involving scalability at the object level, e.g. video object layer [VOL]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/33Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/39Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability involving multiple description coding [MDC], i.e. with separate layers being structured as independently decodable descriptions of input picture data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/86Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals

Definitions

  • the present invention generally relates to signal processing.
  • image encoding systems for example such systems including video encoding systems, and corresponding image decoding systems, wherein during encoding image information is translated into a corresponding spatially layered format to which parametric modelling is
  • the present invention also relates to a method of image encoding utilized within the systems. Furthermore, the present invention additionally relates to a method of image decoding utilized within the aforesaid systems. Additionally, the invention relates to methods of identification of optimal solutions where parametric modelling is applied; such methods of
  • VCEG Video Coding Experts Group
  • ITU International Telecommunications Union
  • JVT Joint Video Team
  • H.264/ AVC Principal objectives of the H.264/ AVC standardization have been to significantly improve video compression efficiency and also to provide a "network-friendly" video representation addressing conversational and non-conversational applications; conversational applications relate to telephony whereas non-conversational applications relate to storage, broadcast and streaming of communication data.
  • the standard H.264/ AVC is broadly recognized as being able to achieve these objectives; moreover, the standard H264/AVC is also being considered for adoption by several other technical and standardization bodies dealing with video applications, for example the DVB-Forum and the DVD Forum.
  • US 5, 917, 609 is especially pertinent to compression of medical X-ray angiographic images where loss of noise leads a cardiologist or radiologist to conclude that corresponding images are distorted.
  • the encoder and corresponding decoder described are to be regarded as specialist implementations not necessarily complying with any established or emerging image encoding and corresponding decoding standards.
  • this standard utilizes similar principles of spatial scalability known from existing standards such as MPEG-2.
  • Application of the principles means that it is possible to encode a video sequence in two or more layers arranged in sequence from a highest layer to a lowest layer, each layer using a spatial resolution which is equal to or less than the spatial resolution of its next highest layer.
  • the layers are mutually related in such a manner that a higher layer, often referred to as an "enhancement layer", represents a difference between original images in the video sequence and a lower encoded layer after which it has been locally decoded and scaled-up to a spatial resolution corresponding to the original images.
  • an enhancement layer there is shown a scheme for generating data corresponding to such an enhancement layer.
  • FIG. 1 there is shown a known composite encoder indicated generally by 10.
  • the encoder 10 comprises a scaling-down function 20, a first H.264 encoder 30, a local H.264 decoder 40, a scaling-up function 50, a difference function 60 and a second H.264 encoder 70.
  • a video signal input IP is provided for inputting pixel image data.
  • the input IP is coupled to a non-inverting input (+) of the difference function 60 and to an input of the scaling-down function 20.
  • a scaled-down output of the scaling-down function 20 is coupled to an input of the first encoder 30.
  • a first principal encoded output of the first encoder 30 is arranged to provide a base layer output BLOP.
  • a second local encoded output of the first encoder 30 is coupled to an input of a local H.264 decoder whose corresponding decoded output is coupled to an input of the scaling-up function 50.
  • a scaled-up output of the scaling-up function 50 is coupled to an inverting input (-) of the difference function 60.
  • a difference output of the difference function 60 is coupled to an input of the second encoder 70.
  • An encoded output from the second encoder 70 is arranged to provide an enhancement layer output ELOP.
  • the composite encoder 10 is defined as being a multi-layer encoder on account of input image data presented at the input IP being represented in a plurality of encoded outputs, for example at the BLOP and ELOP outputs, each output corresponding to a "layer".
  • the composite encoder 10 is susceptible to being implemented in software, hardware, or a mixture of both software and hardware.
  • the scaling-down function 20 and the scaling-up function 50 are preferably arranged to have matched and mutually inverse image scaling characteristics.
  • the first encoder 30 and the local decoder 40 are preferably arranged to provide matched but inverse characteristics.
  • the first and second encoders 30, 70 are preferably endowed with mutually similar encoding characteristics.
  • An input stream of pixel data corresponding to a sequence of images is provided at the input IP of the encoder 10.
  • the stream is passed on a frame-by-frame basis to the non- inverting input (+) of the difference function 60 and also to the scaling-down function 20.
  • a scaled-down version of the input IP provided from the scaling-down function 20 is presented to the first encoder 30 which encodes the scaled-down version to provide the base layer BLOP output.
  • the first encoder 30 also provides a similar encoded output to the local decoder 40 which reconstitutes a version of the scaled-down version of the input presented to the first encoder 20.
  • the reconstituted version is then passed via the scaling-up function 50 to the inverting input (-) of the difference function 60.
  • the difference function 60 thereby provides at its output presented to an input of the second encoder 70 an error signal corresponding to errors introduced by a combination of the first encoder 30 and its associated decoder 40, ignoring deviations introduced by the scaling functions 20, 50.
  • This error signal is encoded to give rise to the enhancement-layer ELOP output.
  • the BLOP AND ELOP outputs are conveyed via a transmission medium to a receiver which is operable to decode the BLOP and ELOP outputs using one or more decoders similar in operating characteristics to the local decoder 40 and then the resulting decoded ELOP and BLOP signals are combined, it is feasible to reconstitute the input IP at the receiver with enhanced accuracy as encoding and decoding errors are susceptible to being compensated at the receiver by effect of the ELOP signal.
  • the inventors have appreciated that the ELOP output typically will have a relatively high spatial-frequency noise-like characteristic which corresponds to demanding material for a video encoder such as an H.26L encoder; in the following, the term "noiselike" is to be construed to refer to a relative lack of spatial correlation concurrently with a significant part of signal energy being distributed at higher spatial frequencies. Therefore, it is not uncommon in practice that the quantity of data used to encode a given part of the enhancement layer exceeds the quantity of data needed for encoding a corresponding part of the original image. Such a high data quantity requirement for encoding the enhancement layer signal ELOP potentially represents a problem which the present invention seeks to address.
  • a first object of the invention is to provide an image encoding system, and a corresponding complementary decoding system, utilizing multi-layer image encoding and decoding which is susceptible to providing greater image data compression.
  • a second object of the invention is to provide a more efficient method of encoding images whilst conveying substantially complete information present within a sequence of images.
  • an image encoding system including an encoder for receiving input image data and generating corresponding encoded image output data, the encoder including image processing means for processing said input image data to generate for each input image therein a plurality of corresponding image layers including at least one basic layer and at least one enhancement layer, and encoding means for receiving said image layers and generating therefrom the encoded image output data, said encoding means further comprising block selecting means for selecting one or more sub-regions of said at least one enhancement layer and modelling said one or more sub-regions for representation thereof in the image output data by way of descriptive model parameters.
  • the invention is of advantage in that it is capable of providing enhanced image encoding and decoding which is susceptible to greater data compression.
  • the processing means is operable to represent one or more principal features of each input image in its corresponding at least one basic layer, and to represent residual image information corresponding to a difference between information in said input image and its corresponding at least one basic layer in said at least one enhancement layer.
  • Subdivision of the input image to several layers is of benefit because it enables image subtleties to be isolated from principal features thereof, thereby enabling more efficient coding of the principal features, whilst allowing for progressive degrees of encoding residual details depending on a quality of eventual decoded image desired.
  • said one or more sub-regions are represented in the encoded output data from the encoding means as corresponding data when determined by the selecting means to be unsuitable for modelling, and represented by equivalent model parameters when determined by the selecting means to be suitable for modelling.
  • Applying modelling to features which are most appropriately modelled is of benefit in that an optimal compromise between image data compression and decoded quality is susceptible to being thereby achieved.
  • the encoding means is arranged to encode the input image data in at least one of substantially ITU-T H.264 and ISO/EEC MPEG-4 AVC standards enhanced by inclusion of said model parameters. More preferably, on account of such contemporary standards allowing for dynamically assignable private data fields, said model parameters are included into one or more private data regions of said encoded image output data.
  • said encoding means is operable to apply a spatial transform for translating said at least one selected sub-region to its corresponding model parameters for inclusion in said encoded image output data.
  • said transform includes a discrete cosine transform (DCT).
  • DCT discrete cosine transform
  • such DCT transform can be substituted with other types of mathematical transform.
  • the transform is operable to generate a corresponding 2- dimensional data set for each corresponding sub-region
  • the encoding means is arranged to concatenate said 2-dimensional data set to generate a corresponding 1- dimensional data set for inclusion in said model parameters in the encoded image output data.
  • the inventors have identified that use of a DCT is especially suitable for a type of feature encountered in each sub-region whilst resulting in an acceptably small amount of data when, for example, subject to 2-D to 1-D concatenation.
  • the present invention is susceptible to being implemented without a need for 2-D to 1-D concatenation.
  • direct parameter modelling of 2-D transform data from selected macroblocks can, if required, be employed.
  • the encoding means is arranged to select a model order for use in encoding said one or more sub-regions in said corresponding model parameters by way of an optimization between quantity of model parameter data and accuracy to which said modelling parameters represent their one or more corresponding sub-regions.
  • Use of optimization is capable of rendering the system provide better optimized data compression whilst substantially maintaining image quality.
  • the encoding means is arranged to apply a statistical test to calculate a statistical error between image data corresponding to said one or more sub- regions and their corresponding model parameters, and apply selective parameter estimation to determine a model order to employ for generating the modelling parameters for the encoded output data.
  • Use of interpolation is susceptible to reducing computational effort required to encode said one or more sub-regions and therefore rendering the system at least one of simpler to implement, capable of more rapid image encoding and less expensive to implement.
  • said one or more sub-regions correspond substantially to spatial noise-like features present in said at least one input image.
  • spatial noise is susceptible to giving rise to considerable amounts of data if not represented by model parameters.
  • inclusion of spatial noise-like features is important for accurate image recreation on decoding, the inventors have appreciated that the exact nature of the spatial noise is not so greatly important to image intelligibility and quality.
  • the inventors have appreciated that the statistical properties of spatial noise-like features are more important for intelligibility and quality, rather than exact image pixel values.
  • the system further includes a decoder for receiving the encoded output data from the encoder and for decoding said output data to recreate said input image, the decoder including decoding means for isolating said model parameters from directly encoded image data in the encoded output data, the decoder further including sub-region synthesizing means for receiving said decoded model parameters and generating data corresponding to said one or more sub-regions from said parameters, the decoder further comprising data merging means for combining said synthesized sub-region data with decoded direct image data to generate decoded output image data corresponding to said image input provided to the encoder.
  • a decoder for receiving the encoded output data from the encoder and for decoding said output data to recreate said input image
  • the decoder including decoding means for isolating said model parameters from directly encoded image data in the encoded output data
  • the decoder further including sub-region synthesizing means for receiving said decoded model parameters and generating data corresponding to said one or more sub-regions from said parameters
  • said encoded output image data from the encoder is conveyed to the decoder via a transmission medium, the medium including at least one of: the Internet, an optical data disc, a magnetic data disc, a DVD, CD, a solid-state memory device, a wireless communication network.
  • a transmission medium including at least one of: the Internet, an optical data disc, a magnetic data disc, a DVD, CD, a solid-state memory device, a wireless communication network.
  • an encoder for receiving input image data and generating corresponding encoded image output data
  • the encoder including image processing means for processing said input image data to generate for each input image therein a plurality of corresponding image layers including at least one basic layer and at least one enhancement layer, and encoding means for receiving said image layers and generating therefrom the encoded image output data, said encoding means further comprising block selecting means for selecting one or more sub- regions of said at least one enhancement layer and modelling said one or more sub-regions for representation thereof in the image output data by way of descriptive model parameters.
  • the invention is of advantage in that the encoder is susceptible to addressing at least one aforementioned object of the invention.
  • the processing means is operable to represent one or more principal features of each input image in its corresponding at least one basic layer, and to represent residual image information corresponding to a difference between information in said input image and its corresponding at least one basic layer in said at least one enhancement layer.
  • said one or more sub-regions are represented in the encoded output data from the encoding means as corresponding data when determined by the selecting means to be unsuitable for modelling, and represented by equivalent model parameters when determined by the selecting means to be suitable for modelling.
  • the encoding means is arranged to encode the input image data in at least one of substantially ITU-T H.264 and ISO/IEC MPEG-4 AVC standards enhanced by inclusion of said model parameters. More preferably, said model parameters are included into one or more private data regions of said encoded image output data. Such use of private data regions is susceptible to rendering the encoder backwardly compatible.
  • said encoding means is operable to apply a spatial transform for translating said at least one selected sub-region to its corresponding model parameters for inclusion in said encoded image output data.
  • said transform includes a discrete cosine transform (DCT).
  • DCT discrete cosine transform
  • alternative transforms are also susceptible to being employed.
  • the transform is operable to generate a corresponding 2- dimensional data set for each corresponding sub-region, and the encoding means is arranged to concatenate said 2-dimensional data set to generate a corresponding 1- dimensional data set for inclusion in said model parameters in the encoded image output data.
  • the encoding means is arranged to select a model order for use in encoding said one or more sub-regions in said corresponding model parameters by way of a optimization between quantity of model parameter data and accuracy to which said modelling parameters represent their one or more corresponding sub-regions.
  • the encoding means is arranged to apply a statistical test to calculate a statistic error between image data corresponding to said one or more sub- regions and their corresponding model parameters, and apply selective parameter estimation to determine a model order to employ for generating the modelling parameters for the encoded output data.
  • said one or more sub-regions correspond substantially to spatial noise-like features present in said input image.
  • a decoder for use for an encoder according to the second aspect of the invention, the decoder being operable to receive encoded output data from the encoder and for decoding said output data to recreate a corresponding input image, the decoder including decoding means for isolating model parameters from directly encoded image data in the encoded output data, the decoder further including sub-region synthesizing means for receiving said decoded model parameters and generating data corresponding to one or more sub-regions from said parameters, the decoder further comprising data merging means for combining said synthesized sub-region data with decoded direct image data to generate decoded output image data corresponding to said image input provided to the encoder.
  • a transmission medium for conveying said encoded output image data thereon from an encoder according to the first aspect of the invention, the medium including at least one of: an optical data disc, a magnetic data disc, a DVD, CD, a solid-state memory device. It will be appreciated that other types of data carrier are also possible.
  • a method of encoding image data in an encoder including the steps of:
  • the processing means is operable to represent one or more principal features of each input image in its corresponding at least one basic layer, and to represent residual image information corresponding to a difference between information in said input image and its corresponding at least one basic layer in said at least one enhancement layer.
  • the at least one basic layer includes most of principal details necessary for rendering the image recognizable when decoded again
  • the at least one enhancement layer includes fine detail to complement and refine the image conveyed in the at least one basic layer.
  • said one or more sub-regions are represented in the encoded output data from the encoding means as corresponding data when determined by the selecting means to be unsuitable for modelling, and represented by equivalent model parameters when determined by the selecting means to be suitable for modelling.
  • the encoding means is arranged to encode the input image data in at least one of substantially ITU-T H.264 and ISO/EBC MPEG-4 AVC standards enhanced by inclusion of said model parameters. More preferably, said model parameters are included into one or more private data regions of said encoded image output data.
  • said encoding means is operable to apply a spatial transform for translating said at least one selected sub-region to its corresponding model parameters for inclusion in said encoded image output data.
  • said transform includes a discrete cosine transform (DCT).
  • DCT discrete cosine transform
  • other types of transform are susceptible to being alternatively or additionally employed.
  • the transform is operable to generate a corresponding 2- dimensional data set for each corresponding sub-region
  • the encoding means is arranged to concatenate said 2-dimensional data set to generate a corresponding 1- dimensional data set for inclusion in said model parameters in the encoded image output data.
  • the encoding means is arranged to select a model order for use in encoding said one or more sub-regions in said corresponding model parameters by way of an optimization between quantity of model parameter data and accuracy to which said modelling parameters represent their one or more corresponding sub-regions. More preferably, the encoding means is arranged to apply a statistical test to calculate a statistical error between image data corresponding to said one or more sub-regions and their corresponding model parameters, and apply selective parameter estimation to determine a model order to employ for generating the modelling parameters for the encoded output data.
  • said one or more sub-regions correspond substantially to spatial noise-like features present in said at least one input image.
  • Such spatial noise-like features are capable of enabling the method to operate more efficiently as more sub-regions are then susceptible to being represented by model parameters.
  • the step of including a decoder for receiving the encoded output data from the encoder and for decoding said output data to recreate said input image the decoder including decoding means for isolating said model parameters from directly encoded image data in the encoded output data, the decoder further including sub-region synthesizing means for receiving said decoded model parameters and generating data corresponding to said one or more sub-regions from said parameters, the decoder further comprising data merging means for combining said synthesized sub-region data with decoded direct image data to generate decoded output image data corresponding to said image input provided to the encoder.
  • said encoded output image data from the encoder is conveyed to the decoder via a transmission medium, the medium including at least one of: the Internet, an optical data disc, a magnetic data disc, a DVD, CD, a solid-state memory device, a wireless communication network.
  • the invention is capable of being implemented in one or more of hardware, software and a combination of software and hardware.
  • Figure 1 is a schematic diagram of a composite encoder utilizing multi-layer image encoding
  • Figure 2 is a diagram of a group of images subject to encoding in the encoder of Figure l;
  • Figure 3 is a schematic diagram of a composite encoder utilizing multi-layer image encoding wherein an error difference signal is subject to detail analysis for purposes of generating an enhancement layer ELOP data stream;
  • Figure 4 is a schematic diagram of a composite encoder according to the invention, the encoder utilizing model parameter data to represent one or more selected macroblocks in enhancement layer ELOP data generated by the encoder;
  • Figure 5 is a corresponding decoder according to the invention to complement the encoder of Figure 4;
  • Figure 6 is an example enhancement layer ELOP image with selected macroblocks Bl to B4 marked thereon;
  • Figures 7 to 10 are Discrete Cosine Transforms (DCT) of the macroblocks Bl to B4 of Figure 6;
  • Figure 11 is a set of graphs of 2-D to 1-D data concatenation pertaining to the macroblocks Bl to B4 of Figure 6:
  • Figure 12 is a schematic diagram of a noise synthesizer for use in the invention.
  • Figures 13 and 14 are illustration of synthesis of noise-like signals for the selected macroblock B2;
  • Figure 15 is a Power Spectral Density (PSD) comparison relating to selective macroblocks Bl, B3 and B4; and
  • Figure 16 is a graph illustrating interpolated ELOP macroblock model parameter optimization.
  • the signal from the scaling-up function 50 that is subtracted from the input IP to generate the enhancement-layer ELOP signal via the second encoder 70 is obtained by passing the input IP through several processing steps, namely down-scaling, encoding, decoding and up-scaling.
  • Each of these steps are operable to introduce distortions; for example, re-sampling is susceptible to distorting higher spatial frequency information present in the images of the input IP on account of the use of imperfect filtering in a manner associated with the Nyquist criterion, whilst coding introduces artefacts which are mostly attributable to the quantization of higher transform coefficients.
  • FIG 2 there is shown an example image from a sequence of images provided in the input signal IP, Figure 2 thereby representing a "snap-shot" situation.
  • An original image presented at the input IP is denoted by 100.
  • An image denoted by 110 corresponds to the image 100 subjected to down-scaling by a factor of x 2 in the scaling-down function 20 and then JPEG encoded in the first encoder 30 followed by corresponding decoding.
  • an image denoted by 120 corresponds to the image 110 after it has been subjected to up-scaling in the scaling-up function 50.
  • an image denoted by 130 corresponds to a spatial difference of the image 100, 120, namely equivalent to difference image information provided from the difference function 60 to the second encoder 70; the image 130 is suitable for use in generating an enhancement-layer ELOP signal.
  • a 7-tap FUR filter is utilized for image filtering purposes, and 1.5 bits per pixel (bpp) for JPEG encoding.
  • the composite encoder 200 includes component parts of the encoder 10, namely the scaling-down function 20, the first H.264 encoder, the local H.264 decoder 40, the scaling-up function 50, the difference function 60 and the second H.264 encoder 70.
  • the composite encoder 200 additionally comprises an D -modifier 210, a detail analyzer 220 and a multiplier function 230. Connection topology of the encoder 200 will now be described.
  • the video signal input IP is coupled to the non-inverting input (+) of the difference function, to a first input of the detail analyzer 220 and to the scaling-down function 20.
  • An output of the scaling down function 20 is connected to an input of the first H.264 encoder 30 whose output corresponds to the base layer BLOP output.
  • An auxiliary encoded output of the encoder 30 is coupled via the local H.264 decoder 40 whose output is connected via the scaling-up function 50 to a second input of the detail analyzer 220 and also the inverting input (-) of the difference function 60; the scaling-up function 50 and the scaling-down function 20 are preferably arranged to provide mutually opposing effects.
  • An output SG of the analyzer 220 is coupled to an input of the modifier 210.
  • an output (1-D) from the modifier 210 is coupled to a first multiplying input of the multiplying function 230.
  • a summing output of the difference function 60 is connected to a second multiplying input of the multiplying function 230.
  • a multiply output MRS of the function 230 is coupled to an input of the second H.264 encoder 70 whose output is arranged to provide the enhancement layer ELOP signal.
  • composite encoder 200 is susceptible to being implemented in a least one of dedicated hardware, in software executing on computer hardware, and in a mixture of software and dedicated hardware.
  • the composite encoder 200 depicted in Figure 3 is arranged to function to a major extent in a similar manner to the encoder 10 in Figure 1, namely:
  • the input signal IP is propagated through the scaling-down function 20 to the first encoder 30 to generate the encoded base layer BLOP.
  • the first encoder 30 is also operable to provide a signal equivalent to BLOP which is decoded in the local decoder 40 and subject to scaling-up in the scaling-up function 50 to generate a signal DS;
  • the input signal IP is propagated through the difference function 60 whereat a reconstituted version of the input signal IP subject to encoding and decoding, namely the signal DS, is subtracted from the original signal IP to generate a corresponding residual difference signal RS.
  • the residual signal RS is presented to the multiplier function 230 whereat it is multiplied by a signal (1-D) to generate the modulated difference signal MRS which is subsequently encoded at the second encoder 70 to generate the enhancement layer ELOP.
  • the detail analyzer 220 is operable to receive the input signal IP and the residual signal DS and to derive therefrom a measure of spatial regions of the images conveyed in the input signal IP where:
  • the multiplier 230 is operable to reduce attenuation applied to the signal RS so that the encoder 70 correspondingly generates sufficient data at the ELOP output to allow the features of visual significance to be subsequently decoded and reconstituted. Conversely, where the residual signal RS includes image information of low significance, the multiplier 230 is operable to increase attenuation applied to the signal RS so that encoder 70 generates less data.
  • the detail analyzer 220 generates a numerical value D having associated therewith pixel parameters (x, y, fr#) for each pixel or group of pixels present in incoming images of the input data IP; "x" and “y” are image pixel spatial co-ordinates whereas fr# is colour and/or luminance data indicator.
  • pixel parameters x, y, fr#
  • fr# colour and/or luminance data indicator.
  • An effect provided by the composite encoder 200 is to filter regions of the images in the input IP which include relatively little detail, h such regions of relatively little detail, a considerable amount of data for the ELOP output would have been generated in the encoder 10, the regions corresponding in practice to substantially irrelevant little details and noise.
  • the composite encoder 200 is an advance on the encoder 10.
  • the composite encoder 200 depicted in Figure 3 is capable of being further improved.
  • the inventors have appreciated that even apparently low-detail, noise-like regions in the ELOP output are capable of improving spatial resolution when reconstituted in combination with the corresponding BLOP signal, in other words, even apparently low-detail noise-like regions in the ELOP images can improve spatial resolution of corresponding images in the BLOP output.
  • exact pixel values are often not a major concern in spatial noise-like regions, but the overall contribution of these regions when reconstructing images from the ELOP and BLOP outputs is perceptually important.
  • the inventors propose to model such noise-like regions and send corresponding model parameters to an enabled decoder; the enabled decoder is then capable of applying the model parameters to a synthesizer to synthesize an approximation of the original noise-like data.
  • Such an approach devised by the inventors is capable of not only preserving more decoded image spatial resolution in comparison to decoded images derived from the encoders 10, 200, but also capable of reducing bit-rate in BLOP and ELOP outputs correspondingly generated providing that coding of fewer model parameters in the approach is more efficient than coding corresponding original image data described by the model parameters.
  • the inventors have appreciated that exclusion of data parts of the signal IP from full encoding and conveying model data corresponding to the excluded parts can be implemented in practice by utilizing conventional macro-block skipping procedures.
  • FIG 4 there is provided a schematic diagram of a composite encoder according to the invention; the encoder is indicated generally by 300.
  • the encoder 300 comprises the scaling-down function 20, the first H.264 encoder 30, the local H.264 decoder 40, the scaling-up function 50, the difference function 60 and the second encoder 70, for example as employed in the aforementioned composite encoders 10, 200.
  • the composite encoder 300 is distinguished from the aforementioned composite encoders 10, 200 is that the encoder 300 includes a detail analyzer 310, a buffer 320, an analyzer 330, a block select function 340, a model extraction function 350, an encoder 360 and finally a multiplexer 370.
  • BLOP and ELOP image data generated by the composite encoder 300 is coupled to a transmission/storage medium 380.
  • the medium 380 is preferably at least one of a communication network such as the Internet, a CD, a DVD, an optical fibre network and a wireless transmission network such as used for mobile telephones.
  • the input signal IP corresponding to a sequence of digital pixel images is conveyed to the scaling-down function 20 which scales the images, for example in a manner as illustrated in Figure 2, and then feeds them to the first H.264 encoder 30 which processes the images to generate corresponding encoded data in the form of the BLOP output.
  • an auxiliary encoded output LE from the encoder 30 is passed through the scaling-up function 50 to provide a reconstituted signal RD for input to the inverting input of the difference function 60.
  • the signal RD corresponds to the signal IP , except that the signal RD includes encoding errors arising within the first encoder 30 and corresponding error arising in the local decoder 40.
  • the scaling-down function 20 and the scaling-up function 50 are arranged to provide mutually identical but mutually inverse characteristics when the encoder 30 and its local decoder 40 are arranged to provide substantially mutually complementary characteristics.
  • the signal RD and the input signal IP are mutually subtracted at the difference function 60 to generate a residual signal RS which is conveyed to an input of the second H.264 encoder 70 for encoding therein.
  • the second encoder 70 is operable to generate corresponding encoded data which is selectively transmitted through the multiplexer 370 to generate the enhancement layer ELOP output in a manner which will be elucidated in further detail later.
  • the BLOP and ELOP outputs are conveyed to the transmission/storage medium 380.
  • the buffer 320 for example operable in a manner corresponding to a FIFO, is arranged to receive sequences of images present in the input IP and store them to feed into the analyzer 330. Subsequently, the analyzer 330 is operable to receive image data from the buffer 320 and to analyze the data to determine regions thereof which are susceptible to having their ELOP residual data implemented by way of a parameter model; these regions shall hereinafter also be referred to as "image blocks”.
  • the block select function 340 communicates to the second encoder 70 that it should encode the signal RS in a normal manner, for example as occurs in the composite encoder 200.
  • the block select function 340 disables the second encoder 70 by way of an enable block EB signal and causes the model extraction function 350 to process the one or more selected blocks and calculate corresponding model parameters MP. Moreover, the block select function 340 also passes a corresponding block index Bl to the encoder 360 so that the encoder 360 not only receives the model parameters MP from the extraction function 350 but also an indication of the corresponding block from the select function 340. h substitution for the second encoder 70, the encoder 360 outputs model parameters corresponding to the selected blocks to the ELOP output.
  • the composite encoder 300 functions in a similar manner to the composite encoder 10 except when one or more image blocks are identified in the input signal IP which are susceptible to having their residual image represented by model parameters in which case model parameters are inserted in the ELOP output instead of equivalent encoded data from the second encoder 70.
  • the detail analyzer 310 is optionally incorporated into the encoder 300 for use in pre-selecting suitable image blocks suitable for being represented by model parameters in the ELOP output; the detail analyzer 310 is provided with input data from at least one of the difference function 60 and input signal IP as illustrated.
  • the analyzer 310 is operable to provide an output D indicative of enhancement layer image density.
  • the composite encoder 300 is preferably implemented in at least one of hardware, software executing on computing hardware and a mixture of software and hardware.
  • the composite encoder 300 will now be elucidated in further detail.
  • the buffer 320 is capable of providing a benefit that images present in the signal IP are susceptible to being analyzed both spatially and temporally, namely across several images in a sequence.
  • the model extraction function 350 is beneficially based on statistical and spectral analysis which will be elucidated in more detail later.
  • the block select function 340 provides the control signal EB to the second encoder 70 which empties memory locations therein corresponding to image blocks selected for parameter modelling; such emptying occurs through so-called skip macro-block code.
  • the block co-ordinates and model parameters are encoded by the encoder 360 which preferably employs fixed length coding (FLC), for example at least one of Pulse Code Modulation (PCM) and Natural Binary Coding; alternatively, or additionally, Variable Length Coding (VLC) is susceptible to being employed, for example Huffman Coding and/or Arithmetic Coding.
  • FLC fixed length coding
  • PCM Pulse Code Modulation
  • VLC Variable Length Coding
  • coded model parameters can multiplexed as private data with a standard bit stream arrangement provided from the second encoder 70 at a high transport level, or internally in the second encoder 70 itself, for example by way of utilizing contemporary "reserved SEI messages"; SEI is here an abbreviation for "Supplemental Enhancement Information" as accommodated in the H. 264/ AVC standard, since SEI messages are well specified parts of H.264/ AVC syntax.
  • the encoder 300 illustrated in Figure 4 is complemented by a corresponding decoder as illustrated in Figure 5.
  • the decoder 400 comprises a primary signal processing path for receiving BLOP image layer data from the transmission/storage medium 380, the primary path comprising in sequence an H.264 decoder 430 arranged to complement the first encoder 30 of the composite encoder 300, a scaling-up function 410 arranged to complement the scaling-down function 20 of the composite encoder 300, and a summing function 420 whose output OP provides a final decoded output from the decoder 400.
  • the secondary path comprises a demultiplexer 440 providing an input for receiving the ELOP data, a first output coupled to an H.264 decoder 450 and second output PRD denoting "private data" coupled to a decoder 460 operable to decode aforementioned parameter model data.
  • An output EP+SB namely "enhanced pictures and skipped macroblocks" is coupled from the H.264 decoder 450 to a block overwrite function 480 whose output is coupled to the summing function 420 as illustrated.
  • the decoder 460 comprises a first output coupled to a block select function 470 whose output is coupled in turn to the block overwrite function 480 as illustrated.
  • the block overwrite function 480 includes an output which is connected to a summing input of the summing function 420.
  • the decoder 460 includes a second output MP, namely "model parameters", connected to a macroblock synthesizer 490 arranged to receive noise input data from a random noise generator 510.
  • a simulated noise output from the synthesizer 490 is coupled via a post-processing function 500 to an input of the block overwrite function 480.
  • the post-processing function 500 includes features such as macroblock clipping but is also susceptible to including other types of image editing functions.
  • Layer image data namely BLOP and corresponding ELOP data, from the composite encoder 300 of Figure 4 is coupled via the medium 380 to the decoder 430 and the demultiplexer 440 as illustrated.
  • BLOP layer image data is decoded in the decoded 430 and is passed to the scaling-up function 410 which scales up the decoded BLOP data to provide BLOP layer output data to the summing function 420 for subsequent output at OP.
  • the ELOP data is received at the de-multiplexer 440 and is selectively directed to the decoder 450 where macroblock parameter modelling has not been implemented at the encoder 300.
  • corresponding parameters are encoded into private data areas of the ELOP data conveyed via the transmission/storage medium 380.
  • the demultiplexer 440 extracts the private data, namely "PRD", from the ELOP data and passes this PRD to the decoder 460 which is operable to generate corresponding model parameters MP from the PRD.
  • the model parameters MP are passed to the synthesizer 490 functioning in tandem with the noise generator 510 which are operable to recreate noise-like structures of macroblocks identified and encoded in the encoder 300 as described in the foregoing.
  • a synthesized output corresponding to the selected encoded macroblocks passes via the post-processing function 500 to the block overwrite function 480 which is operable to utilize synthesized output received from the post-processing function 500 is preference to output from the decoder 450 for macroblocks selected by the encoder 300.
  • the summing function 420 combined decoded output corresponding to the BLOP and ELOP data to generate the reconstituted image output OP suitable for final viewing.
  • the analyzer 330 of the composite encoder 300 of Figure 4 is operable to distinguish between noise-like and texture-like structures in the enhancement layer information after BLOP -type image information has been subtracted, or to redefine such a distinction if already executed by the detail analyzing function 310 when optionally included.
  • the analyzer 330 performs a discrete cosine transformation, namely a "DCT", of macroblocks identified for transformation to corresponding model parameters.
  • the DCT generates information about a spectral energy distribution within each selected block of images in the input IP for parameter modelling, such spectral energy distributions being appropriate to use for categorizing various types of texture and noise-like structures present in the images.
  • FIG. 6 Examples of DCT analysis are illustrated in Figures 6 to 10, wherein DCT analyses for macroblocks Bl, B2, B3, B4 selected by the analyzer 330 are indicated generally by 560, 565, 570, 575 respectively.
  • Figure 6 there is shown an enhancement layer image of the portrait image shown in Figure 2.
  • the enhancement layer image of Figure 6 spatial locations of the macroblocks Bl to B4 are shown; each block comprises a field of 16 x 16 pixels.
  • the macroblock B2 is distinguished to be a low-detail noise-like block, whereas the macroblocks Bl, B3, B4 include more texture-like detail; the macroblocks Bl to B4 are all susceptible to being modelled and thereby being represented by corresponding model parameters.
  • the macroblock Bl includes a clear vertical edge, whereas the blocks B3 and especially B4 are more spatially uniform than the block Bl.
  • the block B3 includes spatially gradually changing diagonal texture whereas the macroblock B4 includes highly detailed spatially irregular texture.
  • the macroblock B4 gives rise to a more peaked DCT characteristic whereas the block B3 has a relatively uniform DCT characteristic.
  • the DCTs of the macroblocks Bl, B3 include several dominant coefficients shown in Figures 7 and 9 to be disposed in specific directions, namely substantially horizontally for the macroblock Bl and substantially diagonally for the macroblock B3.
  • DCT's are susceptible to being used to model selected macroblocks in the ELOP image layer
  • other methods can additionally or alternatively be utilized.
  • Such other methods are preferably arranged not only to process data within each selected macroblock but also from pixels in regions surrounding such macroblocks, for example by utilizing 2-dimensional (2-D) cross-correlation.
  • various properties of each selected macroblock are susceptible to temporal analysis from image to image in a sequence of images presented to the analyzer 330. For example, an analysis of time- consistency of certain DCT characteristics is potentially susceptible to being used for distinguishing spatial image detail from temporal noise.
  • operation of the analyzer 330 preferably also involves coding parameters and content analysis decisions available from H264 encoding within the composite encoder 300.
  • a first step in generating a 1-D representation of a 2-D block of data is by concatenating block columns or rows in a fixed or random order.
  • Results of such 2-D to 1-D conversion in respect of Figure 7 to 10 are illustrated in Figure 11 wherein the deterministic nature of the macroblocks Bl and B3 are contrasted with the more relatively random nature of the macroblocks B2 and B4.
  • coefficients b3 are generated by concatenating columns corresponding to the macroblock B3 whereas coefficients bl, b2 and b4 are generated by concatenating rows of macroblocks Bl, B2 and B4 respectively.
  • Equation 1 Equation 1
  • x[n] an observed output of the system
  • e[n] an unobserved input to the system
  • ak's coefficients describing the system.
  • a power spectral density (PSD) function Pxx(f) of x[n] is susceptible to being computed as determined by Equation 2 (Eq. 2) where a parameter f is used to represent frequency.
  • the PSD function can be determined by estimating the AR coefficients ak and an associated noise variance denoted by D2.
  • Several methods are capable of being employed of estimating the AR coefficients ak, for example at least one of a Yule- Walker method, a covariance method and a Burg method as described in "The Digital Signal Processing Handbook" by Vijay Madisetti, Douglas Williams, published by CRC Press, Florida, 1998.
  • the synthesizer 600 is operable to generate a synthesis of b2[n].
  • the synthesizer 600 generates a synthesis s[n] of, for example, b2[n] such that s[n] has a mean and a variance which are substantially an exact match to those of b2[n] as described by Equation 3 (Eq. 3) where a parameter G corresponds to gain:
  • the synthesizer 600 is susceptible to being used to implement the synthesizer 490 and its associated noise generator in the decoder 400 illustrated in Figure 5.
  • the synthesizer 600 comprises a parameter decoder 630, a noise generator 640, a parametrically-driven shaping filter 650, a variance computing function 660 coupled and an associated gain computing function 670, a mean computing function 680, and finally a multiplying function 690 and its associated difference function 700. It will be appreciated that the synthesizer 600 is capable of being implemented in hardware, in software executable on computer apparatus and/or a mixture of software and hardware.
  • the noise generator 640 includes an output e[n] coupled to an input of the shaping filter 650; the filter 650 is also connected to the decoder 630 to receive AR coefficients therefrom.
  • the shaping filter 650 comprises an output s[n] coupled to a first input of the multiplier function 690 and to respective inputs of the mean computing function 680 and the variance computing function 660.
  • a second input of the multiplying function 690 denoted by "G" is connected to an output of the gain computing function 670.
  • This function 670 is arranged to receive inputs from the variance computing function 660 and the parameter decoding decoder 630 as illustrated.
  • a multiplication output from the function 690 is coupled to a first input of the difference function 700.
  • the difference function 700 includes a subtraction input whereat the mean computing function 680 is operable to provide a variance means "mean s"; moreover, the function 700 also includes an addition input for receiving an output from the decoder 630 corresponding to a mean of the parameters b2, namely "mean b2".
  • the noise generator 640 generates a noise-like data set for e[n] which is passed to the filter 650.
  • the filter 650 receives AR coefficients from the decoder 630 and filters corresponding components of the data set e[n] to generate the output s[n].
  • the output s[n] passes to the mean computing function 680 which generates its corresponding mean "means s" which is passed to the difference function 700 which is operable to subtract this mean and thereby ensure that the output b ⁇ 2[n] has a mean of substantially zero.
  • the variance computing function 660 is operable to determine s[n]'s variance and pass this variance to the gain computing function 670.
  • the gain computing function 670 receives a desired variance Db2 from the decoder 630 and accordingly adjusts the gain G so that the output ⁇ G.s[n] ⁇ provided from the multiplier function 690 has a desired variance as dictated by the decoder 630. Finally, the decoder 630 provides its output "mean b2" for adjusting a mean of the output b ⁇ 2[n] from the difference function 700.
  • the synthesizer 600 is capable of simulating parameters b[n] as demonstrated in Figures 13 and 14.
  • a first graph indicated by 746 includes an abscissa axis and an ordinate axis corresponding respectively to DCT sample pixel index and pixel value with regard to aforementioned parameter b2.
  • a graph indicated by 748 is a power spectral density against normalised spatial frequency corresponding to the graph 746.
  • Content of the graph 746 is capable of being synthesized by the synthesizer 600 to generate equivalent data in a graph indicated by 750; a corresponding power spectral density graph is indicated generally by 752.
  • the original graphs 746, 748 are to be compared with the synthesized graphs 750, 752 respectively.
  • the synthesizer 600 is capable of generating a resemblance from concise model parameter data fed to it.
  • a graph indicated generally by 754 including an abscissa axis 756 for normalised spatial frequency and an ordinate axis 758 power spectral density (PSD).
  • PSD power spectral density
  • the graph 754 illustrates PSD estimates for the parameters bl, b3, b4, the graph 754 showing variations between different selected ELOP-layer macroblocks of images presented to the encoder 300 of Figure 4.
  • the encoder 300 and corresponding decoder 400 are susceptible to substantially maintaining image quality and detail in comparison to the known encoder 10 whilst providing enhanced data compression in data output from the encoder 300; such data compression arises, as elucidated in the foregoing, by representing one or more selected macroblocks in the ELOP enhanced layer by model parameters, such parameters being derived by DCT and subsequent 2-D to 1-D concatenation of generated DCT coefficients, such concatenation giving rise to the aforementioned AR coefficients susceptible of being communicated with ELOP-layer data in private data fields thereof.
  • model parameters such parameters being derived by DCT and subsequent 2-D to 1-D concatenation of generated DCT coefficients, such concatenation giving rise to the aforementioned AR coefficients susceptible of being communicated with ELOP-layer data in private data fields thereof.
  • model order can be made dynamically variable.
  • model order can be set a preferred compromise value.
  • An accurate procedure that can be employed in the model extraction function 350 is to estimate parameter values in increasing order and to determine an optimal compromise such that increase in model order does not give to a corresponding increase in perceived image quality from the decoder 400.
  • such an approach to determine an optimum model order is computationally demanding.
  • the inventors have appreciated that it is computationally more beneficial to calculate the fit of a limited number of sets of model parameters for different model orders and then use the properties of a fit criterion to determine an optimal model order.
  • Such a preferred approach circumvents a need to generate laboriously an entire sequence of model parameter sets and check encoding quality for each set.
  • statistical analysis is applied in the model extraction function 350 and used to determine a quality of fit to be used, for example, for image reconstruction purposes.
  • the interpolation is advantageously driven by a difference in noise components between an original version of an image and a reconstituted image for given model orders on account of noise components being able to yield considerable information for interpolation purposes.
  • the graph 800 includes an abscissa axis 810 denoting model order P and an ordinate axis 820 denoting a fit function F(P) having the model order P as one of its arguments and expressing the difference between the model and the data.
  • the fit function F(P) is implemented as a part of the model extraction function 350 and is indicative of a quality of statistical fit of model parameters to corresponding selected macroblocks in the ELOP enhance layer.
  • the graph 800 illustrates an iterative selection of an optimal model order P based on a Generalized Information Criterion (GIC) as described in "Automatic Spectral Analysis with Time Series Models" by P. M. T. Broersen, IEEE Transactions on Instrumentation and
  • Equation 4 Equation 4 (Eq. 4):
  • GIC(P) F(P) + 3P Eq 4
  • 3P stands for a penalty Q(P).
  • Q(P) is a known function which does not depend on M(P) or the data but that increases with P and is easily calculated.
  • the particular penalty function denoted 3P contains a penalty factor that occurs when the penalty is a linear function of p (alpha*p).
  • alpha 3.
  • Equation 5 Equation 5 (Eq. 5):
  • Psel selected model order for use in representing an ELOP-layer selected macroblock as model parameters.
  • the following steps are executed (e.g. for each ELOP-layer selected macroblock to be represented as equivalent parameters) when a standard non-interpolation approach is employed:
  • a preferred interpolation approach can be employed as follows: (0) a model M(P0) with a low number of parameters P0 is estimated (which does not require a lot of computations) (A) a most complex model M(Pm) is calculated and its fit F(Pm) and corresponding GIC calculated; (B) assuming that fit of lower-order models, namely M(Pm-l), M(Pm-2), ...
  • a new highest candidate for selection for use in the function 350 is a model of order Pm whose corresponding GIC is potentially smaller than GIC(PO), namely F(Pm)+3Pm ⁇ GIC(PO); and (C) step (B) is repeated if required until the fit F(P) is lower than GIC (P0).
  • the value of P0 can be increased by estimating the parameters of additional low-order models (M(P0+1), M(P0+2), ). This can be illustrated with reference to Fig. 16 where P0 should correspond to a low order model close to zero.
  • the interpolation approach enables a minimum value of GIC to be found at a greatly reduced computational cost within the function 350.
  • the procedure for selective estimation for statistical order selection described here yields an exact minimum if the fit F(P) is monotonically decreasing, as is the case for many parameter estimation methods. So, in this case, the resulting order is exactly equal to the order found with the standard, "full search" method, as described in the aforementioned points (a) to (c).
  • interpolation approach is susceptible to increasing execution speed of the function 350 for model parameter determination.
  • Such an advantage is susceptible to enabling the function 350 to be implemented in less expensive and compact hardware and/or software, for example in inexpensive consumer products.
  • model order selection is susceptible to being applied in technical fields outside that of aforesaid video encoding and corresponding decoding, for example in other situations where curve fitting and analysis of stationary stochastic signals is required; stationary stochastic signals are also known as "coloured noise”.
  • the model order selection approach elucidated with respect to Figure 16 and its associated example, is effectively an "ARMAsel" algorithm and represents a general tool for the analysis of stationary stochastic signals.
  • the approach is not only usable in AR models but optionally also where Moving Averages (MA) are involved, and also for combined ARMA models.
  • MA Moving Averages
  • the model order selection approach is capable of ensuring that more accurate models can be obtained for a wide range of mutually different types of signal.
  • noise modelling of medical images is capable of providing decoded images that are perceptually similar to corresponding original images, even where high compression ratio are used to generate associated image data.
  • the approach employed in Figure 16 is susceptible to being applied to determine more accurate models for use in generating such compressed data.
  • model order selection approach is also applicable to general medical data analysis, for example to the monitoring of heartbeat signals, to the analysis of lung noise for diagnostic purposes and to EEG electrical signal analysis.
  • model order selection approach employed in Figure 16 is not restricted to 1 -dimensional model order selection.
  • model order selection for 2-dimenisonal AR, MA or ARMA models is also implementable using the approach. Selection of most suitable model to yet higher-dimensional data is also accommodated by the approach, for example to 3 -dimensions and above.
  • the aforementioned approach of model order selection is also susceptible to being employed in sound processing, especially for processing temporal noise-like components in audio signals, for example in speech and/or music. Sound signal compression is also enhanced by representation of temporal noise-like components by corresponding noise- description model parameters; selection of a suitable model order to employ in such an audio application is capable of being addressed by the aforesaid approach.
  • the approach to model order selection as described above is capable of being applied in a wide range of applications, for example in general digital signal processing, for example as in telecommunication systems.
  • the approach is applicable in radar systems for processing signals corresponding to radar reflections from sea waves, for example modelled using the aforementioned ARMAsel algorithm; such radar reflections are susceptible to generating corresponding signals which are highly complex and correspond to a superposition of both desired signal and noise-like components.
  • the approach to model order selection is also susceptible for use in modelling turbulent systems, for example as in vortex modelling.
  • vibration analysis mechanical structures
  • mechanical structures are susceptible to exhibiting complex harmonic vibration mode spectra, moreover, vibration spectral measurement is often executed in a background of ambient temporal and/or harmonic noise; such characteristics are susceptible to being determined by the aforementioned approach.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
PCT/IB2004/002770 2003-08-29 2004-08-25 System and method for encoding and decoding enhancement layer data using descriptive model parameters WO2005022918A1 (en)

Priority Applications (7)

Application Number Priority Date Filing Date Title
US10/569,126 US7953156B2 (en) 2003-08-29 2004-08-25 System and method for encoding and decoding enhancement layer data using descriptive model parameters
EP04769188A EP1661405B1 (en) 2003-08-29 2004-08-25 System and method for encoding and decoding enhancement layer data using descriptive model parameters
AT04769188T ATE435567T1 (de) 2003-08-29 2004-08-25 System und verfahren zur codierung und decodierung von daten der verbesserungsebene durch verwendung deskriptiver modellparameter
JP2006524459A JP4949836B2 (ja) 2003-08-29 2004-08-25 記述的モデルパラメータを用いたエンハンスメントレイヤデータを符号化及び復号化するシステム及び方法
DE602004021818T DE602004021818D1 (de) 2003-08-29 2004-08-25 System und verfahren zur codierung und decodierung von daten der verbesserungsebene durch verwendung deskriptiver modellparameter
CN2004800248162A CN1843039B (zh) 2003-08-29 2004-08-25 用于使用描述性模型参数来编码并解码增强层数据的系统和方法
KR1020067004230A KR101073535B1 (ko) 2003-08-29 2006-02-28 기술적 모델 매개변수들을 사용하여 향상 계층 데이터를 인코딩하고 디코딩하기 위한 시스템 및 방법

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP03300104.1 2003-08-29
EP03300104 2003-08-29

Publications (1)

Publication Number Publication Date
WO2005022918A1 true WO2005022918A1 (en) 2005-03-10

Family

ID=34259299

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2004/002770 WO2005022918A1 (en) 2003-08-29 2004-08-25 System and method for encoding and decoding enhancement layer data using descriptive model parameters

Country Status (8)

Country Link
US (1) US7953156B2 (ja)
EP (1) EP1661405B1 (ja)
JP (1) JP4949836B2 (ja)
KR (1) KR101073535B1 (ja)
CN (1) CN1843039B (ja)
AT (1) ATE435567T1 (ja)
DE (1) DE602004021818D1 (ja)
WO (1) WO2005022918A1 (ja)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009501479A (ja) * 2005-07-15 2009-01-15 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ テクスチャの領域のための画像コーダ
KR100974177B1 (ko) 2005-09-27 2010-08-05 퀄컴 인코포레이티드 랜덤 필드 모델을 사용한 사진 및 비디오 압축과 프레임레이트 업 변환을 개선시키는 방법 및 장치
US8774266B2 (en) 2005-04-13 2014-07-08 Nokia Corporation Coding, storage and signalling of scalability information
US9565429B2 (en) 2011-11-08 2017-02-07 Huawei Technologies Co., Ltd. Method and apparatus for coding matrix and method and apparatus for decoding matrix

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9743078B2 (en) 2004-07-30 2017-08-22 Euclid Discoveries, Llc Standards-compliant model-based video encoding and decoding
EP1836821A2 (en) * 2005-01-11 2007-09-26 QUALCOMM Incorporated Methods and apparatus for transmitting layered and non-layered data via layered modulation
JP4824635B2 (ja) * 2007-06-15 2011-11-30 株式会社 ソキア・トプコン ロータリエンコーダの角度補正方法
US20090168871A1 (en) * 2007-12-31 2009-07-02 Ning Lu Video motion estimation
US9143757B2 (en) * 2011-04-27 2015-09-22 Electronics And Telecommunications Research Institute Method and apparatus for transmitting and receiving stereoscopic video
KR20130011994A (ko) * 2011-07-22 2013-01-30 삼성전자주식회사 송신 장치, 수신 장치 및 그 송수신 방법
US9591318B2 (en) 2011-09-16 2017-03-07 Microsoft Technology Licensing, Llc Multi-layer encoding and decoding
US11089343B2 (en) * 2012-01-11 2021-08-10 Microsoft Technology Licensing, Llc Capability advertisement, configuration and control for video coding and decoding
US9185414B1 (en) * 2012-06-29 2015-11-10 Google Inc. Video encoding using variance
CN103916673B (zh) * 2013-01-06 2017-12-22 华为技术有限公司 基于双向预测的编码方法、解码方法和装置
JP6261215B2 (ja) * 2013-07-12 2018-01-17 キヤノン株式会社 画像符号化装置、画像符号化方法及びプログラム、画像復号装置、画像復号方法及びプログラム
US9621917B2 (en) 2014-03-10 2017-04-11 Euclid Discoveries, Llc Continuous block tracking for temporal prediction in video encoding
US10091507B2 (en) 2014-03-10 2018-10-02 Euclid Discoveries, Llc Perceptual optimization for model-based video encoding
US10097851B2 (en) 2014-03-10 2018-10-09 Euclid Discoveries, Llc Perceptual optimization for model-based video encoding
JP6150134B2 (ja) * 2014-03-24 2017-06-21 ソニー株式会社 画像符号化装置および方法、画像復号装置および方法、プログラム、並びに記録媒体
EP3122051A1 (en) 2015-07-24 2017-01-25 Alcatel Lucent Method and apparatus for encoding and decoding a video signal based on vectorised spatiotemporal surfaces
KR20210055278A (ko) * 2019-11-07 2021-05-17 라인플러스 주식회사 하이브리드 비디오 코딩 방법 및 시스템
EP3820150B1 (en) * 2019-11-07 2024-01-03 Dotphoton AG Method and device for steganographic processing and compression of image data

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5917609A (en) * 1995-09-12 1999-06-29 U.S. Philips Corporation Hybrid waveform and model-based encoding and decoding of image signals
WO2003036979A1 (en) * 2001-10-26 2003-05-01 Koninklijke Philips Electronics N.V. Spatial scalable compression scheme using adaptive content filtering

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6690833B1 (en) * 1997-07-14 2004-02-10 Sarnoff Corporation Apparatus and method for macroblock based rate control in a coding system
US5995150A (en) * 1998-02-20 1999-11-30 Winbond Electronics Corporation America Dual compressed video bitstream camera for universal serial bus connection
US6957201B2 (en) * 1998-11-17 2005-10-18 Sofresud S.A. Controlled capacity modeling tool
US7471834B2 (en) * 2000-07-24 2008-12-30 Vmark, Inc. Rapid production of reduced-size images from compressed video streams
US6907070B2 (en) * 2000-12-15 2005-06-14 Microsoft Corporation Drifting reduction and macroblock-based control in progressive fine granularity scalable video coding
US7155066B2 (en) * 2001-05-31 2006-12-26 Agilent Technologies, Inc. System and method for demosaicing raw data images with compression considerations
KR100603592B1 (ko) * 2001-11-26 2006-07-24 학교법인 고황재단 영상 화질 향상 인자를 이용한 지능형 파문 스캔 장치 및 그 방법과 그를 이용한 영상 코딩/디코딩 장치 및 그 방법
US20070126021A1 (en) * 2005-12-06 2007-06-07 Yungryel Ryu Metal oxide semiconductor film structures and methods

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5917609A (en) * 1995-09-12 1999-06-29 U.S. Philips Corporation Hybrid waveform and model-based encoding and decoding of image signals
WO2003036979A1 (en) * 2001-10-26 2003-05-01 Koninklijke Philips Electronics N.V. Spatial scalable compression scheme using adaptive content filtering

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
NAKAYA Y ET AL: "Model-based/waveform hybrid coding for videotelephone images", SPEECH PROCESSING 2, VLSI, UNDERWATER SIGNAL PROCESSING. TORONTO, MAY 14 - 17, 1991, INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH & SIGNAL PROCESSING. ICASSP, NEW YORK, IEEE, US, vol. VOL. 2 CONF. 16, 14 April 1991 (1991-04-14), pages 2741 - 2744, XP010043573, ISBN: 0-7803-0003-3 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8774266B2 (en) 2005-04-13 2014-07-08 Nokia Corporation Coding, storage and signalling of scalability information
US9332254B2 (en) 2005-04-13 2016-05-03 Nokia Technologies Oy Coding, storage and signalling of scalability information
JP2009501479A (ja) * 2005-07-15 2009-01-15 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ テクスチャの領域のための画像コーダ
KR100974177B1 (ko) 2005-09-27 2010-08-05 퀄컴 인코포레이티드 랜덤 필드 모델을 사용한 사진 및 비디오 압축과 프레임레이트 업 변환을 개선시키는 방법 및 장치
EP1938613B1 (en) * 2005-09-27 2010-11-17 Qualcomm Incorporated Method and apparatus for using random field models to improve picture and video compression and frame rate up conversion
US9565429B2 (en) 2011-11-08 2017-02-07 Huawei Technologies Co., Ltd. Method and apparatus for coding matrix and method and apparatus for decoding matrix
US10123016B2 (en) 2011-11-08 2018-11-06 Huawei Technologies Co., Ltd. Method and apparatus for coding matrix and method and apparatus for decoding matrix
US10742982B2 (en) 2011-11-08 2020-08-11 Huawei Technologies Co., Ltd. Method and apparatus for coding matrix and method and apparatus for decoding matrix
US11265547B2 (en) 2011-11-08 2022-03-01 Huawei Technologies Co., Ltd. Method and apparatus for coding matrix and method and apparatus for decoding matrix

Also Published As

Publication number Publication date
CN1843039B (zh) 2011-02-23
KR20060132797A (ko) 2006-12-22
DE602004021818D1 (de) 2009-08-13
US7953156B2 (en) 2011-05-31
CN1843039A (zh) 2006-10-04
JP2007504696A (ja) 2007-03-01
EP1661405B1 (en) 2009-07-01
US20060262846A1 (en) 2006-11-23
EP1661405A1 (en) 2006-05-31
JP4949836B2 (ja) 2012-06-13
KR101073535B1 (ko) 2011-10-17
ATE435567T1 (de) 2009-07-15

Similar Documents

Publication Publication Date Title
EP1661405B1 (en) System and method for encoding and decoding enhancement layer data using descriptive model parameters
EP2774370B1 (en) Layer decomposition in hierarchical vdr coding
EP2323407A1 (en) Video image encoding method, video image decoding method, video image encoding apparatus, video image decoding apparatus, program and integrated circuit
JP4159400B2 (ja) ビデオ画像を処理するコンピュータ実行方法及び記録媒体
US20120087595A1 (en) Image encoding device, image decoding device, image encoding method, and image decoding method
JP2007503784A (ja) ハイブリッドビデオ圧縮法
EP2036358A1 (en) Image encoding/decoding method and apparatus
JPWO2009050889A1 (ja) 映像復号方法及び映像符号化方法
KR20080055965A (ko) 다양한 모션 모델들을 사용하는 인코더 보조 프레임 레이트상향 변환
EP2036351A1 (en) Image encoding/decoding method and apparatus
CN108353175B (zh) 使用系数引起的预测处理视频信号的方法和装置
JP2004032718A (ja) フェーディング推定/補償によりビデオフレームを処理するシステムおよび方法
TWI573443B (zh) 使用顯著圖之視訊編碼技術
WO2016040255A1 (en) Self-adaptive prediction method for multi-layer codec
KR20150123810A (ko) 이미지 동적 범위 변환 오퍼레이터를 선택하는 방법 및 디바이스
GB2495942A (en) Prediction of Image Components Using a Prediction Model
US8428116B2 (en) Moving picture encoding device, method, program, and moving picture decoding device, method, and program
EP3843399B1 (en) Video image component prediction method and apparatus, and computer storage medium
CN117751575A (zh) 用于估计胶片颗粒参数的方法或装置
Heindel et al. Sample-based weighted prediction for lossless enhancement layer coding in SHVC
US11647228B2 (en) Method and apparatus for encoding and decoding video signal using transform domain prediction for prediction unit partition
US20220060686A1 (en) Video encoding and video decoding
US20200329232A1 (en) Method and device for encoding or decoding video signal by using correlation of respective frequency components in original block and prediction block
CN117044206A (zh) 基于深度学习的yuv视频压缩的运动流编码
WO2022146215A1 (en) Temporal filter

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200480024816.2

Country of ref document: CN

AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2004769188

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2006262846

Country of ref document: US

Ref document number: 10569126

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 701/CHENP/2006

Country of ref document: IN

WWE Wipo information: entry into national phase

Ref document number: 2006524459

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 1020067004230

Country of ref document: KR

WWP Wipo information: published in national office

Ref document number: 2004769188

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 10569126

Country of ref document: US

WWP Wipo information: published in national office

Ref document number: 1020067004230

Country of ref document: KR