WO1999007157A1 - Procede et dispositif permettant de comprimer des sequences video - Google Patents

Procede et dispositif permettant de comprimer des sequences video Download PDF

Info

Publication number
WO1999007157A1
WO1999007157A1 PCT/EP1998/002950 EP9802950W WO9907157A1 WO 1999007157 A1 WO1999007157 A1 WO 1999007157A1 EP 9802950 W EP9802950 W EP 9802950W WO 9907157 A1 WO9907157 A1 WO 9907157A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
images
motion
model
prediction
Prior art date
Application number
PCT/EP1998/002950
Other languages
English (en)
Inventor
Klaus Diepold
Max Griessl
Helmut Hanika
Christof Knobloch
Harald Martens
Klaus Müller
Jan Otto Reberg
Clemens RÖTTGERMANN
Markus Wittkop
Thomas Zeitler
Original Assignee
Idt International Digital Technologies Deutschland Gmbh
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Idt International Digital Technologies Deutschland Gmbh filed Critical Idt International Digital Technologies Deutschland Gmbh
Priority to AU80188/98A priority Critical patent/AU8018898A/en
Priority to EP98928295A priority patent/EP0998826A1/fr
Publication of WO1999007157A1 publication Critical patent/WO1999007157A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • This invention relates to the compression of sequences of digital video images, so as to allow more efficient transmission or storage of the sequences.
  • the invention is based on modelling sets of consecutive images according to the IDLE principle.
  • Each set of consecutive images is modelled by choosing one image as reference image and representing the other images in terms of the reference image and a subspace model of motion.
  • the sets of consecutive images may overlap.
  • the reference image and the subspace model of motion can themselves be compressed by means of e.g. standard image compression or data compression schemes.
  • Residuals can be transmitted.
  • the length of the set of consecutive images can be adaptively adjusted depending on the sequence content.
  • the complexity of the subspace model of motion can be adjusted adaptively.
  • Bidirectional (forward and backward) prediction can be used to improve coding efficiency or image quality. Changes in intensity can be modelled. Arbitrarily shaped video objects can be supported. Redundancy between neighbouring sets of consecutive images can be exploited.
  • Various sampling grids, image resolutions, and aspect ratios can be supported. In particular, interlaced and progressive scanned video input as well as different colour formats can be supported.
  • Wa ⁇ ing Mapping a source to an instantiation using motion information (displacement).
  • Forward wa ⁇ ing Mapping every pixel of the source to a displaced point in the instantiation and keeping its attribute.
  • the attribute of a pixel in the instantiation is calculated by inte ⁇ olating the attributes of mapped pixels in its neighbourhood.
  • Backward wa ⁇ ing Mapping onto every pixel of the instantiation from a displaced point in the source.
  • the attribute of a pixel in the instantiation is calculated by inte ⁇ olating the attributes of pixels in the neighbourhood of its displaced point in the source.
  • Prediction target To be predicted from a source.
  • Prediction Mapped source as an approximation of a prediction target.
  • Forward prediction A prediction, where the prediction target lies in the future of the source.
  • Backward prediction A prediction, where the prediction target lies in the past of the source.
  • Prediction error Difference between the original prediction target and the prediction.
  • Blending Pixel by pixel combination of predictions.
  • D ⁇ and D n (Displacement Vertically and Horizontally): Vertical and horizontal component of motion field.
  • Reference image Source to be mapped for the prediction of a target.
  • I-image Intra. non predicted reference image.
  • P-image Predicted reference image.
  • B-image Bi-directional predicted image.
  • U-image Unidirectional predicted image.
  • image comprises frames as well as fields, as defined in the standard ISO/IEC 13818-2 (MPEG-2 video), already included by reference. Moreover an image may contain colour information, too.
  • fidelity is meant how well a prediction or reconstruction predicts or reconstructs an original, expressed in the root mean square (RMS) of the prediction error, or other error measures.
  • quality is meant the subjective, visual quality of the prediction or reconstruction, which in addition to fidelity takes into account how annoying various types of image artefacts are for the human visual system.
  • An encoder/decoder system works according to the IDLE-principle, if it
  • An IDLE-model for an SCI of length N consists of
  • the combination of the IDLE models is performed using blend-fields.
  • the attributes may be components of motion-(displacement-) vectors or intensity etc.
  • the encoder/decoder system described in the following can work using two different motion compensation schemes, a forward and a backward one.
  • the sources e.g. reference images
  • the sources are wa ⁇ ed to the targets by forward wa ⁇ ing wherein the motion fields are defined in the source positions.
  • the sources e.g. reference images
  • the sources are wa ⁇ ed to the targets by backward wa ⁇ ing wherein the motion fields are defined in the target positions.
  • Wa ⁇ and OppositeWa ⁇ two abstract modules/operators are used: Wa ⁇ and OppositeWa ⁇ .
  • Wa ⁇ performs a forward wa ⁇ ing and OppositeWa ⁇ a backward wa ⁇ ing whereas in the backward motion compensation scheme Wa ⁇ performs a backward wa ⁇ ing and OppositeWa ⁇ a forward wa ⁇ ing.
  • Wa ⁇ performs a forward wa ⁇ ing and OppositeWa ⁇ a forward wa ⁇ ing.
  • Fig. la,b show the encoder in its basic form
  • figs. 2 a - d illustrate the concept of subspace modelling
  • fig. 3 illustrates a decoder in its basic form
  • fig. 4 a. b show the codec containing a compression module
  • fig.5 illustrates the context for updating a subspace model of motion
  • fig. 6 illustrates using the first image in each set of consecutive images as reference image
  • fig. 7 illustrates using the centre image of each set of consecutive images as reference image
  • figs. 8 a - d illustrate adaptive determination of the length of a set of consecutive images
  • fig. 9 illustrates the context for determining rank of the subspace model of motion
  • 10 a - d illustrate the motivation for using an extended reference image
  • figs. 11 a - h illustrate one method for estimating an extended reference image
  • fig. 12 illustrates the use of P-images
  • figs. 13 a - c illustrate the need for bi-directional prediction
  • fig. 14 illustrates the use of B-images
  • fig. 15 a - c shows the encoding using bi-directional predictions
  • fig. 16 shows a decoder adapted to bi-directional prediction
  • figs. 17 and 18 show two possible contexts for calculation of residuals
  • fig. 19 shows how residual dampening may be integrated into the system
  • fig. 20 illustrates the context for subspace modelling of intensity
  • 21 a - b show how subspace modelling of intensity changes can be implemented into an encoder; figs 22 a - b show how to predict images under presence of a subspace model of intensity changes; fig. 23 illustrates preconditioning of intensity changes before subspace modelling; fig. 24 a - c show how to exploit redundancy of subspace models for consecutive SCI ' s. O 99/07157 . ⁇ 6 .-
  • Fig. la shows the invention in a first preferred embodiment.
  • An encoder Enc 100 receives as input a sequence of video images 101 and transmits a representation of the sequence as an IDLE model 104.
  • the encoder has an image store ImageStore 110 buffering a certain number of images of the incoming sequence.
  • the encoder has a module BuildMod 120 for building an IDLE model. It is built from images read from the image store ImageStore 110.
  • the module TraMod 130 transmits the IDLE model.
  • BuildMod 150 reads as input the set of consecutive images 151, consisting of a number of images N .
  • the number of images N is fixed, e. g. 15.
  • One of the images in the set of consecutive images 151 is selected as a reference image. This can be the first image.
  • a switch 160 sends this image into a reference image store ReflmgSto 170.
  • the switch 160 sends each of the other images to a motion estimator ME 180. Motion is estimated between the reference image and the other images. This results in N - 1 motion fields.
  • the motion estimator ME 180 can be based on several principles. One possibility is to use a block based motion estimator, finding, for each block of pixels in an image, a motion vector to a similar block of pixels in another image.
  • the motion fields are sent to a module BuildSSMod 190 that builds a subspace model of the motion.
  • the individual elements ⁇ . , of A can explicitly be written as
  • the m rows of the matrix P constitute a set of basis vectors for the m -dimensional vector space spanned by the rows of A .
  • the matrix T indicates the linear combination of the basis vectors in P to rebuild the data matrix A .
  • the collection of the k -th column of the matrix T and the k -th row of the matrix P is called the k -th factor.
  • T e collection of the f ⁇ m rows of the matrix P constitute a set of basis vectors for an / -dimensional vector space.
  • This vector space is a subspace of the space spanned by the rows of A . Hence we will call it a subspace model.
  • the case / ⁇ m is now illustrated with reference to figure 2a.
  • the n -th row a 210 of the matrix A 200 is rebuild according to fig. 2b where the pertaining n -th row t n 212 of the matrix T 202 is multiplied with mat ⁇ x P 214 and the pertaining n -th row esammlung 216 of the mat ⁇ x E 206 is added.
  • the mat ⁇ x A 200 may stand for any kind of data.
  • A is used to apply the above desc ⁇ bed subspace modelling technique to motion data.
  • One row of the matrix P 204 represents a basis image of motion corresponding to one factor.
  • One column of the matrix T 202 contains the temporal weight coefficients corresponding to one factor. The collection of all retained factors will be referred to as a subspace model of motion.
  • Motion data may consist of one or more motion fields, which e g. represent computed motion for a set of consecutive images
  • Motion fields generally consist of one motion vector tor each pixel or for each group of pixels.
  • the motion fields have one vertical and one ho ⁇ zontal component, which are denoted by D. and D,_ , respectively.
  • the vertical and the honzontal components of a motion vector can be considered as a real and imaginary part of a complex valued vector field.
  • the modelling can be performed using complex SVD algo ⁇ thms or similar methods.
  • the motion fields can also be represented in terms of magnitude and angle.
  • One subspace model for magnitude and one for angle can be used.
  • the arrangement of the motion data in mat ⁇ x form can be done in several ways:
  • E.g. the ve ⁇ ical component of a motion field, D v 220. having size n v vertically and n h ho ⁇ zontally, can be strung out to be represented by a row vector 222 of length n n v - n h .
  • Each ot the constructed vectors represents one row ot the data mat ⁇ x A 200.
  • the same procedure can be applied on the horizontal component of a motion field D h .
  • T e result is a separate subspace model for each, the vertical and the horizontal component of the motion data.
  • the vertical and horizontal component of the motion fields D v 230 and D h 232 can be strung out as one row vector 234 jointly, which then has the length n - 2 - «. - n h .
  • the rank / can be chosen to be significantly lower than the number of images in the corresponding set of consecutive images.
  • the motion residual E 206 will contain small values, or can be dropped without too much degradation in reconstruction quality. In total, this gives a potential for compression, in that a low number of temporal weight coefficients and basis image vectors, T 202 and P 204, and optionally the motion residual matrix E 206, can be transmitted instead of the full motion fields A 200.
  • TraMod 130 One simple implementation of TraMod 130 is to just transmit the data on a given communication channel.
  • the decoder Dec 300 receives an IDLE model 301 with its module ReceiveMod 310 and sends it to a module Predict 320 which provides a prediction 302 of each of the original images within the set of consecutive images.
  • the reference image part of the IDLE model is sent to the reference image store ReflmgSto 330.
  • the subspace model of motion is sent to the module RebuildMotion 340, where motion fields are rebuilt from the subspace model.
  • the rebuilding of the motion field corresponding to a certain image is performed by calculating a linear combination of the basis images using temporal weight coefficients pertaining to that image.
  • the module Wa ⁇ 350 uses the reference image and the rebuilt motion fields to provide the predictions 302 of the images within the set of consecutive images.
  • the reference image itself passes the module Wa ⁇ 350 unchanged.
  • the Wa ⁇ module 350 wa ⁇ s the image content of a reference image according to a motion field. Several methods for doing this are given in "Digital Image Wa ⁇ ing", third edition, George Wolberg, ISBN 0-8186-8944-7, IEEE Computer Society Press, 1994, which is hereby included by reference.
  • a related method is to inte ⁇ olate motion between motion vectors computed by block matching, as described in "Feedback Loop for Coder Control in a Block-Based Hybrid Coder with Mesh- Based Motion Compensation", Jo Ostermann, Proceedings of ICASSP "97, Kunststoff, IEEE Computer Society Press 1997, which is hereby included by reference.
  • the IDLE model produced by the encoder may be compressed.
  • the reference image, the temporal weight coefficients and basis images of the motion model can be compressed before being transmitted in module TraMod 130 from fig. 1.
  • Such an encoder scheme is shown in 4a.
  • the encoder Enc 400 receives as input a sequence of video images 401 which are buffered in the module ImageStore 410.
  • the module BuildMod 420 reads SCIs from ImageStore 410. builds the corresponding IDLE models and delivers the models to the module CompMod 425.
  • the module CompMod 425 compresses the models.
  • the compressed models are then transmitted by the module TraMod 430.
  • the decoder Dec 450 receives an IDLE model 451 with its module ReceiveMod 460 and sends it to a module DecompMod 470 which decompresses the model.
  • the decompressed model is sent to the module Predict 480 which provides a prediction 452 of each of the original images within the set of consecutive images.
  • Module Predict 480 works as module Predict 320 in fig. 3.
  • the above procedure is applied for each IDLE model.
  • the residual matrix E 206 may be also compressed and transmitted to the decoder. For all those types of data various compression methods can be used.
  • the reference image is a still image, and may be compressed using one of many still image compression methods available.
  • transform coding may be used.
  • One such method based on the Discrete Cosine Transform, DCT, is JPEG, as described in "JPEG Still Image Compression Standard", William P. Pennebaker, Joan L Mitchell, ISBN 0-442-01272-1, 1993, which is hereby included by reference.
  • Other methods are vector quantisation, predictive coding, fractal coding, wavelet based transform coding and other still image coding techniques as mentioned in “Digitale Marshcodtechnik", Jens-Rainer Ohm. Springer Verlag 1995, which is hereby included by reference.
  • each basis image usually has some characteristics in common with conventional still images; in particular, there is usually spatial redundancy.
  • each basis image can be compressed using any still image compression tool mentioned above with an adaptation to match the dynamic range pertaining to the different types of data.
  • the basis images may be represented in low-resolution.
  • the corresponding sub-sampling or up-sampling may be done as part of the compression and decompression.
  • the basis images may be represented by adaptive meshes (wire-frame) which can be generated by e.g., the Delauney triangulation process.
  • Various matrix factorisation processes can produce scaled basis images suitable for the used set of compression tools.
  • One method e. g. is that the basis images can be adjusted so that the maximum impact of the quantisation error will be the same for all factors, if the characteristic of the quantiser is the same for all factors.
  • appropriate scaling factors for the temporal weight coefficients as well as for the basis image for a given factor can be calculated and transmitted individually.
  • Another method is to add or subtract an offset value to the elements of a basis image, and scale the basis images to fit into an appropriate dynamic range.
  • the scaling factors as well as the offsets need to be transmitted additionally to the decoder. Instead transmitting the scaling factors directly, they can be used to adjust the quantiser step size. This method may also be applied to all basis images simultaneously.
  • the temporal weight coefficients for the first factors often have high level of temporal correlation.
  • predictive coding techniques may be applied for their compression.
  • the subspace model of motion is updated recursively.
  • the motion fields are estimated for each image of the set of consecutive images. Subsequently, the subspace modelling is performed.
  • memory consumption may be important. It is advantageous to have a method that does not need intermediate storage for all motion fields. It may be important to have parallel algorithms, so that motion estimation and calculation of subspace models can be performed concurrently. For such applications, the subspace models can be updated each time a new image is processed instead of being built at the end of a set of consecutive images.
  • the BuildMod module 500 again takes a set of consecutive images 501 as input, a switch 510 sends one image to a reference image store ReflmgSto 520 and the other images to a motion estimator module ME 530.
  • the resulting motion fields from ME 530 are now sent to a module for updating a subspace model, UpdateSSMod 540.
  • a model store StoreSSMod 550 for a subspace model is initialised to zero.
  • UpdateSSMod 540 reads the current subspace model from StoreSSMod 550, updates the subspace model, and stores it back.
  • the position of the reference image within an SCI can be determined to satisfy va ⁇ ous performance c ⁇ te ⁇ a.
  • the first image of each SCI was selected as the reference image
  • Fig. 6 illustrates the resulting structure for the orde ⁇ ng of the images pertaining to an IDLE model 600
  • the first image in the set of consecutive images, denoted by / 610, is used as a reference image.
  • the other images, denoted by U 620, 630, 640, 650, are represented as forward predictions of the reference image.
  • the corresponding motion fields are modelled using subspace modelling as desc ⁇ bed above
  • FIG 7 shows a centre image 730 of a set of consecutive images pertaining to an IDLE model 700.
  • Earlier images 710, 720 are backward predicted versions and later images 740, 750 are forward predicted versions of the centre image.
  • the corresponding motion fields can be modelled as one motion model.
  • one subspace model can be used for the earlier images 710, 720 and one separate model can be used for the later images 740, 750
  • the most representative image in the set of consecutive images may serve as the reference image in order to enable an optimum prediction using the pertaining IDLE model.
  • the selection of the reference image can be based on motion compensated predictions, e.g., motion estimation can be performed between the first image in the set of consecutive images and each of the following images as far as the corresponding prediction satisfies a fidelity c ⁇ te ⁇ on, a quality c ⁇ te ⁇ on or a cost (in terms of bits) c ⁇ te ⁇ on
  • motion compensation can be performed between the first image in the set of consecutive images and each of the following images as far as the corresponding prediction satisfies a fidelity c ⁇ te ⁇ on, a quality c ⁇ te ⁇ on or a cost (in terms of bits) c ⁇ te ⁇ on
  • the o ⁇ ginal image corresponding to the last satisfying prediction is chosen as the reference image
  • the reference image can be determined by computing an average or median histogram over intensity or colour for all images, and then selecting the image whose histogram has the highest simila ⁇ ty with an average or median histogram
  • Selecting a suitable image to be used as reference image for a set of consecutive images can also be done using analysis by synthesis. Several images can be t ⁇ ed as the reference image, and the selection that results in the most efficient modelling can be accepted.
  • Fifth preferred embodiment
  • the number of images to be included in one set of consecutive images is determined adaptively by BuildMod 420, referring back to fig.4a.
  • subspace modelling can bring advantages, e.g. with regard to compression ratios.
  • the subspace modelling may bring disadvantages regarding fidelity vs. compressibility, both for motion subspace models and the overall system.
  • BuildMod 420 determines the number of images to be included in one set of consecutive images. To this end, the next image to be processed is read from the ImageStore 410. Motion estimation calculates the motion field between the reference image and the current image. A fidelity criterion is used to decide whether this motion field will be included in the process of building the current subspace model.
  • BuildMod 420 passes the current IDLE model to CompMod 425.
  • the subspace model in StoreSSMod 550 and UpdateSSMod 540 additionally must be re-initialised to empty before building the next model.
  • Motion fields often have little similarity across scene shifts.
  • One way to determine the length of the set of consecutive images is therefore to use a scene shift detector.
  • Fig. 8 a shows how the motion estimation module ME 530 from fig. 5 or 180 from fig. lb can be used for defining sets of consecutive images, in addition to its normal function.
  • the functions of switch 810, ReflmgSto 812, ME 814, are the same as described in fig. 5.
  • the motion field, resulting from ME 814, is used to wa ⁇ the reference image in the module Wa ⁇ 816 to produce a predicted image.
  • the wa ⁇ ing is performed in the same direction (forward or backward) as in the decoder.
  • the absolute value of the difference between the predicted image and the original image is summed up in SumAbsDiff 818. For related images, this will often give a relatively small result, while for unrelated images, this will often give a large result.
  • the sum is fed to the encoder control EncCont 819, where it can be compared with a given threshold. As long as the sum stays below the threshold, the motion field is passed to BuildSSMod 820 so that it is included in a subspace model of motion. If the sum is larger than the threshold, the end of the set of consecutive images is reached.
  • the reference image 821 and the subspace model of motion 822, together constituting an IDLE model 823. are given as output.
  • Figure 8b illustrates an alternative method, which uses the module OppositeWa ⁇ 836, where the wa ⁇ ing is performed in the opposite direction as in the decoder.
  • the functions of switch 832, ReflmgSto 834, ME 835, BuildSSMod 837 are the same as described in fig. 8a.
  • the current image is wa ⁇ ed in module OppositeWa ⁇ 836 using the motion field estimated by ME 835.
  • the resulting prediction is expected to be similar to the reference image for the case of highly related images. Therefore, the result from a module 838 for summing absolute differences between the wa ⁇ ed image and the reference image can be compared against a given threshold in the encoder control EncCont 839. This way the start of a new set of consecutive images may be triggered.
  • the similarity between an original image and the prediction of the image can also be expressed in terms of spatial size of prediction failure areas.
  • One possibility is to calculate an intensity residual image as the difference between the original image and the prediction, divide the residual image into spatial blocks, and then count the number of spatial blocks where the average error is above a given threshold.
  • a spatial block denotes a group of pixels or one pixel.
  • the average error can be found by the sum of absolute values of the spatial block divided by the number of pixels in the spatial block, or by root mean square (RMS), or by another fidelity criterion.
  • RMS root mean square
  • Other characterisations can also be used, like a median value.
  • a threshold can be used to distinguish between spatial blocks, which exhibit a prediction failure or prediction success.
  • Another threshold can be used for the allowed number of spatial blocks designated as prediction failure. The end of the set of consecutive images is not reached as long as the number of spatial blocks with prediction failure stays below a certain threshold, e.g. 5 % of the total number of spatial blocks in the image.
  • characteristics of the human visual system may be considered to get an estimate of how visible artefacts will be.
  • One possible method is to slightly blur the residual image before thresholding.
  • Another possible method is to process the thresholded image with mo ⁇ hological filters like Opening' and "Closing" .
  • Also methods from "Method and apparatus for compression of video images and image residuals", already included by reference, may be used to discriminate between highly and less visible artefacts. Residual values which correspond to invisible artefacts may be removed from the residual image before each spatial block is characterised as a prediction success or a prediction failure.
  • FIG. 8 c A module BuildMod 850 for building a model is shown.
  • EncCont 869, ReflmgSto 861 , ME 862, OppositeWa ⁇ 863 and BuildSSMod 864 are the same as described for fig. 8 b.
  • DefBounds 866 for defining bounds wherein the artefacts are declared to be invisible.
  • This module can exploit masking effects of the human visual systems. Examples include entropy masking and non-linear response for intensity. It produces an upper and a lower bound for each spatial block. A more detailed description is given in "Method and Apparatus for compression of video images and image residuals", already included by reference.
  • RelToBounds 867 It relates for each pixel or block of pixels the difference between the original image and the predicted image, given as output of the difference module 865, to the bounds. It can be based on several principles. One is to check whether the residual is within the bounds, producing a yes/no result, e.g. encoded as 0 if the residual is within the bounds and 1 otherwise. Another principle is to compute the ratio between the residual and the upper or lower bound. Optionally, if the ratio is smaller than 1, then it can be set to zero. Yet another method is to compute the difference between the residual and the upper or lower bound.
  • a summation module 868 then sums up the results of RelToBounds 867 and gives the sum to EncCont 869, which can make a decision about the length of the set of consecutive images based on this sum.
  • the modules difference module 865. DefBounds 866, RelToBounds 867 and summation module 868 may also be used within the structure according to fig. 8a instead of SumAbsDiff 818, where DefBounds and the difference module get as input the current image instead of the reference image.
  • the fit of a new motion field to a subspace model can be used for deciding whether to start a new set of consecutive images. This is illustrated on fig. 8 d. showing a module BuildMod 875.
  • the result from ME 884 is projected on a subspace model of motion in ProjOnSSMOD 886, producing a residual Res, which can not be described within the subspace model. If the sum of absolute values of the residual, as computed in SumABS 888. is above a threshold, then EncCont 874 may decide to start a new set of consecutive images. Otherwise.
  • UpdateSSMod 892 updates the subspace model of motion stored in StoreSSMod 890. When the result of SumAbs 888 is small, the motion field fits well into the subspace model.
  • the threshold can be absolute, e.g. corresponding to an allowed average or maximum deviation, expressed in the number of pixels, between the motion field as estimated by ME 884 and the same motion field as modelled by BuildSSMod.
  • Some methods for updating a subspace model have a projection of new data on old factors as one of the first steps.
  • ProjOnSSMOD 886 can therefore be seen as a first step of UpdateSSMod 892.
  • random access into the video sequence is required. Examples include channel change in connection with broadcasting, or fast forward mode for storage based systems. In such systems, some maximum temporal distance between reference images is prescribed.
  • An encoder control module EncCont can then count the processed images, and force reference images at given intervals by starting a new set of consecutive images. This can be combined with any other method for defining the length of a set of consecutive images.
  • a system could have an image counter. Each time a new set of consecutive images is triggered using any of the methods outlined in figs. 8 a-d; the image counter is reset to zero. For each processed image, the image counter is increased by one.
  • the threshold may be set to 20 if random access with at most one second delay is prescribed.
  • One method is to first determine the length of the set of consecutive images according to any of the described methods using a preliminary reference image, and then choose the final reference image adaptively as described in the third embodiment.
  • Another method is based on selecting the length for a set of consecutive images and choosing the reference image simultaneously.
  • the goal is a representation as illustrated in fig. 7, such that the set of consecutive images will consist of as many images as possible.
  • the search is done using motion estimation. All images between image 710 to the found reference image / 730 are included into the set of consecutive images. Then find the end of the set of consecutive images, such that all the images after / 730 can still be well predicted from / 730 as described in the fifth embodiment.
  • Different sets of consecutive images may have a different optimum number of motion factors / , also called the rank of the subspace model.
  • rank of the subspace model E.g., when the images of a set of consecutive images only show a still image without motion, then zero motion factors are needed.
  • a subspace model that captures much of the motion needs several factors, that is, it requires a model with higher rank.
  • motion information is lost in several stages in the described system resulting in a loss of fidelity of the subspace model and therefore a loss of prediction fidelity.
  • motion estimators will in general not be able to estimate the "true motion”.
  • One way of reducing the rank is to first update or build a subspace model using a higher number of factors, and then determining a final rank after all images of the set of consecutive images have been processed. This is illustrated in fig. 9.
  • BuildMod 900 corresponding to the module 420 with the same name in fig. 4a, has a module UpdateSSMod 950 for updating a subspace model stored in StoreSSMod 960. Before the subspace model is passed on from BuildMod 900, the rank of the model is determined in DetRank 970 and factors may be deleted.
  • DetRank may then be based on a set of rules or a table giving the number of factors / to be used depending on the length N of a set of consecutive images.
  • a set of rules could be: 0 for N ⁇ 2
  • DetRank 1070 may be based on the content of the subspace model
  • one method is to discard those factors that have singular values below a threshold
  • Another method is to define a relative threshold E g , if the singular value of a factor has less than 5 °c of the largest singular value ot a subspace model, then the tactor is discarded
  • Another method is to define a smallest relevant motion E g , one may defme that the impact of motion smaller than one half pixel is not visible. Therefore, cut away those factors whose largest absolute value of the temporal weight coefficients multiplied with the largest absolute value of the basis image does not exceed the value of 0 5.
  • the image content may have an influence on the necessary fidelity of the motion subspace model and therefore on its rank E g , in areas without texture of the image some "false" motion may be of low visual importance, while in textured parts of the image even small errors in motion may be visible
  • each basis image can be coded separatek using an image-compression tool This tool can be controlled so that only significant parts of the basis image are transmitted E g . if the image-compression tool is based on dividing the basis image into spatial blocks and transform coding each block using DCT. then it can be checked if the motion related to this factor and block is below a given threshold, say, one half pixel If it is the case, then the block mav be skipped, otherwise it mav be coded The -,Q
  • the rank of the model may be reduced by considering the singular values, or the sum of squares of basis images or temporal weight coefficients.
  • the first factors have mainly to do with "signal”
  • the last factors have mainly to do with "noise”.
  • the "noise” may include e.g., errors originating from the motion estimation process or from thermal noise in the image.
  • the factors mainly related to "signal” have strongly decreasing singular values, while the factors related to "noise” will have a much flatter distribution of the singular values. The transition from strongly decreasing to flatter distribution can often be seen as a "knee point" on a plot of the singular values as a function of the factor numbers.
  • An encoder can determine an appropriate knee point and discard all factors with singular values smaller than the singular value at the knee point.
  • DetRank 1070 may be performed as part of UpdateSSMod 1050 for every update operation. This may have advantages in terms of computational complexity or storage requirements due to the fact that UpdateSSMod 1050 and StoreSSMod 1060 only need to consider the selected number of factors.
  • the methods for determining an intermediate value for the rank in UpdateSSMod 1050 and the final determination in DetRank 1070 may be combined.
  • the rank may be reduced from full to intermediate in UpdateSSMod 1050, mostly in order to save resources, while the rank may be further reduced in DetRank 1070, mostly to optimise compression ratios.
  • the above described methods for rank reductions may be spread out to different locations in the encoding process to achieve a multi-stage processing scheme.
  • the reason for this is to add flexibility to the rank determination in order to take into account the potentially changed importance of factors throughout the encoding process.
  • the rank can be determined in common for both dimensions, or the rank can be determined independently for each dimension. -, ,
  • the shape information for those areas may be selected by a manual or an automatic segmentation process.
  • the shape information designates the set of valid pixels in an image.
  • the notion of 'video objects' is commonly used, e.g. within the context of the MPEG-4 standardisation body.
  • the shape information of video objects is often given in terms of a so-called 'alpha channel'.
  • shape field' will be used.
  • Binary shapes may be used as well as grey-level shapes.
  • the MPEG-4 video coding toolbox describes numerous methods for handling shaped video objects. Examples are the shape-adaptive DCT and various padding methods in the context of block-based video compression.
  • Shape information may be defined only for a reference image.
  • the shape information for the remaining images in an SCI is then generated by a prediction from the shape assigned to the reference image using motion information. To this end the shape of the reference image is wa ⁇ ed to the position of the respective image using the motion fields.
  • the shape information may be defined for each image.
  • the shape information for the reference image in conjunction with the estimated motion information for the images in the SCI is used to determine a prediction model for the shape including a prediction error for the shape.
  • the shape residual can be formed by an arithmetic difference between original and predicted shape. In this case the residual of a binary shape will be a tri-level image. Alternatively, the residual of a binary shape can be formed using an exclusive-or operation between original and prediction, in which case it will be a bi-level image (binary image).
  • shape information for all images can be coded separately using e.g., shape coding tools from MPEG-4.
  • JBIG JBIG
  • shaped video objects may be encoded independently.
  • the individual shaped video objects may be composed into common images making use of composition information such as depth layering or spatial as well as temporal position for the individual video objects.
  • Fig. 10 a shows a reference image.
  • Fig. 10 b shows a predicted image as a reference image wa ⁇ ed with an arbitrary motion field.
  • parts 1030, 1031, 1032, 1033 of the image that are not covered by the corresponding prediction.
  • One way to cope with uncovered parts is to transmit a residual for those parts.
  • FIG. 10 c Another way is to create an extended reference image, containing the reference image plus new content. This is illustrated in fig. 10 c. To the same reference image as was shown in fig. 10a more pixels 1060, 1061, 1062, 1063 have been added, thus forming an extended reference image. The wa ⁇ ed reference image covers now the whole image to be reconstructed. An illustration is depicted in fig. 10 d. The subspace model of motion must be adjusted correspondingly. 1080-1083 are reconstructed areas corresponding to 1060-1063.
  • One method for creating an extended reference image is based on wa ⁇ ing pixels from the original image corresponding to the uncovered pixels in the predicted image into the position of the reference image. This method will now be described in more detail, with reference to figs. 11 a-h.
  • An original reference image /, 1100 is given, depicting a first tree 1102 and a person 1104.
  • An image I n 1110 is also given, depicting the same person 1112 but a second tree 1114, due to camera panning.
  • a motion field D 1120 is calculated between I r 1100 and / exerc 1110.
  • Fig. lie shows the forward motion compensation case.
  • One of the motion vectors 1122 for the tree and one of the motion vectors 1124 for the person in the reference image has been drawn. The task is to prepare a new reference image with space for both instances of the person 1104, 1112, the first tree 1102, and the second tree 1114.
  • An extended version 1130 of I r (horizontal size n h , vertical size n ) is created, so that it contains I r 1100, with additional space around it so that pixels from I n 1110 can be filled in.
  • One conservative method comprises the following steps:
  • the extending may imply a change of the co-ordinate system. E.g., if the origin of /, was in the upper left pixel and the origin of the extended image shall also be in the upper left pixel, then values corresponding to the extension are added to the motion field.
  • the extended regions of the motion field are extrapolated from the given motion field.
  • the extrapolation should be done so as to minimise discontinuities, both between extended and original areas and inside extended areas.
  • One useful technique is the following diffusion process, performed on the motion field D e :
  • the algorithm starts with a "processed" area containing valid motion, defined by a given shape field (i.e. 5 r e ). For each pass trough the field, all pixels that are adjacent to the processed area are set to an average of those neighbouring pixels that lie inside the processed area. As a final result thereof the whole area of the resulting motion field contains valid motion information.
  • Tenth preferred embodiment There may be cases where a sequence is broken into multiple sets of consecutive images, though some similarity between the images remains between the sets of consecutive images.
  • One such similarity is when the reference image of a first set of consecutive images is similar to the reference image of a second set of consecutive images.
  • the second reference image called a P- image, can then be reconstructed from a prediction based on the first reference image plus a residual.
  • One method is to first encode each set of consecutive images independently. Then, as part of compressing the resulting model, the reference image of the second set of consecutive images can be encoded as a predicted version of the first one. Therefore a motion estimation between the two reference images can be performed and the motion field plus a residual is transmitted.
  • Another method is to let the motion between the first and the second reference image be included in the subspace model of motion for the first set of consecutive images.
  • a possible resulting structure is illustrated in fig. 12.
  • a first reference image 1210 is used as a basis for motion estimation to several images 1212, 1214, 1216, 1218.
  • the motion fields 1240 can be modelled together as a subspace model.
  • One image 1218 is then used as a reference image for motion estimation to further images 1220, 1222, 1224, 1226, 1228.
  • One image 1228 of these can once more be used as a reference for motion estimation to further images 1230, 1232, 1234.
  • a related technique without using a subspace model of motion is used in the MPEG class of encoders.
  • inter tools For the compression of the residuals pertaining to P-images "inter” tools may be used instead of "intra” tools.
  • a first image 1300 in a unit of images shows a woman 1305 partly occluded by a car 1310, together with a man 1315.
  • a later image 1360 shows the woman 1365 fully, but this time the man 1375 is partly occluded by the car 1370.
  • An intermediate image 1330 shows the woman 1335 and the man 1345 partly occluded by the car 1340.
  • One early (e.g. the first) and one late (e.g. the last) image of a unit of images are selected as reference images for two IDLE models, such that their sets of consecutive images (SCIs) overlap, that is both contain a common subset of the unit of images.
  • Each non reference image within the intersection can therefore be predicted by the first, early IDLE model, i.e. by a forward prediction, and by the second, later IDLE model, i.e. by a backward prediction.
  • each spatial block of pixels or for each pixel in such an image it is measured, whether the forward prediction or the backward prediction produces the better prediction.
  • the encoder sends one bit for each spatial block of pixels or pixel telling the decoder whether to use the forward or the backward prediction.
  • An IDLE model 1440 with reference image 1412 is used for predicting nearby images 1410, 1411, 1413, 1414. 1415, 1416.
  • Another IDLE model 1445 with reference image 1416 also covers nearby images 1414. 1415, 1417, 1418, 1419, 1420.
  • Those images 1414, 1415 that are covered by both models are called bi-directional predicted images, or B-images in short.
  • Those images 1410, 1411, 1413 that are only covered by one model are called unidirectional predicted images, or U-images in short.
  • the structure can be continued, in that a further IDLE model 1450 with reference image 1421 covers nearby images 1417, 1418, 1419, 1420, 1422. 1423. 1424 andl425. etc.
  • an image is encoded as a U-image or a B-image depends on the quality with which it can be predicted from a single IDLE model. The decision may be done while building the IDLE models.
  • a B-image can change to a U-image within the decoder if random access makes the earlier of the two models unavailable. For example the decoder gets first access to the sequence at IDLE model 1450 and has therefore no information from the previous IDLE model 1445. Hence the images 1417, 1418, 1419 and 1420 may be decoded as U-images.
  • a blend field is calculated in the encoder and transmitted to the decoder.
  • Each element of the blend field contains one value for each pixel or each spatial block of pixels of the image, indicating whether it has to be taken from the forward prediction of the earlier IDLE model or the backward prediction of the later IDLE model. For example the value 0 may indicate the usage of the forward prediction and the value 1 the usage of the backward prediction.
  • blend fields will often have spatial redundancy. If one pixel or one block of pixels in a B- image is taken from the forward prediction, then there is an increased probability that the neighbouring pixel or block of pixels will also be taken from the forward prediction.
  • Each blend field can therefore be compressed using a compression tool for binary images. Due to the binary character of the blend fields all methods for coding shape fields can also be used for blend fields.
  • the blend fields will also often have temporal redundancy. If one pixel or one block of pixels in a B-image is taken from the backward prediction, then there is an increased probability that the corresponding pixel or block of pixels in the following B-image will be taken from the backward prediction, too.
  • the temporal and spatial redundancy of the blend fields occurs for areas of the image where the prediction is good in one direction and less good in the other. In areas of the image where both the forward and backward prediction produce equally good quality or fidelity the selection of the best may be influenced by secondary effects, like noise in the images. The alternation between forward and backward predictions might produce visual artefacts in the B-image.
  • the blend fields will tend to be less compressible. It may be advantageous to apply techniques which simplify the blend fields in those areas where the forward and backward predictions are of similar quality.
  • One such technique is to apply a median filter to the blend fields in those areas where the forward and backward predictions have similar fidelity or quality.
  • Another technique is to use methods which adapt the blend value for each element according to a compromise between prediction quality and similarity to neighbours.
  • One such method called Agree filter, minimises a cost function by performing the following steps:
  • /,' denotes the forward prediction of the first IDLE model
  • I 2 ' the backward prediction of the second IDLE model
  • / the original image to be reconstructed B the blend field
  • k a weighting constant
  • Block (p) a spatial block of pixels in the image corresponding to the element p in the blend field
  • Neigh(p) a set of spatial neighbour elements of element p in B .
  • no iteration i.e. repetition of step (2) and (3)
  • the second term measures how many of the neighbouring blocks have a blend value that is different from the current one.
  • the constant k determines the strength of influence of the second quantity.
  • the blend field can be seen as a weight field. Instead of using only either 0 or 1 weights, the weights can be feathered. Consider a transition from backward to forward prediction in the middle of eight pixels on a scan line. Without feathering, the weights for the forward prediction may be
  • blend fields with real valued weights in the unit interval which are well compressible may be generated by a procedure as discussed in "Compactly-Encoded Optical Flow Fields For Bidirectional Prediction and Video Coding", Ravi Krishnamurthy, Pierre Moulin, John W, Woods. Visual Communications and Image Processing, San Jose, California, 1997. which is hereby included by reference. A similar concrete realisation of such a procedure is described in the following:
  • the final prediction /' is a linear combination between the forward prediction /( and the backward prediction l with a corresponding value of the blend field B:
  • the prediction errors (residuals) E. may be pre-processed by methods described in "Method and apparatus for compression of video images and image residuals", already included by reference. Residuals corresponding to artefacts below a visual threshold may be removed from the . .
  • the set of blend fields of successive images can be compressed by building a subspace model with reduced rank, as described for the motion data (module BuildSSMod 190).
  • Each basis image of the subspace model has the size of a blend field.
  • Each factor has one temporal weight coefficient for each B- image covered by the subspace model.
  • the blend field for one bidirectional described image (B-image) is found by multiplying the temporal weight coefficients corresponding to the image with the basis images.
  • the subspace model may be build in one step using stored blend fields or may be updated recursively (as done for motion fields in module UpdateSSMod 540, see third preferred embodiment). An intermediate determination of the rank and a final determination may be combined (see seventh preferred embodiment). All blend fields between two reference images may be described by one subspace model.
  • the blend fields may be wa ⁇ ed to a common position within the sequence before modelling. One way of doing this is to calculate the blend fields in the corresponding image positions, as described above. Then each blend field may be wa ⁇ ed to the position corresponding to the reference image of e.g. the later IDLE model. This can be done by using the OppositeWa ⁇ routine with the motion field obtained from the later IDLE model. Then a subspace model of these blend fields is build. On the decoder side each blend field resulting from the blend field subspace model is wa ⁇ ed to its corresponding image position with the corresponding motion field obtained from the subspace model of motion using the Wa ⁇ routine. This wa ⁇ ed field is used to perform the blending of the forward and backward predictions.
  • the basis images of the model can be compressed using standard image compression techniques, like entropy coding, run-length coding, predictive coding, DCT or similar. In particular, they can be compressed using the same or similar routines as for the subspace models of motion, as described in the second preferred embodiment. Moreover the basis images may be sub-sampled. Predictive coding techniques and frequency transforms may be applied for the compression of the temporal weight coefficients.
  • Blend fields with values exclusively 0 and 1 can be compressed by applying techniques for shape fields or shape residuals as described in the ninth embodiment.
  • An example for a realisation of an encoder which works with subspace models for blend fields containing values in the unit interval will now be described with reference to fig. 15a.
  • the encoder Enc 1500 receives as input a sequence of images 1501 which are buffered in the module ImageStore 1502.
  • the module BuildModel 1503 builds an IDLE model which is compressed in module CompMod 1504. This compressed model is simultaneously put to the module DecompMod 1505 and to the module TraMod 1509 where it is transmitted.
  • the module DecompMod 1505 decompresses the IDLE model.
  • the decompressed model is simultaneously stored in the module ModelBuffer 1506 and put to the module CalcBlendModel 1507.
  • the module CalcBlendModel 1507 receives the previous IDLE model stored in the module ModelBuffer 1506, the current IDLE model from the module DecompMod 1505 and original images from the module ImageStore 1502.
  • the module CalcBlendModel 1507 calculates the corresponding blend subspace model which is then compressed in the module CompBlendMod 1508.
  • Finally the module TraBlendMod 1510 transmits the compressed blend subspace model.
  • the module CalcBlendModel 1507 will now be described with reference to fig. 15b.
  • the module 1520 receives a earlier 1521 and a later IDLE model 1522 and original images 1523.
  • the module BlendControl 1524 calculates the intersection of the SCIs of the two IDLE models and uses this information in order to control the modules Predict 1525 and 1526.
  • These modules build predictions (see module Predict 320 in fig. 3) for all images lying in the intersection of the SCIs.
  • the module CalcBlendFields 1527 receives these predictions as well as the original images and calculates blend fields for all predictions.
  • the module BuildSSModBlend 1528 builds a subspace model of these blend fields.
  • the module CalcBlendFields 1527 will now be described with respect to fig. 15c.
  • the module 1530 receives a forward prediction 1531 from an earlier IDLE model, a backward prediction 1533 from a later IDLE model and the corresponding original image 1532.
  • the modules CalcResidual 1534, 1535 calculate the residuals (prediction errors) taking into account the predictions and the original image.
  • the modules CalcResidual 1534, 1535 may also process the residuals by methods described in "Method and apparatus for compression of video images and image residuals" already included by reference. Using the two residuals the module CalcValues 1536 calculates all values for the output blend field by solving the variation problem described above .
  • the decoder Dec 1600 receives an IDLE model 1601 in the module ReceiveMod 1610 and a blend subspace model 1602 in the module ReceiveBlendMod 1620.
  • the module DecompMod 1615 decompresses the IDLE model and the module DecompBlendMod 1616 decompresses the blend subspace model.
  • the decompressed blend subspace model is put to module RebuildBlendFields 1650.
  • the decompressed IDLE model is simultaneously stored in the module ModelBuffer 1630 .- 3D .-
  • the module BlendControl 1640 calculates the intersection of the SCIs of the current IDLE model and the previous IDLE model received from the module ModelBuffer 1630 as well as the unit of images between the reference images of both models.
  • the module Predict 1660 calculates all forward predictions of the previous IDLE model.
  • the module Predict 1670 calculates all backward predictions of the current IDLE model. Both modules 1660, 1670 receive control information from the module BlendControl 1640. If a forward and a backward prediction exist for a certain image, the module BlendPredictions 1680 blends them according to their corresponding blend field received from the module RebuildBlendFields 1650. If only one prediction exists for a certain image, this prediction passes the module BlendPredictions 1680 unaltered. The described process is controlled by information received from the module BlendControl 1640.
  • residuals as the differences between original images and predictions can then be transmitted.
  • the original area instead of the difference can be transmitted if it needs less bits.
  • FIG. 17 shows an encoder Enc 1700.
  • a sequence 1701 of images is kept in an image store 1705.
  • BuildMod 1710 reads an SCI from the image store 1705 and builds an IDLE model.
  • the model is compressed in the module CompMod 1722 and then transmitted by TraMod 1725.
  • the model is also used for predicting the images in a module Predict 1715 (see module 320 in fig.3).
  • a module 1720 calculates the differences between the original images and the predictions, producing residuals. These are compressed in module CompRes 1723 and then transmitted by TraRes 1730. In this scenario, the residuals are calculated with the predictions obtained by the unprocessed model.
  • quantisation or other compression artefacts coming from lossy compression of the model can be considered when computing the residuals. This can be done by calculating the residuals with the predictions obtained by the processed model, which has passed compression and decompression.
  • a corresponding encoder is illustrated in fig. 18.
  • the encoder Enc 1800 receives a sequence of images 1801 and stores it in the ImageStore 1810.
  • BuildMod 1820 reads an SCI from the ImageStore 1810 and builds an IDLE model.
  • the IDLE model is compressed in module CompMod 1830, where quantisation, thresholding or other types of lossy data reduction take place.
  • the compressed model is transmitted by TraMod 1880 and put to module DecompMod 1840. where it is decompressed. This decompressed model is used for predicting the images in module Predict 1850.
  • the residuals as the differences between the original images and these predictions are calculated in module 1860.
  • the residuals are compressed in module CompRes 1870 and then transmitted by TraRes 1890.
  • All residuals can be compressed as described for the residuals pertaining to P-images.
  • parameter sets leading to worse quality may be used. Assuming that the relevant parts of such residuals are located in a relatively small area, direct addressing and coding only the blocks containing these relevant parts may yield better results.
  • Predictive coding techniques in combination with variable length coding may be used for a lossless compression of these addresses.
  • the set of addresses can be inte ⁇ reted as a binary field and can therefore be compressed with tools for bi-level images.
  • the residuals should be computed on the encoder side based on a full simulation of the decoder. That is, the encoder should inco ⁇ orate all artefacts of compression of the reference images and the subspace model of motion. This way it is guaranteed that transmitting the residuals will indeed improve the fidelity of the reconstructed images.
  • the reference image When the reference image is compressed with a lossy technique, it will exhibit coding artefacts. Those artefacts will then manifest themselves also in predictions, which implies that one compression artefact may have to be corrected in residuals for several images. This will degrade the compression ratios.
  • the residual In order to avoid unnecessary coding of quantisation artefacts, the residual can be computed as if the reference image had been compressed in a loss-less way. That is, for pu ⁇ oses of computing residuals, the original reference image is used instead.
  • the residual may be computed as if the basis images had been compressed in a loss-less way.
  • the encoder chooses a number of factors as a compromise between bit rate and quality. Some factors that are not transmitted describe motion that is visually insignificant, but whose removal still leads to an increase in residuals. It may be beneficial to compute the residuals as if a higher number of factors has been used for the subspace model of motion.
  • Characteristics of the current motion field and the subspace model of motion may be considered by estimating the visibility of artefacts caused by the rebuilt motion.
  • a spatially and temporally smooth error in motion with a magnitude of e.g. 1 pixel, may be hardly visible in a larger area with a smooth motion field.
  • a motion artefact with a magnitude of 0-5 pixels might be easily visible in an area with a non-smooth motion field. Such information can be exploited when deciding whether original or quantised motion should be used for determination of predictions.
  • the P-images defined in the seventh embodiment are reconstructed by adding a residual image to the prediction of an IDLE model containing an earlier reference image.
  • This residual image can in principle be treated similar to the residuals occurring in the previous embodiment that is visually insignificant residuals may be suppressed or dampened.
  • Principles according to "Method and apparatus for compression of video images and image residuals", already included by reference, may therefore be used for dampening the residuals, so that only visually significant residuals need to be compressed.
  • One fact influencing the visibility of image artefacts is the temporal extent to which such artefacts are present.
  • One method for limiting the temporal extent is to switch between dampening residuals and not dampening residuals for P-images.
  • FIG. 19 shows an encoder Enc 1900.
  • a sequence 1901 of images is buffered in an image store 1910.
  • BuildMod 1920 reads a set of consecutive images from the image store 1910 and builds a corresponding IDLE model.
  • CompMod 1930 compresses this model, which is then transmitted in TraMod 1980.The same model is decompressed in module DecompMod 1935.
  • the module Predict 1940 receives the decompressed model and predicts the images (see module 320).
  • the difference module 1950 calculates the difference between the original image and its prediction, producing a residual.
  • a switch 1960, controlled by the encoder control module EncCont 1905 directs the residual either to the module DampRes 1970 for dampening, or to the module CompRes 1975.
  • the output of module DampRes 1970 is also put to module CompRes 1975, which compresses the residuals.
  • the compressed residual is transmitted by module TraRes 1990.
  • the switching can be done in a way such that residuals are transmitted without being dampened for a certain fraction of the P-images, e.g. every second P-image.
  • the switching can be done with respect to time, expressed in the number of images or in seconds. E.g., the switching can be done so that if more than one second has elapsed since the last I-image or not-dampened P-image residual was transmitted, then dampening should be disabled for the next P-image residual, else it should be enabled.
  • the encoder can control the parameters of DampRes instead of operating a switch.
  • a bit rate control parameter affecting bounds for allowed residuals can be used.
  • the parameters can be set so that P-image residuals are normally transmitted with strong dampening, while some chosen P-image residuals are transmitted without or with very weak dampening.
  • the dampening of a P-image residual may be performed for certain pixels or certain blocks of pixels only. These pixels or blocks of pixels may be selected randomly, such that a certain number of pixels or blocks of pixels are dampened. This number may be a function of the time since the last I-image was transmitted.
  • an input sequence may also exhibit systematic changes in intensity. Those intensity changes may also be modelled efficiently using subspace models.
  • the basic method is explained in the patent application "Method and Apparatus for data analysis", already included by reference.
  • Subspace modelling of intensity can be inco ⁇ orated in the present invention, as shown in fig. 20.
  • An encoder Enc 2000 receives as input a sequence of images 2001. For a set of consecutive images SCI read from the image store 2030, a subspace model of motion is built in BuildModMotion 2040. This subspace model of motion, together with the reference image and the original images are used to build a subspace model of intensity changes in BuildModlnt 2050. The total output is compressed in module CompMod 2055 and transmitted in module TraMod 2060.
  • the intensity change ⁇ I n ' is rebuilt from the corresponding subspace model of intensity changes.
  • Such a subspace model can be built in two ways described in the two alternative versions of the module BuildModlnt, which are shown in figures 21a and 21b, respectively.
  • module BuildModlnt One possible version of module BuildModlnt will now be described with reference to fig. 21a.
  • a set of consecutive images SCI 2101 and a corresponding IDLE model consisting of a motion model MotionSSMod 2102 and a reference image I rit 2103 are given as input.
  • each image l n of the SCI 2101 is wa ⁇ ed to the reference position using the corresponding motion field.
  • the difference ⁇ / admir between this image and the reference image is calculated by module 2130. Hence ⁇ l ⁇ is given by
  • M n Opposite Wa ⁇ (/ ⁇ - Wa (/ re/ ,D n ' ),D n ' )
  • the module 2200 receives an IDLE model 2201 and sends the reference image to the module ReflmgSto 2202.
  • the subspace model of intensity changes is sent to the module RebuildlntensChange 2203 and the subspace model of motion is sent to the module RebuildMotion 2204.
  • the rebuilt intensity change from 2203 is added to the reference image in module 2205.
  • the module Wa ⁇ 2206 wa ⁇ s the resulting images according to the corresponding rebuilt motion fields from 2204 and delivers predictions 2207 for all images of the SCI.
  • the reference image itself passes the modules 2205 and 2206 unchanged.
  • Predict modules 480, 1525, 1526. 1660, 1670, 1715, 1850 and 1940 must work like 2200.
  • One such difference image for each image in the SCI is used to build the subspace model of intensity changes.
  • each image 1 n of the SCI is wa ⁇ ed to the reference position according to the corresponding rebuilt motion field D n ' .
  • the difference between this image and the reference image is then wa ⁇ ed to the position of / admir according to D .
  • the final wa ⁇ ed difference ⁇ / crown is given by;
  • ⁇ /ford Wa ⁇ ( ⁇ ppositeWa ⁇ (/ , )- / admir, ,D n ' )
  • the module Predict 320 of the decoder (fig .3) must work like module Predict 2220 which will now be described with reference to fig. 22b.
  • the module 2220 receives an IDLE model 2221 and sends the reference image to the module ReflmgSto 2222.
  • the subspace model of intensity changes is sent to the module RebuildlntensChange 2224 and the subspace model of motion is sent to the module RebuildMotion 2223.
  • the module Wa ⁇ 2225 wa ⁇ s the reference image according to the corresponding rebuilt motion field from 2223.
  • the module 2226 adds the corresponding rebuilt intensity changes from 2224 to the wa ⁇ ed reference images and delivers predictions 2227 for all images of the SCI.
  • the reference image itself passes the modules 2225 and 2226 unchanged.
  • Some types of residuals are visually more disturbing than others. This may be considered when modelling intensity changes. E.g., if a sha ⁇ intensity edge is positioned one quarter of a pixel out of position in a predicted image, this will lead to a large intensity residual, though the artefact may not be visible. Further, it is a difficult problem to estimate the motion fields for moving objects at the boundary of objects, and the residuals may even be unsystematic in magnitude. Since such residuals are large, they may influence the subspace modelling. Since they are unsystematic, they may lead to a high number of factors necessary to obtain good fidelity. Since they occur at edges, they may sometimes be visually insignificant. It is therefore advantageous to let such pixels have less influence on the subspace modelling of intensity changes.
  • One method is to pre-process the intensity residual image before modelling. This is illustrated in fig. 23, showing an alternative BuildModlnt 2100.
  • a reference image 2303, a subspace model of motion MotionSSMod 2302 and a corresponding set of consecutive images SCI 2301 are given.
  • Motion fields are reconstructed in RebuildMotion 2310, every image is wa ⁇ ed to reference position in OppositeWa ⁇ 2320, and the difference 2330 between this image and the reference image is fed to Damp 2340 where the residual is dampened, considering masking effects derived from the reference image as described in "Method and apparatus for compression of video images and image residuals", already included by reference.
  • the dampened differences are then used for building a subspace model in BuildSSMod 2350.
  • the same procedure can be applied for all variations to build a subspace model of intensity changes.
  • Another method is to use weights in the subspace modelling. Instead of removing intensity differences of relatively little visual importance from the residual before subspace modelling, the intensity differences can be allowed to remain, and the weights in the subspace modelling are changed instead. Residuals of little visual importance can then be given low weights. For this pu ⁇ ose, BuildSSMod 2350 must be able to handle pixel by pixel weights for each image.
  • the modelling of intensity depends on the rebuilt motion field, represented by D n ' .
  • Three kinds of rebuilt motion fields are possible, which differ in the kind of processing they had been exposed to.
  • a first possibility is to use each motion field directly as calculated by the motion estimation tool.
  • a second possibility is to rebuild the motion field from the subspace model of motion after determining the rank of the subspace model.
  • a third possibility is to rebuild the motion field from the subspace model of motion, after the subspace model has been compressed and decompressed.
  • the three possibilities have an increasing ability to produce a model that will lead to high fidelity in the final decoded image, in that the two last possibilities compensate for the effects of motion subspace modelling and the third possibility further compensates for the effects of compression and decompression of the motion subspace model.
  • IDLE models corresponding to consecutive SCIs may still exhibit a high amount of correlation.
  • One way of exploiting this for the reference image was shown in the seventh preferred embodiment.
  • a similar effect can be exploited for basis images of the subspace model of motion or basis images for the subspace of intensity changes for consecutive SCIs, that is, the pertaining basis images span similar vector spaces.
  • I f (k) denote the row vector representing the / -th basis image of a subspace model for a certain attribute (motion, intensity changes, etc.) belonging to the k -th IDLE-model.
  • Lel I B (k) ⁇ (A:) 7" I 2 (k) ⁇ ...
  • I F (k) ⁇ J represent the matrix containing F row vectors describing a basis B of the subspace model for the certain attribute belonging to the k -th IDLE-model.
  • the set of current basis images I B (k) may be represented as the sum of the preceding basis images I B (k - 1) and a correction term R B (k) , which can be regarded as a residual, that is.
  • I B (k) I B (k - l) + R B (k) .
  • the residual R bear (k) may be transmitted to the decoder instead of the full basis images I B (k) . This may bring a gain in terms of data compression if the representation of the residual R B (k) requires less data than the representation of the current basis images I a (k) .
  • the subspaces spanned by the IDLE models corresponding to two consecutive SCIs may be very similar, the pertaining basis images may still differ substantially.
  • an invertible matrix M(k) may be determined to achieve
  • I B (k) M(k) - I B (k - l) + R B M (k) .
  • the pu ⁇ ose of the procedure described in the following is to provide a matrix M(k) and a corresponding matrix R B M (k) so that I B (k) can be constructed in the decoder.
  • the representation of the matrices M(k) and R BM (k) should require less bits as the representation of the current basis images I B (k) .
  • the matrix M (k) should be computed as to minimise the amount of data required for the representation ofR BM (k) without sacrificing fidelity in the prediction of I B (k) .
  • One method to compute such a matrix M(k) s to minimise the sum of squares of R B ⁇ (k ) . e.g. by using a standard least squares estimation technique as given by
  • a final decision may be made on the transmission of either the current basis images I B (k) directly or the matrix M(k) together with the residual R B M (k) , whatever needs fewer data.
  • This decision can also be performed row by row, i.e. for each factor.
  • the encoder Enc 2400 receives images 2401 which are buffered in the module ImageStore 2402.
  • BuildMod 2404 reads images from the module ImageStore 2402 and builds an IDLE model for a set of consecutive images (SCI).
  • the resulting IDLE model is simultaneously passed to the module CalcModRes 2410. to the module ModelBuffer 2406, to the module Predict 2408 and to the module CompMod 2418.
  • the module CalcModRes 2410 receives the current IDLE model from BuildMod 2404 and the preceding IDLE model from ModelBuffer 2406 and calculates the matrix M(k) and the residual R H M (k) . which are both compressed in module CompModRes
  • module CompModRes 2416 and module CompMod 2418 which compresses the current IDLE model are given to the module CombMod 2420, where they are combined.
  • the combination is the result of a decision process based on the amount of data necessary for the transmission of either the current basis images I B (k) or both the matrix M(k) and the residual R BM (k) . This decision can also be performed for each factor.
  • the module CompModRes 2416 and module CompMod 2418 which compresses the current IDLE model are given to the module CombMod 2420, where they are combined.
  • the combination is the result of a decision process based on the amount of data necessary for the transmission of either the current basis images I B (k) or both the matrix M(k) and the residual R BM (k) . This decision can also be performed for each factor.
  • the module CompModRes 2416 and module CompMod 2418 which compresses the current IDLE model are given to the module CombMod 2420, where they are combined.
  • the combination is the result of
  • TraCombMod 2422 transmits this combined model.
  • the module Predict 2408 provides predictions for the images of the SCI.
  • the module 2412 calculates the difference between the original images and these predictions.
  • the resulting residuals are compressed in module CompRes 2414 and transmitted in module TraRes 2424.
  • the above procedure can be used separately for each attribute independently, e.g. for the vertical motion subspace model, the horizontal one and the subspace model of intensity changes.
  • the procedure described above may be applied jointly on basis images of different attributes.
  • the transformation matrix ⁇ (£) was computed by projecting the basis images of the current SCI on the basis images of the previous SCI. This is the optimum solution if the sum of squares of the residual is used as the cost function to be minimised. However, it need not be an optimum solution with regard to the task of data compression. Therefore, an alternative cost function for optimisation may be defined, which better represents the relation between the required amount of data and the matrices M(k) and R B M (k) . Such a cost function takes the elements of the transformation matrix ⁇ f (k) as parameters.
  • An appropriately chosen optimisation method transforms the previous basis images l ⁇ (k - ⁇ ) according to an initial mat ⁇ xM(k) , and computes residuals R BM (k) as the difference between the current basis images I B (k) and the transformed basis images.
  • the cost function evaluates the quality of this step, the optimisation method produces a correction term for M(k) and decides on the further continuation of the algorithm until convergence.
  • the basis images from the previous SCI I B (k - 1) are transformed using a matrix ⁇ f (k) .
  • the result is wa ⁇ ed according to the motion field D between the two reference images of the models.
  • a residual R B ⁇ I (k) can be added optionally.
  • the matrix M (k) may be determined by
  • Fig. 24b An implementation structure for the functions to be performed inside the module CalcModRes 2410 is depicted in Fig. 24b.
  • the module 2430 receives the preceding IDLE model 2431 and the current IDLE model 2432.
  • the module CalcMotion 2434 calculates the motion field D between the two reference images of the models. This motion field and the current IDLE model are given to the module OppositeWa ⁇ 2436, where the basis images I B (k) of the current IDLE model are wa ⁇ ed to the co-ordinate system of the basis images of the preceding IDLE model.
  • the module CalcM 2438 computes the matrix M(k) using the basis images of the preceding IDLE model and the wa ⁇ ed basis images of the current IDLE model according to the equation given above.
  • the matrix M(k) is fed to the module MatrixMult 2440 and returned as output 2446.
  • the module MatrixMult 2440 performs the transformation M(k) • I B (k - 1) .
  • the result is put to the module Wa ⁇ 2442 which wa ⁇ s its input to the co-ordinate system of the basis images of the current IDLE model according to the motion field D .
  • the module 2444 calculates the residual R B M (k) as the difference between the basis images of the current IDLE model and the output of the module Wa ⁇ 2442.
  • the pertaining transformation can be computed after applying the Wa ⁇ operator.
  • the further procedure continues along the lines described above as
  • the matrix M(k) may be determined by
  • An implementation structure for the functions to be performed inside the module CalcModRes 2410 is depicted in Fig. 24c.
  • the module 2450 receives the preceding IDLE model 2452 and the current IDLE model 2451.
  • the module CalcMotion 2454 calculates the motion field D between the two reference images of the models.
  • This motion field and the preceding IDLE model are given to the module Wa ⁇ 2456, where the basis images 1 (k - l) of the preceding IDLE model are wa ⁇ ed to the co-ordinate system of the basis images of the current IDLE model.
  • the module CalcM 2458 computes the matrix M(k) using the basis images of the current IDLE model and the wa ⁇ ed basis images of the preceding IDLE model according to the equation given above.
  • the matrix M(k) is fed to the module MatrixMult 2460 and returned as output 2465.
  • the module MatrixMult 2460 performs the transformation M(k) - ⁇ a ⁇ p(I B (k - ⁇ ),D) .
  • the module 2462 calculates the residual , R B (k) as the difference between the basis images of the current IDLE model and the output of the module MatrixMult 2460.
  • a decoder corresponding to the procedure described above has to provide the means to perform the transformation of basis images already resident in memory according to the matrix M(k) or
  • the decoder must provide the means to perform a wa ⁇ function on the basis images stored in memory.
  • the method described for compression according to the present invention in any of its embodiments as described herein can be implemented on a standard computer by a corresponding computer program, as will be readily apparent to an expert. It may also be implemented by a special pu ⁇ ose chip, such as an ASIC, or by a Digital Signal Processor (DSP) performing the method of the present invention.
  • DSP Digital Signal Processor
  • a corresponding method for decoding may, for example, comprise decompressing, rebuilding motion fields from subspace models, wa ⁇ ing images and adding image residuals in order to reconstruct the sequence of video images.
  • the corresponding apparatus may. for example, comprise means to perform the aforementioned steps .
  • the method may be used to reduce the amount of bits necessary for storing the video sequence on a storage medium, for transmitting it via radio broadcasting, satellite links, optical fibers, or via any communications link, e.g. also through the internet or any computer network.
  • Encoding and/or decoding may be performed either on data written to and/or read from a storage medium, it may also be performed on data transmitted and/or received over any communications link, in this case the decoding may also be performed in real time.
  • the method may also be performed by a device connected to a TV set or a satellite receiver, it may then be performed in combination with any methods assuring security against unauthorized use such as encryption/decryption methods.
  • the program code which enables a computer to perform the method of the invention may be stored on a floppy disk, in a memory of the computer, or it may be delivered to the computer via any transmission medium like e.g. an internet link.
  • a method for encoding a sequence of video images, each video image comprising pixels comprising the steps of: representing the sequence by at least one set of consecutive images, each set of consecutive images comprising a plurality of video images; representing each set of consecutive images by one reference image and a subspace model of motion, the reference image and the subspace model of motion together comprising an
  • IDLE model wherein the subspace model of motion comprises a plurality of temporal coefficient vectors and a plurality of basis images, so that each element of each temporal coefficient vector corresponds to one video image and so that each element of each basis image contributes to the motion for a pixel, the combination of one temporal coefficient vector and one basis image together being called a factor, representing the sequence in encoded form by the collection of IDLE models.
  • step (7) transmitting and/or storing the reference image and the subspace model of motion, wherein the results transmitted and. or stored in step (7) together comprise the representation of the sequence of video images, which can be used to calculate predictions by wa ⁇ ing the pertaining reference image according to the pertaining rebuilt motions field from the pertaining subspace model of motion.
  • step of selecting a reference image comprises a method for adaptively selecting the reference image which comprises the following steps: selecting the first image as reference image, and/or selecting an image with a given distance from the first image as reference image, and/or selecting an image as the reference image whose histogram has the highest similarity with an average or median histogram of a given range of images, and/or said method for adaptively selecting comprises following steps:
  • step (3) of claim 2 the choice of the last image to be represented by the IDLE model is made dependent on one of or a combination of the following: _ _.
  • method (4) comprises the steps:
  • a norm of its basis image is larger than a threshold.
  • a norm of the product of its temporal coefficient vector and its basis image, called the norm of the factor is larger than a threshold
  • a norm of the factor is larger than the value at the knee point in the sequence of norm values of all factors of the subspace model, the norm values being sorted to achieve a monotonic order.
  • the shape field is wa ⁇ ed according to the subspace model of motion and the wa ⁇ ed shape field is used to indicate the valid pixels in the prediction.
  • the extending of the IDLE model comprises an extension of the reference image by wa ⁇ ing pixels from the original images corresponding to uncovered pixels of the predicted images into the position of the reference image and extrapolating the subspace model of motion correspondingly if given in the reference position.
  • the extending of the IDLE model in the case of forward motion compensation for each image to be covered comprises the steps:
  • step (7) including the backward wa ⁇ ed parts of step (6) as new parts of the enlarged reference image, thereby producing a new reference image
  • step (5) including the forward wa ⁇ ed parts of step (4) as new parts of the enlarged reference image, thereby producing a new reference image.
  • step (1) the motion field is calculated from the subspace model of motion.
  • the blend field is a bi-level field, one level indicating that the prediction from the first IDLE model is used, the other level indicating that the prediction of the second IDLE model is used.
  • each pixel or block of pixels of the B-image is a real number ⁇ between zero and one, defining each pixel or block of pixels of the B-image as a convex combination of the corresponding pixels or blocks of pixels of the predictions from the first and the second IDLE model, where the first is weighted by 1 - ⁇ and the second is weighted by ⁇ .
  • the cost function is a weighted sum of the prediction fidelity or quality of the resulting B-image and the roughness of the blend field.
  • a subspace model is build for the blend fields corresponding to the B-images within the overlap of the sets of consecutive images of the two IDLE models, the subspace model of blend fields comprising a plurality of temporal coefficient vectors and a plurality of basis images, so that each element of each temporal coefficient vector corresponds to one blend field, so that each element of each basis image contributes to the blend field for a pixel, the combination of one temporal coefficient vector and one basis image together being called a factor.
  • the IDLE model also comprises a subspace model of intensity changes, the subspace model of intensity changes comprising a plurality of temporal coefficient vectors and a plurality of basis images, so that each element of each temporal coefficient vector corresponds to one video image and so that each element of each basis image contributes to the intensity change for a pixel, the combination of one temporal coefficient vector and one basis image together being called a factor.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

L'invention se rapporte à un procédé permettant de coder une séquence d'images vidéo, chaque image comportant des pixels. Ce procédé consiste à représenter la séquence par au moins un jeu d'images consécutives, chaque jeu comportant une pluralité d'images vidéo; à représenter chaque jeu d'images consécutives par une image de référence et un modèle sous-espace de déplacement, l'image de référence et ledit modèle comportant tous deux un modèle IDLE, et le modèle sous-espace de déplacement comportant une pluralité de vecteurs de coefficient temporel et une pluralité d'images de base, de sorte que chaque élément de chaque vecteur temporel corresponde à une seule image vidéo et que chaque élément de chaque image de base contribue au déplacement au niveau d'un pixel, la combinaison d'un vecteur de coefficient temporel et d'une image de base étant appelée un facteur; enfin, à représenter la séquence sous forme codée grâce à la réunion des modèles IDLE.
PCT/EP1998/002950 1997-07-28 1998-05-20 Procede et dispositif permettant de comprimer des sequences video WO1999007157A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
AU80188/98A AU8018898A (en) 1997-07-28 1998-05-20 Method and apparatus for compressing video sequences
EP98928295A EP0998826A1 (fr) 1997-07-28 1998-05-20 Procede et dispositif permettant de comprimer des sequences video

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP97112951.5 1997-07-28
EP97112951 1997-07-28

Publications (1)

Publication Number Publication Date
WO1999007157A1 true WO1999007157A1 (fr) 1999-02-11

Family

ID=8227129

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP1998/002950 WO1999007157A1 (fr) 1997-07-28 1998-05-20 Procede et dispositif permettant de comprimer des sequences video

Country Status (4)

Country Link
EP (1) EP0998826A1 (fr)
AU (1) AU8018898A (fr)
WO (1) WO1999007157A1 (fr)
ZA (1) ZA986661B (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003017649A1 (fr) * 2001-08-20 2003-02-27 Koninklijke Philips Electronics N.V. Extension de la taille d'image
EP2009921A3 (fr) * 2007-06-26 2011-06-22 Mitsubishi Electric Corporation Procédé et système pour mappage de ton inversé d'une image, et codec pour effectuer ce mappage de ton inversé d'une image
WO2012133469A1 (fr) * 2011-03-31 2012-10-04 Mitsubishi Electric Corporation Procédé de compression d'images texturées

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5319468A (en) * 1991-04-18 1994-06-07 Matsushita Electric Industrial Co., Ltd. Image signal recording and reproducing system with use of bandwidth compression coding
WO1995008240A2 (fr) * 1993-09-08 1995-03-23 Idt, Inc. Procede et appareil d'analyse de donnees
WO1995020863A1 (fr) * 1994-01-26 1995-08-03 British Technology Group Usa Inc. Procede et appareil pour la compression de donnees video, utilisant une technique d'interpolation de mouvement adaptative dans le temps
WO1995034172A2 (fr) * 1994-06-03 1995-12-14 Idt International Digital Technologies Deutschland Gmbh Appareil et procede de decodage d'images video
EP0705041A2 (fr) * 1994-09-29 1996-04-03 Sony Corporation Codeur vidéo adaptif aux changements de scènes
WO1996029679A1 (fr) * 1995-03-22 1996-09-26 Idt International Digital Technologies Deutschland Gmbh Procede et dispositif de determination de mouvement, coordonnee entre plusieurs trames
WO1997016025A1 (fr) * 1995-10-20 1997-05-01 Nokia Mobile Phones Ltd. Codage de champ vectoriel de deplacement

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5319468A (en) * 1991-04-18 1994-06-07 Matsushita Electric Industrial Co., Ltd. Image signal recording and reproducing system with use of bandwidth compression coding
WO1995008240A2 (fr) * 1993-09-08 1995-03-23 Idt, Inc. Procede et appareil d'analyse de donnees
WO1995020863A1 (fr) * 1994-01-26 1995-08-03 British Technology Group Usa Inc. Procede et appareil pour la compression de donnees video, utilisant une technique d'interpolation de mouvement adaptative dans le temps
WO1995034172A2 (fr) * 1994-06-03 1995-12-14 Idt International Digital Technologies Deutschland Gmbh Appareil et procede de decodage d'images video
EP0705041A2 (fr) * 1994-09-29 1996-04-03 Sony Corporation Codeur vidéo adaptif aux changements de scènes
WO1996029679A1 (fr) * 1995-03-22 1996-09-26 Idt International Digital Technologies Deutschland Gmbh Procede et dispositif de determination de mouvement, coordonnee entre plusieurs trames
WO1997016025A1 (fr) * 1995-10-20 1997-05-01 Nokia Mobile Phones Ltd. Codage de champ vectoriel de deplacement

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KIRBY M ET AL: "A MODEL PROBLEM IN THE REPRESENTATION OF DIGITAL IMAGE SEQUENCES", PATTERN RECOGNITION, vol. 26, no. 1, 1 January 1993 (1993-01-01), pages 63 - 73, XP000355494 *
VERMEULEN P J E ET AL: "KARHUNEN-LOEVE TECHNIQUES FOR OPTIMAL PROCESSING OF TIME -SEQUENTIAL IMAGERY", OPTICAL ENGINEERING, vol. 30, no. 4, 1 April 1991 (1991-04-01), pages 415 - 423, XP000202133 *
WU H -J ET AL: "TRACKING SUBSPACE REPRESENTATIONS OF FACE IMAGES", PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AN SIGNAL PROCESSING (ICASSP), I. IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING ADELAIDE, APR. 19 - 22, 1994, vol. 5, no. CONF. 19, 19 April 1994 (1994-04-19), INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS, pages V-389 - V-392, XP000533746 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003017649A1 (fr) * 2001-08-20 2003-02-27 Koninklijke Philips Electronics N.V. Extension de la taille d'image
EP2009921A3 (fr) * 2007-06-26 2011-06-22 Mitsubishi Electric Corporation Procédé et système pour mappage de ton inversé d'une image, et codec pour effectuer ce mappage de ton inversé d'une image
WO2012133469A1 (fr) * 2011-03-31 2012-10-04 Mitsubishi Electric Corporation Procédé de compression d'images texturées
US20120251013A1 (en) * 2011-03-31 2012-10-04 Fatih Porikli Method for Compressing Textured Images
US8433148B2 (en) * 2011-03-31 2013-04-30 Mitsubishi Electric Research Laboratories, Inc. Method for compressing textured images

Also Published As

Publication number Publication date
EP0998826A1 (fr) 2000-05-10
ZA986661B (en) 1999-01-27
AU8018898A (en) 1999-02-22

Similar Documents

Publication Publication Date Title
US4849810A (en) Hierarchial encoding method and apparatus for efficiently communicating image sequences
EP0894404B1 (fr) Codeur et decodeur de donnees video
US8086052B2 (en) Hybrid video compression method
US8023562B2 (en) Real-time video coding/decoding
EP1138152B8 (fr) Procede et appareil effectuant l'estimation du mouvement hierarchique a l'aide d'une pyramide non lineaire
US6252905B1 (en) Real-time evaluation of compressed picture quality within a digital video encoder
US6895051B2 (en) Video data encoder and decoder
WO2001015457A2 (fr) Codage d'image
Barthel et al. Three-dimensional fractal video coding
KR100238889B1 (ko) 형태 부호화를 위한 보더 화소 예측 장치 및 방법
EP1505837B1 (fr) Appareil et Méthode pour la compression d'images numériques
EP0978200A2 (fr) Procede et dispositif de codage d'une sequence d'images
WO1999007157A1 (fr) Procede et dispositif permettant de comprimer des sequences video
Segall et al. Super-resolution from compressed video
Ho et al. A pyramidal image coder with contour-based interpolative vector quantization
Eryurtlu et al. Very low-bit-rate segmentation-based video coding using contour and texture prediction
KR100238888B1 (ko) 적응적인 형태 부호화 장치 및 그 부호화 방법
Burgett et al. Multiresolution multiplicative autoregressive coding of images
CA1323093C (fr) Methode et appareil de transmission de sequences d'images a filtrage adaptatif
Cordell et al. An interpolative spatial domain technique for coding image sequences
Yao et al. Hybrid Fractal Video Coding with Neighbourhood Vector Quantisation.
Hijazi LOCMIC: LOw Complexity Multi-resolution Image Compression
De Natale et al. A mesh‐interpolation scheme for very‐low bitrate coding of video sequences
Khalil et al. DCT of spatially adaptive subsampled interframes for image sequence coding
Mathur et al. A low bit rate video coder using content-based automatic segmentation and shape-adaptive DCT

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: KR

WWE Wipo information: entry into national phase

Ref document number: 1998928295

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 09463795

Country of ref document: US

WWP Wipo information: published in national office

Ref document number: 1998928295

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref country code: CA

WWW Wipo information: withdrawn in national office

Ref document number: 1998928295

Country of ref document: EP