WO2009130540A1 - Procédé de codage/décodage vidéo haute définition convenant au flux vidéo en temps réel - Google Patents

Procédé de codage/décodage vidéo haute définition convenant au flux vidéo en temps réel Download PDF

Info

Publication number
WO2009130540A1
WO2009130540A1 PCT/IB2008/051565 IB2008051565W WO2009130540A1 WO 2009130540 A1 WO2009130540 A1 WO 2009130540A1 IB 2008051565 W IB2008051565 W IB 2008051565W WO 2009130540 A1 WO2009130540 A1 WO 2009130540A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
pixel
pixels
value
compressed
Prior art date
Application number
PCT/IB2008/051565
Other languages
English (en)
Inventor
Victor Stepanov
Original Assignee
Maxtu S.A.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Maxtu S.A. filed Critical Maxtu S.A.
Priority to PCT/IB2008/051565 priority Critical patent/WO2009130540A1/fr
Publication of WO2009130540A1 publication Critical patent/WO2009130540A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/14Coding unit complexity, e.g. amount of activity or edge presence estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/33Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain

Definitions

  • the present invention relates to the domain of digital audio/video transmission or storage and in particular to the compression and decompression of an audio/video signal in order to minimise the amount of data to be transmitted or stored while allowing for the accurate reconstruction of the audio/video information contained within the transmitted or stored signal.
  • Spatial and temporal redundancy results from the fact that pixel values in an image are not independent: they are co-related with their neighbours - both in the spatial domain, i.e. from pixel to pixel within the same frame, and in the temporal domain, i.e. for the same equivalent pixel from one frame to the next. This means that to some extent, the value of a pixel may be predictable given the values of its neighbouring pixels.
  • the analysis and subsequent treatment of video signals in this way can result in a type of encoding known as variable length encoding. This occurs when the quantization of the signal is such that regularly occurring events are mapped to short codes while more rare events are mapped to longer codes.
  • video compression techniques involve the analysis of an input video stream with a view to discarding information which is deemed indiscernible to the human brain. This usually results in some form of filtering whereby small details, which are considered indiscernible to the human eye, are discarded or whereby the available colour palette is reduced where small changes in tone are not easily perceived.
  • DCT discrete cosine transform
  • VQ vector quantization
  • FC fractal compression
  • DWT discrete wavelet transform
  • JPEG Joint Photographic Experts Group
  • MPEG from the International Organization for Standardization and the International Electrotechnical Commission's Joint Photographic Experts Group and the Moving Pictures Experts Group, respectively
  • H.261 and H.263 both from United Nations International Telecommunications Union
  • Vector Quantization techniques look at an array of data rather than individual values thus forming a general view of the subject. Redundant data is then rejected while at the same time retaining the desired object or the data stream's original intent. Fractal compression is a form of vector quantization. Compression is achieved by locating repeating sections of an image then using a fractal algorithm to generate those sections form the original.
  • the aim of the present invention is to overcome currently perceived limitations in transmission bandwidth as well as in the speed of current encoding algorithms by optimising a technique for encoding and decoding a video signal in order to allow for the delivery of high definition digital video of a very high quality and at a rate suitable for real-time streaming applications.
  • the present invention relates to a method for encoding a digital data video file comprising a plurality of frames, said method yielding a plurality of compressed data files and comprising the following steps: - downsampling a first frame to obtain a first downsized frame having a lower resolution than the first frame;
  • FIG.1 shows a block diagram representing the encoding procedure used in one embodiment of the present invention
  • FIG.2 shows a block diagram of the decoding procedure used in one embodiment of the present invention
  • FIG.3 shows a test frame and an example of what is described in the present invention as a contour frame or outline frame extracted from said test frame;
  • FIG.4 shows a chart which summarises the comparison of colour intensities (in this case grayscale) in corresponding pixels between two frames;
  • FIG.5 shows an alternative representation of the comparison of colour intensities in corresponding pixels between two frames
  • FIG.6 illustrates an image processing technique which may be used in am embodiment the present invention as part of the compression process
  • FIG.7 shows a block diagram of the encoding and decoding processes in an alternative embodiment of the current invention
  • FIG.8 shows a block diagram of the encoding and decoding processes in a further embodiment of the current invention.
  • the present invention makes use of a technique referred to as contour compression, which involves the accentuation of contrast regions in an image.
  • contour compression involves the accentuation of contrast regions in an image. This technique allows for the extraction of the most important details of an image, while neglecting the parts of the image which do not contribute additional information affecting the quality of the image with respect to the requirements of the human eye in assessing said image.
  • the encoding procedure is carried out on a frame-by-frame basis on the input video file, which is generally in some uncompressed format such as RAW or BMP. See FIG.1.
  • the encoding technique involves the generation of a first compressed frame (CF1 ) and a second compressed frame (CF2), both of which are either stored or transmitted to a receiver (RX) for subsequent decoding.
  • HDVID input video file
  • DN downsampled
  • a picture whose x and y dimensions are reduced by a factor of two is referred to either as a 50% downsize or a 4x downsize, since there are four times less pixels in the resulting picture.
  • the method used to achieve a 50% downsize could be simply the suppression of every second pixel in the horizontal direction and in the vertical direction.
  • An alternative method could involve the grouping of pixels into groups of n pixels, calculating an average of the pixel intensities in each group and then replacing the n pixels by n/4 pixels, whose intensity is equal to the calculated average.
  • the purpose of this downsizing is two-fold: to reduce the amount of data to be transmitted and to accentuate the regions of contrast in the picture during the generation of a so-called outline frame or contour frame.
  • the downsized frame is compressed (CMP1 ) using a first compression scheme, which may be a standard scheme such as JPEG, MPEG, H.261 or H.263 or any other such lossy compression scheme or any lossless compression scheme, thus giving a first compressed frame (CF1 ), which is either stored or transmitted to the receiver (RX) for subsequent decoding.
  • the second part of the encoding procedure requires that the first compressed frame (CF1 ) be decompressed (DECMP1 ) using the same scheme that was used in the compression.
  • the resulting decompressed frame i.e. the second downsized frame, is then resized or upconverted (UP) by the same factor that was previously used in the downsizing process.
  • This upconvert step produces the second frame. Upconverting can be achieved by interpolation or resampling. Any of the standard techniques can be used, such as nearest-neighbour or piecewise-constant interpolation or any of linear, bilinear, bicubic, polynomial, spline or fractal interpolation.
  • a process of smoothing or averaging is carried out to smooth out boundaries between blocks present in the second frame.
  • the averaging is done over 3x3 pixels or 5x5 pixels in order to overlap block boundaries present in the upsized version, since said block boundaries, which are created as a result of the interpolation procedure, normally occur at even numbers of pixels.
  • the smoothed frame resulting from the smoothing process is compared (DIFF) to the first frame.
  • the result of the comparison is an outline or contour frame (CONT).
  • the outline frame is then compressed using a second compression scheme (CMP2), which may or may not be different from the first compression scheme, to give a second compressed frame (CF2).
  • CMP2 second compression scheme
  • the second compressed frame is either stored or transmitted to the receiver to be used with the first compressed frame in the decoding procedure.
  • FIG.3 illustrates a sample frame (ORIG) and one possible outline frame or contour frame (OUTLINE) associated with the sample frame.
  • OOG sample frame
  • OUTLINE outline frame or contour frame
  • the pixel attributes comprise, for each colour contributing to the overall tone of the pixel, a digital value representing the intensity of that colour.
  • the value of each pixel's tone in the original frame (first frame) is compared to the value of the corresponding pixel's tone in the smoothed frame. If the difference is less than a given threshold value, then a default value retained as the result of the comparison. In practice the default value is usually 0. Conversely, if the difference is greater than the given threshold, then the difference value is retained as the result of the comparison.
  • the comparison could be made between blocks of pixels.
  • the encoding technique used in the present invention ensures that the largest difference between the two frames being compared will occur at regions where the contrast is the highest i.e. at regions of abrupt changes in tone. This is due to the effects resulting from the resizing, compression, decompression and smoothing processes. Therefore the resulting outline frame will only contain information relative to areas of the original frame which have high contrast or significant changes in tone.
  • the level of sensitivity to tone changes is selected by choosing an appropriate threshold: the higher the threshold, the less detail remains in the outline frame; the lower the threshold, the more detail appears in the outline frame.
  • FIG.4 illustrates all the possible differences between two frames whose grayscale tone is encoded onto eight bits.
  • a diagonal line from the bottom left to the top right of the chart would represent a threshold of zero and indicates the zone where both frames are exactly equal. As the line is broadened, so the threshold is increased to encompass ever- increasing differences between the two frames.
  • FIG.5 shows another representation of the threshold concept. If the threshold were c, then the range of differences subtended by the area c would all be ignored. Areas 1 a/1 b, 2a/2b, 3a/3b and 4a/4b represent different compression ratios. For example, if the chosen threshold covers areas 2, 3, 4 and c, then only areas of very high contrast would be stored in the outline frame, whereas if the chosen threshold covers areas c and 4, then areas of lesser contrast would also be stored in the outline frame.
  • the threshold can be dynamically modified from frame to frame. Pre-processing of the image is carried out in order to determine the distribution of contrast present in the original image. In this way a tradeoff can be reached on a frame to frame basis whereby the threshold is chosen such that only the necessary minimum amount data is kept. Another type of pre-processing can also be carried out which takes the rate of change of the position of objects from one frame to the next into consideration. In this manner the threshold is modified depending on the speed of a moving object in the video. In the case of a portion of video with fast-moving objects, then the threshold could be set high since the contrast from frame to frame at the regions of interest will be large. Conversely, for video with slow-moving subjects, the threshold could be lowered. This dynamic modification of threshold allows for the optimization of the efficiency of the encoding.
  • the resulting outline frame from this procedure allows for good dynamic image definition and represents a high compression rate due to the presence of many zeros in the resulting file.
  • the above discussion covers frame-by-frame compression.
  • inter-frame compression techniques are used. Inter-frame compression can be readily realized based on the pre-processing of a selection of stored outline frames. In such a scheme, every fourth frame for example would be saved, or every fifth or sixth etc, and the outline frame extracted. The saved outline frames are used as a basis for calculating the missing frames.
  • the compression processes mentioned above can be of type JPEG, MPEG, H.261 or H.263 or any other such standard compression scheme or some proprietary scheme.
  • the present invention also makes use of another scheme, which will be described using the following example.
  • the example applies to a frame comprising a grayscale image, but it can be extrapolated to apply to full colour images.
  • an image comprising an array of pixels.
  • Each pixel has a tone attribute associated with it, said tone attribute being defined by a digital value of n-bits. See FIG.6.
  • a copy (A1 ) of an original image (O) is made, wherein only the most significant bit of each pixel is retained while all remaining n-1 bits are set to zero.
  • the image A1 then has a maximum of only two tones.
  • C1 The regions where changes in tone occur are detected and stored in an outline frame (C1 ).
  • C1 is compressed using a standard lossless compression scheme (e.g. Huffman coding) and stored.
  • the resulting compression ratio is very high due to the presence of many consecutive zeros.
  • a further copy (A2) of the original image is made wherein the two most significant bits are retained while all remaining bits are set to zero. A2 therefore has a maximum of four tones.
  • An outline frame (C2) is generated.
  • An exclusive-OR between C1 and C2 is done, thus keeping only the data which is different between the two frames.
  • the resulting key frame (K2) is compressed (e.g. Huffman) and stored.
  • the process continues, using three most significant bits to generate an outline frame C3, which is XOR-ed with C2 to give key frame K3, which is compressed (e.g. Huffman) and stored and so on until all n bits have been taken into account.
  • the final compression file to be transmitted comprises all of the compressed key frames, Kn, and the first compressed outline frame C1. This effectively gives a lossless compression.
  • the final compression file can contain just C1 , K2 and K4 for a good quality image to be recovered.
  • a practical method for achieving the result described above consists in treating the original image bit by bit, rather than pixel by pixel.
  • Key files for each bit of an n-bit pixel can easily be extracted by copying each bit of the first n-bit pixel in the original frame to n corresponding key files, K1 to Kn, then inspecting each bit of the next pixel in the original frame and appending a 0 or a 1 in the bit's corresponding key file depending on the following condition: append a 0 whenever the next bit is the same as the previous bit; append a 1 whenever the next bit is different from the previous bit.
  • n key files are then compressed using a standard lossless compression scheme such as Huffman coding, thus creating a set of very compact compressed files due to the presence of many repeated zeros in each of the key files.
  • a standard lossless compression scheme such as Huffman coding
  • lossless compression can be realized.
  • image compression lossy compression is usually sufficient, wherein only a subset of the totality of the compressed (e.g. Huffman) key files is used.
  • the compressed key files, rebuilding the image is easily realized by generating a first pixel using the first bit of each of the key files (LSB from the first key file, MSB from the nth key file), thereby defining the colour or tone of the first pixel.
  • the rest of the pixels in the frame are generated by analyzing each of the key files. For example, the MSB bit of the next pixel is determined by inspecting the next value in the Kn key file in order to determine whether the MSB of the next pixel should have the same value as the MSB in the preceding pixel or the opposite value.
  • the K1 key file is used to determine the LSB values of the consecutive pixels and so on.
  • FIG.2 illustrates the decoding procedure using the first and second compressed frames (CF1 , CF2) which were generated during the encoding procedure.
  • the first compressed frame (CF1 ) is decompressed (DECMP1 ) using the same compression scheme as was used to create said frame and then resized or upconverted (UP) by the same factor that was used in the downsizing step of the encoding procedure, resulting in a resized frame.
  • the resized frame is averaged (AVE), thus smoothing out any blocks which may be present. As in the averaging process carried out during the encoding procedure, the averaging is done over 3x3 pixels or 5x5 pixels.
  • the second compressed frame (CF2) is also decompressed to give a second outline frame (CONT). The second outline frame and the averaged frame are combined to give a decompressed final frame (FIN) representative of the original frame.
  • CONT second outline frame
  • FIN decompressed final frame
  • the outline frame in the encoding part of the process is generated by comparing the downsized frame with the second downsized frame as shown in FIG.7.
  • One advantage of this embodiment is that there are fewer steps in the encoding, resulting in a faster encoding process.
  • a further advantage is that the second compressed frame is smaller, resulting in the transmission or storage of a smaller file.
  • the outline frame in the encoding part of the process is generated by comparing the first frame and the second frame. See FIG.8. This allows for a slightly shorter encoding process and it does not necessitate any upsizing of the decompressed second compressed frame on the decoding side.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

La présente invention concerne un procédé destiné au codage et au décodage de fichiers vidéo numériques afin de faciliter le stockage desdits fichiers ou leurs transmissions dans des applications sensibles à la bande passante. Le procédé est basé sur la détection de régions de contraste dans les données vidéo et implique la génération de deux jeux de données compressés qui, une fois recombinés, produisent une représentation de grande qualité des données initiales. L'un des jeux de données compressés est produit à partir des données initiales dont la résolution a été réduite, tandis que les jeux de données compressés restants sont produits à partir d'un jeu de données représentant des zones de contraste dans les données initiales.
PCT/IB2008/051565 2008-04-23 2008-04-23 Procédé de codage/décodage vidéo haute définition convenant au flux vidéo en temps réel WO2009130540A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/IB2008/051565 WO2009130540A1 (fr) 2008-04-23 2008-04-23 Procédé de codage/décodage vidéo haute définition convenant au flux vidéo en temps réel

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/IB2008/051565 WO2009130540A1 (fr) 2008-04-23 2008-04-23 Procédé de codage/décodage vidéo haute définition convenant au flux vidéo en temps réel

Publications (1)

Publication Number Publication Date
WO2009130540A1 true WO2009130540A1 (fr) 2009-10-29

Family

ID=40342573

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2008/051565 WO2009130540A1 (fr) 2008-04-23 2008-04-23 Procédé de codage/décodage vidéo haute définition convenant au flux vidéo en temps réel

Country Status (1)

Country Link
WO (1) WO2009130540A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10992943B2 (en) 2016-09-08 2021-04-27 V-Nova International Limited Data processing apparatuses, methods, computer programs and computer-readable media
CN112714323A (zh) * 2020-12-25 2021-04-27 人和未来生物科技(长沙)有限公司 一种医学图像压缩方法及解码方法

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005057933A1 (fr) * 2003-12-08 2005-06-23 Koninklijke Philips Electronics N.V. Procede de compression evolutive spatiale avec zone morte

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005057933A1 (fr) * 2003-12-08 2005-06-23 Koninklijke Philips Electronics N.V. Procede de compression evolutive spatiale avec zone morte

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ELMER P M ED - CHIARIGLIONE L: "THE DESIGN OF A HIGH BIT RATE HDTV CODEC", SIGNAL PROCESSING OF HDTV, 2. TURIN, AUG. 30 - SEPT. 1, 1989; [PROCEEDINGS OF THE INTERNATIONAL WORKSHOP ON HDTV], AMSTERDAM, ELSEVIER, NL, vol. WORKSHOP 3, 30 August 1989 (1989-08-30), pages 619 - 631, XP000215280 *
RABBANI M ET AL: "An overview of the JPEG 2000 still image compression standard", SIGNAL PROCESSING. IMAGE COMMUNICATION, ELSEVIER SCIENCE PUBLISHERS, AMSTERDAM, NL, vol. 17, no. 1, 1 January 2002 (2002-01-01), pages 3 - 48, XP004326797, ISSN: 0923-5965 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10992943B2 (en) 2016-09-08 2021-04-27 V-Nova International Limited Data processing apparatuses, methods, computer programs and computer-readable media
US12034943B2 (en) 2016-09-08 2024-07-09 V-Nova International Limited Data processing apparatuses, methods, computer programs and computer-readable media
CN112714323A (zh) * 2020-12-25 2021-04-27 人和未来生物科技(长沙)有限公司 一种医学图像压缩方法及解码方法

Similar Documents

Publication Publication Date Title
CN112913237B (zh) 使用深度神经网络的人工智能编码和人工智能解码方法和设备
US7656561B2 (en) Image compression for rapid high-quality imaging
US7991052B2 (en) Variable general purpose compression for video images (ZLN)
US8416847B2 (en) Separate plane compression using plurality of compression methods including ZLN and ZLD methods
US8537898B2 (en) Compression with doppler enhancement
US6912318B2 (en) Method and system for compressing motion image information
US20030103680A1 (en) Block boundary artifact reduction for block-based image compression
JPH11513205A (ja) ビデオ符号化装置
EP0482180A1 (fr) Procede de codage predictif lineaire adapte aux blocs, avec gain et polarisation adaptatifs.
WO2007040765A1 (fr) Filtrage adaptatif de contenu a reduction de bruit pour signaux d'image
CN1695381A (zh) 在数字视频信号的后处理中使用编码信息和局部空间特征的清晰度增强
US5831677A (en) Comparison of binary coded representations of images for compression
CN110896483A (zh) 压缩和解压缩图像数据的方法
EP1769459B1 (fr) Compression d'images pour imagerie rapide haute qualite
JP4293912B2 (ja) ウェーブレット変換を使用するカラー画像のデータ圧縮
KR20230108286A (ko) 전처리를 이용한 비디오 인코딩
US6631161B1 (en) Method and system for compressing motion image information
JP2003531553A (ja) 固定圧縮率を使用する効率的なビデオデータアクセス
AU2002230101A2 (en) Moving picture information compressing method and its system
WO2009130540A1 (fr) Procédé de codage/décodage vidéo haute définition convenant au flux vidéo en temps réel
JP3627291B2 (ja) ブロック歪み除去装置および方法
EP1170956A2 (fr) Procédé et système pour la compression de l'information des images animées
US6628708B1 (en) Method and system for compressing color video data within a data processing system
JP2000165873A (ja) 動画像情報の圧縮方法およびそのシステム
JP3958033B2 (ja) 動画像情報の圧縮方法およびそのシステム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08737969

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08737969

Country of ref document: EP

Kind code of ref document: A1