GB2574425A

GB2574425A - Video coding and decoding

Info

Publication number: GB2574425A
Application number: GB1809236.1A
Authority: GB
Inventors: Laroche Guillaume; Taquet Jonathan; Gisquet Christophe; Onno Patrice
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2018-06-05
Filing date: 2018-06-05
Publication date: 2019-12-11
Also published as: GB201809236D0; WO2019234001A1; TW202005370A

Abstract

The application relates to encoding and decoding video, for the VVC standard, involving in-loop filtering in the form of Sample Adaptive Offset (SAO) filtering, which helps to redefine edges and boundaries that may have been smoothed by the encoding process. Determining whether an image part is of predetermined group of parts, and including (or obtaining) first syntax element, in bit-stream, for signalling that SAO parameters are inferred from parameters used for another image part, or else not including (or obtaining) the syntax. Determining whether an image part satisfies predetermined criterion, thus enabling or disabling (or obtaining this information) syntax element (i.e. merge flags), in a bit-stream, for signalling the use of one or more other syntax elements for signalling that SAO parameters are inferred from parameters used for another image part. Determining statistics about an image part, when it is grouped according to at least two different groups, selecting the best group for the image part, applying SAO parameters based on selected group, providing filtered image. Encoding comprising: predicting image parts from other image parts according to a first or second prediction mode, and performing SAO filtering based on the grouping.

Description

VIDEO CODING AND DECODING

Recently, the Joint Video Experts Team (JVET), a collaborative team formed by MPEG and ITU-T Study Group 16’s VCEG, commenced work on a new video coding standard referred to as Versatile Video Coding (VVC). The goal of VVC is to provide significant improvements in compression performance over the existing HEVC standard (i.e., typically twice as much as before) and to be completed in 2020. The main target applications and services include — but not limited to — 360-degree and high-dynamic-range (HDR) videos. In total, JVET evaluated responses from 32 organizations using formal subjective tests conducted by independent test labs. Some proposals demonstrated compression efficiency gains of typically 40% or more when compared to using HEVC. Particular effectiveness was shown on ultra-high definition (UHD) video test material. Thus, we may expect compression efficiency gains well-beyond the targeted 50% for the final standard.

The JVET exploration model (JEM) uses all the HEVC tools. One of these tools is sample adaptive offset (SAO) filtering. However, SAO is less efficient in the JEM reference software than in the HEVC reference software. This arises from fewer evaluations and from signalling inefficiencies compared to other loop filters.

US 9769450 discloses an SAO filter for three dimensional or 3D Video Coding or 3DVC such as implemented by the HEVC standard. The filter directly re-uses SAO filter parameters of an independent view or a coded dependent view to encode another dependent view, or re-uses only part of the SAO filter parameters of the independent view or a coded dependent view to encode another dependent view. The SAO parameters are re-used by copying them from the independent view or coded dependent view.

US 2014/0192860 Al relates to the scalable extension of HEVC. HEVC scalable extension aims at allowing coding/decoding of a video made of multiple scalability layers, each layer being made up of a series of frames. Coding efficiency is improved by inferring, or deriving, SAO parameters to be used at an upper layer (e.g. an enhancement layer) from the SAO parameters actually used at a lower (e.g. base) layer. This is because inferring some SAO parameters makes it possible to avoid transmitting them.

US 2013/0051455 Al describes some the merging process for SAO parameters. SAO parameters for Coding Tree Units are grouped in order to reduce the encoder delay. It is proposed to provide SAO parameters for a region (for example a row) that enable the determination of SAO parameters for the filtering regions to be performed in parallel.

HEVC standard proposes to signal whether SAO parameter set (sao merge up flag, sao merge left flag ) are derived or not from the above Coding Tree Unit.

It is desirable to improve the coding efficiency of images subjected to the SAO filtering.

Different aspects of the present invention are described below.

In a first aspect, it is also proposed a method and corresponding device for signalling a Sample Adaptive Offset (SAO) filtering. The first aspect also concerns a method a corresponding device for performing a Sample Adaptive Offset (SAO) filtering.

The first aspect also described corresponding encoding and decoding method and associated devices.

In a second aspect, it is proposed a method and corresponding device for encoding an image comprising a plurality of image parts, one or more image parts being predicted from one or more other image parts according to a first or a second prediction mode.

According to a third aspect of the present invention there is provided a method of signalling, in a bitstream, Sample Adaptive Offset (SAO) filtering parameters for use in performing SAO filtering on an image comprising a plurality of image parts, the image parts being groupable into groups of image parts using two or more different available groupings, the method comprising: determining which of said available groupings applies to an image part to be filtered; if the determined grouping is a predetermined one of the different available grroupings, including in the bitstream an inferring-permitted syntax element, which indicates that it is permitted to infer the SAO parameters for performing SAO filtering on the image part to be filtered from the SAO parameters used for filtering another image part; and if the determined grouping is another one of the different available groupings, not including the syntax element in the bitstream.

Reference will now be made, by way of example, to the accompanying drawings, in which:

Figure lisa diagram for use in explaining a coding structure used in HEVC;

Figure 2 is a block diagram schematically illustrating a data communication system in which one or more embodiments of the invention may be implemented;

Figure 3 is a block diagram illustrating components of a processing device in which one or more embodiments of the invention may be implemented;

Figure 4 is a flow chart illustrating steps of an encoding method according to embodiments of the invention;

Figure 5 is a flow chart illustrating steps of a loop filtering process of in accordance with one or more embodiments of the invention;

Figure 6 is a flow chart illustrating steps of a decoding method according to embodiments of the invention;

Figure 7A and 7B are diagrams for use in explaining edge-type SAO filtering in HEVC;

Figure 8 is a diagram for use in explaining band-type SAO filtering in HEVC;

Figure 9 is a flow chart illustrating the steps of a process to decode SAO parameters according to the HEVC specifications;

Figure 10 is a flow chart illustrating in more detail one of the steps of the Figure 9 process;

Figure 11 is a flow chart illustrating how SAO filtering is performed on an image part according to the HEVC specifications;

Figure 12 is a flow chart illustrating steps carried out an encoder to determine SAO parameters for the CTUs of a group (frame or slice) in a CTU-level derivation of SAO parameters;

Figure 13 shows one of the steps of Figure 12 in more detail;

Figure 14 shows another one of the steps of Figure 12 in more detail;

Figure 15 shows yet another one of the steps of Figure 12 in more detail;

Figure 16 shows various different groupings 1201-1206 of CTUs in a slice;

Figure 17 is a diagram showing image parts of a frame in a derivation of SAO parameters in which a first method of sharing SAO parameters is used;

Figure 18 is a flowchart of an example of a process for setting SAO parameters in the derivation of Figure 17;

Figure 19 is a flowchart of an example of a process for setting of SAO parameters in derivation using the first sharing method to share SAO parameters among a column of CTUs;

Figure 20 is a flowchart of an example of a process for setting of SAO parameters in derivation using the first sharing method to share SAO parameters among a group of NxN CTUs;

Figure 21 is a diagram showing image parts of one NxN group in the derivation of Figure 20;

Figure 22 illustrates an example of how to select the SAO parameter derivation according to the sixth embodiment of the invention;

Figure 23 is a flow chart illustrating a decoding process suitable for a second method of sharing SAO parameters among image parts of a group;

Figure 24 is a diagram showing image parts of multiple 2x2 groups;

Figure 25 is a flow chart illustrating a encoding process according to the first embodiment of the invention;

Figure 26 is a flow chart illustrating a decoding process according to the third embodiment of the invention;

Figure 27 is a flow chart illustrating a decoding process according to the fourth embodiment of the invention; and

Figure 28 is a diagram showing a system comprising an encoder or a decoder and a communication network according to embodiments of the present invention,

Figure 1 relates to a coding structure used in the High Efficiency Video Coding (HEVC) video standard. A video sequence 1 is made up of a succession of digital images i. Each such digital image is represented by one or more matrices. The matrix coefficients represent pixels.

An image 2 of the sequence may be divided into slices 3. A slice may in some instances constitute an entire image. These slices are divided into non-overlapping Coding Tree Units (CTUs) 4. A Coding Tree Unit (CTU) is the basic processing unit of the High Efficiency Video Coding (HEVC) video standard and conceptually corresponds in structure to macroblock units that were used in several previous video standards. A CTU is also sometimes referred to as a Largest Coding Unit (LCU).

A CTU is generally of size 64 pixels x 64 pixels. Each CTU may in turn be iteratively divided into smaller variable-size Coding Units (CUs) 5 using a quadtree decomposition.

Coding units are the elementary coding elements and are constituted by two kinds of sub-unit called a Prediction Unit (PU) and a Transform Unit (TU). The maximum size of a PU or TU is equal to the CU size. A Prediction Unit corresponds to the partition of the CU for prediction of pixels values. Various different partitions of a CU into PUs are possible as shown by 6 including a partition into 4 square PUs and two different partitions into 2 rectangular PUs. A Transform Unit is an elementary unit that is subjected to spatial transformation using DCT. A CU can be partitioned into TUs based on a quadtree representation 7.

Each slice is embedded in one Network Abstraction Layer (NAL) unit. In addition, the coding parameters of the video sequence are stored in dedicated NAL units called parameter sets. In HEVC and H.264/AVC two kinds of parameter sets NAL units are employed: first, a Sequence Parameter Set (SPS) NAL unit that gathers all parameters that are unchanged during the whole video sequence. Typically, it handles the coding profile, the size of the video frames and other parameters. Secondly, a Picture Parameter Set (PPS) NAL unit includes parameters that may change from one image (or frame) to another of a sequence. HEVC also includes a Video Parameter Set (VPS) NAL unit which contains parameters describing the overall structure of the bitstream. The VPS is a new type of parameter set defined in HEVC, and applies to all of the layers of a bitstream. A layer may contain multiple temporal sub-layers, and all version 1 bitstreams are restricted to a single layer. HEVC has certain layered extensions for scalability and multiview and these will enable multiple layers, with a backwards compatible version 1 base layer.

Figure 2 illustrates a data communication system in which one or more embodiments of the invention may be implemented. The data communication system comprises a transmission device, in this case a server 201, which is operable to transmit data packets of a data stream to a receiving device, in this case a client terminal 202, via a data communication network 200. The data communication network 200 may be a Wide Area Network (WAN) or a Local Area Network (LAN). Such a network may be for example a wireless network (Wifi /

802.1 la or b or g), an Ethernet network, an Internet network or a mixed network composed of several different networks. In a particular embodiment of the invention the data communication system may be a digital television broadcast system in which the server 201 sends the same data content to multiple clients.

The data stream 204 provided by the server 201 may be composed of multimedia data representing video and audio data. Audio and video data streams may, in some embodiments of the invention, be captured by the server 201 using a microphone and a camera respectively. In some embodiments data streams may be stored on the server 201 or received by the server 501 from another data provider, or generated at the server 201. The server 201 is provided with an encoder for encoding video and audio streams in particular to provide a compressed bitstream for transmission that is a more compact representation of the data presented as input to the encoder.

In order to obtain a better ratio of the quality of transmitted data to quantity of transmitted data, the compression of the video data may be for example in accordance with the HEVC format or H.264/AVC format.

The client 202 receives the transmitted bitstream and decodes the reconstructed bitstream to reproduce video images on a display device and the audio data by a loud speaker.

Although a streaming scenario is considered in the example of Figure 2, it will be appreciated that in some embodiments of the invention the data communication between an encoder and a decoder may be performed using for example a media storage device such as an optical disc.

In one or more embodiments of the invention a video image is transmitted with data representative of compensation offsets for application to reconstructed pixels of the image to provide filtered pixels in a final image.

Figure 3 schematically illustrates a processing device 300 configured to implement at least one embodiment of the present invention. The processing device 300 may be a device such as a micro-computer, a workstation or a light portable device. The device 300 comprises a communication bus 313 connected to:

-a central processing unit 311, such as a microprocessor, denoted CPU;

-a read only memory 307, denoted ROM, for storing computer programs for implementing the invention;

-a random access memory 312, denoted RAM, for storing the executable code of the method of embodiments of the invention as well as the registers adapted to record variables and parameters necessary for implementing the method of encoding a sequence of digital images and/or the method of decoding a bitstream according to embodiments of the invention; and

-a communication interface 302 connected to a communication network 303 over which digital data to be processed are transmitted or received

Optionally, the apparatus 300 may also include the following components:

-a data storage means 304 such as a hard disk, for storing computer programs for implementing methods of one or more embodiments of the invention and data used or produced during the implementation of one or more embodiments of the invention;

-a disk drive 305 for a disk 306, the disk drive being adapted to read data from the disk 306 or to write data onto said disk;

-a screen 309 for displaying data and/or serving as a graphical interface with the user, by means of a keyboard 310 or any other pointing means.

The apparatus 300 can be connected to various peripherals, such as for example a digital camera 320 or a microphone 308, each being connected to an input/output card (not shown) so as to supply multimedia data to the apparatus 300.

The communication bus provides communication and interoperability between the various elements included in the apparatus 300 or connected to it. The representation of the bus is not limiting and in particular the central processing unit is operable to communicate instructions to any element of the apparatus 300 directly or by means of another element of the apparatus 300.

The disk 306 can be replaced by any information medium such as for example a compact disk (CD-ROM), rewritable or not, a ZIP disk or a memory card and, in general terms, by an information storage means that can be read by a microcomputer or by a microprocessor, integrated or not into the apparatus, possibly removable and adapted to store one or more programs whose execution enables the method of encoding a sequence of digital images and/or the method of decoding a bitstream according to the invention to be implemented.

The executable code may be stored either in read only memory 307, on the hard disk 304 or on a removable digital medium such as for example a disk 306 as described previously. According to a variant, the executable code of the programs can be received by means of the communication network 303, via the interface 302, in order to be stored in one of the storage means of the apparatus 300 before being executed, such as the hard disk 304.

The central processing unit 311 is adapted to control and direct the execution of the instructions or portions of software code of the program or programs according to the invention, instructions that are stored in one of the aforementioned storage means. On powering up, the program or programs that are stored in a non-volatile memory, for example on the hard disk 304 or in the read only memory' 307, are transferred into the random access memory 312, which then contains the executable code of the program or programs, as well as registers for storing the variables and parameters necessary for implementing the invention.

In this embodiment, the apparatus is a programmable apparatus which uses software to implement the invention. However, alternatively, the present invention may be implemented in hardware (for example, in the form of an Application Specific Integrated Circuit or ASIC).

Figure 4 illustrates a block diagram of an encoder according to at least one embodiment of the invention. The encoder is represented by connected modules, each module being adapted to implement, for example in the form of programming instructions to be executed by the CPU 311 of device 300, at least one corresponding step of a method implementing at least one embodiment of encoding an image of a sequence of images according to one or more embodiments of the invention.

An original sequence of digital images /0 to in 401 is received as an input by the encoder 40. Each digital image is represented by a set of samples, known as pixels.

A bitstream 410 is output by the encoder 40 after implementation of the encoding process. The bitstream 410 comprises a plurality of encoding units or slices, each slice comprising a slice header for transmitting encoding values of encoding parameters used to encode the slice and a slice body, comprising encoded video data.

The input digital images zO to in 401 are divided into blocks of pixels by module 402. The blocks correspond to image portions and may be of variable sizes (e.g. 4x4, 8x8, 16x16, 32x32, 64x64 pixels). A coding mode is selected for each input block. Two families of coding modes are provided: coding modes based on spatial prediction coding (Intra prediction), and coding modes based on temporal prediction (Inter coding, Merge, SKIP). The possible coding modes are tested.

Module 403 implements an Intra prediction process, in which the given block to be encoded is predicted by a predictor computed from pixels of the neighbourhood of said block to be encoded. An indication of the selected Intra predictor and the difference between the given block and its predictor is encoded to provide a residual if the Intra coding is selected.

Temporal prediction is implemented by motion estimation module 404 and motion compensation module 405. Firstly a reference image from among a set of reference images/pictures 416 is selected, and a portion of the reference image, also called reference area or image portion, which is the closest area to the given block to be encoded, is selected by the motion estimation module 404. Motion compensation module 405 then predicts the block to be encoded using the selected area. The difference between the selected reference area and the given block, also called a residual block, is computed by the motion compensation module 405. The selected reference area is indicated by a motion vector.

Thus in both cases (spatial and temporal prediction), a residual is computed by subtracting the prediction from the original block.

In the INTRA prediction implemented by module 403, a prediction direction is encoded. In the temporal prediction, at least one motion vector is encoded.

Information relative to the motion vector and the residual block is encoded if the Inter prediction is selected. To further reduce the bitrate, assuming that motion is homogeneous, the motion vector is encoded by difference with respect to a motion vector predictor. Motion vector predictors of a set of motion information predictors is obtained from the motion vectors field 418 by a motion vector prediction and coding module 417.

The encoder 40 further comprises a selection module 406 for selection of the coding mode by applying an encoding cost criterion, such as a rate-distortion criterion. In order to further reduce redundancies a transform (such as DCT) is applied by transform module 407 to the residual block, the transformed data obtained is then quantized by quantization module

408 and entropy encoded by entropy encoding module 409. Finally, the encoded residual block of the current block being encoded is inserted into the bitstream 410.

The encoder 40 also performs decoding of the encoded image in order to produce a reference image for the motion estimation of the subsequent images. This enables the encoder and the decoder receiving the bitstream to have the same reference frames. The inverse quantization module 411 performs inverse quantization of the quantized data, followed by an inverse transform by inverse transform module 412. The intra prediction module 413 uses the prediction information to determine which predictor to use for a given block and the motion compensation module 414 actually adds the residual obtained by module 412 to the reference area obtained from the set of reference images 416.

Post filtering is then applied by module 415 to filter the reconstructed frame of pixels. In the embodiments of the invention an SAO loop filter is used in which compensation offsets are added to the pixel values of the reconstructed pixels of the reconstructed image

Figure 5 is a flow chart illustrating steps of loop filtering process according to at least one embodiment of the invention. In an initial step 51, the encoder generates the reconstruction of the full frame. Next, in step 52 a deblocking filter is applied on this first reconstruction in order to generate a deblocked reconstruction 53. The aim of the deblocking filter is to remove block artifacts generated by residual quantization and block motion compensation or block Intra prediction. These artifacts are visually important at low bitrates. The deblocking filter operates to smooth the block boundaries according to the characteristics of two neighboring blocks. The encoding mode of each block, the quantization parameters used for the residual coding, and the neighboring pixel differences in the boundary are taken into account. The same criterion/classification is applied for all frames and no additional data is transmitted. The deblocking filter improves the visual quality of the current frame by removing blocking artifacts and it also improves the motion estimation and motion compensation for subsequent frames. Indeed, high frequencies of the block artifact are removed, and so these high frequencies do not need to be compensated for with the texture residual of the following frames.

After the deblocking filter, the deblocked reconstruction is filtered by a sample adaptive offset (SAO) loop filter in step 54 using SAO parameters 58 determined in accordance with embodiments of the invention. The resulting frame 55 may then be filtered with an adaptive loop filter (AFF) in step 56 to generate the reconstructed frame 57 which will be displayed and used as a reference frame for the following Inter frames.

In step 54 each pixel of the frame region is classified into a class or group. The same offset value is added to every pixel value which belongs to a certain class or group.

The determination of the SAO parameters for the sample adaptive offset filtering will be explained in more detail hereafter with reference to any one of Figures 10 to 11.

Figure 6 illustrates a block diagram of a decoder 60 which may be used to receive data from an encoder according an embodiment of the invention. The decoder is represented by connected modules, each module being adapted to implement, for example in the form of programming instructions to be executed by the CPU 311 of device 300, a corresponding step of a method implemented by the decoder 60.

The decoder 60 receives a bitstream 61 comprising encoding units, each one being composed of a header containing information on encoding parameters and a body containing the encoded video data. As explained with respect to Figure 4, the encoded video data is entropy encoded, and the motion vector predictors’ indexes are encoded, for a given block, on a predetermined number of bits. The received encoded video data is entropy decoded by module 62. The residual data are then dequantized by module 63 and then an inverse transform is applied by module 64 to obtain pixel values.

The mode data indicating the coding mode are also entropy decoded and based on the mode, an INTRA type decoding or an INTER type decoding is performed on the encoded blocks of image data.

In the case of INTRA mode, an INTRA predictor is determined by intra prediction module 65 based on the intra prediction mode specified in the bitstream.

If the mode is INTER, the motion prediction information is extracted from the bitstream so as to find the reference area used by the encoder. The motion prediction information is composed of the reference frame index and the motion vector residual. The motion vector predictor is added to the motion vector residual in order to obtain the motion vector by motion vector decoding module 70.

Motion vector decoding module 70 applies motion vector decoding for each current block encoded by motion prediction. Once an index of the motion vector predictor, for the current block has been obtained the actual value of the motion vector associated with the current block can be decoded and used to apply motion compensation by module 66. The reference image portion indicated by the decoded motion vector is extracted from a reference image/picture 68 to apply the motion compensation 66. The motion vector field data 71 is updated with the decoded motion vector in order to be used for the inverse prediction of subsequent decoded motion vectors.

Finally, a decoded block is obtained. Post filtering is applied by post filtering module 67 similarly to post filtering module 415 applied at the encoder as described with reference to Figure 4. A decoded video signal 69 is finally provided by the decoder 60.

The aim of SAO filtering is to improve the quality of the reconstructed frame by sending additional data in the bitstream in contrast to the deblocking filter where no information is transmitted. As mentioned above, each pixel is classified into a predetermined class or group and the same offset value is added to every pixel sample of the same class/group. One offset is encoded in the bitstream for each class. SAO loop filtering has two SAO types: an Edge Offset (EO) type and a Band Offset (BO) type. An example of Edge Offset type is schematically illustrated in Figures 7A and 7B, and an example of Band Offset type is schematically illustrated in Figure 8.

In HEVC, SAO filtering is applied CTU by CTU. In this case the parameters needed to perform the SAO filtering (set of SAO parameters) are selected for each CTU at the encoder side and the necessary parameters are decoded and/or derived for each CTU at the decoder side. This offers the possibility of easily encoding and decoding the video sequence by processing each CTU at once without introducing delays in the processing of the whole frame. Moreover, when SAO filtering is enabled, only one SAO type is used: either the Edge Offset type filter or the Band Offset type filter according to the related parameters transmitted in the bitstream for each classification. One of the SAO parameters in HEVC is an SAO type parameter sao type idx which indicates for the CTU whether EO type, BO type or no SAO filtering is selected for the CTU concerned.

The SAO parameters for a given CTU can be copied from the upper or left CTU, for example, instead of transmitting all the SAO data. One of the SAO parameters in HEVC is a sao merge up flag, which when set indicates that the SAO parameters for the subject CTU should be copied from the upper CTU. Another of the SAO parameters in HEVC is a sao merge left flag, which when set indicates that the SAO parameters for the subject CTU should be copied from the left CTU.

SAO filtering may be applied independently for different color components (e.g. YUV) of the frame. For example, one set of SAO parameters may be provided for the luma component Y and another set of SAO parameters may be provided for both chroma components U and V in common. Also, within the set of SAO parameters one or more SAO parameters may be used as common filtering parameters for two or more color components, while other SAO parameters are dedicated (per-component) filtering parameters for the color components. For example, in HEVC, the SAO type parameter sao type idx is common to U and V, and so is a EO class parameter which indicates a class for EO filtering (see below), whereas a BO class parameter which indicates a group of classes for BO filtering has dedicated (per-component) SAO parameters for U and V.

A description of the Edge Offset type in HEVC is now provided with reference to Figures 7A and 7B.

Edge Offset type involves determining an edge index for each pixel by comparing its pixel value to the values of two neighboring pixels. Moreover, these two neighboring pixels depend on a parameter which indicates the direction of these two neighboring pixels with respect to the current pixel. These directions are the 0-degree (horizontal direction), 45degree (diagonal direction), 90-degree (vertical direction) and 135-degree (second diagonal direction). These four directions are schematically illustrated in Figure 7A.

The table of Figure 7B gives the offset value to be applied to the pixel value of a particular pixel “C” according to the value of the two neighboring pixels Cnl and Cn2 at the decoder side.

When the value of C is less than the two values of neighboring pixels Cnl and Cn2, the offset to be added to the pixel value of the pixel C is “+ 01”. When the pixel value of C is less than one pixel value of its neighboring pixels (either Cnl or Cn2) and C is equal to one value of its neighbors, the offset to be added to this pixel sample value is “+ 02”.

When the pixel value of c is less than one of the pixel values of its neighbors (Cnl or Cn2) and the pixel value of C is equal to one value of its neighbors, the offset to be applied to this pixel sample is “- 03”. When the value of C is greater than the two values of Cnl or Cn2, the offset to be applied to this pixel sample is “- 04”.

When none of the above conditions is met on the current sample and its neighbors, no offset value is added to the current pixel C as depicted by the Edge Index value “2” of the table.

It is important to note that for the particular case of the Edge Offset type, the absolute value of each offset (01, 02, 03, 04) is encoded in the bitstream. The sign to be applied to each offset depends on the edge index (or the Edge Index in the HEVC specifications) to which the current pixel belongs. According to the table represented in Figure 7B, for Edge Index 0 and for Edge Index 1 (01, 02) a positive offset is applied. For Edge Index 3 and Edge Index 4 (03, 04), a negative offset is applied to the current pixel.

In the HEVC specifications, the direction for the Edge Offset amongst the four directions of Figure 7A is specified in the bitstream by a “sao eo class luma” field for the luma component and a “sao eo class chroma” field for both chroma components U and V.

The SAO Edge Index corresponding to the index value is obtained by the following formula:

Edgeindex = sign (C - Cn2) - sign (Cnl- C) +2 where the definition of the function sign(.) is given by the following relationships sign(x) = 1, when x>0 sign(x) = -1, when x<0 sign(x) = 0, when x=0.

In order to simplify the Edge Offset determination for each pixel, the difference between the pixel value of C and the pixel value of both its neighboring pixels Cnl and Cn2 can be shared for current pixel C and its neighbors. Indeed, when SAO Edge Offset filtering is applied using a raster scan order of pixels of the current CTU or frame, the term sign (CnlC) has already computed for the previous pixels (to be precise it was computed as C’-Cn2’ at a time when the current pixel C’ at that time was the present neighboring pixel Cnl and the neighboring pixel Cn2’ was what is now the current pixel C). As a consequence this sign (c_nl- c) does not need to be computed again.

A description of the Band Offset type is now provided with reference to Figure 8.

Band Offset type in SAO also depends on the pixel value of the sample to be processed. A class in SAO Band offset is defined as a range of pixel values. Conventionally, for all pixels within a range, the same offset is added to the pixel value. In the HEVC specifications, the number of offsets for the Band Offset filter is four for each reconstructed block or frame area of pixels (CTU), as schematically illustrated in Figure 8.

One implementation of SAO Band offset splits the full range of pixel values into 32 ranges of the same size. These 32 ranges are the classes of SAO Band offset. The minimum value of the range of pixel values is systematically 0 and the maximum value depends on the bit depth of the pixel values according to the following relationship Max = 2^Bltdepth-\ Classifying the pixels into 32 ranges of the full interval includes 5 bits checking needed to classify the pixels values for fast implementation i.e. only the 5 first bits (5 most significant bits) are checked to classify a pixel into one of the 32 classes/ ranges of the full range.

For example, when the bitdepth is 8 bits per pixel, the maximum value of a pixel can be 255. Hence, the range of pixel values is between 0 and 255. For this bitdepth of 8 bits, each class contains 8 pixel values.

In conventional Band Offset type filtering, the start of the band, represented by the grey area (40), that contains four ranges or classes, is signaled in the bitstream to identify the position of the first class of pixels or the first range of pixel values. The syntax element representative of this position is the “sao band position field in the HEVC specifications. This corresponds to the start of class 41 in Figure 8. According to the HEVC specifications, 4 consecutive classes (41, 42, 43 and 44) of pixel values are used and 4 corresponding offsets are signaled in the bitstream.

Figure 9 is a flow chart illustrating the steps of a process to decode SAO parameters according to the HEVC specifications. The process of Figure 9 is applied for each CTU to generate a set of SAO parameters for all components. In order to avoid encoding one set of SAO parameters per CTU (which is very costly), a predictive scheme is used for the CTU mode. This predictive mode involves checking if the CTU on the left of the current CTU uses the same SAO parameters (this is specified in the bitstream through a flag named ''sao merge left Jlag”). If not, a second check is performed with the CTU above the current CTU (this is specified in the bitstream through a flag named ''sao merge up Jlag”). This predictive technique enables the amount of data representing the SAO parameters for the CTU mode to be reduced. Steps of the process are set out below.

In step 503, the ''sao merge left flag' is read from the bitstream 502 and decoded. If its value is true, then the process proceeds to step 504 where the SAO parameters of left CTU are copied for the current CTU. This enables the types for YUV of the SAO filter for the current CTU to be determined in step 508.

If the outcome is negative in step 503 then the “sao merge up Jlag' is read from the bitstream and decoded. If its value is true, then the process proceeds to step 505 where the SAO parameters of the above CTU are copied for the current CTU. This enables the types of the SAO filter for the current CTU to be determined in step 508.

If the outcome is negative in step 505, then the SAO parameters for the current CTU are read and decoded from the bitstream in step 507 for the Luma Y component and both U and V components (501) (551) for the type. The offsets for Chroma are independent.

The details of this step are described later with reference to Figure 10. After this step, the parameters are obtained and the type of SAO filter is determined in step 508.

In subsequent step 511a check is performed to determine if the three colour components (Y and U & V) for the current CTU have been processed. If the outcome is positive, the determination of the SAO parameters for the three components is complete and the next CTU can be processed in step 510. Otherwise, (Only Y was processed) U and V are processed together and the process restarts from initial step 512 previously described.

Figure 10 is a flow chart illustrating steps of a process of parsing of SAO parameters in the bitstream 601 at the decoder side. In initial step 602, the ”sao type idx X” syntax element is read and decoded. The code word representing this syntax element can use a fixed length code or could use any method of arithmetic coding. The syntax element sao type idx X enables determination of the type of SAO applied for the frame area to be processed for the colour component Y or for both Chroma components U & V. For example, for a YUV 4:2:0 sequence, two components are considered: one for Y, and one for U and V. The “sao type idx X can take 3 values as follows depending on the SAO type encoded in the bitstream. ‘0’ corresponds to no SAO, ‘Γ corresponds to the Band Offset case illustrated in Figure 8 and ‘2’ corresponds to the Edge Offset type filter illustrated in Figures 3A and 3B

In the same step 602, a test is performed to determine if the “sao type idx X is strictly positive. If “sao type idx_X' is equal to “0” signifying that there is no SAO for this frame area (CTU) for Y if X is set equal to Y and that there is no SAO for this frame area for U and V if X is set equal to U and V. The determination of the SAO parameters is complete and the process proceeds to step 608. Otherwise if the “sao type idx” is strictly positive, this signifies that SAO parameters exist for this CTU in the bitstream.

Then the process proceeds to step 606 where a loop is performed for four iterations. The four iterations are carried in step 607 where the absolute value of offset j is read and decoded from the bitstream. These four offsets correspond either to the four absolute values of the offsets (01, 02, 03, 04) of the four Edge indexes of SAO Edge Offset (see Figure 7B) or to the four absolute values of the offsets related to the four ranges of the SAO band Offset (see Figure 8).

Note that for the coding of an SAO offset, a first part is transmitted in the bitstream corresponding to the absolute value of the offset. This absolute value is coded with a unary code. The maximum value for an absolute value is given by the following formula:

MAX_abs_SAO_offset_value = (1 « (Min(bitDepth, 10)-5))-1 where «is the left (bit) shift operator.

This formula means that the maximum absolute value of an offset is 7 for a pixel value bitdepth of 8 bits, and 31 for a pixel value bitdepth of 10 bits and beyond.

The current HEVC standard amendment addressing extended bitdepth video sequences provides similar formula for a pixel value having a bitdepth of 12 bits and beyond.

The absolute value decoded may be a quantized value which is dequantized before it is applied to pixel values at the decoder for SAO filtering. An indication of use or not of this quantification is transmitted in the slice header.

For Edge Offset type, only the absolute value is transmitted because the sign can be inferred as explained previously.

For Band Offset type, the sign is signaled in the bitstream as a second part of the offset if the absolute value of the offset is not equal to 0. The bit of the sign is bypassed when CABAC is used.

After step 607, the process proceeds to step 603 where a test is performed to determine if the type of SAO corresponds to the Band Offset type (sao type Jdx X == 1).

If the outcome is positive, the signs of the offsets for the Band Offset mode are decoded in steps 609 and 610, except for each offset that has a zero value, before the following step 604 is performed in order to read in the bitstream and to decode the position “sao bandposition X' of the SAO band as illustrated in Figure 8.

If the outcome is negative in step 603 (“sao type Jdx X is set equal to 2), this signifies that the Edge Offset type is used. Consequently, the Edge Offset class (corresponding to the direction 0, 45, 90 and 135 degrees) is extracted from the bitstream 601 in step 605. If X is equal to Y, the read syntax element is “sao eo class luma” and if X is set equal to U and V, the read syntax element is “sao eo class chroma”.

When the four offsets have been decoded, the reading of the SAO parameters is complete and the process proceeds to step 608.

Figure 11 is a flow chart illustrating how SAO filtering is performed on an image part according to the HEVC specifications, for example during the step 907 in Figure 6. In HEVC, this image part is a CTU. This same process is also applied in the decoding loop (step 715 in Figure 4) at the encoder in order to produce the reference frames used for the motion estimation and compensation of the following frames. This process is related to the SAO filtering for one color component (thus suffix “_X” in the syntax elements has been omitted below).

An initial step 701 comprises determining the SAO filtering parameters according to processes depicted in Figures 9 and 10. The SAO filtering parameters are determined by the encoder and the encoded SAO parameters are included in the bitstream. Accordingly, on the decoder side in step 701 the decoder reads and decodes the parameters from the bitstream. Step 701 gives the sao type Jdx and if it equals 1 the sao band position 702 and if it equals 2 the sao eoclass Juma or sao eo class chroma (according to the colour component processed). It may be noted that if the element sao type idx is equal to 0 the SAO filtering is not applied. Step 701 gives also the offsets table of the 4 offsets 703.

A variable z, used to successively consider each pixel Pi of the current block or frame p area (CTU), is set to 0 in step 704. In step 706, pixel ‘ is extracted from the frame area 705 p (the current CTU in the HEVC standard) which contains N pixels. This pixel ‘ is classified in step 707 according to the Edge offset classification described with reference to Figures 7A & 7B or Band offset classification as described with reference to Figure 8. The decision p

module 708 tests if ‘ is in a class that is to be filtered using the conventional SAO filtering. p

If ‘ is in a filtered class, the related class number j is identified and the related offset

Offset value ⁷ is extracted in step 710 from the offsets table 703. In the case of the

OffsetP conventional SAO filtering this ⁷ is then added to the pixel value ' in step 711 in

P'P' order to produce the filtered pixel value ‘ 712. This filtered pixel ‘ is inserted in step 713 into the filtered frame area 716. In embodiments of the invention, steps 710 and 711 are carried out differently, as will be explained later in the description of those embodiments.

PP

If ‘ is not in a class to be SAO filtered then ‘ (709) is inserted in step 713 into the filtered frame area 716 without filtering.

After step 713, the variable z is incremented in step 714 in order to filter the subsequent pixels of the current frame area 705 (if any - test 715). After all the pixels have been processed (i>=N) in step 715, the filtered frame area 716 is reconstructed and can be added to the SAO reconstructed frame (see frame 908 of Figure 6 or 716 of Figure 4).

As noted above, the JVET exploration model (JEM) for the future VVC standard uses all the HEVC tools. One of these tools is sample adaptive offset (SAO) filtering. However, SAO is less efficient in the JEM reference software than in the HEVC reference software. This arises from fewer evaluations and from signalling inefficiencies compared to other loop filters.

Figure 12 is a flow chart illustrating steps carried out an encoder to determine SAO parameters for the CTUs of a group (frame or slice) in the CTU-level. The process starts with a current CTU 1101. First the statistics for all possible SAO types and classes are accumulated in a variable CTUStats 1102. The process of Step 1102 is described below with reference to Figure 13. According to the value set in the variable CTUStats, the RD cost for the SAO merge Left is evaluated if the Left CTU is in the current Slice 1103 as the RD cost of the SAO Merge UP (1104). Thanks to the statistics in CTUStats 1102, new SAO parameters are evaluated for Luma 1105 and for both Chroma components 1109. (Both Chroma components because the Chroma components share the same SAO type in the HEVC standard). For each SAO type 1006, the best RD offsets and other parameters for Band offset classification are obtained 1107. Steps 1107 and 1110 are explained below for Edge and Band classification with reference to Figure 14 and Figure 15 respectively. All RD costs are computed thanks to their respective SAO parameters (1108). In the same way for both Chroma components, the optimal RD offsets and parameters are selected 1111. All this RD costs are compared in order to select the best SAO parameters set 1115. These RD costs are also compared to disable SAO independently for the Luma and the Chroma components 1113, 1114. The use of a new SAO parameters set 1115 is compared to the SAO parameters set “Merging” or sharing 1116 from the left and up CTU.

Figure 13 is a flow chart illustrating steps of an example of a statistics computed at the encoder side that can be applied for the Edge Offset type filter, in the case of the conventional SAO filtering. The similar approach may also be used for the Band Offset type filter.

Figure 13 illustrates the setting of the variable CTUStats containing all information needed to derive each best rate distortion offsets for each class. Moreover, it illustrates the selection of the best SAO parameters set for the current CTU. For each colour component Y, U, V (or RGB) 811 each SAO type is evaluated. For each SAO type 812 the variables Sung and SumNbPiXj are set to zero in an initial step 801. The current frame area 803 contains N pixels.

j is the current range number to determine the four offsets (related to the four edge indexes shown in Figure 7B for Edge Offset type or to the 32 ranges of pixel values shown in Figure 8 for Band Offset type). Sung is the sum of the differences between the pixels in the range j and their original pixels. SumNbPiXj is the number of pixels in the frame area, the pixel value of which belongs to the range j.

In step 802, a variable i, used to successively consider each pixel Pi of the current frame area, is set to zero. Then, the first pixel of the frame area 803 is extracted in step 804. In step 805, the class of the current pixel is determined by checking the conditions defined in

Figure 7B. Then a test is performed in step 805. During step 805, a check is performed to determine if the class of the pixel value corresponds to the value “none of the above” of Figure 7B.

If the outcome is positive, then the value “i” is incremented in step 808 in order to consider the next pixels of the frame area 803.

Otherwise, if the outcome is negative in step 806, the next step is 807 where the related SumNbPix_j (i.e. the sum of the number of pixels for the class determined in step 805) is incremented and the difference between Τ’ and its original value Pf⁸ is added to Sum_]. In the next step 808, the variable z is incremented in order to consider the next pixels of the frame area 803.

Then a test is performed to determine if all pixels have been considered and classified. If the outcome is negative, the process loops back to step 804 described above. Otherwise, if the outcome is positive, the process proceeds to step 810 where the variable CTUStats for the current colour component X and the SAO type SAO type and the current class j are set equal to Sum_] for the first value and SumNbPix_/ for the second value. These variables can be used to compute for example the optimal offset parameter Offset _y of each class j. This offset Offset_} may be the average of the differences between the pixels of class j and their original values. Thus, Offset _y is given by the following formula:

Sum_}SumNbPix_}

Note that the offset Offset_} is an integer value. As a consequence, the ratio defined in this formula may be rounded, either to the closest value or using the ceiling or floor function.

Each offset Offset _y is an optimal offset Ooptj in terms of distortion.

To evaluate an RD cost for a merge of SAO parameters, the encoder uses the statistics set in table CTUStats. According to the following examples for the SAO Merge Left and by considering the type for Luma Left Type Y and the four related offsets O Left O, O Left l, O_Left_2, O_Left_3, the distortion can be obtained by the following formula:

DistortionLeftY = (CTUStats[Y][ Left_Type_Y][0][l] x O Left O x O Left O CTUStats[Y][ Left_Type_Y][0][0] x O Left O x 2)» Shift + (CTUStats[Y][ Left_Type_Y][l][l] x O Left l x O Left O CTUStats[Y][ Left_Type_Y][l][O] x O Left l x 2)» Shift + (CTUStats[Y][ Left_Type_Y][2][l] x 0_Left_2 x O Left O CTUStats[Y][ Left_Type_Y][2][0] x 0_Left_2 x 2)» Shift + (CTUStats[Y][ Left_Type_Y][3][l] x 0_Left_3 x O Left O CTUStats[Y][ Left_Type_Y][3][0] x 0_Left_3 x 2)» Shift

The variable Shift is designed for a distortion adjustment. The distortion should be negative as SAO is a post filtering.

The same computing is applied for Chroma components. The Lambda of the rate distortion cost is fixed for the three components. For an SAO parameters merged with the left CTU, the rate is only 1 flag which is CAB AC coded.

The encoding process illustrated in Figure 14 is applied in order to find the best offset in terms of rate distortion criterion, offset referred to as ORDj. This process is applied in steps 1109 to 1112.

In an initial step 901 of the encoding process of Figure 14, the rate distortion value Jj is initialized to the maximum possible value. Then a loop on Oj from Ooptj to 0 is applied in step 902. Note that Oj is modified by 1 at each new iteration of the loop. If Ooptj is negative, the value Oj is incremented and if Ooptj is positive, the value Oj is decremented. The rate distortion cost related to Oj is computed in step 903 according to the following formula:

J(Oj)= SumNbPixj x Oj x Oj - Sumj x Oj x 2 + λ R(Oj) where λ is the Lagrange parameter and R(Oj) is a function which provides the number of bits needed for the code word associated with Oj.

Formula ‘SumNbPixj x Oj x Oj - Sumj x Oj x 2’ gives the improvement in terms of the distortion provided by the use of the offset Oj. If J(Oj) is inferior to Jj then Jj = J(Oj) and ORDj is equal to Oj in step 904. If Oj is equal to 0 in step 905, the loop ends and the best ORDj for the class j is selected.

This algorithm of Figures 13 and 14 provides a best ORDj for each class j. This algorithm is repeated for each of the four directions of Figure 7A. Then the direction that provides the best rate distortion cost (sum of Jj for each direction) is selected as the direction to be used for the current CTU.

This algorithm (Figures 13 and 14) for selecting the offset values at the encoder side for the Edge offset tool can be easily applied to the Band Offset filter to select the best position (SAO_band_position) where j is in the interval [0,32[ instead of the interval [1,4[ in Figure 13. It involves changing the value 4 to 32 in modules 801, 810, 811. More specifically, for the 32 classes of Figure 8, the parameter Sumj (j=[0,32[) is computed. This corresponds to computing for each range j, the difference between the current pixel value (Pi) and its original value (Porgi), each pixel of the image belonging to a single range j. Then the best offset in terms of rate distortion ORDj is computed for the 32 classes, with the same process as described in Figure 14.

The next step involves finding the best position of the SAO band position of Figure 8. This is determined with the encoding process set out in Figure 15. The RD cost Jj for each range has been computed with the encoding process of Figure 14 with the optimal offset ORDj in terms of rate distortion. In Figure 15, in an initial step 1001 the rate distortion value J is initialized to the maximum possible value. Then a loop on the 28 positions j of 4 consecutive classes is run in step 1002. Next, the variable Jj corresponding to the RD cost of the band (of 4 consecutive classes) is initialized to 0 in step 1003. Then the loop on the four consecutive offset j is run in step 1004. Ji is incremented by the RD costs of the four classes Jj in step 1005 (j=i to i+4).

If this cost Ji is inferior to the Best RD cost J, J is set to Ji, and sao_band_position = i in step 1007, and the next step is step 1008.

Otherwise, the next step is step 1008.

Test 1008 checks whether or not the loop on the 28 positions has ended. If not, the process continues in step 1002, otherwise the encoding process returns the best band position as being the current value of sao_band_position 1009.

Thus, the CTUStats table in the case of determining the SAO parameters at the CTU level is created by the process of Figure 12. This corresponds to evaluating the CTU level in terms of the rate-distortion compromise. The evaluation may be performed for the whole image or for just the current slice.

Figure 16 shows various different groupings 1201-1206 of CTUs in a slice.

A first grouping 1201 has individual CTUs. This first grouping requires one set of SAO parameters per CTU. It corresponds to the CTU-level previously mentioned.

A second grouping 1202 makes all CTUs of the entire image one group. Thus, in contrast to the CTU-level, all CTUs of the frame (and hence the slice which is either the entire frame or a part thereof) share the same SAO parameters.

To make all CTUs of the image share the same SAO parameters one of two methods can be used. In both methods, the encoder first computes a set of SAO parameters to be shared by all CTUs of the image. Then, in the first method, these SAO parameters are set for the first CTU of the slice. For each remaining CTU from the second CTU to the last CTU of the slice, the sao merge left flag is set equal to 1 if the flag exists (that is, if the current CTU has a left CTU). Otherwise, the sao merge up flag is set equal to 1. Figure 17 shows an example of CTUs with SAO parameters set according to the first method. This method has the advantage that no signalling of the grouping to the decoder is required. Also, no changes to the decoder are required to introduce the groupings and only the encoder is changed. The groupings could therefore be introduced in an encoder based on HEVC without modifying the HEVC decoder. Surprisingly, groupings do not increase the rate too much. This is because the merge flags are generally CAB AC coded in the same context. Since for the second group (entire image) these flags all have the same value (1), the rate consumed by these flags is very low. This follows because they always have the same value and the probability is 1.

In the second method of making all CTUs of the image share the same SAO parameters, the grouping is signalled to the decoder in the bitstream. The SAO parameters are also signalled as SAO parameters for the group (whole image), for example in the slice header. In this case, the signalling of the grouping consumes bandwidth. However, the merge flags can be dispensed with, saving the rate related to the merge flags, so that overall the rate is reduced.

The first and second groupings 1201 and 1202 provide very different rate-distortion compromises. The first grouping 1201 is at one extreme, giving very fine control of the SAO parameters (CTU by CTU), which should lower distortion, but at the expense of a lot of signalling. The second grouping is at the other extreme, giving very coarse control of the SAO parameters (one set for the whole image), which raises distortion but has very light signalling.

Next, a description will be given of how to determine in the encoder the SAO parameters for the second grouping 1202. In the second grouping 1202 the determination is done for a whole image and all CTUs of the slice/frame share the same SAO parameters.

Figure 18 is an example of the setting of SAO parameters for a frame/slice level using the first method of sharing SAO parameters (i.e. without new SAO classifications at encoder side). This Figure is based on Figure 17. At the beginning of the process, the

CTUStats table is set for each CTU (in the same way as the CTU level encoding choice). This CTUStats can be used for the traditional CTU level 1302. Then the table FrameStats is set by adding each value for all CTUs of the table CTUStats 1303. Then the same process as for CTU level is applied to find the best SAO parameters 1305 to 1315. To set the SAO parameters for all CTUs of the frame, the selected SAO parameters set at step 1315 is set for the first CTU of the slice/frame. Then for each CTU from the second CTU to the last CTU of the slice/frame, the sao merge left Jag is set equal to 1 if it exists otherwise the sao merge upJlag is set equal to 1 (indeed for the second CTU to the last CTU a merge Left or Up or both exist) 1317. The syntax of the SAO parameters set is unchanged from that presented in Figure 9. At the end of the process the SAO parameters are set for the whole slice/frame.

Thus, the CTUStats table in the case of determining the SAO parameters for the whole image (frame level) is created by the process of Figure 18. This corresponds to evaluating the frame level in terms of the rate-distortion compromise.

The evaluations are then compared and the one with the best performance is selected.

The example of determining the SAO parameters in Figure 18 corresponds to the first method of sharing SAO parameters as it uses the merge flags to share the SAO parameters among all CTUs of the image (see steps 1316 and 1317). These steps can be omitted if a second method of sharing SAO parameters is used as described in further embodiments below.

Figure 19 is an example of the setting of SAO parameters sets for a third grouping 1203 at the encoder side. This Figure is based on Figure 12. To reduce the amount of steps in the figure, the modules 1105 to 1115 have been merged in one step 1405 in this Figure 19. At the beginning of the process, the CTUStats table is set for each CTU. This CTUStats can be used for the traditional CTU level 1302 encoding choice. For each column 1403 of the current slice/frame, the table ColumnStats is set by adding each value 1405 from CTUStats 1402, for each CTUs of the current column 1404. Then the new SAO parameters are determined as for CTU level 1406 encoding choice (cf. Figure 12). If it is not the first column, the RD cost to share the SAO parameters with the previous left column is also evaluated 1407, in the same way as the sharing of SAO parameters set between left and up CTU 1103, 1104 is evaluated. If the sharing of SAO parameters gives a better RD cost 1408 than the RD cost for the new SAO parameters set, the sao merge left Jag is set equal to 1 for the first CTU of the column. This CTU has the address number equal to the value “Column”. Otherwise, the SAO parameters set for this first CTU of the column is set equal (1409) to the new SAO parameters obtained in step 1406.

For all other CTUs of the column 1411, their SAO merge Left sao merge left Jlag is set equal to 0 if it exists and the SAO merge up sao merge upJlag is set equal to 1. Then the SAO parameters set for the next column can be processed 1403. Please note that, except for the first line of CTU all other CTUs of the frame have the sao merge left flag equal 1 to 0 if it exists and the sao merge up flag equals to 1. So, step 1412 can be processed once per frame.

The advantage of this CTU grouping is another RD compromise between the CTU level encoding choice and the frame level which can be useful for some conditions. Also, in this example, merge flags are used within the group, which means that the third grouping can be introduced without modifying the decoder (i.e. the grouping can be HEVC-compliant). Of course, the second method of sharing SAO parameters described in the third embodiment can be used instead. In that case, merge flags are not used in the group (CTU column) and steps 1411 and 1412 are omitted.

In one variant, the Merge between columns doesn’t need to be checked. It means that steps 1407 1408 1410 are removed from the process of Figure 19. The advantage of removing this possibility is a simplification of the implementation and the ability to parallelize the process. This has a small impact on coding efficiency.

Another possible compromise intermediate between the CTU level and the frame level can be offered by a fourth grouping 1204 in Figure 16, which makes a line of CTUs a group. To determine the SAO parameters for this fourth grouping, a similar process to that of Figure 18 can be applied. In that case, the variable ColumnStats is changed by FineStats. The step 1403 is replaced by “For Fine = 0 to Num CTU in Height”. The step 1404 is replaced by “For CTU_in_line= 0 to Num_CTU_in_Width”. Step 1405 by ColumnStats[][][][] += CTUStatsfFine* Num CTU in Width + CTU in line][][][][]. The New SAO parameters and the merge with the up CTU is evaluated based on this FineStats table (steps 1406 1407). The step 1410 is replaced by setting of sao merge up flag to 1 for the first CTU of the Fine. And for all CTUs of the slice/frame except each first CTU of each Fine, sao merge left flag is set equal to 1.

The advantage of the line is another RD compromise between the CTU level and Frame level. Please note that the frame or slice are most of the time rectangles and their width is larger than their height. So the line CTUs grouping 1204 is expected to be an RD compromise closer to the frame CTU grouping 1202 than the column CTU grouping 1203.

As for the other CTU groupings 1202 and 1203, the line CTU grouping can be HEVC compliant if the merge flags are used within the groups.

As for the column CTU grouping 1203 the evaluation of merging 2 lines can be removed.

Further RD compromises can be offered by putting two or more columns of CTUs or two or more lines of CTUs together as a group. The process of Figure 18 can be adapted to determine SAO parameters to such groups.

In one embodiment, the number N of columns or lines in a group may depend on the number of groups that are targeted.

The use of several columns or lines for the CTU groupings may be particularly advantageous when the slices or frames are large (for HD, 4K or beyond).

As described previously, in one variant, the merge between these groups containing two or more columns or two or more lines doesn’t need to be evaluated.

Another possible grouping includes split columns or split lines, where the split is tailored to the current slice/frame.

Another possible compromise between the CTU level and the frame level can be offered by square CTU groupings 1205 and 1206 as illustrated in Figure 18. The grouping 1205 makes 2x2 CTUs a group. The grouping 1206 makes 3x3 CTUs a group.

Figure 20 shows an example of how to determine the SAO parameters for such groupings. For each NxN group 1503, the table NxNStats 1507 is set 1504, 1505, 1506 based on CTUstats. This table is used to determine the New SAO parameters 1508 and its RD cost, in addition to the RD cost for a Left 1510 sharing or Up 1509 sharing of SAO parameters. If the Best RD cost is the new SAO parameters 1511, the SAO parameters of the first CTU (top left CTU) of the NxN group is set equal to this new SAO parameters 1514. If the best RD cost is the sharing of SAO parameters with the up NxN group 1512, the sao_merge_up_flag of the first CTU (Top left CTU) of the NxN group is set equal to 1 and the saomergeleftflag to 0 1515. If the best RD cost is the sharing of SAO parameters with the left NxN group 1513, the sao_merge_left_flag of the first CTU (Top left CTU) of the NxN group is set equal to 1, 1516. Then the sao merge left flag and sao merge up flag are set correctly for the other CTUs of the NxN group in order to form the SAO parameters for the current NxN group 1517. Figure 21 illustrates this setting for a 3x3 SAO group. The top left CTU is set equal to the SAO parameters determined in step 1508 to 1516. For the two other top CTUs, the sao merge left Jlag is set equal to 1. As the sao merge left Jlag is the first flag encoded or decoded and as it is set to 1, there is no need to set the sao merge up flag to 0. For the two other CTUs in the first row, the sao merge left Jlag is set equal to 0 and sao merge upJlag is set equal to 1. For the other CTUs, the sao merge left Jlag is set equal to 1.

The advantage of the NxN CTU groupings is to create several RD compromises for SAO. As for the other groupings, these groupings can be HEVC compliant if merge flags within the groups are used. As for the other groupings, the test of Merge left and Merge up between groups can be dispensed with in Figure 19. So steps 1509, 1510, 1512, 1513, 1515 and 1516 can be removed, especially when N is high.

In one variant, the value N depends on the size of the frame/slice. The advantage of this embodiment is to obtain an efficient RD compromise.

In a preferred variant, only N equal to 2 and 3 are evaluated. This offers an efficient compromise.

The possible groupings are in competition with one another as the SAO parameter derivation to be selected for the current slice. An example about how to select the SAO parameter derivation using a rate-distortion compromise comparison is described below according to a sixth embodiment of the invention in reference to Figure 22.

Figure 23 is a flow chart illustrating a decoding process when the CTU grouping is signaled in the slice header according to the second method of sharing SAO parameters among the CTUs of the group. First the flag SaoEnabledFlag is extracted from the bitstream 1801. If SAO is not enabled, the next slice header syntax element is decoded 1807 and SAO will not be applied to the current slice. Otherwise the decoder extracts N bits form the slice header 1803. N depends on the number of available CTUs groupings. Ideally, the number of CTUs groupings should be equal to 2 power of N. The corresponding CTUs grouping index 1804 is used to select the CTUs grouping method 1805. This grouping method will be applied to extract the SAO syntax and to determine the SAO parameters set for each CTU 1806. Then the next slice header syntax element is decoded.

The advantage of the signalling at slice header of the CTUs grouping is its low impact on the bitrate.

But when the number of slices is significant for a frame, it may be desirable to reduce this signalling. So, in one variant, the CTUs grouping index uses a unary max code in the slice header. In that case, the CTUs groupings are ordered according to their probabilities of occurrences (highest to lowest).

For example, at least one SAO parameter derivation is an intermediate level derivation (SAO parameters not at CTU level or at group level). When applied to a group it causes the group (e.g. frame or slice) to be subdivided into subdivided parts (CTU groupings 1203-1206, e.g. columns of CTUs, lines of CTUs, NxN CTUs, etc.) and derives SAO parameters for each of the subdivided parts. Each subdivided part is made up of two or more said image parts (CTUs). The advantage of the intermediate level derivation(s) is introduction of one or more effective rate-distortion compromises. The intermediate level derivation(s) can be used without the CTU-level derivation or without the frame-level derivation or without either of those two derivations.

Preferably, the smallest grouping is the first grouping 1201 in which each CTU is a group and there is one set of SAO parameters per CTU. However, set of SAO parameters can be applied to a smaller block than the CTU. In this case, the derivation is not at the CTU level, frame level or an intermediate level between the CTU and frame levels but at a subCTU level (a level smaller than an image part).

In this case, instead of signalling a grouping it is effective to signal an index representing a depth of the SAO parameters.

Table below shows one example of a possible indexing scheme:

Depth of SAO parameters	One set of SAO parameters per
0	1/16 CTU
1	1/4 CTU
2	CTU level
3	2x2 CTUs
4	3x3 CTUs
5	Frame Level

The index 0 means that each CTU is divided into 16 blocks and each may have its own SAO parameters. Index 1 means that each CTU is divided into 4 blocks, again each having its own SAO parameters.

The selected derivation is then signalled to the decoder in the bitstream. The signalling may comprise a depth syntax element (e.g. using the indexing scheme above).

In a variant, least one derivation when applied to a group causes the group to be subdivided into subdivided parts and derives SAO parameters for each of the subdivided parts, and each image part is made up of two or more said sub-divided parts.

In a variant the first derivation when applied to a group causing the group to have SAO parameters at a first level, and the second derivation when applied to a group causing the group to have SAO parameters at a second level different from the first level. The levels may any two levels from the frame level to a sub-CTU level. The levels may correspond to the groupings 1201-1206 in Figure 12.

Preferably SAO parameters is signalled for a slice, which means that the derivation is used for all CTUs of the slice.

Also, when the selected level of the SAO parameters for a slice is an intermediate level between the slice level and the CTU level, a derivation may be selected per CTU group (e.g. each column of CTUs) of the slice or frame.

In Figure 24 the SAO merge flags are usable between groups of the CTUs grouping. As depicted in Figure 24, for the 2x2 CTU grouping, the SAO merge Left and SAO merge up are kept for each group of 2x2 CTUs. But they are removed for CTUs inside the group. Please note that only the sao merge left Jlag is used for the grouping 1203 of a column of CTUs, and only the sao merge up Jlag is used for the grouping 1204 of a line of CTUs.

In a variant, a flag signals if the current CTU group shares its SAO parameters or not. If it is true, a syntax element representing one of the previous groups is signalled. So each group of a slice can be predicted by a previous group except the first one. This improves the coding efficiency by adding several new possible predictors.

Referring back to a previous example, it was mentioned that a default set of SAO parameters can be used when the collocated CTU does not use SAO or when none of the collocated CTUs uses SAO. In a variant, the default set depends on the selected grouping. For example, a first default set may be associated with one grouping (or one level of SAO parameters) and a second default set may be associated with another grouping (or another level of SAO parameters). The size of the groups (or the level of SAO parameters) is found to have an influence of what SAO parameters work efficiently as the default set. The different default sets may be determined by the encoder and transmitted to the decoder in the sequence parameter set. Then, the decoder uses the appropriate default set according to the grouping selected for the current slice.

In a variant, a depth of the SAO parameters was selected for a slice, including depths smaller than a CTU, making it possible to have a set of SAO parameters per block in a CTU.

Embodiments of the present invention described below are intended to improve the coding efficiency of SAO by using various techniques for determining one or more SAO parameters of an image part in a current image.

First group of embodiments

In first group of embodiments, it is proposed to improve the use of syntax elements enabling for one image part (for instance a CTU) the inferring of SAO parameters from another image part (another CTU), through the flags: “sao merge up flag' and “sao merge left flag described by reference to the Figure 9.

First embodiment

In a first embodiment, it is proposed a method of signalling in a bitstream, a Sample Adaptive Offset (SAO) filtering on an image , the image comprising a plurality of image parts, the image parts being grouped into a plurality of groups. The method comprises: determining whether an image part is of a predetermined group, and if the image part is of the predetermined group, then including in the bitstream a first syntax element, for signalling that SAO parameters for performing SAO filtering on the image part are inferred from SAO parameters used for filtering another image part, else not including the first syntax element.

An example of the first embodiment is illustrated in Figure 25, which is a flow chart illustrating the steps of a process that may be implemented in an encoder, to encode SAO parameters according to the first embodiment. More precisely this figure illustrates as an example, the signalling of the inferring or not of SAO parameters provided for CTUs or group of CTUs within a bitstream 4114, based on CTU grouping.

Preferably, when there is no signalling for inferring the SAO parameters, for example, if the image part is not of the predetermined group, then SAO parameters for filtering the image part are included in the bitstream.

More precisely, the encoder checks first in a test 4102 whether an information, for example, a grouping index 4101 is set equal to the predetermined level, for example CTU level, 4102. As a variant, the SAO merge flags (first syntax elements) are not included in the bitstream if said image part is of a group comprising at least two image parts. As another variant, if the CTU is of a group made up of

- 2*2 CTU, or

- 3*3 CTU, or a line of CTUs (or partial line), or a column of CTUs (or partial column), or all the CTUs of the image, then the SAO parameters for filtering the image part are included in the bitstream.

Moreover, if the image part is of a group comprising partitioned image parts, then the first syntax element may be included in the bitstream.

More precisely if the image part is of a group comprising image parts partitioned in 16 portions, or image parts partitioned in 8 portions, or image parts partitioned in 4 portions, then the first syntax element is included in the bitstream.

If the test result is false, then for the three components Y,U,V 4111, a new set of SAO parameters is inserted 4112 in the bitstream 4114. The steps 4111 and 4112 are repeated until the last CTU is processed.

If the grouping index is set equal to the CTU level value (“Yes” for the test 4102), and if the left CTU (meaning the CTU located at the left side of the processed CTU) exists (test 4103), the sao merge left Jlag is inserted in a step 4104 in the bitstream 4114.

If the sao merge left Jlag is not equal to false (or the value ‘0’) in test 4105 and if the up CTU (meaning the CTU located above the processed CTU) exists in a test 4109, the sao merge upJlag is inserted in a step 4107, in the bi stream 4114.

If the flag sao merge up Jlag is set equal to false (or the value ‘0’), test 4108, a new SAO parameters set is inserted in the bitstream in the steps 4111, 4112 and 4113.

In a variant, step 4112 may also comprise the insertion of a flag sao merge Jlags enabled. The flag is a second syntax element, associated with the group of the image part, for signalling whether the use of the first syntax element(s) is enabled or disabled.

Second embodiment

In the second embodiment, the sao merge flags enabled may be included in the bitstream based on a criterion (for instance the group the CTU belongs to, or the prediction or encoding mode which is used), for signalling whether the use of the SAO merge flags for signalling that SAO parameters for performing SAO filtering on the image part are inferred from SAO parameters used for filtering another image part or not, is enabled or disabled.

The new steps of the process that may be implemented in an encoder.

For example, the index or the first or second syntax element when included in the bitstream, are inserted at:

the image level, or the sequence level, when the image is of a video sequence.

Third embodiment

Figure 25 is a flow chart illustrating the steps of a process to parse SAO parameters, as an alternative to the process illustrated in Figure 9, and in relation to the encoding steps implemented in an encoder described in Figure 22. The steps are preferably implemented in a decoder.

This second embodiment proposes a method of performing sample adaptive offset (SAO) filtering on an image comprising a plurality of image parts, the image parts being grouped into a plurality of groups. The method comprises:

obtaining information indicating whether the group of an image part is of predetermined groups, and if the group of the image part is of the predetermined groups, then obtaining a first syntax element signalling inferring SAO parameters for performing SAO filtering on said image part, from SAO parameters used for filtering another image part, else not obtaining the first syntax element.

Preferably, the process of Figure 25 is applied for each CTU to generate a set of SAO parameters for all components. Before decoding SAO parameters for a CTU or a group of CTUs (when there is grouping of CTUs), it is tested in a test 4014 if the information. For instance the information may be the grouping index 4013 previously mentioned. It is tested whether it is set equal to a value indicating a predetermined group or a set of predetermined groups, for example corresponding to the CTU level index 4014. Said CTU level index may have been previously decoded from the header for example. In that case, the sao merge left Jlag (first syntax element) is extracted in a step 4003 from a bitstream 4002 and if needed the sao merge up Jlag (first syntax element) is also extracted in a step 4005 from the bitstream 4002.

If, the grouping index is not set equal to the CTU level index in the test 4014, the flags sao merge leftJag and sao merge up Jag are not considered and new SAO parameters are extracted in a step 4007, from the bitstream 4002.

The other steps 4004, 4006 and 4008-4012 are respectively similar to steps 504, 506, 508-512 in Figure 9 previously described.

In one variant, the merge flags are kept for CTU level but removed for all other CTU groupings, as illustrated in Figure 25. The advantage is a flexibility of the CTU level.

In one variant, if said image part is of a group comprising at least two image parts, then the first syntax element(s) is (are) not obtained.

In another variant, the merge flags are used for CTU when the SAO signalling is lower or equal to the CTU level (1/16 CTU or A CTU or 1/8 CTU) and removed for other CTUs groupings having larger groups.

The following table illustrates this embodiment:

SAO block size	SAO merge flag?
1/16 CTU	TRUE
1/8 CTU	TRUE
1/4 CTU	TRUE
CTU	TRUE
2x2 CTUs	FALSE
3x3 CTUs	FALSE
Line	FALSE
Column	FALSE
Fame level	FALSE

In other words, if said image part is of a group made up of

2*2 image parts, or

3*3 image parts, or a line of image parts (or partial line), or a column of image parts (or partial column), or all the image parts of the image, then the first syntax element (s) is (are) not included in the bitstream.

Fourth embodiment

Figure 26 is a flow chart illustrating a third embodiment this variant of the second embodiment in Figure 25. However a new test 4214 evaluates if the value of a syntax element the flag sao merge flags enabled (second syntax element) is equal to true to enable the decoding of the flags sao merge left fllag and sao merge up fllag testes in tests 4203 and 4205, instead of checking the value of the grouping index, as illustrated in Figure 25.

In other words, the information is a second syntax element associated with the group of the image part, signalling whether the use of the first syntax element (sao merge left fllag and sao merge upfllag) is enabled or disabled.

Fifth embodiment

In a fifth embodiment, it is proposed a method of performing sample adaptive offset (SAO) filtering on an image comprising a plurality of image parts, the image parts being grouped into a plurality of groups.

The method comprises:

obtaining a syntax element in the bitstream based on a predetermined criterion, for signalling whether the use of one or more other syntax elements for signalling that SAO parameters for performing SAO filtering on the image part are inferred from SAO parameters used for filtering another image part or not, is enabled or disabled.

For instance, the predetermined criterion is the fact that said image part is of a predetermined group or not.

In the third, fourth and fifth embodiments, when the steps are implemented in a decoder for decoding an image from a bitstream (respectively 4002 or 4202), said information or SAO parameters, or first or second syntax elements are parsed from the bitstream.

As for the encoder side (Figure 25), said information or SAO parameters, or first or second syntax elements when parsed from the bitstream, concerns:

the image level, or the sequence level, when the image is of a video sequence.

In the previous embodiments, a default set of SAO parameters can be used when the collocated CTU does not use SAO or when none of the collocated CTUs uses SAO. In a variant, the default set depends on the selected depth of SAO parameters. For example, a first default set may be associated with one depth (for example 1/16) and a second default set may be associated with another depth (for example %). The depth is found to have an influence of what SAO parameters work efficiently as the default set. The different default sets may be determined by the encoder and transmitted to the decoder in the sequence parameter set. Then, the decoder uses the appropriate default set according to the depth selected for the current slice.

In a variant, one possibility is to remove the SAO merge flags for all levels. It means that steps 503 504 505 506 of Figure 9 are removed. The advantage is that it reduces significantly the signalling of SAO and consequently it reduces the bitrate. Moreover, it simplifies the design by removing two syntax elements at CTU level.

The merge flags are important for small block sizes because a SAO parameters set is costly compared to the amount of samples that it can improve. In that case, these syntax elements reduce the cost of SAO parameters signalling. For large groups, the SAO parameters set is less costly so the usage of merge flags is not efficient. So the advantage of these embodiments is a coding efficiency increase.

Sixth embodiment

In the sixth embodiment, several SAO derivation are evaluated at encoder and the relative SAO syntax at CTU level is inserted in the bitstream. Consequently, the decoder is not modified. One of the advantages is that this embodiment can be used with an HEVC compliant decoder.

All these different groupings, defined in the previous embodiments, can be compared at encoder side to select the one which gives the best RD compromise for the current frame. Figure 22 illustrates this embodiment. More precisely Figure 22 illustrates an example of how to select the SAO parameter derivation using a rate-distortion compromise comparison.

One possibility to increase the coding efficiency at encoder side is to test all possible SAO groupings but this should increase the encoding time compared to the example of Figure 22 where a small subset of groupings is evaluated.

The current slice/frame 1701 is used to set the CTUStats table 1703 for each CTU 1702. This table 1703 is used to evaluate the CTU level 1704, the frame/ Slice Grouping 1705, the Column grouping 1706, the line grouping 1707, the 2x2 CTUs grouping 1708 or 3x3 CTU grouping 1709 or all other described CTUs groupings as described previously (in an non limitative way). The best CTUs grouping is selected according to the rate distortion criterion computed for each grouping 1710. The SAO parameters sets for each CTU are set (1711) according to the grouping selected in step 1710. These SAO parameters 1712 are then used to apply the SAO filtering 1713 in order to obtain the filtered frame/slice. The SAO parameters for each CTU 1711 is then inserted inside the bitstream as described in Figure 9.

The advantage of this embodiment is that it doesn’t require any modification of HEVC SAO at decoder side, so this method is HEVC compliant.

The main advantage is a coding efficiency increase. The second advantage is that this competition method doesn’t require any additional SAO filtering or classification. Indeed, the main impacts on encoder complexity are the step 1702 which needs SAO classification for all possible SAO type and the step 1713 which filtered the samples. All other CTU groupings evaluations are only some additions of values already obtained during the CTU level encoding choice (set in the table CTUStats).

One other possibility to increase the coding efficiency at encoder side is to test all possible SAO groupings but this should increase the encoding time compared to the example of Figure 12 where a small subset of groupings is evaluated.

Second group of embodiments

Seventh embodiment

Accordingly, in the seventh embodiment, the competition between the different permitted SAO parameters derivations is modified so that only one derivation is permitted in the encoder for any given slice or frame. The permitted derivation may be determined in dependence upon one or more characteristics of the slice or frame. For example, the permitted derivation may be selected based on the slice type (Intra, InterP, Inter B), quantization level (QP) of the slice, or position in the hierarchy of a Group of Pictures (GOP).

The advantage of this embodiment is a complexity reduction. Instead of evaluating two or more competing derivations just one derivation is selected, which can be useful for a hardware encoder.

Thus, in a variant, a first derivation is associated with first groups of the image (e.g. Intra slices) and a second derivation is associated with second groups of the image (e.g. Inter P slices). It is determined whether a group to be filtered is a first group or a second group. If it is determined that the group to be filtered is a first group, the first derivation is used to filter the image parts of the group, and if it is determined that the group to be filtered is a second group, the second derivation is used to filter the image parts of the group. Evaluation of the two derivations is not required.

Whether a group to be filtered is determined to be a first group or a second group may depend on one or more of a slice type;

a frame type of the image to which the group to be filtered belongs;

a position in a quality hierarchy of a Group of Pictures of the image to which the group to be filtered belongs;

a quality of the image to which the group to be filtered belongs; and a quantisation parameter applicable to the group to be filtered.

For example, when the first groups have a higher quality or higher position in the quality hierarchy than the second groups, the first derivation may have fewer image parts per group than the second derivation.

In a variant, a particular derivation of the SAO parameters was selected for a given slice or frame. However, if the encoder has the capacity to evaluate a limited number of competing derivations, it is unnecessary to eliminate the competition altogether. The competition for a given slice or frame is still permitted but the set of competing derivations is adapted to the slice or frame.

The set of competing derivations may depend on the slice type.

For Intra slices, the set preferably contains groupings with groups containing small numbers CTUs (e.g. CTU level, 2x2 CTU, 3x3 CTU, and Column). Also, if depths lower than a CTU are available (as in the tenth embodiment), these depths are preferably also included.

For Inter slices, the set of derivations preferably contains groupings with groups containing large numbers of CTUs such as Fine, Frame level. However, smaller groupings can also be considered down to the CTU level.

The advantage of this embodiment is a coding efficiency increase thanks to the use of derivations adapted for a slice or frame.

In one variant, the set of derivations can be different for an Inter B slice from that for an Inter P slice.

In another variant, the set of competing derivations depends on the characteristics of the frame in the GOP. This is especially beneficial for frames which vary in quality (QP) based on a quality hierarchy. For the frames with the highest quality or highest position in the hierarchy, the set of competing derivations should include groups containing few CTUs or even sub-CTU depths (same as for Intra slices above). For frames with a lower quality or lower position in the hierarchy, the set of competing derivations should include groups with more CTUs.

The set of competing derivations can be defined in the sequence parameters set.

Thus, in the seventh embodiment a first set of derivations is associated with first groups of the image (e.g. Intra slices) and a second set of derivations is associated with second groups of the image (e.g. Inter P slices). It is determined whether a group to be filtered is a first group or a second group. If it is determined that the group to be filtered is a first group, a derivation is selected from the first set of derivations and used to filter the image parts of the group, and if it is determined that the group to be filtered is a second group, a derivation is selected from the second set of derivations and used to filter the image parts of the group. Evaluation of derivations not in the associated set of derivations is not required.

Whether a group to be filtered is a first group or a second group may be determined in the preceding embodiment. For example, when the first groups have a higher quality or higher position in the quality hierarchy than the second groups, the first set of derivations may have at least one derivation with fewer image parts per group than the derivations of the second set of derivations.

The set of CTUs groupings can be defined in the sequence parameters set.

In other words, the seventh embodiment proposed a method of encoding an image comprising a plurality of image parts. The method comprises predicting one or more image parts from one or more other image parts according to a first or a second prediction mode, grouping the predicted image parts into one or more groups of a plurality of groups, according to the used prediction mode, and performing sample adaptive offset (SAO) filtering on predicted image parts based on the grouping.

For example, the image part is predicted from another image part within said image, using an intra prediction mode or from another image part within another reference image than said image, using an inter prediction mode.

Figure 28 shows a system 191 195 comprising at least one of an encoder 150 or a decoder 100 and a communication network 199 according to embodiments of the present invention. According to an embodiment, the system 195 is for processing and providing a content (for example, a video and audio content for displaying/outputting or streaming video/audio content) to a user, who has access to the decoder 100, for example through a user interface of a user terminal comprising the decoder 100 or a user terminal that is communicable with the decoder 100. Such a user terminal may be a computer, a mobile phone, a tablet or any other type of a device capable of providing/displaying the (provided/streamed) content to the user. The system 195 obtains/receives a bitstream 101 (in the form of a continuous stream or a signal - e.g. while earlier video/audio are being displayed/output) via the communication network 199. According to an embodiment, the system 191 is for processing a content and storing the processed content, for example a video and audio content processed for displaying/outputting/streaming at a later time. The system 191 obtains/receives a content comprising an original sequence of images 151, which is received and processed (including filtering with a deblocking filter according to the present invention) by the encoder 150, and the encoder 150 generates a bitstream 101 that is to be communicated to the decoder 100 via a communication network 191. The bitstream 101 is then communicated to the decoder 100 in a number of ways, for example it may be generated in advance by the encoder 150 and stored as data in a storage apparatus in the communication network 199 (e.g. on a server or a cloud storage) until a user requests the content (i.e. the bitstream data) from the storage apparatus, at which point the data is communicated/streamed to the decoder 100 from the storage apparatus. The system 191 may also comprise a content providing apparatus for providing/streaming, to the user (e.g. by communicating data for a user interface to be displayed on a user terminal), content information for the content stored in the storage apparatus (e.g. the title of the content and other meta/storage location data for identifying, selecting and requesting the content), and for receiving and processing a user request for a content so that the requested content can be delivered/streamed from the storage apparatus to the user terminal. Alternatively, the encoder 150 generates the bitstream 101 and communicates/streams it directly to the decoder 100 as and when the user requests the content. The decoder 100 then receives the bitstream 101 (or a signal) and performs filtering with a deblocking filter according to the invention to obtain/generate a video signal 109 and/or audio signal, which is then used by a user terminal to provide the requested content to the user.

In the preceding embodiments, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over, as one or more instructions or code, a computer-readable medium and executed by a hardware-based processing unit.

Computer-readable media may include computer-readable storage media, which corresponds to a tangible medium such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, e.g., according to a communication protocol. In this manner, computer-readable media generally may correspond to (1) tangible computer-readable storage media which is non-transitory or (2) a communication medium such as a signal or carrier wave. Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure. A computer program product may include a computer-readable medium.

By way of example, and not limitation, such computer-readable storage media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage, or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. It should be understood, however, that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transient media, but are instead directed to non-transient, tangible storage media. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and Blu-ray disc, where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

Instructions may be executed by one or more processors, such as one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuitry. Accordingly, the term processor, as used herein may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. In addition, in some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques could be fully implemented in one or more circuits or logic elements.

The techniques of this disclosure may be implemented in a wide variety of devices or apparatuses, including a wireless handset, an integrated circuit (IC) or a set of ICs (e.g., a chip set). Various components, modules, or units are described in this disclosure to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require realization by different hardware units. Rather, as described above, various units may be combined in a codec hardware unit or provided by a collection of interoperative hardware units, including one or more processors as described above, in conjunction with suitable software and/or firmware.

Claims

1. A method of signalling in a bitstream, a Sample Adaptive Offset (SAO) filtering on an image, the image comprising a plurality of image parts, the image parts being grouped into a plurality of groups, the method comprising:

determining whether an image part is of a predetermined group, and if the image part is of the predetermined group, then including in the bitstream a first syntax element, for signalling that SAO parameters for performing SAO filtering on the image part are inferred from SAO parameters used for filtering another image part, else not including the first syntax element.

2. The method of claim 1, wherein if the image part is not of the predetermined group, then including SAO parameters for filtering the image part in the bitstream.

3. The method of claim 1 or 2, wherein if said image part is of a group comprising one image part only, then the first syntax element is included in the bitstream.

4. The method of claim 1 or 2, wherein if said image part is of a group comprising at least two image parts, then the first syntax element is not included in the bitstream.

5. The method of claim 1 or 2, wherein the image is made up of lines and columns of image parts, and if said image part is of a group made up of

2*2 image parts, or

3*3 image parts, or a line of image parts, or a column of image parts, or all the image parts of the image, then not including the first syntax element in the bitstream.

6. The method of claim 1, 2 or 4, wherein if the image part is of a group comprising partitioned image parts, then the first syntax element is included in the bitstream.

7. The method of claim 6, wherein if the image part is of a group comprising image parts partitioned in 16 portions, or image parts partitioned in 8 portions, or image parts partitioned in 4 portions, then the first syntax element is included in the bitstream.

8. The method of claim 1, further comprising including in the bitstream an index for indicating the group of the image part.

9. The method of claim 1, further comprising, including in the bitstream a second syntax element associated with the group of the image part, for signalling whether the use of the first syntax element is enabled or disabled.

10. The method of claim 1 or 8 or 9, wherein the index or the first or second syntax element when included in the bitstream, are inserted at:

the image level, or the sequence level, when the image is of a video sequence.

11. A method of encoding an image comprising signalling sample adaptive offset (SAO) filtering in a bitstream, using the method of any one of claims 1 to 10.

12. A method of signalling in a bitstream, a Sample Adaptive Offset (SAO) filtering on an image, the image comprising a plurality of image parts, the image parts being grouped into a plurality of groups, the method comprising:

determining whether an image part satisfies a predetermined criterion, and including a syntax element in the bitstream based on said criterion, for signalling whether the use of one or more other syntax elements for signalling that SAO parameters for performing SAO filtering on the image part are inferred from SAO parameters used for filtering another image part or not, is enabled or disabled.

13. The method of claim 12, wherein the image part satisfies a predetermined criterion if said image part is of a predetermined group.

14. A method of encoding an image comprising signalling sample adaptive offset (SAO) filtering in a bitstream, using the method of any one of claims 12 to 13.

15. A method of performing sample adaptive offset (SAO) filtering on an image comprising a plurality of image parts, the image parts being grouped into a plurality of groups, the method comprising:

16. The method of claim 15, wherein if the group of the image part is not of the predetermined groups, then SAO parameters for filtering the image part are obtained.

17. The method of claim 15 or 16, wherein if said image part is of a group comprising one image part only, then the first syntax element is obtained.

18. The method of claim 15 or 16, wherein if said image part is of a group comprising at least two image parts, then the first syntax element is not obtained.

19. The method of claim 15 or 16, wherein the image is made up of lines and columns of image parts, and if said image part is of a group made up of

2*2 image parts, or

20. The method of claim 15 or 16, wherein if the image part is of a group comprising partitioned image parts, then the first syntax element is obtained.

21. The method of claim 20, wherein if the image part is of a group comprising image parts partitioned in 16 portions, or image parts partitioned in 8 portions, or image parts partitioned in 4 portions, then the first syntax element is obtained.

22. The method of claim 15 or 16, wherein said information is an index for indicating the group of the image part.

23. The method of claim 15 or 16, wherein said information is a second syntax element associated with the group of the image part, signalling whether the use of the first syntax element is enabled or disabled.

24. A method of decoding from a bitstream, an image comprising performing sample adaptive offset (SAO) filtering using the method of any one of claims 15 to 23, wherein said information or SAO parameters, or first or second syntax elements are parsed from the bitstream.

25. The method of claim 24, wherein said information or SAO parameters, or first or second syntax elements when parsed from the bitstream, concerns:

the image level, or the sequence level, when the image is of a video sequence.

26. A method of performing sample adaptive offset (SAO) filtering on an image comprising a plurality of image parts, the image parts being grouped into a plurality of groups, the method comprising:

27. The method of claim 26, wherein the predetermined criterion is the fact that said image part is of a predetermined group or not.

28. A method of decoding from a bitstream, an image comprising performing sample adaptive offset (SAO) filtering using the method of any one of claims 26 to 27, wherein said syntax elements or other syntax elements are parsed from the bitstream.

29. A method of applying a Sample Adaptive Offset (SAO) filtering on an image, the image comprising a plurality of image parts, the image parts being grouped into a plurality of groups, the method comprising for an image part:

determining statistics about the image part, evaluating based on the determined statistics of the image part, values of a predetermined criterion, when the image part is grouped according to at least two different groups, selecting based on said values a best group for the image part, filtering the image part by applying SAO parameters to the image part, based on the selected group, and providing the filtered image part.

30. A method of encoding an image made up of image part, comprising a step for applying a Sample Adaptive Offset (SAO) filtering according to claim 29.

31. A method of encoding an image comprising a plurality of image parts, the method comprising predicting one or more image parts from one or more other image parts according to a first or a second prediction mode, grouping the predicted image parts into one or more groups of a plurality of groups, according to the used prediction mode, and performing sample adaptive offset (SAO) filtering on predicted image parts based on the grouping.

32. The method of 31, wherein the image part is predicted from another image part within said image, using an intra prediction mode or from another image part within another reference image than said image, using an inter prediction mode.

33. A computer program product for a programmable apparatus, the computer program product comprising a sequence of instructions for implementing each of the steps of the method according to any one of claims 1 to 32 when loaded into and executed by the programmable apparatus.

34. A non-transitory computer-readable storage medium storing instructions of a computer program for implementing each of the steps of the method according to any one of claims 1 to 32.

35. A device for signalling in a bitstream, a Sample Adaptive Offset (SAO) filtering on an image, according to any one of claims 1 to 32.

36. A device for encoding an image or a sequence of images according to claim 14 or anyone of claims 30 to 32.

37. A device for performing sample adaptive offset (SAO) filtering on an image, according to any one of claims 26 to 27.

38. A device for applying a Sample Adaptive Offset (SAO) filtering on an image, according to claim 29.

39. A device for decoding an image or a sequence of images according to claim 28.