WO2019233999A1 - Codage et décodage vidéo - Google Patents
Codage et décodage vidéo Download PDFInfo
- Publication number
- WO2019233999A1 WO2019233999A1 PCT/EP2019/064456 EP2019064456W WO2019233999A1 WO 2019233999 A1 WO2019233999 A1 WO 2019233999A1 EP 2019064456 W EP2019064456 W EP 2019064456W WO 2019233999 A1 WO2019233999 A1 WO 2019233999A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- scheme
- image
- sao
- changing
- grouping
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/161—Encoding, multiplexing or demultiplexing different image signal components
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/14—Coding unit complexity, e.g. amount of activity or edge presence estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/174—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/182—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/189—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
- H04N19/196—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
- H04N19/198—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters including smoothing of a sequence of encoding parameters, e.g. by averaging, by choice of the maximum, minimum or median value
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/80—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
- H04N19/82—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
- H04N19/86—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
Definitions
- the present invention relates to video coding and decoding.
- VVC Versatile Video Coding
- the goal of VVC is to provide significant improvements in compression performance over the existing HEVC standard (i.e., typically twice as much as before) and to be completed in 2020.
- the main target applications and services include— but not limited to— 360-degree and high-dynamic-range (HDR) videos.
- HDR high-dynamic-range
- JVET evaluated responses from 32 organizations using formal subjective tests conducted by independent test labs.
- Some proposals demonstrated compression efficiency gains of typically 40% or more when compared to using HEVC. Particular effectiveness was shown on ultra-high definition (UHD) video test material. Thus, we may expect compression efficiency gains well-beyond the targeted 50% for the final standard.
- UHD ultra-high definition
- JEM JVET exploration model
- SAO sample adaptive offset
- the basic concept of SAO filtering is to classify pixels into different categories and to calculate an offset value to apply to each reconstructed pixel in a given category to compensate for a difference between the reconstructed pixel and a corresponding original pixel.
- Two basic types of SAO filtering are available in HEVC.
- the first type is an edge type, also referred to sometimes as edge offset (EO) filtering.
- the second type is a band type, also referred to sometimes as band offset (BO) filtering.
- EO edge offset
- BO band offset
- LCU Largest Coding Unit
- Edge type filtering classifies pixels of an image area into different categories by comparing a target pixel with neighbouring pixels. The sample values of the pixels (samples) are compared.
- the edge type filtering in HEVC uses a scheme for classifying samples which has four different directions (0, 90, 135 and 45 degrees) and considers two neighbouring pixels in each direction.
- the two neighbouring pixels in the 0-degree direction are the pixels to the left and right of the target pixel.
- the two neighbouring pixels in the 90-degree direction are the pixels to the above and below the target pixel.
- the two neighbouring pixels in the 135- degree direction are the pixels to the upper left and lower right of the target pixel.
- the two neighbouring pixels in the 45 -degree direction are the pixels to the lower left and upper right of the target pixel.
- Category 0 applies when the target sample is less than both neighbouring samples.
- Category 1 applies when the target sample is lower than one neighbouring sample and equal to the other neighbouring sample.
- Category 3 applies when the target sample is higher than one neighbouring sample and equal to the other neighbouring sample.
- Category 4 applies when the target sample is greater than both neighbouring samples.
- the remaining category (category 2) applies when none of the four other categories applies.
- An HEVC encoder evaluates all four directions and selects one direction, for example the direction which provides the best filtering result in terms of a rate-distortion criterion.
- the direction is signalled in the bitstream using an SAO parameter sao_eo_class.
- the encoder signals four offset values in the bitstream. These are the offset values for categories 0, 1, 3 and 4 respectively. No offset value is signalled for category 2 (offset value assumed to be 0).
- the decoder uses the same scheme as the encoder to categorise reconstructed pixels, the two neighbouring pixels used for the classification being determined by the received direction parameter. The decoder applies to the reconstructed pixel the appropriate one of the 4 offset values for the category of the reconstructed pixel concerned.
- Band-type filtering divides a full range of sample values (e.g. 0 to 255) into bands and classifies pixels using the bands.
- the band-type filtering in HEVC uses a scheme for classifying samples which has 32 bands and selects a group of 4 successive bands for each image area to be filtered. The range covered by the group is one-eighth of the full range. There is one offset value for each of the 4 bands of the selected group.
- An HEVC encoder evaluates all possible 4-band groups and selects one group, for example the group which provides the best filtering result in terms of a rate-distortion criterion. The position of the first band of the selected group is signalled in the bitstream using an SAO parameter sao_band _position.
- the encoder signals four offset values in the bitstream. These are the offset values for the four bands of the selected group respectively. No offset value is signalled for any band outside the selected group (offset value assumed to be 0).
- the decoder uses the same scheme as the encoder to classify reconstructed pixels into bands. The group of 4 bands is determined by the received position parameter. If the reconstructed pixel is in one of the 4 bands of the selected group the decoder applies to the reconstructed pixel the appropriate one of the 4 offset values for the band of the reconstructed pixel concerned.
- JCT-VC D0122 entitled“CE8 Subtest3: Picture Quad tree Adaptive Offset” proposed using six patterns for EO classification.
- Four patterns, called l-D patterns were linear patterns of a target pixel and two neighbouring pixels in one of the four directions 0 degrees, 90 degrees, 135 degrees and 45 degrees. These 4 l-D patterns collectively correspond to the HE VC EO classifying scheme described above.
- 2 2-D patterns were proposed. These are two-dimensional patterns of a target pixel and four neighbouring pixels.
- the first 2-D pattern is like a plus sign (“+”) made up of the target pixel and its left, right, upper and lower neighbouring pixels.
- the second 2-D pattern is like a“X” made up of the target pixel and its upper left, upper right, lower left and lower right neighbouring pixels.
- the resulting scheme for classifying samples has 6 patterns so instead of simply classifying the target pixel as having an edge in one of four directions the target pixel is classified as having an edge in one of the four directions, a first 2-D contour conforming to a + or a second 2-D contour conforming to an X.
- the 2-D patterns have 7 different categories according to the different possible relationships between the sample values of the target pixel and the 4 neighbouring pixels, and 6 of the 7 categories have offset values.
- the l-D patterns have the 5 different categories (with 4 offset values) of the HEVC scheme.
- the encoder selects one of the patterns (i.e.
- the decoder identifies the selected pattern from the bitstream, determines which one of the 5 or 7 categories applies to the target pixel, and applies the offset value corresponding to the determined category to the target pixel.
- group range 64 sample values when the full range is from 0 to 255) were proposed, one with 8 bands of band size 8 sample values, and the other with 16 bands of band size 4 sample values.
- Three groups of group range one-eighth of the full range i.e. group range 32 sample values when the full range is from 0 to 255) were proposed, one with 4 bands of band size 8 sample values, another with 8 bands of band size 4 sample values, and the last with 16 bands of band size 2 sample values.
- Three groups of group range one-sixteenth of the full range i.e. group range 16 sample values when the full range is from 0 to 255) were proposed, one with 2 bands of band size 8 sample values, another with 4 bands of band size 4 sample values, and the last with 8 bands of band size 2 sample values.
- the encoder selects a best group for band filtering an image part (LCU), the group having one of the four group ranges, one of the band sizes (for groups with group ranges smaller than one- half), and one of the 32, 64 or 128 positions according to the band size. The selection is made using a rate-distortion criterion.
- the selected group is signalled using an SAO type index parameter sao ypejdx having different values corresponding to the different group range and band size combinations.
- the number of offsets is 2, 4 or 8 depending on the combination (one offset per band in the group). These offsets are also determined by the encoder and signalled to the decoder.
- the selected position is also signalled in the bitstream as in the HE VC band scheme.
- the decoder identifies the group selected by the encoder using the received SAO type index parameter and, if the target pixel is in one of the bands of the group, applies the offset value for the band concerned to the target pixel.
- categories 1 and 3 apply only to one-step edges (edges where the target pixel is the different from one neighbouring pixel but the same as the other). Also, categories 1 and 3 do not distinguish which side of the target pixel the step is on. For example, for direction 0 degrees, the step could be on the left or on the right.
- the 4 two-step edge categories enable checking of forward and reverse orders of the neighbouring pixels in each direction. For example, in the 0-degree direction, the left and right neighbours in the forward order are Cnl and Cn2 respectively but are Cn2 and Cnl respectively in the reverse order.
- the target pixel c is closer to Cnl than to Cn2 (first further category) or if c is closer to Cn2 than to Cnl (second further category).
- the reverse order it is determined if c is closer to Cnl than to Cn2 (third further category) or if c is closer to Cn2 than to Cnl (fourth further category).
- a method of performing sample adaptive offset (SAO) filtering on an image comprising: applying to an image area to be filtered a scheme for classifying samples of the image area; and adapting the scheme based on a type of the image area to be filtered.
- SAO sample adaptive offset
- This aspect of the present invention adapts the scheme for classifying samples based on a type of the image area to be filtered.
- schemes appropriate/efficient for different image-area types are used.
- an image-area type can be based on the number of samples, quality, content or color component. It can also be a CTU grouping (which is effectively another way of controlling the number of samples).
- Some embodiments have just one scheme per image-area type or a limited number of schemes per image-area type. These save evaluations in the encoder and save signalling.
- a method of performing sample adaptive offset (SAO) filtering on an image comprising: selecting different levels of SAO parameters for luma and chroma components respectively.
- This aspect of the present invention selects different levels of SAO parameters for luma and chroma components respectively.
- This aspect is independent of the first aspect because the scheme for classifying samples doesn’t have to be adapted in the second aspect.
- the adapting in the second aspect is an adapting of the grouping to the color component.
- a method of performing sample adaptive offset (SAO) filtering on an image comprising: adapting a scheme for classifying samples from one image area to another image area of the same slice or the same frame.
- the third aspect adapts a scheme for classifying samples from one image area to another image area of the same slice or the same frame.
- This aspect is independent of the first and second aspects because the third aspect does not need different image-area types and does not need to select different levels of SAO parameters for luma and chroma.
- Some embodiments have two or more available schemes for the slice or frame, the scheme to be applied to each image area to be filtered being selected from among the available schemes. This may be accompanied by certain signalling improvements including limiting the number of available schemes and/or signalling a list of the available schemes.
- a method of performing sample adaptive offset (SAO) filtering on an image comprising: applying to an image area to be filtered a scheme for classifying samples of the image area; wherein the scheme is or includes an edge scheme having: a first category applicable when a target pixel has a sample value lower than a first neighbouring pixel on a first side of the target pixel in an edge direction and lower than a second neighbouring pixel on a second side of the pixel opposite the first side in the said edge direction; a second category applicable when said target pixel has a sample value lower than said first neighbouring pixel and equal to said second neighbouring pixel; a third category applicable when said target pixel has a sample value equal to said first neighbouring pixel and lower than said second neighbouring pixel; a fourth category applicable when said target pixel has a sample value higher than said first neighbouring pixel and equal to said second neighbouring pixel; a fifth category applicable when said target pixel has a sample value equal to said first neighbouring
- SAO sample adaptive offset
- the fourth aspect is directed to an edge scheme illustrated in Figure 13. It is independent of the first to third aspects but each of those aspects may use the edge scheme as the or one scheme for classifying samples.
- the techniques of all aspects can be applied exclusively to edge-type SAO filtering or exclusively to band-type SAO filtering or to both types of SAO filtering.
- Further aspects of the present invention relate to encoding and decoding methods using the methods of any of the first to fourth aspects. Yet further aspects of the present invention relate to a device for performing sample adaptive offset (SAO) filtering on an image as defined by claims 38 to 41 respectively.
- SAO sample adaptive offset
- the program may be provided on its own or may be carried on, by or in a carrier medium.
- the carrier medium may be non-transitory, for example a storage medium, in particular a computer-readable storage medium.
- the carrier medium may also be transitory, for example a signal or other transmission medium.
- the signal may be transmitted via any suitable network, including the Internet.
- Figure 1 is a diagram for use in explaining a coding structure used in HEVC
- Figure 2 is a block diagram schematically illustrating a data communication system in which one or more embodiments of the invention may be implemented;
- FIG. 3 is a block diagram illustrating components of a processing device in which one or more embodiments of the invention may be implemented;
- Figure 4 is a flow chart illustrating steps of an encoding method according to embodiments of the invention.
- Figure 5 is a flow chart illustrating steps of a loop filtering process of in accordance with one or more embodiments of the invention.
- Figure 6 is a flow chart illustrating steps of a decoding method according to embodiments of the invention.
- Figure 7A and 7B are diagrams for use in explaining edge-type SAO filtering in HEVC
- Figure 8 is a diagram for use in explaining band-type SAO filtering in HEYC
- Figure 9 is a flow chart illustrating the steps of a process to decode SAO parameters according to the HEVC specifications
- Figure 10 is a flow chart illustrating in more detail one of the steps of the Figure 9 process
- Figure 11 is a flow chart illustrating how SAO filtering is performed on an image part according to the HEVC specifications;
- Figure 12 is a schematic view for use in explaining a method of performing SAO filtering in a first embodiment of the present invention;
- Figure 13 is a flow chart for use in explaining a scheme for classifying samples used in the first embodiment
- Figure 14 is a schematic view for use in explaining a method of performing SAO filtering in a second embodiment of the present invention.
- Figure 15 is a flow chart for use in explaining quality levels in the second embodiment
- Figure 16 is a schematic view for use in explaining a method of performing SAO filtering in a third embodiment of the present invention.
- Figure 17 is a schematic view for use in explaining a method of performing SAO filtering in a fourth embodiment of the present invention.
- Figure 18 is a schematic view for use in explaining a method of performing SAO filtering in a fifth embodiment of the present invention.
- Figure 19 is a schematic view for use in explaining a method of performing SAO filtering in a sixth embodiment of the present invention
- Figure 20 is a schematic view for use in explaining a method of performing SAO filtering in a seventh embodiment of the present invention
- Figure 21 is a schematic view for use in explaining a method of performing SAO filtering in an eighth embodiment of the present invention.
- Figure 22 shows various different groupings 1201-1206 of CTUs in a slice
- Figure 23 is a diagram showing image parts of a frame in which a first method of sharing SAO parameters is used
- Figure 24 is a flow chart illustrating steps carried out an encoder to determine SAO parameters at a CTU level in a ninth embodiment of the present invention
- Figure 25 shows one of the steps of Figure 24 in more detail
- Figure 26 shows another one of the steps of Figure 24 in more detail
- Figure 27 is a flow chart illustrating steps carried out in an encoder to determine SAO parameters at a frame level in the ninth embodiment of the present invention.
- Figure 28 is a flowchart of an example of a process for comparing results for the two levels of SAO parameters and selecting a level in the ninth embodiment
- Figure 29 is a flow chart illustrating a decoding process suitable for a second method of sharing SAO parameters among image parts of a group in the ninth embodiment
- Figure 30 is a flow chart illustrating a decoding process suitable for a second method of sharing SAO parameters among image parts of a group in an eleventh embodiment of the present invention
- Figure 31 is a flow chart illustrating a decoding process suitable for a second method of sharing SAO parameters among image parts of a group in a twelfth embodiment of the present invention
- Figure 32 shows one of the steps of Figure 31 in more detail
- Figure 33 is a schematic view for use in explaining a method of performing SAO filtering in a thirteenth embodiment of the present invention
- Figure 34 is a flow chart illustrating steps carried out in an encoder to determine SAO parameters at one intermediate level in a fourteenth embodiment of the present invention
- Figure 35 is a flow chart illustrating steps carried out in an encoder to determine SAO parameters at another intermediate level in the fourteenth embodiment
- Figure 36 is a diagram showing image parts of one NxN group in the fourteenth embodiment.
- Figure 37 is a flowchart of an example of a process for comparing results for various levels of SAO parameters and selecting a level in the fourteenth embodiment
- Figure 38 is a flow chart illustrating a decoding process suitable for a second method of sharing SAO parameters among image parts of a group in the fourteenth embodiment
- Figure 39 is a schematic diagram for use in explaining a scheme for classifying samples used in a fifteenth embodiment of the present invention
- Figure 40 is a schematic diagram for use in explaining a scheme for classifying samples used in an eighteenth embodiment of the present invention.
- Figure 41 is a flow chart illustrating a decoding process in the fourteenth embodiment
- Figure 42 is a flow chart illustrating steps carried out in an encoder to determine SAO parameters for band filtering in a twenty-third embodiment of the present invention
- Figure 43 is a table presenting different groups usable for band schemes in the twenty- third embodiment of the present invention
- Figure 44 is a flow chart illustrating steps carried out an encoder to determine SAO parameters at a CTU level in a twenty- fourth embodiment of the present invention.
- Figure 45 shows one of the steps of Figure 44 in more detail
- Figure 46 is a flow chart illustrating steps carried out in an encoder to determine SAO parameters at a frame level in the twenty- fourth embodiment.
- Figure 47 is a diagram showing a system comprising an encoder or a decoder and a communication network according to embodiments of the present invention.
- Figure 1 relates to a coding structure used in the High Efficiency Video Coding (HEVC) video standard.
- a video sequence 1 is made up of a succession of digital images i. Each such digital image is represented by one or more matrices. The matrix coefficients represent pixels.
- HEVC High Efficiency Video Coding
- An image 2 of the sequence may be divided into slices 3.
- a slice may in some instances constitute an entire image.
- These slices are divided into non-overlapping Coding Tree Units (CTUs).
- a Coding Tree Unit (CTU) is the basic processing unit of the High Efficiency Video Coding (HEVC) video standard and conceptually corresponds in structure to macroblock units that were used in several previous video standards.
- a CTU is also sometimes referred to as a Largest Coding Unit (LCU).
- LCU Largest Coding Unit
- a CTU has luma and chroma component parts, each of which component parts is called a Coding Tree Block (CTB). These different color components are not shown in Figure 1.
- CTB Coding Tree Block
- a CTU is generally of size 64 pixels x 64 pixels.
- Each CTU may in turn be iteratively divided into smaller variable-size Coding Units (CUs) 5 using a quadtree decomposition.
- CUs variable-size Coding Units
- Coding units are the elementary coding elements and are constituted by two kinds of sub-unit called a Prediction Unit (PU) and a Transform Unit (TU).
- the maximum size of a PU or TU is equal to the CU size.
- a Prediction Unit corresponds to the partition of the CU for prediction of pixels values.
- Various different partitions of a CU into PUs are possible as shown by 606 including a partition into 4 square PUs and two different partitions into 2 rectangular PUs.
- a Transform Unit is an elementary unit that is subjected to spatial transformation using DCT.
- a CU can be partitioned into TUs based on a quadtree representation 607.
- NAL Network Abstraction Layer
- coding parameters of the video sequence are stored in dedicated NAL units called parameter sets.
- SPS Sequence Parameter Set
- PPS Picture Parameter Set
- HEVC also includes a Video Parameter Set (VPS) NAL unit which contains parameters describing the overall structure of the bitstream.
- the VPS is a new type of parameter set defined in HEVC, and applies to all of the layers of a bitstream.
- a layer may contain multiple temporal sub-layers, and all version 1 bitstreams are restricted to a single layer.
- HEVC has certain layered extensions for scalability and multiview and these will enable multiple layers, with a backwards compatible version 1 base layer.
- FIG. 2 illustrates a data communication system in which one or more embodiments of the invention may be implemented.
- the data communication system comprises a transmission device, in this case a server 201, which is operable to transmit data packets of a data stream to a receiving device, in this case a client terminal 202, via a data communication network 200.
- the data communication network 200 may be a Wide Area Network (WAN) or a Local Area Network (LAN).
- WAN Wide Area Network
- LAN Local Area Network
- Such a network may be for example a wireless network (Wifi / 802.1 la or b or g), an Ethernet network, an Internet network or a mixed network composed of several different networks.
- the data communication system may be a digital television broadcast system in which the server 201 sends the same data content to multiple clients.
- the data stream 204 provided by the server 201 may be composed of multimedia data representing video and audio data. Audio and video data streams may, in some embodiments of the invention, be captured by the server 201 using a microphone and a camera respectively. In some embodiments data streams may be stored on the server 201 or received by the server 201 from another data provider, or generated at the server 201.
- the server 201 is provided with an encoder for encoding video and audio streams in particular to provide a compressed bitstream for transmission that is a more compact representation of the data presented as input to the encoder. In order to obtain a better ratio of the quality of transmitted data to quantity of transmitted data, the compression of the video data may be for example in accordance with the HEVC format or H.264/AVC format.
- the client 202 receives the transmitted bitstream and decodes the reconstructed bitstream to reproduce video images on a display device and the audio data by a loud speaker.
- the data communication between an encoder and a decoder may be performed using for example a media storage device such as an optical disc.
- a video image is transmitted with data representative of compensation offsets for application to reconstructed pixels of the image to provide filtered pixels in a final image.
- FIG. 3 schematically illustrates a processing device 300 configured to implement at least one embodiment of the present invention.
- the processing device 300 may be a device such as a micro-computer, a workstation or a light portable device.
- the device 300 comprises a communication bus 313 connected to:
- central processing unit 311 such as a microprocessor, denoted CPU;
- ROM read only memory
- RAM random access memory 312, denoted RAM, for storing the executable code of the method of embodiments of the invention as well as the registers adapted to record variables and parameters necessary for implementing the method of encoding a sequence of digital images and/or the method of decoding a bitstream according to embodiments of the invention;
- the apparatus 300 may also include the following components:
- -a data storage means 304 such as a hard disk, for storing computer programs for implementing methods of one or more embodiments of the invention and data used or produced during the implementation of one or more embodiments of the invention;
- the disk drive being adapted to read data from the disk 306 or to write data onto said disk;
- the apparatus 300 can be connected to various peripherals, such as for example a digital camera 320 or a microphone 308, each being connected to an input/output card (not shown) so as to supply multimedia data to the apparatus 300.
- peripherals such as for example a digital camera 320 or a microphone 308, each being connected to an input/output card (not shown) so as to supply multimedia data to the apparatus 300.
- the communication bus provides communication and interoperability between the various elements included in the apparatus 300 or connected to it.
- the representation of the bus is not limiting and in particular the central processing unit is operable to communicate instructions to any element of the apparatus 300 directly or by means of another element of the apparatus 300.
- the disk 306 can be replaced by any information medium such as for example a compact disk (CD-ROM), rewritable or not, a ZIP disk or a memory card and, in general terms, by an information storage means that can be read by a microcomputer or by a microprocessor, integrated or not into the apparatus, possibly removable and adapted to store one or more programs whose execution enables the method of encoding a sequence of digital images and/or the method of decoding a bitstream according to the invention to be implemented.
- CD-ROM compact disk
- ZIP disk or a memory card
- the executable code may be stored either in read only memory 306, on the hard disk 304 or on a removable digital medium such as for example a disk 306 as described previously.
- the executable code of the programs can be received by means of the communication network 303, via the interface 302, in order to be stored in one of the storage means of the apparatus 300 before being executed, such as the hard disk 304.
- the central processing unit 311 is adapted to control and direct the execution of the instructions or portions of software code of the program or programs according to the invention, instructions that are stored in one of the aforementioned storage means.
- the program or programs that are stored in a non-volatile memory for example on the hard disk 304 or in the read only memory 306, are transferred into the random access memory 312, which then contains the executable code of the program or programs, as well as registers for storing the variables and parameters necessary for implementing the invention.
- the apparatus is a programmable apparatus which uses software to implement the invention.
- the present invention may be implemented in hardware (for example, in the form of an Application Specific Integrated Circuit or ASIC).
- Figure 4 illustrates a block diagram of an encoder according to at least one embodiment of the invention.
- the encoder is represented by connected modules, each module being adapted to implement, for example in the form of programming instructions to be executed by the CPU 311 of device 300, at least one corresponding step of a method implementing at least one embodiment of encoding an image of a sequence of images according to one or more embodiments of the invention.
- An original sequence of digital images /0 to m 401 is received as an input by the encoder 400.
- Each digital image is represented by a set of samples, known as pixels.
- a bitstream 410 is output by the encoder 400 after implementation of the encoding process.
- the bitstream 410 comprises a plurality of encoding units or slices, each slice comprising a slice header for transmitting encoding values of encoding parameters used to encode the slice and a slice body, comprising encoded video data.
- the input digital images / ⁇ to m 401 are divided into blocks of pixels by module 402.
- the blocks correspond to image portions and may be of variable sizes (e.g. 4x4, 8x8, 16x16, 32x32, 64x64, 128x128 pixels and several rectangular block sizes can be also considered).
- a coding mode is selected for each input block. Two families of coding modes are provided: coding modes based on spatial prediction coding (Intra prediction), and coding modes based on temporal prediction (Inter coding, Merge, SKIP). The possible coding modes are tested.
- Module 403 implements an Intra prediction process, in which the given block to be encoded is predicted by a predictor computed from pixels of the neighbourhood of said block to be encoded. An indication of the selected Intra predictor and the difference between the given block and its predictor is encoded to provide a residual if the Intra coding is selected.
- Temporal prediction is implemented by motion estimation module 404 and motion compensation module 405.
- a reference image from among a set of reference images 416 is selected, and a portion of the reference image, also called reference area or image portion, which is the closest area to the given block to be encoded, is selected by the motion estimation module 404.
- Motion compensation module 405 then predicts the block to be encoded using the selected area.
- the difference between the selected reference area and the given block, also called a residual block, is computed by the motion compensation module 405.
- the selected reference area is indicated by a motion vector.
- a residual is computed by subtracting the prediction from the original block.
- a prediction direction is encoded.
- at least one motion vector is encoded.
- Motion vector predictors of a set of motion information predictors is obtained from the motion vectors field 418 by a motion vector prediction and coding module 417.
- the encoder 400 further comprises a selection module 406 for selection of the coding mode by applying an encoding cost criterion, such as a rate-distortion criterion.
- an encoding cost criterion such as a rate-distortion criterion.
- a transform such as DCT
- the transformed data obtained is then quantized by quantization module 408 and entropy encoded by entropy encoding module 409.
- the encoded residual block of the current block being encoded is inserted into the bitstream 410.
- the encoder 400 also performs decoding of the encoded image in order to produce a reference image for the motion estimation of the subsequent images. This enables the encoder and the decoder receiving the bitstream to have the same reference frames.
- the inverse quantization module 411 performs inverse quantization of the quantized data, followed by an inverse transform by reverse transform module 412.
- the reverse intra prediction module 413 uses the prediction information to determine which predictor to use for a given block and the reverse motion compensation module 414 actually adds the residual obtained by module 412 to the reference area obtained from the set of reference images 416.
- Post filtering is then applied by module 415 to filter the reconstructed frame of pixels.
- an SAO loop filter is used in which compensation offsets are added to the pixel values of the reconstructed pixels of the reconstructed image
- Figure 5 is a flow chart illustrating steps of loop filtering process according to at least one embodiment of the invention.
- the encoder generates the reconstruction of the full frame.
- a deblocking filter is applied on this first reconstruction in order to generate a deblocked reconstruction 53.
- the aim of the deblocking filter is to remove block artifacts generated by residual quantization and block motion compensation or block Intra prediction. These artifacts are visually important at low bitrates.
- the deblocking filter operates to smooth the block boundaries according to the characteristics of two neighboring blocks. The encoding mode of each block, the quantization parameters used for the residual coding, and the neighboring pixel differences in the boundary are taken into account.
- the deblocking filter improves the visual quality of the current frame by removing blocking artifacts and it also improves the motion estimation and motion compensation for subsequent frames. Indeed, high frequencies of the block artifact are removed, and so these high frequencies do not need to be compensated for with the texture residual of the following frames.
- the deblocked reconstruction is filtered by a sample adaptive offset (SAO) loop filter in step 54 using SAO parameters determined in accordance with embodiments of the invention.
- the resulting frame 55 may then be filtered with an adaptive loop filter (ALF) in step 56 to generate the reconstructed frame 57 which will be displayed and used as a reference frame for the following Inter frames.
- SAO sample adaptive offset
- ALF adaptive loop filter
- step 54 each pixel of the frame region is classified into a class or group.
- the same offset value is added to every pixel value which belongs to a certain class or group.
- FIG. 6 illustrates a block diagram of a decoder 60 which may be used to receive data from an encoder according an embodiment of the invention.
- the decoder is represented by connected modules, each module being adapted to implement, for example in the form of programming instructions to be executed by the CPU 311 of device 300, a corresponding step of a method implemented by the decoder 60.
- the decoder 60 receives a bitstream 61 comprising encoding units, each one being composed of a header containing information on encoding parameters and a body containing the encoded video data.
- the encoded video data is entropy encoded, and the motion vector predictors’ indexes are encoded, for a given block, on a predetermined number of bits.
- the received encoded video data is entropy decoded by module 62.
- the residual data are then dequantized by module 63 and then a reverse transform is applied by module 64 to obtain pixel values.
- the mode data indicating the coding mode are also entropy decoded and based on the mode, an INTRA type decoding or an INTER type decoding is performed on the encoded blocks of image data.
- an INTRA predictor is determined by intra reverse prediction module 65 based on the intra prediction mode specified in the bitstream.
- the motion prediction information is extracted from the bitstream so as to find the reference area used by the encoder.
- the motion prediction information is composed of the reference frame index and the motion vector residual.
- the motion vector predictor is added to the motion vector residual in order to obtain the motion vector by motion vector decoding module 70.
- Motion vector decoding module 70 applies motion vector decoding for each current block encoded by motion prediction. Once an index of the motion vector predictor, for the current block has been obtained the actual value of the motion vector associated with the current block can be decoded and used to apply reverse motion compensation by module 66. The reference image portion indicated by the decoded motion vector is extracted from a reference image 68 to apply the reverse motion compensation 66. The motion vector field data 71 is updated with the decoded motion vector in order to be used for the inverse prediction of subsequent decoded motion vectors.
- Post filtering is applied by post filtering module 67 similarly to post filtering module 815 applied at the encoder as described with reference to Figure 5.
- a decoded video signal 69 is finally provided by the decoder 60.
- SAO filtering is to improve the quality of the reconstructed frame by sending additional data in the bitstream in contrast to the deblocking filter where no information is transmitted.
- each pixel is classified into a predetermined class or group and the same offset value is added to every pixel sample of the same class/group.
- One offset is encoded in the bitstream for each class.
- SAO loop filtering has two SAO types: an Edge Offset (EO) type and a Band Offset (BO) type.
- EO Edge Offset
- BO Band Offset
- An example of Edge Offset type is schematically illustrated in Figures 7A and 7B
- an example of Band Offset type is schematically illustrated in Figure 8.
- SAO filtering is applied CTU by CTU.
- the parameters needed to perform the SAO filtering (set of SAO parameters) are selected for each CTU at the encoder side and the necessary parameters are decoded and/or derived for each CTU at the decoder side.
- This offers the possibility of easily encoding and decoding the video sequence by processing each CTU at once without introducing delays in the processing of the whole frame.
- SAO filtering is enabled, only one SAO type is used: either the Edge Offset type filter or the Band Offset type filter according to the related parameters transmitted in the bitstream for each classification.
- One of the SAO parameters in HEVC is an SAO type parameter sao ypejdx which indicates for the CTU whether EO type, BO type or no SAO filtering is selected for the CTU concerned.
- the SAO parameters for a given CTU can be copied from the upper or left CTU, for example, instead of transmitting all the SAO data.
- One of the SAO parameters in HEVC is a sao_merge_up flag, which when set indicates that the SAO parameters for the subject CTU should be copied from the upper CTU.
- Another of the SAO parameters in HEVC is a saojnergejeft flag, which when set indicates that the SAO parameters for the subject CTU should be copied from the left CTU.
- SAO filtering may be applied independently for different color components (e.g. YUV) of the frame.
- one set of SAO parameters may be provided for the luma component Y and another set of SAO parameters may be provided for both chroma components U and V in common.
- one or more SAO parameters may be used as common filtering parameters for two or more color components, while other SAO parameters are dedicated (per-component) filtering parameters for the color components.
- the SAO type parameter sao ypejdx is common to U and V, and so is a EO class parameter which indicates a class for EO filtering (see below), whereas a BO class parameter which indicates a group of classes for BO filtering has dedicated (per-component) SAO parameters for U and V.
- Edge Offset type involves determining an edge index for each pixel by comparing its pixel value to the values of two neighboring pixels. Moreover, these two neighboring pixels depend on a parameter which indicates the direction of these two neighboring pixels with respect to the current pixel. These directions are the 0-degree (horizontal direction), 45-degree (diagonal direction), 90-degree (vertical direction) and 135-degree (second diagonal direction). These four directions are schematically illustrated in Figure 7A.
- the table of Figure 7B gives the offset value to be applied to the pixel value of a particular pixel“C” according to the value of the two neighboring pixels Cnl and Cn2 at the decoder side.
- the offset to be added to the pixel value of the pixel C is“+ 01”.
- the offset to be added to this pixel sample value is“+ 02”.
- the offset to be applied to this pixel sample is“- 03”.
- the value of C is greater than the two values of Cnl or Cn2, the offset to be applied to this pixel sample is“- 04”.
- each offset (01, 02, 03, 04) is encoded in the bitstream.
- the sign to be applied to each offset depends on the edge index (or the Edge Index in the HEVC specifications) to which the current pixel belongs. According to the table represented in Figure 7B, for Edge Index 0 and for Edge Index 1 (01, 02) a positive offset is applied. For Edge Index 3 and Edge Index 4 (03, 04), a negative offset is applied to the current pixel.
- the direction for the Edge Offset amongst the four directions of Figure 7A is specified in the bitstream by a sao_eo_class_luma” field for the luma component and a“sao_eo_class_chroma” field for both chroma components U and V.
- the SAO Edge Index corresponding to the index value is obtained by the following formula:
- Edgelndex sign (C - Cn2) - sign (Cnl- C) +2
- Band Offset type in SAO also depends on the pixel value of the sample to be processed.
- a class in SAO Band offset is defined as a range of pixel values. Conventionally, for all pixels within a range, the same offset is added to the pixel value. In the HEVC specifications, the number of offsets for the Band Offset filter is four for each reconstructed block or frame area of pixels (CTU), as schematically illustrated in Figure 8.
- SAO Band offset splits the full range of pixel values into 32 ranges of the same size. These 32 ranges are the bands (or classes) of SAO Band offset.
- Classifying the pixels into 32 ranges of the full interval includes 5 bits checking needed to classify the pixels values for fast implementation i.e. only the 5 first bits (5 most significant bits) are checked to classify a pixel into one of the 32 classes/ ranges of the full range.
- each band or class contains 8 pixel values.
- a group 40 of bands represented by the grey area (40), is used, the group having four successive bands 41, 42, 43 and 44, and information is signaled in the bitstream to identify the position of the group, for example the position of the first of the 4 bands.
- the syntax element representative of this position is the “ sao_band jpositiorT’ field in the HEVC specifications. This corresponds to the start of band
- FIG. 9 is a flow chart illustrating the steps of a process to decode SAO parameters according to the HEVC specifications.
- the process of Figure 9 is applied for each CTU to generate a set of SAO parameters for all components.
- a predictive scheme is used for the CTU mode. This predictive mode involves checking if the CTU on the left of the current CTU uses the same SAO parameters (this is specified in the bitstream through a flag named “ saojnergejeft Jlag”). If not, a second check is performed with the CTU above the current CTU (this is specified in the bitstream through a flag named“ sao_merge_up Jlag”). This predictive technique enables the amount of data representing the SAO parameters for the CTU mode to be reduced. Steps of the process are set out below.
- step 503 the“ saojnergejeft Jlag” is read from the bitstream 502 and decoded. If its value is true, then the process proceeds to step 504 where the SAO parameters of left CTU are copied for the current CTU. This enables the types for YUV of the SAO filter for the current CTU to be determined in step 508.
- step 503 If the outcome is negative in step 503 then the“ sao_merge_up Jlag” is read from the bitstream and decoded. If its value is true, then the process proceeds to step 505 where the SAO parameters of the above CTU are copied for the current CTU. This enables the types of the SAO filter for the current CTU to be determined in step 508.
- step 505 If the outcome is negative in step 505, then the SAO parameters for the current CTU are read and decoded from the bitstream in step 507 for the Luma Y component and both U and V components (501) (551) for the type.
- the offsets for Chroma are independent. The details of this step are described later with reference to Figure 10. After this step, the parameters are obtained and the type of SAO filter is determined in step 508.
- step 511 a check is performed to determine if the three colour components (Y and U & V) for the current CTU have been processed. If the outcome is positive, the determination of the SAO parameters for the three components is complete and the next CTU can be processed in step 510. Otherwise, (Only Y was processed) U and V are processed together and the process restarts from initial step 512 previously described.
- Figure 10 is a flow chart illustrating steps of a process of parsing of SAO parameters in the bitstream 601 at the decoder side.
- the”sao ypejdx_X” syntax element is read and decoded.
- the code word representing this syntax element can use a fixed length code or could use any method of arithmetic coding.
- the syntax element sao_type_idx_X enables determination of the type of SAO applied for the frame area to be processed for the colour component Y or for both Chroma components U & V. For example, for a YUV 4:2:0 sequence, two components are considered: one for Y, and one for U and V.
- the “ saoJypejdxJC’ can take 3 values as follows depending on the SAO type encoded in the bitstream.‘O’ corresponds to no SAO,‘U corresponds to the Band Offset case illustrated in Figure 8 and‘2’ corresponds to the Edge Offset type filter illustrated in Figures 3 A and 3B.
- YUV color components are used in HE VC (sometimes called Y, Cr and Cb components), it will be appreciated that in other video coding schemes other color components may be used, for example RGB color components.
- the techniques of the present invention are not limited to use with YUV color components. and can be used with RGB color components or any other color components.
- a test is performed to determine if the“ saoJypejdx ’ is strictly positive. If“saoJypejdxJC’ is equal to“0” signifying that there is no SAO for this frame area (CTU) for Y if X is set equal to Y and that there is no SAO for this frame area for U and V if X is set equal to U and V. The determination of the SAO parameters is complete and the process proceeds to step 608. Otherwise if the“ saojypejdx” is strictly positive, this signifies that SAO parameters exist for this CTU in the bitstream.
- step 606 a loop is performed for four iterations.
- the four iterations are carried in step 607 where the absolute value of offset j is read and decoded from the bitstream.
- These four offsets correspond either to the four absolute values of the offsets (01, 02, 03, 04) of the four Edge indexes of SAO Edge Offset (see Figure 7B) or to the four absolute values of the offsets related to the four ranges of the SAO band Offset (see Figure 8).
- Note that for the coding of an SAO offset a first part is transmitted in the bitstream corresponding to the absolute value of the offset. This absolute value is coded with a unary code.
- the maximum value for an absolute value is given by the following formula:
- MAX abs SAO offset value (1 « (Min(bitDepth, l0)-5))-l
- « is the left (bit) shift operator.
- This formula means that the maximum absolute value of an offset is 7 for a pixel value bitdepth of 8 bits, and 31 for a pixel value bitdepth of 10 bits and beyond.
- the current HE VC standard amendment addressing extended bitdepth video sequences provides similar formula for a pixel value having a bitdepth of 12 bits and beyond.
- the absolute value decoded may be a quantized value which is dequantized before it is applied to pixel values at the decoder for SAO filtering. An indication of use or not of this quantification is transmitted in the slice header.
- the sign is signaled in the bitstream as a second part of the offset if the absolute value of the offset is not equal to 0.
- the bit of the sign is bypassed when CAB AC is used.
- the signs of the offsets for the Band Offset mode are decoded in steps 609 and 610, except for each offset that has a zero value, before the following step 604 is performed in order to read in the bitstream and to decode the position“ sao_band _position_X” of the SAO band as illustrated in Figure 8.
- the read syntax element is“sao eo class luma” and if X is set equal to U and V, the read syntax element is“sao eo class chroma”.
- FIG 11 is a flow chart illustrating how SAO filtering is performed on an image part according to the HEVC specifications, for example during the step 67 in Figure 6.
- this image part is a CTU.
- This same process 700 is also applied in the decoding loop (step 415 in Figure 4) at the encoder in order to produce the reference frames used for the motion estimation and compensation of the following frames .
- This process is related to the SAO filtering for one color component (thus suffix“_X” in the syntax elements has been omitted below).
- An initial step 701 comprises determining the SAO filtering parameters according to processes depicted in Figures 9 and 10.
- the SAO filtering parameters are determined by the encoder and the encoded SAO parameters are included in the bitstream. Accordingly, on the decoder side in step 701 the decoder reads and decodes the parameters from the bitstream.
- Step 701 obtains the saojypejdx and if it equals 1 also obtains the sao_band josition 702 and if it equals 2 also obtains the sao o lass Junta or sao _eo class chroma (according to the color component processed). If the element saojypejdx is equal to 0 the SAO filtering is not applied.
- Step 701 obtains also an offsets table 703 of the 4 offsets.
- a variable i used to successively consider each pixel Pi of the current block or frame area (CTU), is set to 0 in step 704.
- “frame area” and“image area” are used interchangeably in the present specification.
- a frame area in this example is a CTU in the p
- step 706 pixel ' is extracted from the frame area 705 which contains N p
- This pixel ' is classified in step 707 according to the Edge offset classification described with reference to Figures 7A & 7B or Band offset classification as described with p
- the decision module 708 tests if 1 is in a class that is to be filtered using the conventional SAO filtering.
- value J is extracted in step 710 from the offsets table 703.
- This filtered pixel ' is inserted in step 713 into the filtered frame area 716.
- the variable i is incremented in step 714 in order to filter the subsequent pixels of the current frame area 705 (if any - test 715).
- the filtered frame area 716 is reconstructed and can be added to the SAO reconstructed frame (see frame 68 of Figure 6 or 416 of Figure 4).
- JEM JVET exploration model
- SAO is less efficient in the JEM reference software than in the HEVC reference software. This arises from fewer evaluations and from signalling inefficiencies compared to other loop filters.
- a first group of embodiments of the present invention described below is intended to improve the coding efficiency of SAO by using various techniques for adapting a scheme for classifying samples in SAO filtering.
- the classification outcome for another partition may be a
- the first group of embodiments therefore adapts the scheme to the type of image area.
- the type of image area could be based on the number of samples in the image area, with first and second types of image area having higher and lower numbers of samples.
- the type of image area could alternatively be based on a quality of the image area.
- the type could alternatively be based on a content (or activity) of the image area.
- the type could alternatively be based on a color component (luma or chroma) of the image area.
- the image parts can be processed individually or in groups of two or more image parts.
- a Coding Tree Unit (CTU) is contemplated as an image part.
- CTU groupings can be used to create image areas with different number of samples.
- a CTU grouping may be selected for a slice or frame. Within the slice or frame the selected CTU grouping is used to create image areas.
- a CTU grouping is not limited to two or more CTUs.
- a single CTU may be a CTU grouping.
- a CTU may in turn comprise blocks, one block or an array of blocks within a CTU may also be a CTU grouping.
- the available CTU groupings may be any two or more of 1/16 CTU, 1 ⁇ 4 CTU, CTU, 2x2 CTUs, 3x3 CTUs, a column of CTUs, a line of CTUs, a whole slice or even a whole frame.
- SAO parameters are provided per image area of the slice or frame, making it possible to control the amount of SAO parameter data for the slice or frame concerned.
- the smaller the CTU grouping the smaller the image areas and hence the bigger the amount of SAO parameter data.
- the bigger amount of SAO parameters gives fine control of SAO parameters, and hence potentially lower distortion.
- the CTU groupings offer different rate distortion compromises for SAO.
- the data transmitted for SAO filtering should be increased to obtain a better rate distortion compromise.
- each colour component doesn’t need the same amount of data to obtain a good filtering.
- the Luma component should have more data to obtain a better SAO filtering than for Chroma components.
- a second group of embodiments of the present invention has SAO parameters per image area and adapts the image areas so that the amount of SAO parameter data gives a suitable rate-distortion compromise.
- this group of embodiments it is not necessary to adapt the scheme for classifying samples in the image area. The same scheme can be used for all image areas.
- the scheme for classifying samples in edge-type SAO filtering is adapted based on the number of samples in the image area to be filtered.
- Figure 12 shows a first image area IA1 having a first number of samples (pixels) and a second image area IA2 having a second number of samples (pixels). The first number is higher than the second number in this example.
- a first scheme Sl for classifying samples of the image area is used for the first image area IA1 .
- the first scheme Sl is the only available scheme for the first image area IA1.
- a second scheme S2 for classifying samples of the image area is used.
- the second scheme is the only available scheme for the second image area IA2.
- the first scheme Sl is illustrated in Figure 13.
- the scheme Sl instead of 5 categories in the HEVC scheme (see Figure 7B), there are 7 categories in the scheme Sl. This increase in the number of categories occurs because category 1 in Figure 7B does not order the neighbouring pixels and applies whenever the target pixel c is lower than one neighbouring pixel and the same as the other neighbouring pixel.
- category 1 applies only when the target pixel c is lower than the first neighbouring pixel Cnl and the same as the second neighbouring pixel Cn2.
- Category 2 applies only when the target pixel c is lower than the second neighbouring pixel Cn2 and the same as the first neighbouring pixel Cnl.
- category 3 applies only when the target pixel c is greater than one neighbouring pixel and the same as the other neighbouring pixel.
- category 3 applies only when the target pixel c is greater than the first neighbouring pixel Cnl and the same as the second neighbouring pixel Cn2.
- Category 4 applies only when the target pixel c is greater than the second neighbouring pixel Cn2 and the same as the first neighbouring pixel Cnl .
- the other categories 0, 5 and 6 in the scheme Sl are the same as the categories 0, 4 and 2 respectively in Figure 7B.
- the categories 0 to 5 each have an offset.
- Category 6 has no offset.
- the second scheme S2 is the HEVC scheme of Figures 7A and 7B, i.e. a scheme with 4 directions 0, 90, 135 and 45 degrees, 5 categories and 4 offsets.
- an encoder adapts the scheme applied to an image area based on the number of samples. Accordingly, the encoder applies the scheme Sl to the larger image area
- the scheme Sl has a better coding efficiency for the larger image area IA1 because it offers a more precise classification of the edge offset than the scheme Sl.
- the first scheme Sl uses two more offsets than the second scheme S2 but this increase in signalling (the rate of two additional offsets) is justified by the reduced distortion.
- the scheme S2 has a better coding efficiency for the smaller image area IA2. Using the scheme Sl would be inefficient because the more precise classification would make little or no difference to the distortion but would incur a rate of 2 additional offsets.
- the first and second image areas IA1 and IA2 were image areas with different numbers of samples (larger and smaller image areas).
- a first image area IA1 has a higher quality than a second image area IA2, as illustrated schematically in Figure 14.
- the quality of an image area may be measured in various different ways.
- the number of samples in the first image area IA1 may be the same as in the second image area IA2 in this embodiment.
- the coding efficiency of temporal dependencies exploited by Inter coding can be further exploited by considering balance, in terms of rate and distortion, between encoded frames.
- One way is to set different rate distortion compromises for several consecutive frames instead of setting the same rate distortion compromise for all frames.
- each frame is encoded with a particular balance between rate and distortion.
- Each compromise between rate and distortion corresponds to a level in a hierarchy of levels (compromises).
- Figure 15 shows an example of a Group of Pictures (GoP) comprising images associated with a hierarchy of rate-distortion compromises for the low delay case.
- the size of image is related to the hierarchy in terms of quality. For example, images with a level equal to“2” have the biggest size and so have a higher quality than images with medium size and the level 1. The images with the smallest size (level equal to 0) have the lower quality.
- Another way to evaluate the hierarchy of the images in term of rate-distortion is to consider the Quantization Parameters of the images forming the GoP 280.
- the images with the highest quality are associated with Quantization Parameters QP. They have a level equal to 2.
- the images with the medium quality are associated with Quantization Parameters QP+l. They have a level equal to 1.
- the images with the lowest quality are associated with Quantization Parameters QP+2. They have a level equal to 0.
- the quality (absence of distortion) has higher importance (relative to cost) in the rate-distortion compromise for higher levels compared to lower level.
- the effect is that an image at the top of the hierarchy (level) should have a larger quality than image with lower level.
- This quality is then propagated, with the help of the Inter coding mode, on the following encoded images which have a lower level in the hierarchy and which have a lower quality in terms of rate distortion compromise than the images with the higher level in the hierarchy.
- the compromise between rate and distortion can be fixed by the Quantization parameter or by the Lagrangian parameter (called lambda) in the rate distortion criterion or by both.
- an encoder adapts the scheme applied to an image area based on a measure of quality of the image area (hierarchy level, QP, Lagragian parameter etc.). Other possible quality measures include SAD (Sum of absolute Difference) and MSE (Mean Squared Errors). Accordingly, the encoder applies the scheme Sl to the higher-quality image area IA1 and applies the scheme S2 to the lower-quality image area IA2.
- the scheme Sl has a better coding efficiency for the higher-quality image area IA1 because it offers a more precise classification of the edge offset than the scheme S2.
- the first scheme Sl uses two more offsets than the second scheme S2 but this increase in signalling (the rate of two additional offsets) is justified by the reduced distortion.
- the scheme S2 has a better coding efficiency for the lower-quality image area IA2. Using the scheme Sl would be inefficient because the more precise classification would make little or no difference to the distortion but would incur a rate of 2 additional offsets.
- the first and second image areas IA1 and IA2 were image areas with different numbers of samples (larger and smaller image areas).
- the first and second areas IA1 and IA2 were image areas of higher and lower content, as illustrated schematically in Figure 16.
- content may be measured in various different ways.
- the number of samples in the first image area IA1 may be the same as in the second image area IA2 in this embodiment.
- the quality of samples in the first image area IA1 may be the same as in the second image area IA2 in this embodiment too.
- One measure is the degree of motion in the image area. This may be measured based on the motion vectors and motion vector differences between neighbouring blocks.
- Another measure of content is the variation of sample values within the image area. This measure is generally used to determine if the image area content is smooth or not. This can be evaluated with a gradient measure for example.
- an encoder adapts the scheme applied to an image area based on a measure of content of the image area. Accordingly, the encoder applies the scheme Sl to the higher- content image area IA1 and applies the scheme S2 to the lower-content image area IA2.
- the scheme Sl has a better coding efficiency for the higher-content image area IA1 because it offers a more precise classification of the edge offset than the scheme S2.
- the first scheme Sl uses two more offsets than the second scheme S2 but this increase in signalling (the rate of two additional offsets) is justified by the reduced distortion.
- the scheme S2 has a better coding efficiency for the lower-content image area IA2. Using the scheme S 1 would be inefficient because the more precise classification would make little or no difference to the distortion but would incur a rate of 2 additional offsets.
- the first and second image areas IA1 and IA2 were image areas of the same color component.
- the first and second image areas IA1 and IA2 belong to different color components.
- the first image area IA1 may belong to the luma component
- the second image area IA2 may belong to one of the two chroma components, as illustrated in Figure 17.
- an image area in the luma component and its corresponding image area in a chroma component may contain different numbers of samples because of the color scheme (e.g. 4:2:2 or 4:2:0). In such color schemes the number of samples per image area in the luma component is higher than the number of samples per image area in a chroma component.
- the luma and chroma components have the same number of samples per image area (as in the case of the 4:4:4 color scheme) it is possible to adapt the scheme for classifying samples based on the color component. This may be useful because in an image each color component doesn’t need the same amount of data to obtain a satisfactory SAO filtering result. In particular, the luma component should have more data to obtain satisfactory SAO filtering results than each chroma component.
- the number of samples in the first image area IA1 may be the same as in the second image area IA2.
- the first image area IA1 may be the same as the second image area IA2 in this embodiment too.
- an encoder adapts the scheme applied to an image area based on a color component. Accordingly, the encoder applies the scheme Sl to the luma image area IA1 and applies the scheme S2 to the chroma image area IA2.
- the scheme Sl has a better coding efficiency for the luma image area IA1 because it offers a more precise classification of the edge offset than the scheme S2.
- the first scheme Sl uses two more offsets than the second scheme S2 but this increase in signalling (the rate of two additional offsets) is justified by the reduced distortion.
- the scheme S2 has a better coding efficiency for the chroma image area IA2. Using the scheme Sl would be inefficient because the more precise classification would make little or no difference to the distortion but would incur a rate of 2 additional offsets.
- the scheme is adapted based on two or more of the number of samples, quality, content and color component of the image area in combination.
- the combination may be a weighted combination for example. This can lead to a more accurate adaptation of the scheme to the image area characteristics.
- the first scheme Sl was the sole scheme available for the first image area IA1 and the second scheme S2 was the sole scheme available for the second image area IA2. This can be efficient because when only one scheme is available it is not necessary to evaluate two or more competing schemes at the encoder side, and select one of them, for example based on a RD criterion. This saves encoder complexity and encoding time.
- an image area it is possible for an image area to have two or more available schemes.
- the first image area IA1 has the first and second schemes Sl and S2 available, but the second image area IA2 has only the second scheme S2 available.
- schemes Sl and S2 could be available for image area IA1 and the scheme S2 and a further scheme S3 could be available for image area IA2.
- the further scheme S3 is preferably a scheme with fewer categories and/or fewer offsets and/or fewer directions than the second scheme S2.
- the competing available schemes are evaluated by the encoder and one of them is selected.
- the available scheme(s) may be determined for a slice or a frame. Then, for each image area within the slice or frame, the encoder applies competition between schemes if there is more than available scheme or uses the sole available scheme. As a result, as illustrated in Figure 19, different image areas within the same slice or frame can have different classification outcomes, including an outcome from one scheme that is not possible with another scheme.
- the available scheme(s) be predetermined for different slice types. For example, the number of available schemes for an I-slice could be higher than for an Inter slice In another variant, the same set of schemes is available for both image areas. In this, however, no evaluations can be avoided.
- the encoder When there are two or more available schemes for an image area, the encoder has to signal which of the schemes it selected for a given image area. This may be achieved by introducing a new syntax element SAO_scheme_index. Limiting the number of available schemes is advantageous because the syntax element SAO_scheme_index can be shorter in this case. For example, the encoder may have a wider menu of schemes available but may make fewer schemes available for each type of image area.
- the encoder selects two or more available schemes for a slice and signals a list of available schemes for the image areas in the slice header, as illustrated in Figure 20.
- the list preferably has fewer schemes than a wider menu of schemes Sl- S3 usable by the encoder. Then, the syntax element which identifies the selected one of the available schemes may be shorter than if it had to identify any of the schemes in the wider menu of schemes.
- the encoder may select the two or more schemes from the wider menu based on an RD criterion, for example.
- the advantage of signalling a list is that the list may contain fewer schemes than the menu of schemes. Accordingly, the syntax element sao_scheme_index which is signalled per image area can use a smaller number of bits than if any scheme in the menu was available. Because there can be many image areas in one slice, even saving one bit from this syntax element is significant and the cumulative savings may easily outweigh the cost of signalling the list.
- the encoder selected two or more available schemes from a wider menu.
- the encoder selects a single available scheme for a slice from the wider menu S1-S3.
- all image areas of the slice must be the same scheme, but the scheme for one slice may be different from the scheme for another slice.
- the syntax element sao_scheme_index is just required for the slice, rather than for each image area, which leads to very light signalling.
- An image is generally composed of image parts.
- LCUs Largest Coding Units
- SAO selection edge, band, no SAO filtering
- SAO_eo_class for edge
- SAO_band _position for band
- signalling of SAO parameters was at the LCU level.
- the level of SAO parameter signalling is a compromise between the rate consumed by the SAO parameters and the distortion improvement obtained by having SAO parameters at the level concerned.
- Lower levels of SAO parameters for example at each quadtree partition level, i.e. sub-LCU
- SAO parameters for example at each quadtree partition level, i.e. sub-LCU
- Higher levels of SAO parameters for example at the slice or frame level, i.e. multi-LCU
- the compromise selected (designed into the encoder and decoder) was finally to have SAO parameters at the LCU level.
- VVC currently in the course of standardisation, instead of using a fixed compromise it is contemplated to make two or more different compromises available, and the encoder may then select one compromise for a slice or frame.
- Coding Tree Units or CTUs
- image parts although as described later it is possible to have image parts at the sub-CTU level too.
- CTUs Coding Tree Units
- signalling of SAO parameters is at the CTU level, i.e. each different CTU can have its own SAO parameters.
- a given LCU could inherit the SAO parameters of a neighbouring LCU above or to the left of the given LCU in order to avoid signalling separate sets of SAO parameters for the two LCUs.
- a flag saojnergejeft was used to signal that the SAO parameters of the left LCU are used for the given LCU.
- a flag sao_merge_up was used to signal that the SAO parameters of the above LCU are used for the given LCU.
- the signalling of SAO parameters is still at the LCU level in this case because a decision is still made for each LCU whether re-used or new SAO parameters are applied.
- signalling of SAO parameters is at the slice level, i.e. all CTUs of the slice share SAO parameters.
- the encoder first computes a set of SAO parameters to be shared by all CTUs of the image. Then, in the first method, these SAO parameters are set for the first CTU of the slice. Lor each remaining CTU from the second CTU to the last CTU of the slice, the saojnerge eft flag is set equal to 1 if the flag exists (that is, if the current CTU has a left CTU). Otherwise, the saojnerge ip flag is set equal to 1.
- Figure 23 shows an example of CTUs with SAO parameters set according to the first method.
- This method has the advantage that no signalling of the grouping to the decoder is required. Also, no changes to the decoder are required to introduce the groupings and only the encoder is changed. The groupings could therefore be introduced in an encoder based on HEVC without modifying the HEVC decoder. Surprisingly, groupings do not increase the rate too much. This is because the merge flags are generally CAB AC coded in the same context. Since for the second group (entire image) these flags all have the same value (1), the rate consumed by these flags is very low. This follows because they always have the same value and the probability is 1. In the second method of making all CTUs of the image share the same SAO parameters, the grouping is signalled to the decoder in the bitstream.
- the SAO parameters are also signalled as SAO parameters for the group (whole image), for example in the slice header.
- the signalling of the grouping consumes bandwidth.
- the merge flags can be dispensed with, saving the rate related to the merge flags, so that overall the rate is reduced.
- the encoder selects either the first grouping 1201 or the second grouping 1202 for the slice.
- the selected grouping is signalled the decoder if the second method of sharing SAO parameters is used.
- the slice header may contain a new syntax element sao_grouping_index.
- FIG 24 is a flow chart illustrating steps carried out by an encoder to determine SAO parameters for the first grouping 1201 (determination of SAO parameters at the CTU level).
- the process starts with a current CTU (1101).
- First the statistics for all possible SAO types and classes are accumulated in the variable CTUStats (1102).
- the process of Step 1102 is described below with reference to Figure 25.
- the RD cost for the SAO merge Left is evaluated if the Left CTU is in the current Slice (1103) as the RD cost of the SAO Merge UP (1104). Thanks to the statistics in CTUStats (1102), new SAO parameters are evaluated for Luma (1105) and for both Chroma components (1109).
- Figure 25 is a flow chart illustrating steps of an example of a statistics computed at the encoder side that can be applied for the Edge Offset type filter.
- Figure 25 illustrates the setting of the variable CTUStats containing all information needed to derive each best rate distortion offsets for each class. Moreover, it illustrates the selection of the best SAO parameters set for the current CTU. For each colour component Y, Sum
- each SAO type is evaluated.
- the variables and S um NbPi x arg sct t0 zero j n an initial step 801.
- the current frame area 803 contains N pixels.
- J is the current range number to determine the four offsets (related to the four edge indexes shown in Figure 7B for Edge Offset type) or to determine the six offsets (related to
- step 802 a variable i, used to successively consider each pixel Pi of the current frame p
- step 805 the class of the current pixel is determined by checking the conditions defined in Figure 7B or 13. Then a test is performed in step 805. During step 805, a check is performed to p
- step 808 If the outcome is positive, then the value“i” is incremented in step 808 in order to consider the next pixels of the frame area 803.
- step 806 the next step is 807 where the related
- step 810 the variable CTUStats for the current colour component X and the SAO type SAO type and the current class J are set equal to ⁇ um J for the first value and S um NbPi value.
- the offset J is an integer value.
- the ratio defined in this formula may be rounded, either to the closest value or using the ceiling or floor function.
- Each offset J is an optimal offset Ooptj in terms of distortion
- the encoder uses the statistics set in table CTUCStats.
- the distortion can be obtained by the following formula:
- variable Shift is designed for a distortion adjustment.
- the distortion should be negative as SAO is a post filtering.
- the same computing is applied for Chroma components.
- the Lambda of the Rate distortion cost is fixed for the three components. Lor an SAO parameters merged with the left CTU, the rate is only 1 flag which is CABAC coded.
- the rate distortion value Jj is initialized to the maximum possible value. Then a loop on Oj from Ooptj to 0 is applied in step 902. Note that Oj is modified by 1 at each new iteration of the loop. If Ooptj is negative, the value Oj is incremented and if Ooptj is positive, the value Oj is decremented.
- the rate distortion cost related to Oj is computed in step 903 according to the following formula:
- J(Oj) SumNbPix x Oj x Oj - Sum j x Oj x 2 + l R(Oj) where l is the Lagrange parameter and R(Oj) is a function which provides the number of bits needed for the code word associated with Oj.
- This algorithm of Figures 25 and 26 provides a best ORDj for each class j. This algorithm is repeated for each of the four directions of Figure 7A ( Figure 13 also has four directions). Then the direction that provides the best rate distortion cost (sum of Jj for each direction) is selected as the direction to be used for the current CTU.
- CTUStats table in the case of determining the SAO parameters at the CTU level is created by the process of Figure 24. This corresponds to evaluating the CTU level in terms of the rate-distortion compromise. The evaluation may be performed for the whole image or for just the current slice.
- the determination is done for a whole image and all CTUs of the slice/frame share the same SAO parameters.
- Figure 27 is an example of the setting of SAO parameters for a frame/slice level using the first method of sharing SAO parameters (i.e. without new SAO classifications at encoder side).
- This Figure is based on Figure 17.
- the CTUStats table is set for each CTU (in the same way as the CTU level encoding choice).
- This CTUStats can be used for the traditional CTU level (1302).
- the table FrameStats is set by adding each value for all CTUs of the table CTUStats (1303).
- the same process as for CTU level is applied to find the best SAO parameters (1305 to 1315).
- the selected SAO parameters set at step 1315 is set for the first CTU of the slice/frame. Then for each CTU from the second CTU to the last CTU of the slice/frame, the sao_merge_left_flag is set equal to 1 if it exists otherwise the sao_merge_up_flag is set equal to 1 (indeed for the second CTU to the last CTU a merge Left or Up or both exist) (1317).
- the syntax of the SAO parameters set is unchanged from that presented in Figure 9. At the end of the process the SAO parameters are set for the whole slice/frame.
- the two evaluations are then compared and the one with the best performance is selected.
- the selected grouping (CTU level or frame level) is then signalled to the decoder in the bitstream if necessary, for example using the a new syntax element sao_grouping_index .
- Figure 28 illustrates the competition between the CTU level for SAO and the slice or frame level for SAO at the encoder side.
- the current slice/frame 1901 is used to set the CTUStats table (1903) for each CTU (1902). This table (1903) is used to evaluate the CTU level (1904) and the frame or slice level (1905) as described previously.
- the best grouping (level of SAO parameters) for the slice or frame is selected according to the rate distortion evaluation results for the two groupings (1910).
- the SAO parameters sets for each CTU are set (1911) according to the grouping selected in step 1910. These SAO parameters are then used to apply the SAO filtering (1913) in order to obtain the filtered frame/slice.
- the ninth embodiment has the advantage of offering a coding efficiency increase.
- a second advantage, obtained when the first method of sharing SAO parameters within the group is used but not when the second method is used is that this competition method doesn’t require any additional SAO filtering or classification.
- the main impacts on encoder complexity are the step 1902 which needs SAO classification for all possible SAO types and the step 1913 which filters the samples.
- the frame level evaluation is only some additions of values already obtained during the CTU level encoding choice (set in the table CTUStats).
- a level of the SAO parameters is determined based on an RD criterion.
- Figure 29 is a flow chart illustrating a decoding process when the CTU grouping is signaled in the slice header according to the second method of sharing SAO parameters among the CTUs of the group.
- the flag SaoEnabledFlag is extracted from the bitstream (1801). If SAO is not enabled, the next slice header syntax element is decoded (1807) and SAO will not be applied to the current slice. Otherwise the decoder extracts 1 bit from the slice header (1803).
- the extracted bit is a CTU grouping index (1804) which identifies the CTU grouping method (1805) selected by the encoder.
- the grouping method may be either the first grouping 1201 (SAO parameters at the CTU level) or the second grouping 1202 (SAO parameters at the frame or slice level). This grouping method will be applied to extract the SAO syntax and to determine the SAO parameters set for each CTU (1806). Then the next slice header syntax element is decoded.
- the two groupings have different schemes.
- the HEVC scheme S2 is used for the CTU-level grouping 1201 and another scheme, for example the scheme Sl of Figure 13 is used for the slice-level or frame-level grouping 1202.
- this approach corresponds to the first embodiment because the groupings 1201 and 1202 have different numbers of samples per image area.
- the whole- slice or whole- frame grouping 1202 corresponds to a larger image area IA1 and the CTU-level grouping 1201 corresponds to a smaller image area IA2.
- the sixth and other embodiments with two or more available schemes for at least one image area can also be modified for the groupings.
- the schemes Sl and S2 may both be available for the frame-level or slice-level grouping 1202 but only the scheme S2 may be available for the CTU-level grouping 1201.
- Figure 30 shows an example of a decoding process when at least one grouping has two or more available schemes. The process in Figure 30 is applied slice by slice.
- step 3102 a flag in the slice header SaoEnabledFlag is read and tested. If the flag is 0, SAO filtering is disabled for the slice and the process moves to the next syntax element in the same slice header (step 3107).
- N may be 1.
- a default list of schemes may exist for a type of image area.
- a default list of schemes may exist for a type of grouping.
- step 3109 it is checked whether the selected grouping is one for which a default list of schemes exists. If so, the default list is used and processing moves to step 3105. Otherwise, a list of schemes is read from the bitstream in step 3108 (M bits).
- the scheme selected by the encoder for each image area of the slice is obtained from the bitstream, for example the syntax element sao_scheme_index is obtained.
- the selected scheme is identified from among list of available schemes for the selected grouping.
- the different levels of SAO parameters provided by the groupings 1201 and 1202 offer different RD compromises. Accordingly, it is not necessary to adapt the scheme for classifying samples according to the grouping.
- the HEVC scheme of Figures 7A and 7B could be used for both groupings.
- the same scheme is used for luma and chroma components but the luma grouping may be different from the chroma grouping.
- the SAO parameters may be at different levels for luma and chroma. This may be useful because in an image each color component doesn’t need the same amount of data to obtain a satisfactory SAO filtering result.
- the luma component should have more data to obtain satisfactory SAO filtering results than each chroma component.
- one grouping index is transmitted for the Luma component and another grouping index is transmitted for both Chroma components.
- a CTU means a coding tree block for the 3 color components the grouping is not a CTU grouping for all 3 color components but is modified to be one grouping of image parts for luma and another grouping of image parts for chroma.
- a first variant illustrated in Figure 31, there is a common chroma grouping for both chroma components (U and V components, sometimes called Cr and Cb components instead of U and V components). This figure is based on Figure 29.
- a loop on Luma and both Chroma components is added (2708, 2709) in order to extract one grouping index (2704) for Luma and one for both Chroma components.
- This index (2704) is used to extract the SAO parameters set syntax (2706).
- Figure 32 illustrates the impact of this embodiment on the decoding process of some syntax element for a group.
- Figure 32 is based on Figure 9 which illustrates the SAO merge principle.
- the process receives the value X (2801) representing the Luma component or both Chroma components.
- the loop on components has been removed compared to Figure 9. So the Merge process is only for variable X and not for the 3 components.
- step 2702 is after the loop on components (2708) in Figure 31. This reduces the rate when only Luma or only Chroma components need to be filtered.
- the 3 colour components are separated and each has its own SAO syntax.
- X can be Luma, Chroma U or Chroma V.
- the advantage of this embodiment is more flexibility for the SAO filtering design. This is also suitable for an RGB colour representation.
- the advantage of this embodiment is a coding efficiency increase. Indeed, for the Luma component which contains more information than the Chroma components, a group should contain smaller amounts of samples than for each Chroma component (for 4:2:0). So it is interesting to separate these colour components. Moreover, the Luma component contains more information than the Chroma components.
- the encoder selects a grouping of image parts for luma and a grouping of image parts for both chroma components (or for each chroma component separately) and signals the selected groupings to the decoder.
- the frame-level or slice-level grouping 1202 is the only grouping available for chroma and the CTU-level grouping 1201 is the only grouping available for luma. In this case, the selected grouping does not need to be signalled.
- one or both groupings could have two or more available schemes with one scheme being selected, as illustrated in Figure 33. If two or more schemes are available for the frame-level or slice-level grouping 1202 the scheme may be selected for all CTUs of the slice or frame or for each CTU (using per-CTU signalling of sao_scheme_index for example). If two or more schemes are available for the CTU-level grouping 1201 the scheme is selected for each CTU (using per-CTU signalling of sao_scheme_index for example). A list of available schemes may be signalled, as shown schematically in Figure 33, or the list may be predetermined, in which case no signalling of a list is needed. Fourteenth Embodiment
- the two groupings 1201 and 1202 in the eighth embodiment constitute extreme compromises.
- a third grouping 1203 makes a column of CTUs a group.
- FIG 34 is an example of the setting of SAO parameters sets for the third grouping 1203 at the encoder side.
- This Figure is based on Figure 27.
- the modules 1105 to 1115 have been merged in one step 1405 in this Figure 34.
- the CTUStats table is set for each CTU. This CTUStats can be used for the traditional CTU level (1302) encoding choice.
- the table ColumnStats is set by adding each value (1405) from CTUStats (1402), for each CTUs of the current column (1404). Then the new SAO parameters are determined as for CTU level (1406) encoding choice (cf. Figure 24).
- the RD cost to share the SAO parameters with the previous left column is also evaluated (1407), in the same way as the sharing of SAO parameters set between left and up CTU (1103, 1104) is evaluated. If the sharing of SAO parameters gives a better RD cost (1408) than the RD cost for the new SAO parameters set, the sao merge left flag is set equal to 1 for the first CTU of the column. This CTU has the address number equal to the value“Column”. Otherwise, the SAO parameters set for this first CTU of the column is set equal (1409) to the new SAO parameters obtained in step 1406.
- step 1412 can be processed once per frame.
- CTU grouping is another RD compromise between the CTU level encoding choice and the frame level which can be useful for some conditions.
- merge flags are used within the group, which means that the third grouping can be introduced without modifying the decoder (i.e. the grouping can be HEVC-compliant).
- the second method of sharing SAO parameters described in the third embodiment can be used instead. In that case, merge flags are not used in the group (CTU column) and steps 1411 and 1412 are omitted.
- the Merge between columns doesn’t need to be checked. It means that steps 1407 1408 1410 are removed from the process of Figure 34.
- the advantage of removing this possibility is a simplification of the implementation and the ability to parallelize the process. This has a small impact on coding efficiency.
- FIG. 22 Another possible compromise intermediate between the CTU level and the frame level can be offered by a fourth grouping 1204 in Figure 22 which makes a line of CTUs a group.
- a similar process to that of Figure 34 can be applied. In that case, the variable ColumnStats is changed by LineStats.
- the New SAO parameters and the merge with the up CTU is evaluated based on this LineStats table (steps 1406 1407).
- the step 1410 is replaced by setting of sao merge up flag to 1 for the first CTU of the Line. And for all CTUs of the slice/ffame except each first CTU of each Line, sao merge left flag is set equal to 1.
- the advantage of the line is another RD compromise between the CTU level and Frame level. Please note that the frame or slice are most of the time rectangles and their width is larger than their height. So the line CTUs grouping 1204 is expected to be an RD compromise closer to the frame CTU grouping 1202 than the column CTU grouping 1203.
- the line CTU grouping can be HE VC compliant if the merge flags are used within the groups.
- RD compromises can be offered by putting two or more columns of CTUs or two or more lines of CTUs together as a group.
- the process of Figure 34 can be adapted to determine SAO parameters to such groups.
- the number N of columns or lines in a group may depend on the number of groups that are targeted.
- the merge between these groups containing two or more columns or two or more lines doesn’t need to be evaluated.
- Another possible grouping includes split columns or split lines, where the split is tailored to the current slice/frame.
- the grouping 1205 makes 2x2 CTUs a group.
- the grouping 1206 makes 3x3 CTUs a group.
- Figure 35 shows an example of how to determine the SAO parameters for such groupings.
- the table NxNStats (1507) is set (1504, 1505, 1506) based on CTUstats. This table is used to determine the New SAO parameters (1508) and its RD cost, in addition to the RD cost for a Left (1510) sharing or Up (1509) sharing of SAO parameters. If the Best RD cost is the new SAO parameters (1511), the SAO parameters of the first CTU (top left CTU) of the NxN group is set equal to this new SAO parameters (1514).
- the sao merge up flag of the first CTU (Top left CTU) of the NxN group is set equal to 1 and the sao merge left flag to 0 (1515).
- the sao_merge_left_flag of the first CTU (Top left CTU) of the NxN group is set equal to 1 (1516). Then the sao_merge_left_flag and sao_merge_up_flag are set correctly for the other CTUs of the NxN group in order to form the SAO parameters for the current NxN group (1517).
- Figure 36 illustrates this setting for a 3x3 SAO group.
- the top left CTU is set equal to the SAO parameters determined in step 1508 to 1516.
- the sao_merge_left_flag is set equal to 1.
- the sao_merge_left_flag is the first flag encoded or decoded and as it is set to 1, there is no need to set the sao merge up flag to 0.
- the sao merge left flag is set equal to 0 and sao merge up flag is set equal to 1.
- the sao merge left flag is set equal to 1.
- NxN CTU groupings The advantage of the NxN CTU groupings is to create several RD compromises for SAO. As for the other groupings, these groupings can be HEVC compliant if merge flags within the groups are used. As for the other groupings, the test of Merge left and Merge up between groups can be dispensed with in Figure 35. So steps 1509, 1510, 1512, 1513, 1515 and 1516 can be removed, especially when N is high.
- the value N depends on the size of the ffame/slice.
- N 2 and 3 are evaluated. This offers an efficient compromise.
- the possible groupings are in competition with one another for the grouping (level of SAO parameters) to be selected for the current slice.
- Figure 37 illustrates an example of how to select the SAO parameter derivation using a rate-distortion compromise comparison.
- the first method of sharing SAO parameters among the CTUs of a group is used. Accordingly, merge flags are used within groups. If applied to HEVC, the resulting bitstream can be decoded by an HE VC-compliant decoder.
- the current slice/frame 1701 is used to set the CTUStats table (1703) for each CTU (1702).
- This table (1703) is used to evaluate the CTU level (1704), the frame/ Slice Grouping (1705), the Column grouping (1706), the line grouping (1707), the 2x2 CTUs grouping (1708) or 3x3 CTU grouping (1709) or all other described CTUs groupings as described previously.
- the best CTUs grouping is selected according to the rate distortion criterion computed for each grouping (1710).
- the SAO parameters sets for each CTU are set (1711) according to the grouping selected in step 1710. These SAO parameters are then used to apply the SAO filtering (1713) in order to obtain the filtered frame/slice.
- the second method of sharing SAO parameters among the CTUs of the CTU grouping may be used instead of the first method. Both methods have the advantage of offering a coding efficiency increase.
- a second advantage, obtained when the first method is used but not when the second method is used, is that this competition method doesn’t require any additional SAO filtering or classification. Indeed, the main impacts on encoder complexity are the step 1702 which needs SAO classification for all possible SAO type and the step 1713 which filtered the samples. All other CTU groupings evaluations are only some additions of values already obtained during the CTU level encoding choice (set in the table CTUStats).
- the encoder signals in the bitstream which grouping of the SAO parameters is selected (CTU level, frame level, column, line, 2x2 CTUs, 3x3 CTUs).
- a possible indexing scheme is shown in Table 1 below:
- Figure 38 is a flow chart illustrating a decoding process when the CTU grouping is signaled in the slice header according to the second method of sharing SAO parameters among the CTUs of the group. It is similar to Figure 31 and uses the same reference numerals for corresponding steps but Figure 38 has further groupings in addition to the first and second groupings 1201 and 1202 of Figure 31.
- First the flag SaoEnabledFlag is extracted from the bitstream (1801). If SAO is not enabled, the next slice header syntax element is decoded (1807) and SAO will not be applied to the current slice. Otherwise the decoder extracts N bits from the slice header (1803 A), instead of 1 bit in 1803 in Figure 29. N depends on the number of available CTUs groupings.
- the number of CTUs groupings should be equal to 2 to the power of N.
- the corresponding CTUs grouping index (1804) is used to select the CTUs grouping method (1805). This grouping method will be applied to extract the SAO syntax and to determine the SAO parameters set for each CTU (1806). Then the next slice header syntax element is decoded.
- the CTUs grouping index uses a unary max code in the slice header. In that case, the CTUs groupings are ordered according to their probabilities of occurrences (highest to lowest).
- At least one grouping is an intermediate level grouping (CTU groupings 1203-1206, e.g. columns of CTUs, lines of CTUs, NxN CTUs, etc.).
- CTU groupings 1203-1206 e.g. columns of CTUs, lines of CTUs, NxN CTUs, etc.
- SAO parameters not at CTU level or at slice/frame level.
- Each intermediate image area is made up of two or more image parts (CTUs).
- the advantage of the intermediate level grouping(s) is introduction of one or more effective rate-distortion compromises.
- the intermediate level grouping(s) can be used without the CTU-level grouping or without the frame-level grouping or without either of those two groupings.
- the scheme for classifying samples at the frame level was the scheme of Figure 13 having 7 categories and 6 offsets.
- the scheme for the CTU level was the HEVC scheme of Figures 7A and 7B having 5 categories and 4 offsets.
- the edge scheme in this contribution uses the same four directions as the HEVC scheme.
- categories 0, 1, 3 and 4 of the HEVC scheme are kept, each with an offset, but instead of the single remaining category (category 2) in the HEVC scheme there 5 further categories in this scheme.
- Four of the 5 further categories apply to“two-step edges” (edges where the target pixel is lower than one neighbouring pixel in the direction of interest and higher than the other neighbouring pixel in the direction of interest), as illustrated schematically in Figure 39.
- categories 1 and 3 apply only to one-step edges (edges where the target pixel is the different from one neighbouring pixel but the same as the other).
- categories 1 and 3 do not distinguish which side of the target pixel the step is on.
- the step could be on the left or on the right.
- the 4 two-step edge categories enable checking of forward and reverse orders of the neighbouring pixels in each direction. For example, in the 0-degree direction, the left and right neighbours in the forward order are Cnl and Cn2 respectively but are Cn2 and Cnl respectively in the reverse order. Then, for the forward order it is determined if the target pixel c is closer to Cnl than to Cn2 (first further category) or if c is closer to Cn2 than to Cnl (second further category).
- This scheme may be used for the largest grouping(s), for example the frame-level, slice- level and/or largest intermediate grouping(s).
- this scheme (10 offsets) is used for the whole slice or frame grouping 1202 and the scheme of Figure 39 (8 offsets) is used for an intermediate grouping such as the grouping 1204 (line of CTUs) or 1206 (3x3 CTUs). Seventeenth Embodiment
- a smaller intermediate grouping such as the grouping 1203 (column of CTUs) or the grouping 1205 (2x2 CTUs) uses the scheme of Figure 13 (6 offsets).
- Figure 13 scheme is used for the smaller intermediate grouping and the HEVC scheme of Figures 7A and 7B (4 offsets) is used for the CTU-level grouping. These two groupings may be the only two groupings or there may be other groupings as well.
- Figure 13 scheme is used for the smaller intermediate grouping and the scheme of Figure 39 is used for a larger grouping (a larger intermediate grouping or the slice/frame- level grouping).
- a larger grouping a larger intermediate grouping or the slice/frame- level grouping.
- the smallest grouping is the first grouping 1201 in which there is one set of SAO parameters per CTU.
- a set of SAO parameters can be applied to a smaller block than the CTU.
- the SAO parameters are not at the CTU level, frame level or an intermediate level between the CTU and frame levels but at a sub-CTU level (a level smaller than an image part).
- Figure 40 illustrates schematically a scheme for classifying samples suitable for a sub- CTU level of SAO parameters.
- This scheme has the same 4 directions as the HEVC scheme of Figures 7A and 7B but only 3 categories are used: the“C ⁇ 2 neighbouring pixels”, the“C>2 neighbouring pixels” (the two extremum classifications), and“none of the above” categories.
- the“none of the above” category has no offset but each of the other two categories has an offset, i.e. two offsets in total.
- the advantage of this scheme is a coding efficiency increase of the SAO parameter signalling when the number of samples associated with the set of SAO parameters is low enough to work satisfactorily with only the 2 classification outcomes.
- each CTU is divided into 16 blocks and each may have its own SAO parameters.
- each CTU is divided into 4 blocks, again each having its own SAO parameters.
- both sub-CTU levels are available and the encoder selects one of them and signals the selected sub-CTU level to the decoder in the bitstream.
- the scheme of Figure 40 may be used for one of both of these sub-CTU levels.
- the levels also include the CTU level or a higher level.
- the higher level may be an intermediate level between the CTU level and the slice/frame level or may be the slice/frame level.
- index 0 means that each CTU is divided into 16 blocks and each may have its own SAO parameters.
- Index 1 means that each CTU is divided into 4 blocks, again each having its own SAO parameters.
- a first set of levels (or groupings) is available for a first image area and a second set of levels (or groupings) is available for a second image area.
- the first image area is a luma image area and the second image area is a chroma image area.
- the first set of levels contains at least one level lower than any of the levels of the second set of levels.
- the available luma levels are lower than the available chroma levels (although one of more levels may be present in both sets). This increases the coding efficiency and reduces the complexity at the encoder side.
- the first image area is a higher-quality image area and the second image area is a lower-quality image area.
- the first set of levels may contain at least one level lower than any of the levels of the second set of levels. In this way, the available levels for a higher-quality image area are lower than the available levels for a lower-quality image area (although one of more levels may be present in both sets). This increases the coding efficiency and reduces the complexity at the encoder side.
- the first image area is a higher- content image area and the second image area is a lower-content image area. Again, the first set of levels may contain at least one level lower than any of the levels of the second set of levels. In this way, the available levels for a higher-content image area are lower than the available levels for a lower-content image area (although one of more levels may be present in both sets). This increases the coding efficiency and reduces the complexity at the encoder side
- a list of available schemes was signalled by the encoder for at least one grouping.
- the encoder determines two or more available schemes for each slice and to signal the available schemes to the decoder, for example in the slice header. No grouping index is signalled in this case but a syntax element sao_scheme_index is signalled for each CTU. This may also be in the slice header. As described previously, the syntax element sao_scheme_index may be shorter when the number of available schemes is fewer than the number in the wider menu. Incidentally, instead of using the single parameter sao_scheme_index it is possible to signal the scheme one or several parameters characterizing the SAO scheme as the direction and/or the number of offsets.
- Figure 41 illustrates an example of the decoding process for a slice in the twenty- second embodiment.
- the list of available schemes is read from the slice header in step 3212. Then, for each CTU in turn of the slice it is checked if the merge eft flag is available (step 3213). This flag is not available if the current CTU is the first CTU of a line. If it is available, it is checked (step 3203). If set, for all color components, the SAO parameters of the left CTU are re-used as the SAO parameters of the current CTU (step 3204). If the merge Jeft flag is not available (“No” in step 3213) it is checked if the merge up flag is available (step 3214).
- This flag is not available if the current CTU is in the first line of the slice. If it is available (“Yes” in step 3214), it is checked (step 3205). If set, for all color components the SAO parameters of the upper CTU are re-used as the SAO parameters of the current CTU (step 3206).
- the processing reaches step 3201.
- the luma component Y is processed first.
- the SAO parameters for the luma component of the current CTU are read from the bitstream in step 3207.
- the syntax element sao_scheme_index for the current CTU is used to read the SAO parameters.
- the two chroma components U and V are processed together.
- the SAO parameters for the chroma components of the current CTU are read from the bitstream in step 3207.
- the syntax element sao_scheme_index for the current CTU is used to read the SAO parameters.
- the available schemes for luma may be different from the available schemes for chroma. Accordingly, the sao_scheme_index for luma may be different from the sao_scheme_index for chroma.
- step 3211 it is checked whether the SAO parameters for all components of the current CTU have been read and, if so, the components are filtered using the SAO parameters concerned (step 3208). Processing then moves to the next CTU of the slice.
- the type of SAO filtering was edge filtering. However, all of the preceding embodiments are equally applicable when the type of filtering is band filtering.
- Band-type filtering divides a full range of sample values (e.g. 0 to 255) into bands and classifies pixels using the bands.
- the band-type filtering in HE VC uses a scheme for classifying samples which has 32 bands and selects a group of 4 successive bands for each image area to be filtered. The range covered by the group is one-eighth of the full range. There is one offset value for each of the 4 bands of the selected group.
- An HE VC encoder evaluates all possible 4-band groups and selects one group, for example the group which provides the best filtering result in terms of a rate-distortion criterion. The position of the first band of the selected group is signalled in the bitstream using an SAO parameter sao_band _position.
- the encoder signals four offset values in the bitstream. These are the offset values for the four bands of the selected group respectively. No offset value is signalled for any band outside the selected group (offset value assumed to be 0).
- the decoder uses the same scheme as the encoder to classify reconstructed pixels into bands. The group of 4 bands is determined by the received position parameter. If the reconstructed pixel is in one of the 4 bands of the selected group the decoder applies to the reconstructed pixel the appropriate one of the 4 offset values for the band of the reconstructed pixel concerned.
- the next step involves finding the best position of the SAO band position of Figure 8.
- Test 1008 checks whether or not the loop on the 32 positions has ended. If not, the process continues in step 1002, otherwise the encoding process returns the best band position as being the current value of sao_band _position 1009.
- group range 64 sample values when the full range is from 0 to 255) were proposed, one with 8 bands of band size 8 sample values, and the other with 16 bands of band size 4 sample values.
- Three groups of group range one-eighth of the full range i.e. group range 32 sample values when the full range is from 0 to 255) were proposed, one with 4 bands of band size 8 sample values, another with 8 bands of band size 4 sample values, and the last with 16 bands of band size 2 sample values.
- Three groups of group range one-sixteenth of the full range i.e. group range 16 sample values when the full range is from 0 to 255) were proposed, one with 2 bands of band size 8 sample values, another with 4 bands of band size 4 sample values, and the last with 8 bands of band size 2 sample values.
- the encoder selects a best group for band filtering an image part (LCU), the group having one of the four group ranges, one of band sizes (for groups with group ranges smaller than one-half), and one of the 32, 64 or 128 positions according to the band size. .
- the selection is made using a rate- distortion criterion.
- the selected group is signalled using an SAO type index parameter sao ypejdx having different values corresponding to the different group range and band size combinations.
- the number of offsets is 2, 4 or 8 depending on the combination (one offset per band in the group). These offsets are also determined by the encoder and signalled to the decoder. The selected position is also signalled in the bitstream as in the HEVC band scheme.
- the decoder identifies the group selected by the encoder using the received SAO type index parameter and, if the target pixel is in one of the bands of the group, applies the offset value for the band concerned to the target pixel.
- the schemes correspond to the levels (or groupings) as set out in Table 4 below.
- Figure 43 shows 16 different groups. Each group has an associated group index from 1 to 16.
- Groups with index values from 1 to 4 have 2, 4, 8 and 16 bands respectively and the full range is divided into 32 possible bands, each of band size 8. 32 different positions are available for each group.
- Groups with index values from 5 to 8 have 2, 4, 8 and 16 bands respectively and the full range is divided into 64 possible bands, each of band size 4. 64 different positions are available for each group.
- Groups with index values from 9 to 12 have 2, 4, 8 and 16 bands respectively and the full range is divided into 128 possible bands, each of band size 2. 128 different positions are available for each group.
- Groups with index values from 13 to 16 have 2, 4, 8 and 16 bands respectively and the full range is divided into 256 possible bands, each of band size 1. 256 different positions are available for each group.
- the groups for index values 1 to 4, 6 to 8, 10 and 11 are illustrated schematically in the final column in Figure 43. The groups for other index values are not shown but can easily be understood from the illustrated examples.
- the scheme is“1 subdivision 4 offsets”. This corresponds to group index 2 only. Incidentally, this is the HEVC band scheme.
- the scheme is“2 subdivisions Max 8 offsets”. This scheme corresponds to group indices 1, 2, 3, 5, 6, and 7. Group index 4 is excluded because it has 16 offsets and 8 is the maximum in this scheme.
- the different possible positions for that one group are compared based on a suitable evaluation criterion, such as the rate-distortion criterion.
- One position is selected. If there is more than one group the groups and the different positions for each group are compared based on the evaluation criterion. One group and one position are selected as a combination.
- the number of positions evaluated is modified from 32 depending on the number of positions for a given group (32, 64, 128 or 256). Also, the number of bands for a given group (step 1004) is modified from 4 to 2, 4, 8 or 16 as appropriate. Further details of the evaluation are given with reference to Figures 44 to 46.
- a scheme for one level (generally a lower level such as 1/16 CTU, 1 ⁇ 4 CTU or 1 CTU) has just one group available (of course different positions may be available for that group), whereas for another level (generally a higher level such as NxN CTUs, line, column, slice of frame) two or more groups are available for the scheme. It is also possible in other embodiments to have two or more schemes for a given level.
- predetermined positions and non-predetermined positions for a group of bands e.g. just a fixed position at the centre of the full range versus a floating position
- overlapping spacings and non-overlapping spacings for a group of bands e.g. just a fixed position at the centre of the full range versus a floating position
- band schemes In the twenty-third embodiment only band schemes were used. In the twenty- fourth embodiment, a band scheme is in competition with an edge scheme for each different grouping (level of SAO parameters).
- Table 5 below shows an example of the competing band and edge schemes in this embodiment.
- the RD evaluation encompasses the different groups available in the band scheme for the level concerned.
- the best outcome from the band scheme is compared with the best outcome from the edge scheme.
- Figure 44 corresponds to Figure 24 but steps 4406 and 4409 are modified compared to the corresponding steps 1106 and 1109 in Figure 24 to include evaluation of the band schemes.
- Figure 46 also corresponds to Figure 27 but steps 4606 and 4609 are modified compared to the corresponding steps 1306 and 1309 in Figure 27 to include evaluation of the band scheme.
- the evaluation also includes each available edge scheme if there is more than one edge scheme for the level concerned and each available band scheme if there is more than one band scheme for the level concerned.
- step 4501 corresponds to step 801 in Figure 25 but the maximum value of j is 32, 64, 128 or 256 in the case of the band scheme evaluation (assuming 32, 64, 128 or 256 bands) and is 4 in the case of the edge scheme evaluation (assuming 4 directions).
- step 4510 corresponds to step 810 in Figure 25 but the number“4” is changed as necessary to reflect the number of offsets in the edge scheme (e.g. 6, 8 or 10).
- the edge and band schemes for a given level should be chosen to be good competitors, so that one scheme does not dominate in terms of selection. It is advantaegeous, but not essential, for the numbers of offsets to be the same or similar between the competing edge and band schemes for a given level.
- Figure 47 shows a system 191 195 comprising at least one of an encoder 150 or a decoder 100 and a communication network 199 according to embodiments of the present invention.
- the system 195 is for processing and providing a content (for example, a video and audio content for displaying/outputting or streaming video/audio content) to a user, who has access to the decoder 100, for example through a user interface of a user terminal comprising the decoder 100 or a user terminal that is communicable with the decoder 100.
- a user terminal may be a computer, a mobile phone, a tablet or any other type of a device capable of providing/displaying the (provided/streamed) content to the user.
- the system 195 obtains/receives a bitstream 101 (in the form of a continuous stream or a signal - e.g. while earlier video/audio are being displayed/output) via the communication network 199.
- the system 191 is for processing a content and storing the processed content, for example a video and audio content processed for displaying/outputting/streaming at a later time.
- the system 191 obtains/receives a content comprising an original sequence of images 151, which is received and processed (including filtering with a deblocking filter according to the present invention) by the encoder 150, and the encoder 150 generates a bitstream 101 that is to be communicated to the decoder 100 via a communication network 191.
- the bitstream 101 is then communicated to the decoder 100 in a number of ways, for example it may be generated in advance by the encoder 150 and stored as data in a storage apparatus in the communication network 199 (e.g. on a server or a cloud storage) until a user requests the content (i.e. the bitstream data) from the storage apparatus, at which point the data is communicated/streamed to the decoder 100 from the storage apparatus.
- the system 191 may also comprise a content providing apparatus for providing/streaming, to the user (e.g. by communicating data for a user interface to be displayed on a user terminal), content information for the content stored in the storage apparatus (e.g.
- the encoder 150 generates the bitstream 101 and communicates/streams it directly to the decoder 100 as and when the user requests the content.
- the decoder 100 then receives the bitstream 101 (or a signal) and performs filtering with a deblocking filter according to the invention to obtain/generate a video signal 109 and/or audio signal, which is then used by a user terminal to provide the requested content to the user.
- the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over, as one or more instructions or code, a computer- readable medium and executed by a hardware-based processing unit.
- Computer-readable media may include computer-readable storage media, which corresponds to a tangible medium such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, e.g., according to a communication protocol.
- computer-readable media generally may correspond to (1) tangible computer-readable storage media which is non- transitory or (2) a communication medium such as a signal or carrier wave.
- Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure.
- a computer program product may include a computer- readable medium.
- such computer-readable storage media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage, or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer.
- any connection is properly termed a computer-readable medium.
- a computer-readable medium For example, if instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium.
- DSL digital subscriber line
- Disk and disc includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and Blu-ray disc, where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
- processors such as one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuitry.
- DSPs digital signal processors
- ASICs application specific integrated circuits
- FPGAs field programmable logic arrays
- processors may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein.
- the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques could be fully implemented in one or more circuits or logic elements.
- the techniques of this disclosure may be implemented in a wide variety of devices or apparatuses, including a wireless handset, an integrated circuit (IC) or a set of ICs (e.g., a chip set).
- IC integrated circuit
- a set of ICs e.g., a chip set.
- Various components, modules, or units are described in this disclosure to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require realization by different hardware units. Rather, as described above, various units may be combined in a codec hardware unit or provided by a collection of interoperative hardware units, including one or more processors as described above, in conjunction with suitable software and/or firmware.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computing Systems (AREA)
- Theoretical Computer Science (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Une image comprenant une pluralité de parties d'image est soumise à un filtrage à décalage adaptatif d'échantillon (SAO). Le filtrage SAO comprend l'application, à une zone d'image à filtrer, d'un schéma pour classer des échantillons de la zone de l'image. Le schéma peut être un schéma de bord ou un schéma de bande ou inclure à la fois des schémas de bord et de bande en compétition entre eux. Le schéma est adapté sur la base d'un type de la zone d'image à filtrer. Par exemple, des zones d'image IA1 et IA2 ayant des nombres différents d'échantillons dans la zone d'image peuvent être différents types de zone d'image. Un autre procédé de filtrage SAO (Figure 17) comprend la sélection de différents niveaux de paramètres SAO pour respectivement des composantes de luminance et de chrominance. Un autre procédé de filtrage SAO comprend l'adaptation d'un schéma pour classer des échantillons d'une zone d'image vers une autre zone d'image de la même tranche ou de la même trame (Figure 19). La présente invention concerne également un schéma de bord particulier pour un filtrage SAO avec 6 décalages (Figure 13).
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1809234.6A GB2574423A (en) | 2018-06-05 | 2018-06-05 | Video coding and decoding |
GB1809234.6 | 2018-06-05 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019233999A1 true WO2019233999A1 (fr) | 2019-12-12 |
Family
ID=62975650
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2019/064456 WO2019233999A1 (fr) | 2018-06-05 | 2019-06-04 | Codage et décodage vidéo |
Country Status (3)
Country | Link |
---|---|
GB (1) | GB2574423A (fr) |
TW (1) | TW202005369A (fr) |
WO (1) | WO2019233999A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113099230A (zh) * | 2021-02-22 | 2021-07-09 | 浙江大华技术股份有限公司 | 编码方法、装置、电子设备及计算机可读存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120207227A1 (en) * | 2011-02-16 | 2012-08-16 | Mediatek Inc. | Method and Apparatus for Slice Common Information Sharing |
GB2496213A (en) * | 2011-11-07 | 2013-05-08 | Canon Kk | Providing Compensation Offsets for a Set of Reconstructed Image Samples |
EP2723073A2 (fr) * | 2011-06-14 | 2014-04-23 | LG Electronics Inc. | Procédé permettant de coder et de décoder des informations d'image |
EP2767087A1 (fr) * | 2011-10-13 | 2014-08-20 | Qualcomm Incorporated | Décalage adaptatif d'échantillon fusionné à un filtre de boucle adaptatif en codage vidéo |
US20140348222A1 (en) * | 2013-05-23 | 2014-11-27 | Mediatek Inc. | Method of Sample Adaptive Offset Processing for Video Coding and Inter-Layer Scalable Coding |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5464435B2 (ja) * | 2010-04-09 | 2014-04-09 | ソニー株式会社 | 画像復号装置および方法 |
-
2018
- 2018-06-05 GB GB1809234.6A patent/GB2574423A/en not_active Withdrawn
-
2019
- 2019-05-28 TW TW108118388A patent/TW202005369A/zh unknown
- 2019-06-04 WO PCT/EP2019/064456 patent/WO2019233999A1/fr active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120207227A1 (en) * | 2011-02-16 | 2012-08-16 | Mediatek Inc. | Method and Apparatus for Slice Common Information Sharing |
EP2723073A2 (fr) * | 2011-06-14 | 2014-04-23 | LG Electronics Inc. | Procédé permettant de coder et de décoder des informations d'image |
EP2767087A1 (fr) * | 2011-10-13 | 2014-08-20 | Qualcomm Incorporated | Décalage adaptatif d'échantillon fusionné à un filtre de boucle adaptatif en codage vidéo |
GB2496213A (en) * | 2011-11-07 | 2013-05-08 | Canon Kk | Providing Compensation Offsets for a Set of Reconstructed Image Samples |
WO2013068427A2 (fr) | 2011-11-07 | 2013-05-16 | Canon Kabushiki Kaisha | Procédé et dispositif destinés à fournir des décalages de compensation pour une série d'échantillons d'une image reconstruits |
US20140348222A1 (en) * | 2013-05-23 | 2014-11-27 | Mediatek Inc. | Method of Sample Adaptive Offset Processing for Video Coding and Inter-Layer Scalable Coding |
Non-Patent Citations (6)
Title |
---|
CHIH-MING FU ET AL: "Sample Adaptive Offset in the HEVC Standard", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS, US, vol. 22, no. 12, 1 December 2012 (2012-12-01), pages 1755 - 1764, XP011487153, ISSN: 1051-8215, DOI: 10.1109/TCSVT.2012.2221529 * |
ESENLIK S ET AL: "Non-CE8: Low-delay support for APS", 99. MPEG MEETING; 6-2-2012 - 10-2-2012; SAN JOSÉ; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. m23251, 6 February 2012 (2012-02-06), XP030051776 * |
ESENLIK S ET AL: "Syntax refinements for SAO and ALF", 7. JCT-VC MEETING; 98. MPEG MEETING; 21-11-2011 - 30-11-2011; GENEVA; (JOINT COLLABORATIVE TEAM ON VIDEO CODING OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ); URL: HTTP://WFTP3.ITU.INT/AV-ARCH/JCTVC-SITE/,, no. JCTVC-G566, 8 November 2011 (2011-11-08), XP030110550 * |
LEE T ET AL: "Simplification on SAO syntax", 9. JCT-VC MEETING; 100. MPEG MEETING; 27-4-2012 - 7-5-2012; GENEVA; (JOINT COLLABORATIVE TEAM ON VIDEO CODING OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ); URL: HTTP://WFTP3.ITU.INT/AV-ARCH/JCTVC-SITE/,, no. JCTVC-I0130, 26 April 2012 (2012-04-26), XP030111893 * |
LUO BINJI ET AL: "A new SAO based on histogram analysis in HEVC", 2013 PICTURE CODING SYMPOSIUM (PCS), IEEE, 8 December 2013 (2013-12-08), pages 49 - 52, XP032566993, DOI: 10.1109/PCS.2013.6737680 * |
OUEDRAOGO N ET AL: "On APS referring and updating", 100. MPEG MEETING; 30-4-2012 - 4-5-2012; GENEVA; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. m24428, 28 April 2012 (2012-04-28), XP030052773 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113099230A (zh) * | 2021-02-22 | 2021-07-09 | 浙江大华技术股份有限公司 | 编码方法、装置、电子设备及计算机可读存储介质 |
Also Published As
Publication number | Publication date |
---|---|
GB2574423A (en) | 2019-12-11 |
GB201809234D0 (en) | 2018-07-25 |
TW202005369A (zh) | 2020-01-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11601687B2 (en) | Method and device for providing compensation offsets for a set of reconstructed samples of an image | |
TWI782904B (zh) | 合併用於視訊寫碼之用於多類別區塊之濾波器 | |
WO2020002117A2 (fr) | Procédés et dispositifs pour réaliser un filtrage à décalage adaptatif d'échantillon (sao) | |
WO2019233999A1 (fr) | Codage et décodage vidéo | |
WO2019233997A1 (fr) | Prédiction de paramètres sao | |
WO2019234000A1 (fr) | Prédiction de paramètres de sao | |
WO2019234002A1 (fr) | Codage vidéo et décodage | |
WO2019234001A1 (fr) | Codage et décodage vidéo | |
WO2019233998A1 (fr) | Codage et décodage vidéo |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19728940 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19728940 Country of ref document: EP Kind code of ref document: A1 |