CN115426499A - Image decoding method and image encoding method - Google Patents

Image decoding method and image encoding method Download PDF

Info

Publication number
CN115426499A
CN115426499A CN202211233463.3A CN202211233463A CN115426499A CN 115426499 A CN115426499 A CN 115426499A CN 202211233463 A CN202211233463 A CN 202211233463A CN 115426499 A CN115426499 A CN 115426499A
Authority
CN
China
Prior art keywords
block
motion information
prediction
module
mode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211233463.3A
Other languages
Chinese (zh)
Inventor
金旲衍
郑旭帝
姜晶媛
李河贤
林成昶
李镇浩
金晖容
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electronics and Telecommunications Research Institute ETRI
Chips and Media Inc
Original Assignee
Electronics and Telecommunications Research Institute ETRI
Chips and Media Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electronics and Telecommunications Research Institute ETRI, Chips and Media Inc filed Critical Electronics and Telecommunications Research Institute ETRI
Publication of CN115426499A publication Critical patent/CN115426499A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/436Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Abstract

An image decoding method and an image encoding method are provided. According to one embodiment of the present invention, a method for decoding an image includes the steps of: identifying a parallel motion information prediction unit index of a current block whose motion information is to be decoded; obtaining motion information of at least one neighboring block among neighboring blocks of a current block that are not blocks belonging to a previously indexed parallel motion information prediction unit; and processing prediction decoding of motion information for the current block based on the motion information that has been obtained.

Description

Image decoding method and image encoding method
The present application is a divisional application of the inventive patent application having an application date of 2018, month 02, and day 27, application number of "201880023458.5", entitled "image processing method for processing motion information for parallel processing, method for decoding and encoding using the image processing method, and apparatus for the method".
Technical Field
The present invention relates to an image processing method, a method of decoding and encoding an image using the image processing method, and an apparatus for the method. More particularly, the present invention relates to an image processing method of processing motion information for parallel processing, a method of decoding and encoding an image using the image processing method, and an apparatus for the method.
Background
Digital video technology may be used in an integrated manner for a wide variety of digital video devices including, for example, digital televisions, digital direct broadcast systems, wireless broadcast systems, personal Digital Assistants (PDAs), laptop or desktop computers, digital cameras, digital recording devices, video gaming devices, video game consoles, mobile telephones, satellite radio telephones, and the like. Digital video devices may implement video compression techniques such as MPEG-2, MPEG-4, or ITU-T H.264/MPEG-4, part 10, advanced Video Coding (AVC), H.265/High Efficiency Video Coding (HEVC)) to more efficiently transmit and receive digital video information. Video compression techniques perform spatial prediction and temporal prediction to eliminate or reduce redundancy inherent in video sequences.
As such image compression techniques, there are various techniques such as an inter prediction technique for predicting pixel values included in a current picture from a previous picture or a subsequent picture of the current picture, an intra prediction technique for predicting pixel values included in the current picture using pixel information in the current picture, and an entropy coding technique that assigns short codes to values having a high occurrence frequency and long codes to values having a low occurrence frequency, and image data can be efficiently compressed using these image compression techniques and can be transmitted or stored.
In order to economically and efficiently cope with various resolutions, frame rates, etc. according to these applications, it is necessary to have a video decoding apparatus that can be easily controlled according to performance and functions required in the applications.
For example, in an image compression method, a picture is partitioned into a plurality of blocks each having a predetermined size to perform encoding. In addition, an inter prediction technique and an intra prediction technique for removing redundancy between pictures are used to improve compression efficiency.
In this case, a residual signal is generated by using intra prediction and inter prediction. The reason why the residual signal is obtained is because when encoding is performed using the residual signal, the amount of data is small, so that the data compression rate is higher, and the value of the residual signal is smaller as the prediction is better.
The intra prediction method predicts data of the current block by using pixels around the current block. The difference between the actual value and the predicted value is called a residual signal block. In the case of HEVC, since the number of prediction modes increases from 9 prediction modes used in the existing h.264/AVC to 35 prediction modes, the intra prediction method is performed with being further divided.
In the case of the inter prediction method, the current block is compared with blocks in adjacent pictures to find the most similar block. Here, the position information (Vx, vy) of the found block is referred to as a motion vector. The difference in pixel values between the current block and the prediction block predicted by the motion vector is referred to as a residual signal block (motion compensation residual block).
In this way, although intra prediction and inter prediction are further divided so that the data amount of the residual signal is reduced, the amount of calculation for processing the video is greatly increased.
In particular, the increase in complexity of the process of determining the partition structure in the picture for image encoding and decoding and the existing block partitioning method cause difficulties in pipeline execution processes, and the existing block partitioning method and the size of the block resulting from the partitioning operation may not be suitable for encoding a high resolution image.
For example, in the parallel processing of motion information for inter prediction, since a current block can be processed only after encoding or decoding of a left neighboring block, an upper neighboring block, and an upper-left neighboring block is completed, when a parallel pipeline is implemented, it is necessary to wait until a block mode and a partition size of a neighboring block for the current block processed in another pipeline are determined, thereby there is a problem of causing pipeline stall.
To address this problem, a merging method that merges motion information on some prediction units within an arbitrary block size has been proposed in HEVC. However, in calculating the motion information of the first prediction unit in an arbitrary block, the same problem that the motion information of neighboring blocks is required occurs, so that there is a problem that the coding efficiency is significantly reduced.
Disclosure of Invention
Technical problem
The present invention has been made to solve the above-mentioned problems, and it is an object of the present invention to provide a method and apparatus for decoding and encoding an image, which improve encoding efficiency by performing parallel processing on motion information predictive decoding and encoding of a high resolution image.
Solution scheme
In order to solve the above object, a method of decoding motion information according to an embodiment of the present invention includes: identifying a parallel motion information prediction unit index of a current block whose motion information is to be decoded; obtaining motion information of at least one neighboring block among remaining blocks excluding blocks belonging to a previously indexed parallel motion information prediction unit from neighboring blocks of a current block; and processing prediction decoding of motion information for the current block based on the obtained motion information.
In order to solve the above object, a method of performing motion information according to an embodiment of the present invention includes: identifying a parallel motion information prediction unit index of a current block whose motion information is to be encoded; obtaining motion information of at least one neighboring block among remaining blocks excluding blocks belonging to a previously indexed parallel motion information prediction unit from neighboring blocks of a current block; and processing motion information prediction encoding for the current block based on the obtained motion information.
Further, in order to solve the above-described object, the method according to the embodiment of the present invention may be implemented as a program for executing the method on a computer and a nonvolatile recording medium storing the program and readable by the computer.
Advantageous effects
According to an embodiment of the present invention, the decoded blocks may be sequentially grouped into a predetermined parallel motion information prediction unit.
According to an embodiment of the present invention, in motion information decoding, motion information decoding may be performed using motion information of remaining blocks excluding blocks belonging to a previous parallel motion information prediction unit from neighboring blocks of a current block.
Accordingly, pipeline processing for each parallel motion information prediction unit can be independently performed, and pipeline stall can be prevented in advance, thereby improving encoding and decoding efficiency.
Drawings
Fig. 1 is a block diagram showing a configuration of an image encoding apparatus according to an embodiment of the present invention.
Fig. 2 to 5 are diagrams illustrating a first embodiment of a method of partitioning an image into block units and processing the image.
Fig. 6 is a block diagram illustrating an embodiment of a method of performing inter prediction in an image encoding apparatus.
Fig. 7 is a block diagram showing a configuration of an image decoding apparatus according to an embodiment of the present invention.
Fig. 8 is a block diagram illustrating an embodiment of a method of performing inter prediction in an image decoding apparatus.
Fig. 9 is a diagram illustrating a second embodiment of a method of partitioning an image into block units and processing the image.
Fig. 10 is a diagram illustrating a third embodiment of a method of partitioning an image into block units and processing the image.
FIG. 11 is a diagram illustrating an embodiment of a method of partitioning a coding unit using a binary tree structure to construct a transform unit.
Fig. 12 is a diagram illustrating a fourth embodiment of a method of partitioning an image into block units and processing the image.
Fig. 13 to 14 are diagrams illustrating still another embodiment of a method of partitioning an image into block units and processing the image.
Fig. 15 and 16 are diagrams illustrating an embodiment of a method of determining a partition structure of a transform unit by performing Rate Distortion Optimization (RDO).
Fig. 17 and 18 are flowcharts illustrating an image processing method of processing motion information for parallel processing according to an embodiment of the present invention.
Fig. 19 to 22 are diagrams illustrating a motion information processing method according to an embodiment of the present invention for each case.
Detailed Description
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the following description of the embodiments of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present disclosure rather unclear.
It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or directly coupled to the other element, but other elements may also be present between the two elements. Further, the description of "including" a specific configuration in the present invention does not exclude a configuration other than the configuration, but means that additional configurations may be included in the practical scope of the present invention or the technical scope of the present invention.
The terms first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. For example, a first component could be termed a second component, and, similarly, a second component could be termed a first component, without departing from the scope of the present invention.
Further, the constituent elements shown in the embodiments of the present invention are individually shown to exhibit different characteristic functions, and each component does not mean that each constituent element is configured with separate hardware or one software constituent element. That is, for convenience of explanation, each constituent unit is included in a list, and at least two of the constituent units may be combined to form one constituent unit, or one constituent unit may be divided into a plurality of constituent units to perform each function. Integrated and separate embodiments of such constituent units are also included within the scope of the present invention unless they depart from the essence of the present invention.
Further, some components are not essential components for performing essential functions, but are optional components for improving performance in the present invention. The present invention can be implemented with only components necessary for implementing the essence of the present invention other than the components for performance improvement, and a structure including necessary components other than optional components for performance improvement is also included in the scope of the present invention.
Fig. 1 is a block diagram showing a configuration of an image encoding apparatus according to an embodiment of the present invention. The image encoding apparatus 10 includes a picture partitioning module 110, a transform module 120, a quantization module 130, a scanning module 131, an entropy encoding module 140, an intra prediction module 150, an inter prediction module 160, an inverse quantization module 135, an inverse transform module 125, a post-processing module 170, a picture storage module 180, a subtractor 190, and an adder 195.
Referring to fig. 1, a picture partitioning module 110 analyzes an input video signal and partitions a picture into coding units to determine a prediction mode and determine a prediction unit size for each coding unit.
Also, the picture partition module 110 transmits a prediction unit to be encoded to the intra prediction module 150 or the inter prediction module 160 according to a prediction mode (or a prediction method). In addition, the picture partitioning module 110 transmits the prediction unit to be encoded to the subtractor 190.
Here, a picture of an image is composed of a plurality of slices, and the slices may be partitioned into a plurality of Coding Tree Units (CTUs) which are basic units of picture partitioning.
The coding tree unit may be partitioned into one or at least two Coding Units (CUs) that are basic units of inter prediction or intra prediction.
Here, the maximum sizes of the coding tree unit and the coding unit may be different from each other, and signaling information regarding the maximum sizes may be transmitted to the decoding apparatus 20. This will be described in more detail later with reference to fig. 17.
A Coding Unit (CU) may be partitioned into one or at least two Prediction Units (PUs) as a basic unit of prediction.
In this case, the encoding apparatus 10 determines any one of inter prediction and intra prediction as a prediction method for each Coding Unit (CU) resulting from the partitioning operation, but prediction blocks may be generated differently from each other for each prediction unit (CU).
In addition, a Coding Unit (CU) may be partitioned into one or two or more Transform Units (TUs), which are basic units performing transform on a residual block.
In this case, the picture partitioning module 110 may transmit the image data to the subtractor 190 in units of blocks (e.g., prediction Units (PUs) or Transform Units (TUs)) resulting from the partitioning operation as described above.
Referring to fig. 2, a Coding Tree Unit (CTU) having a maximum size of 256 × 256 pixels is partitioned into four Coding Units (CUs) each having a square shape using a quad tree structure.
Each of the four Coding Units (CUs) having a square shape may be further partitioned using a quadtree structure. The depth of a Coding Unit (CU) has any one integer of 0 to 3.
A Coding Unit (CU) may be partitioned into one or at least two Prediction Units (PUs) according to prediction modes.
In the case of the intra prediction mode, when the size of the Coding Unit (CU) is 2N × 2N, the Prediction Unit (PU) has a size of 2N × 2N shown in fig. 3 (a) or a size of N × N shown in fig. 3 (b).
In addition, in the case of the inter prediction mode, when the size of the Coding Unit (CU) is 2N × 2N, the Prediction Unit (PU) has any one of the following sizes: 2N × 2N shown in fig. 4 (a), 2N × N shown in fig. 4 (b), N × 2N shown in fig. 4 (c), N × N shown in fig. 4 (d), 2N × nU shown in fig. 4 (e), 2N × nD shown in fig. 4 (f), nL × 2N shown in fig. 4 (g), and nR × 2N shown in fig. 4 (h).
Referring to fig. 5, a Coding Unit (CU) may be partitioned into four Transform Units (TUs) each having a square shape using a quadtree structure.
Each of the four Transform Units (TUs) having a square shape may be further partitioned using a quadtree structure. The depth of a Transform Unit (TU) resulting from the quadtree partitioning operation may have any one integer value from 0 to 3.
Here, when a Coding Unit (CU) is an inter prediction mode, a Prediction Unit (PU) and a Transform Unit (TU) obtained from partitioning the corresponding Coding Unit (CU) may have partition structures independent of each other.
When the Coding Unit (CU) is an intra prediction mode, a size of a Transform Unit (TU) resulting from partitioning the Coding Unit (CU) may not be greater than a size of the Prediction Unit (PU).
Also, the Transform Unit (TU) resulting from the partitioning operation as described above may have a maximum size of 64 × 64 pixels.
The transform module 120 transforms a residual block, which is a residual signal between an original block of an input Prediction Unit (PU) and a prediction block generated by the intra prediction module 150 or the inter prediction module 160, wherein the transform may be performed in the transform module 120 by using a Transform Unit (TU) as a basic unit.
In the transform process, different transform matrices may be determined according to prediction modes (intra or inter), and residual signals of intra prediction have directionality according to the intra prediction modes, so that the transform matrices may be adaptively determined according to the intra prediction modes.
The transformed elementary units may be transformed by two (horizontal and vertical) one-dimensional transformation matrices. For example, in the case of inter prediction, a predetermined transform matrix may be determined.
In addition, in the case of intra prediction, when the intra prediction mode is horizontal, the probability that the residual block has directivity in the vertical direction is high. Thus, the DCT-based integer matrix is applied in the vertical direction and the DST-based or KLT-based integer matrix is applied in the horizontal direction. When the intra prediction mode is vertical, the DST-based or KLT-based integer matrix is applied in the vertical direction, and the DCT-based integer matrix is applied in the horizontal direction.
Furthermore, in case of the DC mode, the DCT-based integer matrix is applied in both directions.
In the case of intra prediction, a transform matrix may be adaptively determined based on the size of a Transform Unit (TU).
The quantization module 130 determines a quantization step size for quantizing coefficients of the residual block transformed by the transform matrix, wherein the quantization step size may be determined in the quantization module 130 for each quantization unit having a predetermined size or more.
The quantization unit may be 8 × 8 or 16 × 16 in size, and the quantization module 130 quantizes the coefficients of the transform block using a quantization matrix determined according to a quantization step size and a prediction mode.
In addition, the quantization module 130 may use a quantization step size of a quantization unit adjacent to the current quantization unit as a quantization step size predictor of the current quantization unit.
The quantization module 130 searches one or two effective quantization steps in the order of the left quantization unit, the upper quantization unit, and the upper left quantization unit of the current quantization unit and generates a quantization step predictor of the current quantization unit using the effective quantization steps.
For example, the quantization module 130 determines the first effective quantization step searched in the above order as the quantization step predictor, or determines the average of two effective quantization steps searched in the above order as the quantization step predictor, or when only one quantization step is effective, the quantization module 130 determines the quantization step as the quantization step predictor.
When the quantization step predictor is determined, the quantization module 130 transmits a difference value between the quantization step of the current quantization unit and the quantization step predictor to the entropy coding module 140.
In addition, there are no left coding unit, upper coding unit, and upper left coding unit of the current coding unit, or there may be coding units that previously exist in coding order within the maximum coding unit.
Therefore, the quantization steps of the quantization unit adjacent to the current coding unit in the maximum coding unit and the quantization unit immediately before in the coding order may be candidates.
In this case, the priorities may be set in the following order: 1) a left-side quantization unit of the current coding unit, 2) an upper-side quantization unit of the current coding unit, 3) an upper-left side quantization unit of the current coding unit, and 4) an immediately preceding quantization unit in coding order. The order may be changed and the upper left quantization unit may be omitted.
In addition, the quantized transform block is sent to the inverse quantization module 135 and the scan module 131.
The scanning module 131 scans the coefficients of the quantized transform block and transforms the coefficients into one-dimensional quantized coefficients. In this case, since the coefficient distribution of the transform block after quantization may depend on the intra prediction mode, the scanning method may be determined according to the intra prediction mode.
Further, the coefficient scanning method may be determined according to the size of the transformed basic unit, and the scanning mode may be changed according to the directional intra prediction mode. In this case, the scanning order of the quantized coefficients may be scanned in the opposite direction.
When quantized coefficients are divided into a plurality of subsets, the same scanning pattern may be applied to the quantized coefficients in each subset, and zigzag scanning or diagonal scanning may be applied to the scanning pattern between the subsets.
In addition, the scan pattern is preferably applied in the forward direction starting from the main subset including DC to the remaining subset, but may be applied in the reverse direction.
Further, the scan pattern between the subsets may be set in the same manner as the scan pattern of the quantized coefficients in the subsets, and the scan pattern between the subsets may be determined according to the intra prediction mode.
In addition, the encoding device 10 is configured such that information indicating the position of the last non-zero quantized coefficient in the transform unit (PU) and the position of the last non-zero quantized coefficient in each subset is included in the bitstream and transmitted to the decoding device 20.
The inverse quantization module 135 performs inverse quantization on the quantized coefficients as described above, and the inverse transform module 125 performs inverse transform on a per Transform Unit (TU) basis to reconstruct the transform coefficients resulting from the inverse quantization operation into a residual block of a spatial domain.
The adder 195 may generate a reconstructed block by adding the residual block reconstructed by the inverse transform module 125 to the prediction block received from the intra prediction module 150 or the inter prediction module 160.
Also, the post-processing module 170 performs a deblocking filtering process for removing blocking artifacts generated in a reconstructed picture, a Sample Adaptive Offset (SAO) application process for compensating for a difference value with respect to an original image on a per-pixel basis, and an Adaptive Loop Filtering (ALF) process for compensating for a difference value with respect to an original image in coding units.
The deblocking filtering process may be applied to a boundary of a Prediction Unit (PU) or a Transform Unit (TU) having a predetermined size or more.
For example, the deblocking filtering process may include: the method includes determining a boundary to be filtered, determining a boundary filtering strength to be applied to the boundary, determining whether a deblocking filter is applied, and selecting a filter to be applied to the boundary when it is determined to apply the deblocking filter.
In addition, whether the deblocking filter is determined depends on the following factors: i) Whether the boundary filtering strength is greater than 0, and ii) whether a value indicating a degree of variation in pixel values at the boundary of two blocks (P block and Q block) adjacent to the boundary to be filtered is less than a first reference value determined by a quantization parameter.
Preferably at least two filters are used. When an absolute difference between two pixels located at a block boundary is greater than or equal to a second reference value, a filter that performs relatively weak filtering is selected.
The second reference value is determined by the quantization parameter and the boundary filtering strength.
A Sample Adaptive Offset (SAO) application process is to reduce distortion between pixels in an image to which a deblocking filter is applied and original pixels. It may be determined whether to perform Sample Adaptive Offset (SAO) application processing on a per picture or slice basis.
A picture or slice may be partitioned into multiple offset regions, and an offset type may be determined for each offset region. The offset types include a predetermined number (e.g., four) of edge offset types and two band offset types.
For example, when the offset type is an edge offset type, the edge type to which each pixel belongs is determined so that the corresponding offset is applied. The edge type is determined based on a distribution of values of two pixels adjacent to the current pixel.
In the Adaptive Loop Filtering (ALF) process, filtering may be performed based on a value obtained by comparing an image reconstructed through a deblocking filtering process or a subsequent adaptive offset application process with an original image.
The picture storage module 180 receives the post-processed image data from the post-processing module 170, and reconstructs and stores an image on a per picture basis. The picture may be an image on a per frame basis or an image on a per domain basis.
The inter prediction module 160 may perform motion estimation using at least one reference picture stored in the picture storage module 180, and may determine a motion vector and a reference picture index indicating the reference picture.
In this case, a prediction block corresponding to a prediction unit to be encoded is selected from reference pictures for motion estimation among a plurality of reference pictures stored in the picture storage module 180 according to the determined reference picture index and motion vector.
The intra prediction module 150 may perform intra prediction encoding using reconstructed pixel values in a picture including the current prediction unit.
The intra prediction module 150 receives a current prediction unit to be prediction encoded and performs intra prediction by selecting one of a predetermined number of intra prediction modes according to the size of the current block.
Intra-prediction module 150 may adaptively filter the reference pixels to generate intra-predicted blocks and use the available reference pixels to generate reference pixels when the reference pixels are unavailable.
The entropy encoding module 140 may perform entropy encoding on the quantized coefficients quantized by the quantization module 130, intra prediction information received from the intra prediction module 150, motion information received from the inter prediction module 160, and the like.
Fig. 6 is a block diagram of an embodiment of a configuration for performing inter prediction in the encoding apparatus 10. The inter prediction encoder shown in fig. 6 includes a motion information determination module 161, a motion information encoding mode determination module 162, a motion information encoding module 163, a prediction block generation module 164, a residual block generation module 165, a residual block encoding module 166, and a multiplexer 167.
Referring to fig. 6, the motion information determination module 161 determines motion information of the current block, wherein the motion information includes a reference picture index and a motion vector, and the reference picture index represents any one of previously encoded and reconstructed pictures.
A reference picture index indicating any one of reference pictures belonging to list 0 (L0) when the current block is uni-directional inter prediction encoded and indicating one of the reference pictures of list 0 (L0) when the current block is bi-directional prediction encoded, and a reference picture index indicating one of the reference pictures of list 1 (L1) may be included.
Also, when the current block is bidirectionally predictive encoded, an index indicating one or two pictures of the reference pictures of the synthesized list LC generated by combining the list 0 and the list 1 may be included.
The motion vector indicates a position of a prediction block in a picture indicated by each reference picture index, and the motion vector may be a pixel unit (integer unit) or a sub-pixel unit.
For example, the motion vector may have a resolution of 1/2, 1/4, 1/8, or 1/16 of a pixel. When the motion vector is not an integer unit, the prediction block may be generated from pixels of the integer unit.
The motion information encoding mode determination module 162 may determine an encoding mode for motion information of the current block as any one of a skip mode, a merge mode, and an AMVP mode.
When there is a skip candidate having the same motion information as that of the current block and the residual signal is 0, the skip mode is applied. The skip mode is applied when a current block, which is a Prediction Unit (PU), is the same size as an encoding unit.
When there is a merge candidate having the same motion information as that of the current block, the merge mode is applied. The merge mode is applied when the current block is different in size from an encoding unit (CU), or when a residual signal exists in the case where the current block is the same in size as the encoding unit (CU). In addition, the merge candidate and the skip candidate may be the same.
The AMVP mode is applied when the skip mode and the merge mode are not applied, and an AMVP candidate having a motion vector most similar to that of the current block may be selected as the AMVP predictor.
The motion information encoding module 163 may encode the motion information according to the method determined by the motion information encoding mode determination module 162.
For example, when the motion information encoding mode is the skip mode or the merge mode, the motion information encoding module 163 performs the merge motion vector encoding process, and when the motion information encoding mode is the AMVP mode, the motion information encoding module 163 performs the AMVP encoding process.
The prediction block generation module 164 generates a prediction block using the motion information of the current block and generates the prediction block of the current block by copying a block corresponding to a position indicated by the motion vector in a picture indicated by the reference picture index when the motion vector is an integer unit.
In addition, when the motion vector is not an integer unit, the prediction block generation module 164 may generate pixels of a prediction block from pixels of an integer unit in a picture indicated by the reference picture index.
In this case, the prediction pixel is generated using an 8-tap interpolation filter for the luminance pixel, and the prediction pixel is generated using a 4-tap interpolation filter for the chrominance pixel.
The residual block generation module 165 generates a residual block using the current block and the prediction block of the current block. When the size of the current block is 2N × 2N, the residual block generation module 165 generates a residual block using the current block and a prediction block having a size of 2N × 2N corresponding to the current block.
In addition, when the size of the current block for prediction is 2N × N or N × 2N, a prediction block of each of two 2N × N blocks constituting 2N × 2N is obtained, and then the resulting prediction block having the size of 2N × 2N may be generated using the two 2N × N prediction blocks.
Also, a residual block of size 2N × 2N may be generated using a prediction block of size 2N × 2N. To address the discontinuity of the boundary of two prediction blocks of size 2N × N, overlap smoothing may be applied to pixels at the boundary.
The residual block encoding module 166 divides the residual block into one or at least two Transform Units (TUs), and each Transform Unit (TU) may be transform-coded, quantized, and entropy-coded.
The residual block encoding module 166 may transform the residual block generated by the inter prediction method using an integer-based transform matrix, and the transform matrix may be an integer-based DCT matrix.
In addition, the residual block encoding module 166 uses a quantization matrix to quantize coefficients of the residual block transformed by the transform matrix, and the quantization matrix may be determined by a quantization parameter.
The quantization parameter is determined for each Coding Unit (CU) having a predetermined size or more. When the current Coding Unit (CU) is smaller than the predetermined size, only the quantization parameter of the first Coding Unit (CU) in coding order among coding units having the predetermined size or smaller is encoded, and the quantization parameters of the remaining Coding Units (CUs) are the same as the quantization parameter of the first Coding Unit (CU) and thus are not encoded.
In addition, the coefficients of the transform block may be encoded using a quantization matrix determined according to a quantization parameter and a prediction mode.
The quantization parameter determined for each Coding Unit (CU) having the predetermined size or more may be prediction-encoded using quantization parameters of Coding Units (CUs) adjacent to the current Coding Unit (CU).
The quantization parameter predictor of the current Coding Unit (CU) is generated by searching for one or two valid quantization parameters in the order of a left Coding Unit (CU) and an upper Coding Unit (CU) of the current Coding Unit (CU).
For example, the first effective quantization parameter searched in the above order may be determined as a quantization parameter predictor. The search is performed in the order of the left-side Coding Unit (CU) and the Coding Unit (CU) immediately preceding in the coding order, thereby determining the first effective quantization parameter as the quantization parameter predictor.
The coefficients of the quantized transform block are scanned and transformed into one-dimensional quantized coefficients, and the scanning method may be differently set according to the entropy encoding mode.
For example, the quantized coefficients subjected to inter prediction encoding may be scanned in a predetermined manner (raster scanning in a zigzag or diagonal direction) when encoded in CABAC, and scanned in a manner different from the predetermined manner when encoded in CAVLC.
For example, the scanning method may be determined according to a zigzag mode in the case of an inter frame, the scanning method may be determined according to an intra prediction mode in the case of an intra frame, and the coefficient scanning method may be differently determined according to the size of a transformed basic unit.
In addition, the scanning mode may be changed according to the directional intra prediction mode, and the scanning order of the quantized coefficients may be scanned in the opposite direction.
The multiplexer 167 multiplexes the motion information encoded by the motion information encoding module 163 and the residual signal encoded by the residual block encoding module 166.
The motion information may be different depending on the encoding mode, and for example, in the case of skipping or merging, the motion information may include only an index indicating a predictor, and in the case of AMVP, the motion information may include a reference picture index, a differential motion vector, and an AMVP index of the current block.
Hereinafter, an embodiment of the operation of the intra prediction module 150 shown in fig. 1 will be described in detail.
First, the intra prediction module 150 receives prediction mode information and the size of a Prediction Unit (PU) from the picture partition module 110, and reads reference pixels from the picture storage module 180 to determine an intra prediction mode of the Prediction Unit (PU).
The intra prediction module 150 determines whether a reference pixel is generated by checking whether there is an unavailable reference pixel, and the reference pixel may be used to determine an intra prediction mode of the current block.
When the current block is located at the upper boundary of the current picture, pixels adjacent to the upper side of the current block are not defined. When the current block is located at the left boundary of the current picture, pixels adjacent to the left side of the current block, which may be determined as not available pixels, are not defined.
Furthermore, even when the current block is located at the slice boundary such that pixels adjacent to the upper side or the left side of the slice are not previously encoded and reconstructed pixels, these pixels may be determined not to be usable pixels.
When there is no pixel adjacent to the left or upper side of the current block or no pixel that has been previously encoded and reconstructed, the intra prediction mode of the current block may be determined using only available pixels.
In addition, reference pixels at unavailable locations may be generated using available reference pixels of the current block. For example, when a pixel at the top square is unavailable, some or all of the left side pixels may be used to generate the top side pixels, and vice versa.
That is, the reference pixel may be generated by copying an available reference pixel at a position closest to the reference pixel at the unavailable position in a predetermined direction, or when there is no available reference pixel in the predetermined direction, the reference pixel may be generated by copying an available reference pixel at a closest position in an opposite direction.
In addition, even when an upper pixel or a left pixel of the current block exists, the reference pixel may be determined as an unavailable reference pixel according to an encoding mode of a block to which the upper pixel or the left pixel belongs.
For example, when a block to which reference pixels adjacent to an upper side of the current block belong is a block that is inter-encoded and thus reconstructed, the reference pixels may be determined as unavailable pixels.
Here, blocks adjacent to the current block are intra-coded such that available reference pixels can be generated using pixels belonging to the reconstructed block, and information on which the encoding apparatus 10 determines the available reference pixels according to the encoding mode is transmitted to the decoding apparatus 20.
The intra prediction module 150 determines an intra prediction mode of the current block using the reference pixels, and the number of acceptable intra prediction modes in the current block may vary according to the size of the block.
For example, when the size of the current block is 8 × 8, 16 × 16, and 32 × 32, there may be 34 intra prediction modes, and when the size of the current block is 4 × 4, there may be 17 intra prediction modes.
The 34 or 17 intra prediction modes may be configured with at least one non-directional mode and a plurality of directional modes.
The at least one non-directional mode may be a DC mode and/or a planar mode. When the DC mode and the planar mode are included in the non-directional mode, there may be 35 intra prediction modes regardless of the size of the current block.
Here, two non-directional modes (DC mode and planar mode) and 33 directional modes may be included.
In case of the planar mode, a prediction block of the current block is generated by using at least one pixel value (or a prediction value of the pixel value, hereinafter referred to as a first reference value) located at the lower right of the current block and a reference pixel.
The configuration of the image decoding apparatus according to the embodiment of the present invention can be derived from the configuration of the image encoding apparatus 10 described with reference to fig. 1 to 6. For example, the image may be decoded by inversely performing the process of the image encoding method described above with reference to fig. 1 to 6.
Fig. 7 is a block diagram illustrating a configuration of a video decoding apparatus according to an embodiment of the present invention. The decoding apparatus 20 includes an entropy decoding module 210, an inverse quantization/inverse transformation module 220, an adder 270, a deblocking filter 250, a picture storage module 260, an intra prediction module 230, a motion compensation prediction module 240, and an intra/inter selection switch 280.
The entropy decoding module 210 receives a bitstream encoded by the image encoding apparatus 10 and decodes the bitstream such that the bitstream is divided into an intra prediction mode index, motion information, a quantized coefficient sequence, etc., and transmits the decoded motion information to the motion compensation prediction module 240.
Further, the entropy decoding module 210 sends the intra prediction mode index to the intra prediction module 230 and the inverse quantization/inverse transform module 220, and sends the inverse quantized coefficient sequence to the inverse quantization/inverse transform module 220.
The inverse quantization/inverse transform module 220 transforms the sequence of quantized coefficients into a two-dimensional array of inverse quantized coefficients and may select one of a plurality of scan modes for the transform, and for example, select a scan mode based on an intra prediction mode and a prediction mode (e.g., intra prediction or inter prediction) of the current block.
The inverse quantization/inverse transform module 220 applies a quantization matrix selected from a plurality of quantization matrices to inverse quantized coefficients of the two-dimensional array to reconstruct the quantized coefficients.
In addition, quantization matrices different from each other may be selected according to the size of a current block to be reconstructed, and the quantization matrices may be selected based on at least one of an intra prediction mode and a prediction mode of the current block for blocks of the same size.
The inverse quantization/inverse transform module 220 inverse-transforms the reconstructed quantized coefficients to reconstruct a residual block, and may perform an inverse transform process using a Transform Unit (TU) as a basic unit.
The adder 270 reconstructs an image block by adding the residual block reconstructed by the inverse quantization/inverse transformation module 220 to the prediction block generated by the intra prediction module 230 or the motion compensation prediction module 240.
The deblocking filter 250 may perform a deblocking filtering process on the reconstructed image generated by the adder 270 to reduce deblocking artifacts due to image loss according to the quantization process.
The picture storage module 260 is a frame memory for storing a locally decoded image on which the deblocking filter 250 performs the deblocking filtering process.
The intra prediction module 230 reconstructs an intra prediction mode of the current block based on the intra prediction mode index received from the entropy decoding module 210, and generates a prediction block according to the reconstructed intra prediction mode.
The motion compensation prediction module 240 generates a prediction block of the current block from the picture stored in the picture storage module 260 based on the motion vector information, and when motion compensation of fractional precision is applied, the motion compensation prediction module 240 applies the selected interpolation filter to generate the prediction block.
The intra/inter selection switch 280 may provide the prediction block generated in the intra prediction module 230 or the motion compensation prediction module 240 to the adder 270 based on the encoding mode.
Fig. 8 is a block diagram illustrating an embodiment of a configuration for performing inter prediction in the image decoding apparatus 20. The inter prediction decoder includes a demultiplexer 241, a motion information encoding mode determination module 242, a merge mode motion information decoding module 243, an AMVP mode motion information decoding module 244, a prediction block generation module 245, a residual block decoding module 246, and a reconstructed block generation module 247. Here, the merge mode motion information decoding module 243 and the AMVP mode motion information decoding module 244 may be included in the motion information decoding module 248.
Referring to fig. 8, the demultiplexer 241 demultiplexes currently encoded motion information and an encoded residual signal from a received bitstream, transmits the demultiplexed motion information to the motion information encoding mode determining module 242, and transmits the demultiplexed residual signal to the residual block decoding module 246.
The motion information encoding mode determining module 242 determines the motion information encoding mode of the current block, and when the skip flag of the received bitstream has a value of 1, the motion information encoding mode determining module 242 determines that the motion information encoding mode of the current block is encoded in the skip encoding mode.
When the skip flag of the received bitstream has a value of 0 and the motion information received from the demultiplexer 241 has only the merge index, the motion information encoding mode determining module 242 determines that the motion information encoding mode of the current block is encoded in the merge mode.
Also, when the skip flag of the received bitstream has a value of 0 and the motion information received from the demultiplexer 241 includes a reference picture index, a differential motion vector, and an AMVP index, the motion information encoding mode determining module 242 determines that the motion information encoding mode of the current block is encoded in the AMVP mode.
The merge mode motion information decoding module 243 is activated when the motion information encoding mode determination module 242 determines that the motion information encoding mode of the current block is the skip mode or the merge mode. When the motion information encoding mode determination module 242 determines that the motion information encoding mode of the current block is the AMVP mode, the AMVP mode motion information decoding module 244 is activated.
The prediction block generation module 245 generates a prediction block for the current block using the motion information reconstructed by the merge mode motion information decoding module 243 or the AMVP mode motion information decoding module 244.
When the motion vector is an integer unit, the prediction block of the current block may be generated by copying a block corresponding to a position indicated by the motion vector in a picture indicated by the reference picture index.
In addition, when the motion vector is not an integer unit, pixels of the prediction block are generated from integer unit pixels in a picture indicated by the reference picture index. Here, the prediction pixel may be generated by using an 8-tap interpolation filter in the case of a luminance pixel and a 4-tap interpolation filter in the case of a chrominance pixel.
The residual block decoding module 246 performs entropy decoding on the residual signal and inverse-scans the entropy-decoded coefficients to generate a two-dimensional quantized coefficient block. The inverse scanning method may vary according to the entropy decoding method.
For example, when decoding is performed based on CABAC, inverse scanning may be applied in a diagonal raster inverse scanning method, and when decoding is performed based on CAVLC, inverse scanning may be applied in a zigzag inverse scanning method. Also, the inverse scanning method may be differently determined according to the size of the prediction block.
The residual block decoding module 246 may inverse quantize the generated coefficient block using the inverse quantization matrix and reconstruct the quantization parameter to derive a quantization matrix. Here, the quantization step size may be reconstructed for each coding unit equal to or greater than a predetermined size.
The residual block decoding module 260 reconstructs a residual block by inverse transforming the inverse quantized coefficient block.
The reconstructed block generation module 270 generates a reconstructed block by adding the predicted block generated by the predicted block generation module 250 to the residual block generated by the residual block decoding module 260.
Hereinafter, an embodiment of a process of reconstructing a current block through intra prediction will be described with reference to fig. 7 again.
First, an intra prediction mode of a current block is decoded from a received bitstream. To this end, the entropy decoding module 210 refers to one of a plurality of intra prediction mode tables to reconstruct a first intra prediction mode index of the current block.
The plurality of intra prediction mode tables are tables shared by the encoding device 10 and the decoding device 20, and any one table selected according to the distribution of intra prediction modes for a plurality of blocks adjacent to the current block may be applied.
For example, when the intra prediction mode of the left block of the current block and the intra prediction mode of the upper block of the current block are identical to each other, the first intra prediction mode table is applied to reconstruct the first intra prediction mode index of the current block, and otherwise, the second intra prediction mode table may be applied to reconstruct the first intra prediction mode index of the current block.
As another example, in a case where the intra prediction modes of the upper block and the left block of the current block are both directional intra prediction modes, when the direction of the intra prediction mode of the upper block and the direction of the intra prediction mode of the left block are within a predetermined angle, the first intra prediction mode index of the current block is reconstructed by applying the first intra prediction mode table, and when the direction of the intra prediction mode of the upper block and the direction of the intra prediction mode of the left block exceed the predetermined angle, the first intra prediction mode index of the current block may be reconstructed by applying the second intra prediction mode table.
The entropy decoding module 210 sends the reconstructed first intra prediction mode index of the current block to the intra prediction module 230.
When the index has a minimum value (i.e., 0), the intra prediction module 230 receiving the first intra prediction mode index determines the maximum possible mode of the current block as the intra prediction mode of the current block.
In addition, when the index has a value other than 0, the intra prediction module 230 compares an index indicated by the maximum possible mode of the current block with the first intra prediction mode index, and as a result of the comparison, when the first intra prediction mode index is not less than the index indicated by the maximum possible mode of the current block, the intra prediction module 230 determines an intra prediction mode corresponding to a second intra prediction mode index obtained by adding 1 to the first intra prediction mode index as the intra prediction mode of the current block, and otherwise, the intra prediction module 230 determines an intra prediction mode corresponding to the first intra prediction mode index as the intra prediction mode of the current block.
The allowable intra prediction modes for the current block may be configured with at least one non-directional mode and a plurality of directional modes.
The at least one non-directional mode may be a DC mode and/or a planar mode. Further, the DC mode or the planar mode may be adaptively included in the allowable intra prediction mode set.
For this purpose, information specifying a non-directional mode included in the allowable intra prediction mode set may be included in a picture header or a slice header.
Next, the intra prediction module 230 reads the reference pixels from the picture storage module 260 and determines whether there are unavailable reference pixels in order to generate the intra prediction block.
This determination may be made by applying the decoded intra prediction mode of the current block according to whether there are reference pixels used to generate the intra prediction block.
Next, when reference pixels need to be generated, the intra prediction module 230 may generate reference pixels at unavailable locations by using available reference pixels that have been previously reconstructed.
The definition of the unavailable reference pixel and the method of generating the reference pixel may be the same as the operation in the intra prediction module 150 according to fig. 1. However, the reference pixels used to generate the intra prediction block may be alternatively reconstructed according to the decoded intra prediction mode of the current block.
In addition, the intra-prediction module 230 determines whether to apply filtering to the reference pixels to generate the prediction block. That is, it is determined whether to apply filtering to the reference pixels based on the size of the current block and the decoded intra prediction mode in order to generate the intra prediction block of the current block.
As the size of the block increases, the blocking effect increases. Therefore, as the size of the block increases, the number of prediction modes used to filter the reference pixels may increase. However, when the block is larger than the predetermined size, since the block is determined to be a flat area, the reference pixels may not be filtered to reduce complexity.
When it is determined that the reference pixel requires the filter to be applied, the intra prediction module 230 filters the reference pixel using the filter.
At least two filters may be adaptively applied according to a degree of depth difference between reference pixels. The filter coefficients of the filter are preferably symmetrical.
Further, the two or more filters may be adaptively applied according to the size of the current block. When the filter is applied, a filter having a narrow bandwidth is applied to a small-sized block, and a filter having a wide bandwidth is applied to a large-sized block.
In case of the DC mode, since the prediction block is generated as an average value of the reference pixels, it is not necessary to apply a filter. In the vertical mode in which the image has correlation in the vertical direction, it is not necessary to apply a filter to the reference pixel, and similarly, in the horizontal mode in which the image has correlation in the horizontal direction, it is not necessary to apply a filter to the reference pixel.
Since whether filtering is applied is related to an intra prediction mode of the current block, the reference pixel may be adaptively filtered based on the size of the prediction block of the current block and the intra prediction mode.
Next, the intra prediction module 230 generates a prediction block using the reference pixels or the filtered reference pixels according to the reconstructed intra prediction mode, and the generation of the prediction block is the same as the operation of the encoding apparatus 10, so that a detailed description thereof will be omitted.
The intra prediction module 230 determines whether to filter the generated prediction block, and may determine whether to filter based on information included in a slice header or a coding unit header or according to an intra prediction mode of the current block.
When it is determined that the generated prediction block is to be filtered, the intra prediction module 230 filters pixels at a specific position of the prediction block generated using available reference pixels adjacent to the current block to generate new pixels.
For example, in the DC mode, prediction pixels adjacent to a reference pixel among the prediction pixels may be filtered using the reference pixels adjacent to the prediction pixels.
Accordingly, the prediction pixels are filtered using one or two reference pixels according to the positions of the prediction pixels, and the filtering of the prediction pixels in the DC mode may be applied to prediction blocks of all sizes.
In addition, in the vertical mode, prediction pixels adjacent to a left reference pixel among prediction pixels of the prediction block may be changed using reference pixels other than the upper pixels used to generate the prediction block.
Also, in the horizontal mode, prediction pixels adjacent to an upper reference pixel among the generated prediction pixels may be changed using reference pixels other than the left-side pixel, which are used to generate the prediction block.
The current block may be reconstructed by using the prediction block of the current block reconstructed in this manner and the residual block of the current block that has been decoded.
Fig. 9 is a diagram illustrating a second embodiment of a method of partitioning an image into block units and processing the image.
Referring to fig. 9, a Coding Tree Unit (CTU) having a maximum size of 256 × 256 pixels is partitioned into four Coding Units (CUs) each having a square shape using a quad tree structure.
At least one of the coding units resulting from the quadtree partitioning operation may be further partitioned into two Coding Units (CUs) each having a rectangular shape using a binary tree structure.
In addition, at least one of the coding units resulting from the quadtree partitioning operation may be further partitioned into four Coding Units (CUs) each having a square shape using a quadtree structure.
At least one of the coding units resulting from the binary tree partitioning operation may be further partitioned into two Coding Units (CUs) each having a square shape or a rectangular shape using a binary tree structure.
In addition, at least one of the coding units resulting from the quadtree partitioning operation may be further partitioned into Coding Units (CUs) each having a square shape or a rectangular shape using a quadtree structure or a binary tree structure.
The Coded Blocks (CBs) resulting from the binary tree partitioning operations and constructed as described above may be used for prediction and transformation without further partitioning. That is, the sizes of the Prediction Unit (PU) and the Transform Unit (TU) belonging to the coding block CB as shown in fig. 9 may be the same as the size of the Coding Block (CB).
As described above, the coding unit resulting from the quadtree partitioning operation may be partitioned into one or at least two Prediction Units (PUs) using the method described with reference to fig. 3 and 4.
The coding unit resulting from the quadtree partitioning operation as described above may be partitioned into one or at least two Transform Units (TUs) by using the method described with reference to fig. 5, and the Transform Units (TUs) resulting from the partitioning operation may have a maximum size of 64 × 64 pixels.
The syntax structure used for partitioning and processing the image on a per block basis may use a flag to indicate the partition information. For example, whether or not a Coding Unit (CU) is partitioned may be indicated by using split _ CU _ flag, and a depth of the Coding Unit (CU) partitioned by a binary tree may be indicated by using binary _ depth. Whether a Coding Unit (CU) is partitioned using a binary tree structure may be represented by a separate binary _ split _ flag.
The methods described above with reference to fig. 1 to 8 are applied to blocks (e.g., coding Units (CUs), prediction Units (PUs), and Transform Units (TUs)) partitioned by the method described above with reference to fig. 9, so that encoding and decoding of images can be performed.
Hereinafter, referring to fig. 10 to 15, another embodiment of a method of partitioning a Coding Unit (CU) into one or at least two Transform Units (TUs) will be described.
According to an embodiment of the present invention, a Coding Unit (CU) may be partitioned into Transform Units (TUs) using a binary tree structure, wherein the transform units are basic units for transforming a residual block.
Referring to fig. 10, at least one of rectangular coding blocks CB0 and CB1, which are obtained from a binary tree partitioning operation and have a size of N × 2N or 2N × N, is further partitioned into square transform units TU0 and TU1, each having a size of N × N, using a binary tree structure.
As described above, the block-based image encoding method may perform prediction, transform, quantization, and entropy encoding steps.
In the prediction step, a prediction signal is generated by referring to a current encoding block and an existing encoded picture or a surrounding picture, and thus a differential signal with respect to the current block can be calculated.
In addition, in the transformation step, transformation is performed using various transformation functions having the differential signal as an input. The transformed signal is classified into a DC coefficient and an AC coefficient, thereby obtaining energy compression and improving coding efficiency.
Further, in the quantization step, quantization is performed with the transform coefficient as an input, and then entropy encoding is performed on the quantized signal, so that an image can be encoded.
In addition, the image decoding method is performed in the reverse order of the above-described encoding process, and a quality distortion phenomenon of an image may occur in the quantization step.
In order to reduce image quality distortion while improving coding efficiency, the size or shape of the Transform Unit (TU) and the type of transform function to be applied may vary depending on the distribution of differential signals input in the transform step and the characteristics of an image.
For example, when a block similar to the current block is found through a block-based motion estimation process in the prediction step, a cost measurement method such as Sum of Absolute Differences (SAD) or Mean Square Error (MSE) is used, and the distribution of the differential signal may be generated in various forms depending on the characteristics of the image.
Accordingly, efficient encoding may be performed by selectively determining the size or shape of a transform unit (CU) based on various distributions of differential signals.
For example, when a differential signal is generated in a particular coding block CBx, the coding block CBx is partitioned into two Transform Units (TUs) using a binary tree structure. Since the DC value is generally referred to as an average value of the input signal, when the differential signal is received as an input of the transform process, the DC value can be effectively indicated by partitioning the encoded block CBx into two Transform Units (TUs).
Referring to fig. 11, a square coding unit CU0 having a size of 2N × 2N is partitioned into rectangular transform units TU0 and TU1 having sizes of N × 2N or 2N × N using a binary tree structure.
According to another embodiment of the present invention, as described above, the step of partitioning a Coding Unit (CU) using a binary tree structure is repeated two or more times, thereby obtaining a plurality of Transform Units (TUs).
Referring to fig. 12, a rectangular coding block CB1 of size N × 2N is partitioned using a binary tree structure, and blocks of size N × N resulting from the partitioning operation are further partitioned using a binary tree structure, thereby creating rectangular blocks of size N/2 × N or N × N/2. Subsequently, the blocks of size N/2 XN or NXN/2 are further partitioned into square transform units TU1, TU2, TU4 and TU5 of size N/2 XN/2 using a binary tree structure.
Referring to fig. 13, a square coding block CB0 of size 2N × 2N is partitioned using a binary tree structure, and a block of size N × 2N resulting from the partitioning operation is further partitioned using a binary tree structure, thereby constructing a square block of size N × N, and then the block of size N × N may be further partitioned into rectangular transform units TU1 and TU2 of size N/2 × N using a binary tree structure.
Referring to fig. 14, a rectangular coding block CB0 of size 2N × N is partitioned using a binary tree structure, and blocks of size N × N resulting from the partitioning operation are further partitioned using a quad tree structure, thereby obtaining square transform units TU1, TU2, TU3, and TU4 of size N/2 × N/2.
The methods described with reference to fig. 1 to 8 are applied to blocks partitioned by the methods described with reference to fig. 10 to 14, such as a Coding Unit (CU), a Prediction Unit (PU), and a Transform Unit (TU), so that encoding and decoding can be performed on an image.
Hereinafter, an embodiment of a method of determining a block partition structure by the encoding apparatus 10 according to the present invention will be described.
The picture partitioning module 110 provided in the image encoding apparatus 10 performs Rate Distortion Optimization (RDO) according to a preset sequence, and outputs a Prediction Unit (PU) and determines partition structures of a Coding Unit (CU), a Prediction Unit (PU), and a Transform Unit (TU) that can be partitioned as described above.
For example, to determine the block partition structure, the picture partition module 110 performs rate-distortion optimization-quantization (RDO-Q) to determine an optimal block partition structure in terms of bit rate and distortion.
Referring to fig. 15, when a Coding Unit (CU) has the form of a pixel size of 2N × 2N, RDO is performed in transform unit (PU) partition order of a pixel size of 2N × 2N shown in (a), a pixel size of N × N shown in (b), a pixel size of N × 2N shown in (c), and a pixel size of 2N × N shown in (d), thereby determining an optimal partition structure of the transform unit (PU).
Referring to fig. 16, when the Coding Unit (CU) has the form of an N × 2N or 2N × N pixel size, RDO is sequentially performed on the partition structure of the transform unit (PU) in the order of the N × 2N (or 2N × N) pixel size shown in (a), the N × N pixel size shown in (b), the N/2 × N (or N × N/2) and N × N pixel sizes shown in (c), the N/2 × N/2, N/2 × N and N × N pixel sizes shown in (d), and the N/2 × N pixel size shown in (e), thereby determining the optimal partition structure of the transform unit (PU).
In the above description, the block partitioning method of the present invention has been described as an example of determining a block partitioning structure by performing Rate Distortion Optimization (RDO). However, the picture partitioning module 110 may use Sum of Absolute Difference (SAD) or mean square error to determine the block partitioning structure, thereby maintaining efficiency while reducing complexity.
Hereinafter, an image processing method and encoding and decoding methods thereof according to embodiments of the present invention will be described in more detail.
Fig. 17 and 18 are flowcharts illustrating an image processing method of processing motion information for parallel processing according to an embodiment of the present invention. Fig. 19 to 22 are diagrams illustrating a method of processing motion information for each case.
As described above, when the motion compensation module 160 of the encoding apparatus 10 and the motion information decoding module 248 of the decoding apparatus 20 encode and decode motion information in the merge mode and the AMVP mode, respectively, processes such as motion compensation, mode determination, and entropy encoding may be implemented as a parallel pipeline system as a hardware module.
Therefore, when the mode decision of the previous block is completed later than the processing time of the current block, the motion compensation processing of the current block may not be performed, which may cause the occurrence of pipeline stall.
To solve this problem, according to an embodiment of the present invention, the encoding device 10 or the decoding device 20 sequentially groups a plurality of blocks into a parallel motion information prediction unit according to a predetermined arbitrary size (S101), obtains motion information of at least one available block among remaining blocks excluding blocks belonging to a previous parallel motion information prediction unit of the current block from neighboring blocks (S103), and constructs a motion information prediction list based on the obtained motion information (S105), and performs motion information prediction decoding processing for each mode based on the motion information prediction list (S107). However, when there is no available block in step S103, the motion information prediction list in step S105 is constructed using a predetermined zero motion vector (zero MV) or a motion vector at the same position of the previous frame (co-located MV).
More specifically, blocks to be encoded and decoded may be sequentially grouped into a parallel motion information prediction unit (PMU) having a predetermined size according to an encoding and decoding order.
In the case where the nth parallel motion information prediction unit to which the current block to be decoded belongs is PMU (n), when encoding and decoding processes, such as AMVP or merge, requiring motion vector information of neighboring blocks are performed in the motion compensation module 160 or the motion information decoding module 248, blocks included in PMU (n-1) that have been encoded or decoded immediately before the current block may be excluded from the neighboring blocks.
That is, in the motion compensation module 160 or the motion information decoding module 248, when constructing a Motion Vector Prediction (MVP) candidate list of the current block, motion information of blocks included in PMU (n-1) that has been encoded/decoded immediately before PMU (n) to which the current block belongs may be excluded. Accordingly, an MVP candidate list corresponding to the current block may be determined from motion information of blocks not included in PMU (n-1) among neighboring blocks.
However, since the number of motion vectors of available neighboring blocks may be reduced, as shown in fig. 19, in order to prevent a reduction in coding efficiency, more motion information of neighboring blocks F to I than existing neighboring blocks a to E may be used to construct a Motion Vector Prediction (MVP) candidate list.
Accordingly, at least one of motion vectors of available blocks (e.g., blocks of the same picture/parallel block/slice) among neighboring blocks, which are not included in PMU (n-1), may be determined as a resulting Motion Vector Prediction (MVP) candidate list.
In addition, referring to fig. 18, when decoding motion information using a neighboring block at a predetermined position, the motion compensation unit 160 or the motion information decoding module 248 identifies a neighboring block to be used for motion estimation of a current block belonging to the PUM (n) (S201), obtains substitute information of motion information of the neighboring block included in the PMU (n-1) (S205) when the neighboring block is included in the PMU (n-1) (S203), and processes motion estimation decoding of the current block using the substitute information without motion information of the neighboring block included in the PMU (n-1) (S207).
Here, the substitute information may be calculated as a predetermined default value even if the actual motion vector information at the corresponding position is unknown, or may be a value derived from a neighboring block included in the PMU (n). For example, the replacement information of the neighboring blocks included in the PMU (n-1) may be constructed with a zero motion vector (zero MV) or a motion vector at the same position of the previous frame (co-located MV).
For such processing, the encoding apparatus 10 may include size information of PMU in syntax form in sequence header information, picture header information, or slice header information for transmission. PMU size information may also be explicitly or implicitly signaled as a value related to the CTU size and the minimum and maximum sizes of the CU (or PU).
Further, the encoding apparatus 10 may sequentially group PMUs according to a predetermined size and allocate a parallel motion information prediction unit index to each unit to index, and the allocated index information may be explicitly transmitted or the index information may be derived in the decoding apparatus 20.
For example, the size of the PMU may always be equal to the minimum PU size, or the size of the PMU may be equal to the size of the PU or CU currently being encoded/decoded. In this case, PMU size information may not be explicitly sent, and the index may be identified according to a sequential grouping operation.
Referring to fig. 20, a thick line indicates a CTU, a solid line indicates a PU, and a red dotted line indicates a PMU.
According to an embodiment of the present invention, the motion vector of the left-lower end of the neighboring motion vector of the current PU may be a motion vector included in the immediately preceding PMU (n-1) according to encoding and decoding order. Accordingly, when motion vector prediction encoding and decoding are performed in the encoding apparatus 10 and the decoding apparatus 20, motion vectors of neighboring blocks included in the immediately preceding PMU (n-1) may be excluded in constructing the motion vector list. As replacement information for blocks included in PMU (n-1) of the immediately preceding index, a zero motion vector (zero MV) or a motion vector at the same position of the previous frame (co-located MV) may be used to construct the motion vector candidate list.
In addition, referring to fig. 21, since the neighboring blocks at the upper right end are included in the immediately preceding PMU (n-1), they are excluded from candidate motion vectors in motion vector prediction encoding and decoding of the current block. Alternatively, a zero motion vector (zero MV) or a motion vector at the same position of the previous frame (co-located MV) may be used as the candidate motion vector.
Referring to FIG. 22, the case where all neighboring blocks on the left side are included in the previous PMU (n-1) is described. As shown in fig. 22, the encoding apparatus 10 and the decoding apparatus 20 exclude motion vectors of all left-side neighboring blocks from candidate motion vectors in motion vector prediction encoding and decoding processes, and instead, motion vectors of upper (upper left and upper right) neighboring blocks may be used to construct a candidate list for motion vector prediction.
According to this construction, the dependency between the parallel motion information prediction units can be split, and the motion information decoding can be processed without waiting until the processing of a block in the previous pipeline among adjacent blocks is completed, whereby pipeline stall can be prevented in advance and encoding and decoding efficiency can be improved accordingly.
The above-described method according to the present invention can be stored in a computer-readable recording medium. The computer-readable recording medium may be a ROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, etc., and may also be implemented in the form of a carrier wave (e.g., transmitted through the internet).
The computer readable recording medium can be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion. In addition, functional programs, codes, and code segments for implementing the above-described methods can be easily inferred by programmers skilled in the art to which the present invention pertains.
Although exemplary embodiments of the present invention have been illustrated and described above, the present invention is not limited to the foregoing specific embodiments, and those skilled in the art can make various modifications to the present invention without departing from the gist of the present invention defined in the claims. Such modifications should not be construed as departing from the technical idea or idea of the present invention.

Claims (2)

1. A method of image decoding, the method comprising:
deriving a first motion information candidate from a predefined neighboring block of the current block;
deriving a second motion information candidate from at least one block included in a motion information prediction unit including the current block;
generating a candidate list using at least one of the first motion information candidate and the second motion information candidate; and is
Generating a prediction block for the current block based on the candidate list,
wherein the motion information prediction unit is a coding tree unit, and
wherein the at least one block included in the motion information prediction unit is determined by excluding a block of a motion information prediction unit belonging to a different row from a motion information prediction unit including the current block.
2. A method of image encoding, the method comprising:
deriving a first motion information candidate from a predefined neighboring block of the current block;
deriving a second motion information candidate from at least one block included in a motion information prediction unit including the current block;
generating a candidate list using at least one of the first motion information candidate and the second motion information candidate; and is
Encoding motion information of the current block based on the candidate list, and
wherein the motion information prediction unit is a coding tree unit, and
wherein the at least one block is determined by excluding a block of a motion information prediction unit belonging to a different row from a motion information prediction unit including the current block.
CN202211233463.3A 2017-03-31 2018-02-27 Image decoding method and image encoding method Pending CN115426499A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR10-2017-0042271 2017-03-31
KR1020170042271A KR20180111378A (en) 2017-03-31 2017-03-31 A method of video processing providing independent properties between coding tree units and coding units, a method and appratus for decoding and encoding video using the processing.
CN201880023458.5A CN110495175B (en) 2017-03-31 2018-02-27 Image decoding method and image encoding method
PCT/KR2018/002417 WO2018182185A1 (en) 2017-03-31 2018-02-27 Image processing method for processing motion information for parallel processing, method for decoding and encoding using same, and apparatus for same

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201880023458.5A Division CN110495175B (en) 2017-03-31 2018-02-27 Image decoding method and image encoding method

Publications (1)

Publication Number Publication Date
CN115426499A true CN115426499A (en) 2022-12-02

Family

ID=63677777

Family Applications (4)

Application Number Title Priority Date Filing Date
CN202211233463.3A Pending CN115426499A (en) 2017-03-31 2018-02-27 Image decoding method and image encoding method
CN202211233595.6A Pending CN115426500A (en) 2017-03-31 2018-02-27 Image decoding method and image encoding method
CN202211233516.1A Pending CN115604484A (en) 2017-03-31 2018-02-27 Image decoding method and image encoding method
CN201880023458.5A Active CN110495175B (en) 2017-03-31 2018-02-27 Image decoding method and image encoding method

Family Applications After (3)

Application Number Title Priority Date Filing Date
CN202211233595.6A Pending CN115426500A (en) 2017-03-31 2018-02-27 Image decoding method and image encoding method
CN202211233516.1A Pending CN115604484A (en) 2017-03-31 2018-02-27 Image decoding method and image encoding method
CN201880023458.5A Active CN110495175B (en) 2017-03-31 2018-02-27 Image decoding method and image encoding method

Country Status (3)

Country Link
KR (4) KR20180111378A (en)
CN (4) CN115426499A (en)
WO (1) WO2018182185A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117376551A (en) * 2023-12-04 2024-01-09 淘宝(中国)软件有限公司 Video coding acceleration method and electronic equipment

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11611742B2 (en) 2019-01-28 2023-03-21 Apple Inc. Image signal encoding/decoding method and device therefor

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3879741B2 (en) * 2004-02-25 2007-02-14 ソニー株式会社 Image information encoding apparatus and image information encoding method
US20090110077A1 (en) * 2006-05-24 2009-04-30 Hiroshi Amano Image coding device, image coding method, and image coding integrated circuit
CN107105282B (en) * 2010-12-14 2019-11-12 M&K控股株式会社 Equipment for decoding moving pictures
US9596467B2 (en) * 2010-12-21 2017-03-14 Nec Corporation Motion estimation device for predicting a vector by referring to motion vectors of adjacent blocks, motion estimation method and storage medium of motion estimation program
US9509995B2 (en) * 2010-12-21 2016-11-29 Intel Corporation System and method for enhanced DMVD processing
CN102651814B (en) * 2011-02-25 2015-11-25 华为技术有限公司 Video encoding/decoding method, coding method and terminal
JP5979405B2 (en) * 2011-03-11 2016-08-24 ソニー株式会社 Image processing apparatus and method
BR112014004914B1 (en) * 2011-08-29 2022-04-12 Ibex Pt Holdings Co., Ltd Method of encoding an image in an amvp mode
KR101197176B1 (en) * 2011-09-23 2012-11-05 주식회사 케이티 Methods of derivation of merge candidate block and apparatuses for using the same
GB2561514B (en) * 2011-11-08 2019-01-16 Kt Corp Method and apparatus for encoding image, and method and apparatus for decoding image
KR20140081682A (en) * 2012-12-14 2014-07-01 한국전자통신연구원 Method and apparatus for image encoding/decoding
WO2014141899A1 (en) * 2013-03-12 2014-09-18 ソニー株式会社 Image processing device and method
KR101676788B1 (en) * 2014-10-17 2016-11-16 삼성전자주식회사 Method and apparatus for parallel video decoding based on multi-core system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117376551A (en) * 2023-12-04 2024-01-09 淘宝(中国)软件有限公司 Video coding acceleration method and electronic equipment
CN117376551B (en) * 2023-12-04 2024-02-23 淘宝(中国)软件有限公司 Video coding acceleration method and electronic equipment

Also Published As

Publication number Publication date
CN115604484A (en) 2023-01-13
KR102510696B1 (en) 2023-03-16
KR20230038687A (en) 2023-03-21
WO2018182185A1 (en) 2018-10-04
KR102657392B1 (en) 2024-04-15
KR20220120539A (en) 2022-08-30
CN110495175B (en) 2022-10-18
KR20180111378A (en) 2018-10-11
KR102437729B1 (en) 2022-08-29
CN115426500A (en) 2022-12-02
KR20220005101A (en) 2022-01-12
CN110495175A (en) 2019-11-22

Similar Documents

Publication Publication Date Title
US11107253B2 (en) Image processing method, and image decoding and encoding method using same
CN107087193B (en) Method for decoding video signal
CN110495173B (en) Image processing method for performing processing of coding tree unit and coding unit, image decoding and encoding method using the same, and apparatus therefor
KR20180058224A (en) Modeling-based image decoding method and apparatus in video coding system
KR102657392B1 (en) A method of video processing providing independent properties between coding tree units and coding units, a method and appratus for decoding and encoding video using the processing
KR20190093172A (en) A method of video processing for moving information, a method and appratus for decoding and encoding video using the processing.
KR20230113661A (en) Method and device for image encoding/decoding based on effective transmission of differential quantization parameter
KR101659343B1 (en) Method and apparatus for processing moving image
KR101914667B1 (en) Method and apparatus for processing moving image
KR20200071302A (en) A method for encoding and decoding video processing using motion prediction
KR102610188B1 (en) Method of video processing providing high-throughput arithmetic coding and method and appratus for decoding and encoding video using the processing
KR20240052921A (en) A method of video processing providing independent properties between coding tree units and coding units, a method and appratus for decoding and encoding video using the processing
KR20140129632A (en) Method and apparatus for processing moving image
KR20200114601A (en) A video encoding method, a video decoding method and an apparatus for processing intra block copy
KR20200005207A (en) A method for encoding and decoding video using a motion compensation and filtering
KR20140129629A (en) Method and apparatus for processing moving image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination