CN116567210A - Video encoding/decoding apparatus and method, and non-transitory recording medium - Google Patents

Video encoding/decoding apparatus and method, and non-transitory recording medium Download PDF

Info

Publication number
CN116567210A
CN116567210A CN202310713804.5A CN202310713804A CN116567210A CN 116567210 A CN116567210 A CN 116567210A CN 202310713804 A CN202310713804 A CN 202310713804A CN 116567210 A CN116567210 A CN 116567210A
Authority
CN
China
Prior art keywords
motion vector
resolution
vector resolution
current block
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310713804.5A
Other languages
Chinese (zh)
Inventor
林晶娟
李善英
孙世勋
申在燮
金炯德
李京泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SK Telecom Co Ltd
Original Assignee
SK Telecom Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020170025673A external-priority patent/KR20180043151A/en
Application filed by SK Telecom Co Ltd filed Critical SK Telecom Co Ltd
Publication of CN116567210A publication Critical patent/CN116567210A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/109Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/174Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present disclosure relates to a video encoding/decoding apparatus and method, and a non-transitory recording medium. The present invention provides a method for encoding video, the method comprising: a step of determining a motion vector resolution of the current block; a step of determining a motion vector of the current block according to the motion vector resolution; a step of predicting the current block by using the motion vector and encoding the current block; and a step of encoding information on the resolution of the motion vector.

Description

Video encoding/decoding apparatus and method, and non-transitory recording medium
The present application is a divisional application of the invention patent application of the original application No. 201780064071.X (international application No. PCT/KR 2017/01484, application date: 2017, 10, 17, invention name: apparatus and method for encoding or decoding an image).
Technical Field
The present invention relates to efficient video encoding or decoding.
Background
The information disclosed in this background section is only for enhancement of understanding of the background of the invention and therefore it may contain information that does not form the prior art.
Video data consumes a large amount of data compared to audio data, still image data, and the like, and therefore many hardware resources including a memory are required to store or transmit the video data itself without compression processing. Therefore, in general, when video data is stored or transmitted, the video data is compressed and stored or transmitted using an encoding device, and a decoding device receives, decompresses, and reproduces the compressed video data. Such video compression techniques include High Efficiency Video Coding (HEVC) with an improvement of about 40% over h.264/AVC in coding efficiency established in the early 2013, as well as h.264/AVC.
During inter prediction encoding, which is a prediction method for encoding or decoding, information about a residual block obtained by predicting a current block and motion information for predicting the current block are signaled to a decoding apparatus. Here, the motion information includes information on a reference picture for predicting a current block and information on a motion vector, and in the case of the conventional HEVC standard, the motion vector is represented in units of 1/4 pixel.
However, the size and resolution of the image and the frame rate gradually increase, and thus, the amount of data to be decoded also increases. Therefore, a compression technique having higher efficiency than the conventional compression technique is required.
Disclosure of Invention
Technical problem
Accordingly, the present invention has been made in view of the above problems, and it is an object of the present invention to provide a video encoding or decoding technique for efficiently encoding video by adjusting the resolution of a motion vector according to image characteristics such as changes in image or block size.
Technical proposal
According to an aspect of the present invention, there is provided a method of encoding video, the method comprising the steps of: determining a motion vector resolution of the current block; determining a motion vector of the current block according to the motion vector resolution of the current block; predicting the current block using the motion vector of the current block and encoding the current block; and encoding information about the motion vector resolution of the current block.
According to another aspect of the present invention, a video decoding method of adaptively determining a motion vector resolution of a current block and decoding the current block includes the steps of: extracting information on the motion vector resolution of the current block from a bitstream, and determining the motion vector resolution of the current block based on the information on the motion vector resolution of the current block; and predicting the current block using a motion vector of the current block determined according to the motion vector resolution of the current block and decoding the current block.
According to another aspect of the present invention, a video decoding apparatus for adaptively determining a motion vector resolution of a current block and decoding the current block includes: a motion vector resolution decoder configured to extract information on the motion vector resolution of the current block from a bitstream and determine the motion vector resolution of the current block based on the information on the motion vector resolution of the current block; and a video decoder configured to predict the current block using a motion vector of the current block determined according to the motion vector resolution of the current block and decode the current block.
Drawings
Figure 1 is a schematic block diagram illustrating a conventional video encoding apparatus,
figure 2 is a diagram illustrating an example of block separation using a quadtree plus binary tree (QTBT) structure,
figure 3 is a diagram showing an example of neighboring blocks of the current block,
figure 4 is a diagram illustrating a video encoding apparatus according to an embodiment of the present invention,
figure 5 is a diagram illustrating the interpolation and motion estimation processes performed by the inter predictor 124 and the resolution of the reference picture,
Figure 6 is a graph comparing the degree of motion of two frames,
figure 7 is a diagram illustrating an example of a resolution determiner 410 according to an embodiment,
fig 8 is a diagram showing an example of the resolution encoder 430 when motion vector resolution information of a current CU is encoded as a resolution difference value,
fig. 9 is a diagram illustrating an example of a resolution encoder 430 that represents a motion vector of a current CU as a resolution scale factor instead of a resolution difference value,
figure 10 is a schematic diagram of a conventional video decoding apparatus,
fig. 11 is a diagram illustrating a video decoding apparatus 1100 according to an embodiment of the present invention,
fig. 12 is a flowchart illustrating a method of decoding video at a video decoding apparatus 1100 according to a first embodiment of the present invention,
figure 13 is a diagram showing an example of adaptive determination of resolution,
figure 14 is a flow chart showing the addition of some operations in figure 12,
fig 15 is a flowchart illustrating a method of decoding video at a video decoding apparatus 1100 according to a second embodiment of the present invention,
figure 16 is a diagram illustrating another example of adaptive determination of resolution,
figure 17 is a flow chart showing the addition of some operations in figure 15,
Fig 18 is a flowchart illustrating a method of decoding video at a video decoding apparatus 1100 according to a third embodiment of the present invention,
fig. 19 is a flowchart showing a case where some operations are added in the flowchart of fig. 18.
Detailed Description
Hereinafter, some embodiments of the present invention will be described in detail with reference to the accompanying drawings. With respect to the reference numerals for the elements in the drawings, although the elements are shown in different drawings, the same reference numerals designate the same elements if possible. In addition, in the following description of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted for clarity and conciseness.
Fig. 1 is a schematic block diagram illustrating a general video encoding apparatus.
The video encoding apparatus 100 may include a block separator 110, a predictor 120, a subtractor 130, a transformer 140, a quantizer 145, an encoder 150, an inverse quantizer 160, an inverse transformer 165, an adder 170, a filter unit 180, and a memory 190. Each component of the video encoding apparatus 100 may be implemented in the form of a hardware chip or may be implemented in the form of software such that one or more microprocessors perform the functions of the software corresponding to each component.
The block separator 110 may separate each picture constituting a video into a plurality of Coded Tree Units (CTUs), and then may recursively separate the CTUs using a tree structure. The leaf nodes in the tree structure may be Coding Units (CUs), which are the basic units of coding. The tree structure may be a Quadtree (QT) structure that separates a node into four lower nodes or a quadtree plus binary tree (QTBT) structure formed by mixing a QT structure with a Binary Tree (BT) structure that separates a node into two lower nodes.
In a quadtree plus binary tree (QTBT) structure, first, CTUs may be separated by QT structure. Then, leaf nodes of QT can be further separated in terms of BT structure. The separation information generated by the block separator 110 separating CTUs in the QTBT structure may be encoded by the encoder 150 and may be transmitted to a decoding apparatus.
In QT, a first flag (qt_split_flag) indicating whether to separate the blocks of the corresponding node is encoded. When the first flag is 1, the blocks of the corresponding node are separated into four blocks having the same size, and when the first flag is 0, the corresponding node is not further separated by QT.
In BT, a second flag (bt_split_flag) indicating whether to separate the blocks of the corresponding node is encoded. There may be a plurality of separation types in BT. For example, there may be two types, a type of horizontally separating a block of a corresponding node and a type of vertically separating a block into two blocks of the same size. Alternatively, there may be another type of separating the blocks of the corresponding node into two blocks having asymmetric shapes. The asymmetric shape may be formed by separating the blocks of the corresponding nodes into two rectangular blocks having a size ratio of 1:3 or by separating the blocks of the corresponding nodes in the diagonal direction. In the case where BT has a plurality of separation types, if the second flag indicating that the block is separated is encoded, separation type information indicating the separation type of the corresponding block may be additionally encoded.
Fig. 2 is a diagram showing an example of block separation using a QTBT structure. Fig. 2 (a) shows an example of separating blocks by QTBT structure, and fig. 2 (b) shows blocks separated in a tree structure. In fig. 2, the solid line represents the separation by QT structure, and the broken line represents the separation by BT structure. In fig. 2 (b), regarding the symbols of the layers, the unbracketed layer represents a layer representing QT, and the layer in brackets represents a layer of BT. Numbers in the BT structure indicated by broken lines indicate the separation type information.
In fig. 2, CTUs, which are the uppermost layers of QTs, may be separated into four nodes of layer 1. Accordingly, the block separator 110 may generate qt_split_flag=1 indicating that the CTU is separated. The blocks corresponding to the first node of layer 1 are no longer separated by QT. Thus, the block splitter 110 generates qt_split_flag=0.
Then, the block corresponding to the first node of layer 1 of QT may be BT. In this embodiment, BT is assumed to have two separate types: the blocks of the node are horizontally separated into two types of blocks of the same size and the blocks of the node are vertically separated into two types of blocks of the same size. The first node of layer 1 of QT becomes the root node "(layer 0)" of BT. The block corresponding to the root node of BT is further split into blocks of "(layer 1)", and thus the block splitter 110 generates a bt_split_flag=1 indicating splitting of the blocks by BT. Then, the block separator 210 generates separation type information indicating whether the corresponding block is horizontally separated or vertically separated. In fig. 2, blocks corresponding to the root node of BT are vertically separated, and thus 1 indicating vertical separation is generated as separation type information. The first block among the blocks of "(layer 1)" separated from the root node is further separated according to the vertical separation type, and thus bt_split_flag=1 and separation type information 1 are generated. On the other hand, the second block (layer 1) separated from the root node of BT is no longer separated, and thus bt_split_flag=0 is generated.
In order to signal information about blocks separated in QTBT structure to a decoding apparatus efficiently, the following information may be additionally encoded. Such information may be encoded as header information of the video, for example, as a Sequence Parameter Set (SPS) or a Picture Parameter Set (PPS).
CTU size: the block size of the uppermost layer (i.e., root node) of QTBT;
-MinQTSize: minimum block size of leaf nodes allowed in QT;
MaxBTSize: maximum block size of root node allowed in BT;
MaxBTDepth: maximum depth allowed in BT;
-MinBTSize: minimum block size of leaf nodes allowed in BT.
A block of the QT having the same size as MinQTSize is not further separated, and thus separation information (first flag) on QT corresponding to the block is not encoded. In addition, the block having a size larger than MaxBTSize in QT does not have BT. Therefore, the separation information (second flag, separation type information) on the BT corresponding to the block is not encoded. When the depth of the node of the BT reaches MaxBTDepth, the block of the node is not further separated, and the separation information (second flag, separation type information) about the BT of the corresponding node is not encoded. In addition, the block of the BT having the same size as MinBTSize is not further separated, and the separation information (second flag, separation type information) about the BT is not encoded. As such, the maximum or minimum block size of the root node or leaf node of QT and BT may be defined at a high layer such as a Sequence Parameter Set (SPS) or a Picture Parameter Set (PPS), and thus, the amount of encoding of information indicating whether CTUs are separated or information indicating a separation type can be reduced.
The same QTBT structure may be used to separate the luminance and chrominance components of the CTU. However, the present invention is not limited thereto, and separate QTBT structures may be used to separate the luminance component and the chrominance component, respectively. For example, in the case of intra (I) slices, different QTBT structures may be used to separate the luminance and chrominance components.
Hereinafter, a block corresponding to a CU to be encoded or decoded is referred to as a "current block".
The predictor 120 predicts the current block to generate a predicted block. Predictor 120 may include an intra predictor 122 and an inter predictor 124.
The intra predictor 122 predicts pixels in the current block using pixels (reference pixels) located around the current block in the current picture including the current block. There are a plurality of intra prediction modes according to a prediction direction, and neighboring pixels and calculation formulas to be used are defined differently according to each prediction mode.
The inter predictor 124 searches for a block most similar to the current block within a reference picture encoded and decoded earlier than the current picture, and generates a predicted block of the current block using the searched block. In addition, the inter predictor 124 generates a motion vector corresponding to a displacement between a current block in the current picture and a predicted block in the reference picture. Motion information including information on a reference picture for predicting a current block and information on a motion vector is encoded by the encoder 150 and transmitted to a video decoding apparatus.
Various methods may be used to minimize the number of bits required to encode motion information.
In one example, when the reference picture and the motion vector of the current block are identical to those of the neighboring block, the motion information about the current block may be transmitted to the decoding apparatus by encoding information that can be used to identify the neighboring block. This approach is referred to as "merge mode".
In the merge mode, the inter predictor 124 may select a predetermined number of merge candidate blocks (hereinafter referred to as "merge candidates") from neighboring blocks of the current block.
As shown in fig. 3, some or all of a left block L, an upper block a, an upper right block AR, a lower left block BL, and an upper left block AL adjacent to the current block in the current picture may be used as neighboring blocks for deriving the merge candidates. In addition, a block located in a reference picture (which may be the same as or different from a reference picture used to predict the current block) other than the current picture in which the current block is located may be used as a merge candidate. For example, a co-located block of the current block in the reference picture or a block adjacent to the co-located block may be further used as a merging candidate.
The inter predictor 124 constructs a merge list including a predetermined number of merge candidates using neighboring blocks. Among the merge candidates included in the merge list, the merge candidate to be used as the motion information of the current block, and merge index information for identifying the selected candidate is generated. The generated merging index information is encoded by the encoder 150 and transmitted to the video decoding apparatus.
As another method of encoding motion information, a motion vector difference value (MVD) may be encoded.
In this method, the inter predictor 124 derives a motion vector prediction amount (MVP) candidate of a motion vector of the current block using neighboring blocks of the current block. Some or all of a left block L, an upper block a, an upper right block AR, a lower left block BL, and an upper left block AL in the current picture illustrated in fig. 3 may be used as neighboring blocks for deriving MVP candidates. In addition, a block located in a reference picture (which may be the same as or different from a reference picture used to predict the current block) other than the current picture in which the current block is located may be used as a neighboring block used to derive the MVP candidate. For example, a common positioning block of a current block in a reference picture or a block adjacent to the common positioning block may be used.
The inter predictor 124 derives MVP candidates using motion vectors of neighboring blocks and determines MVPs of motion vectors of the current block using the MVP candidates. The inter predictor calculates a motion vector difference value (MVD) by subtracting the MVP from the motion vector of the current block.
MVP may be obtained by applying a predefined function (e.g., calculation of median or mean) to the MVP candidates. In this case, the video decoding device is also aware of the predefined function. Since the neighboring block used to derive the MVP candidate is a block that has been encoded and decoded, the video decoding apparatus has already known the motion vector of the neighboring block. Accordingly, the video encoding apparatus 100 does not need to encode information for identifying MVP candidates. Therefore, in this case, information on MVDs and information on reference pictures for predicting the current block are encoded.
In another embodiment, the MVP may be determined by selecting any of the MVP candidates. In this case, information for identifying the selected MVP candidate is additionally encoded together with information on MVDs and information on reference pictures for predicting the current block.
The subtractor 130 subtracts the prediction block generated by the intra predictor 122 or the inter predictor 124 from the current block to generate a residual block.
The transformer 140 transforms a residual signal, which is a value in the spatial domain, in the residual block into transform coefficients in the frequency domain. The transformer 140 may transform the residual signal in the residual block using the size of the current block as a transform unit, or may separate the residual block into a plurality of smaller sub-blocks and transform the residual signal in a transform unit corresponding to the size of the sub-blocks. There may be various methods of separating the residual block into smaller sub-blocks. For example, the residual block may be separated into sub-blocks of the same predefined size, or the residual block may be separated in a Quadtree (QT) having the residual block as a root node.
The quantizer 145 quantizes the transform coefficient output from the transformer 140 and outputs the quantized transform coefficient to the encoder 150.
The encoder 150 encodes the quantized transform coefficients using a coding scheme such as CABAC to generate a bitstream. The encoder 150 may encode information related to block separation such as CTU size, minQTSize, maxBTSize, maxBTDepth, minBTSize, QT _split_flag, bt_split_flag, and separation type, and the decoding apparatus may separate blocks in the same manner as the encoding apparatus.
The encoder 150 encodes information on a prediction type indicating whether the current block is encoded via intra prediction or inter prediction, and encodes intra prediction information or inter prediction information according to the prediction type.
When inter-predicting the current block, the encoder 150 encodes syntax elements of inter prediction information. The syntax element of the inter prediction mode may include the following information.
(1) Mode information indicating whether motion information of a current block is encoded in a merge mode or an MVD encoded mode
(2) Syntax element of motion information
When encoding motion information in the merge mode, the encoder 150 encodes merge index information indicating which one of the merge candidates is selected as a candidate for extracting motion information of the current block as a syntax element of the motion information.
On the other hand, when motion information is encoded in a mode for encoding MVDs, information on MVDs and information on reference pictures are encoded as syntax elements of the motion information. When the MVP is determined using a method of selecting any one of a plurality of MVP candidates, the syntax element of the motion information may further include MVP identification information for identifying the selected candidate.
The inverse quantizer 160 inversely quantizes the quantized transform coefficient output from the quantizer 145 to generate a transform coefficient. The inverse transformer 165 transforms the transform coefficients output from the inverse quantizer 160 from the frequency domain to the spatial domain, thus reconstructing the residual block.
The adder 170 adds the reconstructed residual block and the prediction block generated by the predictor 120 to reconstruct the current block. When intra-predicting a block having a subsequent sequence, pixels in the restored current block may be used as reference pixels.
The filter unit 180 deblocking filters boundaries between reconstructed blocks to remove blocking artifacts due to encoding/decoding in units of blocks and stores the deblocking filtered blocks in the memory 190. When all blocks in one picture are restored, the restored picture may be used as a reference picture for inter-predicting blocks in a picture to be subsequently encoded.
For reference, the video encoding apparatus may encode the current block using a skip mode. In the skip mode, only motion information of the current block is encoded, and any other information about the current block, such as information about a residual block, is not encoded. The above-mentioned merge index information may be used as motion information of the current block. When the current block has been encoded in the skip mode, the video decoding apparatus sets motion information of a merge candidate indicated by merge index information decoded from the bitstream as motion information of the current block. In the skip mode, a predicted block predicted based on motion information of a current block is reconstructed as the current block.
The skip mode may be different from a merge mode for encoding information about a residual block and motion information of a current block in that no other information than the motion information of the current block is encoded in the skip mode. However, the method for encoding motion information of the current block in the skip mode and the merge mode is the same, and thus, all the following descriptions of the merge mode may be applied to the skip mode in the same manner.
Fig. 4 is a diagram illustrating a video encoding apparatus 400 according to an embodiment of the present invention.
The video encoding apparatus 400 according to an embodiment of the present invention may include a resolution determiner 410, a video encoder 420, and a resolution encoder 430.
The resolution determiner 410 determines a motion vector resolution for motion estimation of the current CU. The motion vector resolution may be a minimum unit for representing a motion vector. The motion vector resolution may indicate a resolution in the reference picture for compensating for motion of the current CU, that is, may indicate pixels of the interpolated reference picture. For example, when the motion vector resolution is 1/4 pixel, the reference picture may be interpolated to a position of 1/4 pixel, and the motion vector may be measured up to a unit of 1/4 pixel. Here, the minimum unit for determining the motion vector may be a fractional pixel such as 1/4 pixel or 1/2 pixel, or may be an integer pixel unit such as 1 pixel, 2 pixels, 3 pixels, or 4 pixels.
The video encoder 420 estimates a motion in units of blocks (i.e., in units of CUs) according to the determined motion vector resolution to determine a motion vector of the CU, and predicts and encodes the CU using the determined motion vector.
The resolution encoder 430 encodes information on a motion vector resolution for predicting a motion vector of a CU.
Here, the video encoder 420 may be implemented as the video encoding apparatus 100 described above with reference to fig. 1.
The function of the resolution determiner 410 may be included in the above-mentioned function of the predictor 120 in the video encoding device 100, and may be integrated in the predictor 120.
The functions of the resolution encoder 430 may be included in the above-mentioned functions of the encoder 150 of the video encoding apparatus 100, and may be integrated in the encoder 150.
Fig. 5 is a diagram illustrating interpolation and motion compensation processes performed by the inter predictor 124 and resolution of motion vectors.
Fig. 5 illustrates pixels of a reference picture stored in the memory 190 and sub-pixels formed by interpolating an integer number of pixels of the reference picture. As shown in fig. 5, when integer numbers of pixels A1 to F6 of the reference picture are filtered by the interpolation filter, for example, sub-pixels "a" to "s" may be generated, and when such interpolation is performed, the resolution of motion estimation and motion compensation may be increased by 2 times, 4 times, or more as compared to the integer pixel resolution.
Motion estimation is a process of searching for the most similar part of the current CU from the interpolated reference picture and outputting a block of the corresponding part and a motion vector indicating the corresponding block. The motion vectors generated in this process are encoded by the encoder 150.
During motion estimation and motion compensation, when predicting an image region of inching, a motion vector may be expressed in units of fractional pixels (1/2 pixels, 1/4 pixels, 1/8 pixels, or 1/6 pixels), and when predicting an image region of large motion, a motion vector may be expressed in units of one or more integer pixels (1 pixel, 2 pixels, 3 pixels, or 4 pixels).
Fig. 6 is a diagram for comparing the degrees of motion of two frames.
In fig. 6, when comparing a reference frame and a current frame with respect to the motion of an object included therein, between the two frames, the object corresponding to the circle in fig. 6 is jogged, and thus, the motion of the corresponding object can be estimated in units of fractional pixels, and since between the two frames, the object corresponding to the triangle has a relatively large motion, the motion can be estimated in units of integer pixels.
When the inter prediction mode of the CU as the encoding target is the merge mode, the motion information is not directly signaled, but an index value corresponding to a motion information candidate selected from a plurality of motion information candidates is signaled. Therefore, no information on the motion vector resolution of the selected motion information candidate is transmitted.
On the other hand, when the inter prediction mode of the CU is a mode of encoding the MVD, the MVD information is signaled, and thus, the MVD information can be more effectively expressed in units of fractional pixels or in units of integer pixels according to the resolution of the MVD.
Fig. 7 is a diagram showing an example of the resolution determiner 410 according to an embodiment.
The resolution determiner 410 may include a resolution mode determiner 710, a substitute resolution determiner 720, and an adaptive resolution determiner 730. In some embodiments, the resolution determiner 410 may be implemented in a form omitting at least one of the components of the resolution mode determiner 710, the alternative resolution determiner 720, and the adaptive resolution determiner 730.
The resolution mode determiner 710 determines whether the adaptive motion vector resolution mode is activated. For example, whether to adaptively enable motion vector resolution may be selected with respect to an upper layer image unit included in a plurality of CUs. Here, the upper layer image unit may be an image sequence, a picture, a slice, a CTU, or the like. When the motion vector resolution of the upper layer image unit is not adaptively enabled (i.e., when the corresponding mode is not an adaptive motion vector resolution mode), the default motion vector resolution is applied to all CUs in the upper layer image unit. That is, a fixed default motion vector resolution such as 1/4 pixel may be applied to all CUs in the upper layer image unit. Here, the default motion vector resolution may be a predetermined specific motion vector resolution shared by the video encoding apparatus and the video decoding apparatus, or may be a value determined in an upper layer image area by the video encoding apparatus and signaled to the video decoding apparatus. On the other hand, when the adaptive motion vector resolution mode is enabled to be applied, the alternative resolution determiner 720 and the adaptive resolution determiner 730, which will be described below, adaptively determine the motion vector resolution of each CU to be inter-predicted.
When the adaptive motion vector resolution mode is enabled to be applied, the alternative resolution determiner 720 determines an alternative resolution in addition to the default motion vector resolution. The alternative resolution may be applied for each unit of any of sequence, picture, slice, CTU, and CU. The alternative resolution may be determined for the same individual image units as the image unit to which the adaptive motion vector resolution mode is applied. For example, when the adaptive motion vector resolution mode is applied in units of SPS, the substitution resolution may also be determined in units of SPS, and when the adaptive motion vector resolution mode is applied in units of PPS or slice, the substitution resolution may also be determined in units of PPS or slice. Alternatively, the alternative resolution may be determined for each image unit (e.g., slice, CTU, or CU) smaller than the image unit to which the adaptive motion vector resolution mode is applied. For example, when the adaptive motion vector resolution mode is applied in units of SPS, the substitution resolution may be determined in units of any one of PPS, slice, and CTU, which are image units of a layer lower than the SPS, and when the adaptive motion vector resolution mode is applied in units of PPS, the substitution resolution may be determined in units of any one of slice and CTU, which are image units of a layer lower than the PPS. Alternatively, the alternative resolution may be determined in units of CUs to be encoded.
The alternative resolution determiner 720 may select one of the plurality of motion vector resolution candidates as the alternative resolution.
The adaptive resolution determiner 730 may determine a motion vector resolution of the current CU. For example, the adaptive resolution determiner 730 may determine any one of the default motion vector resolution and the alternative resolution as the motion vector resolution of the current CU.
The resolution encoder 430 may generate and encode information about the resolution of the motion vector based on the information determined by the resolution determiner 410. Hereinafter, in the case where the motion vector resolution of the current CU is determined from among the default motion vector resolution and the alternative resolution, a method in which the resolution encoder 430 encodes motion vector resolution information is exemplified for the first and second embodiments.
First embodiment
In the first embodiment, an image unit for selecting whether to adaptively determine a motion vector resolution between a default motion vector resolution and an alternative resolution is the same as an image unit for determining an alternative resolution, and both the image units are larger than the CU.
When whether to adaptively enable the motion vector resolution is selected for each image sequence unit by the resolution mode determiner 710, the resolution encoder 430 may insert an adaptive_mv_resolution_enabled_flag (i.e., first identification information) as a flag indicating whether to adaptively enable the motion vector resolution into a Sequence Parameter Set (SPS).
When whether to adaptively enable the motion vector resolution is selected for each picture unit by the resolution mode determiner 710, the resolution encoder 430 may insert an adaptive_mv_resolution_enabled_flag, which is a flag indicating whether to adaptively enable the motion vector resolution, into a Picture Parameter Set (PPS).
When whether to adaptively enable motion vector resolution is selected for each slice (or CTU) unit by the resolution mode determiner 710, the resolution encoder 430 may insert an adaptive_mv_resolution_enabled_flag, which is a flag indicating whether to adaptively enable motion vector resolution, into a slice (or CTU) header.
When the resolution mode determiner 710 selects the Adaptive motion vector resolution mode adaptively enabling the motion vector resolution, adaptive_mv_resolution_enabled_flag=on (e.g., on=1) may be set. Otherwise, adaptive_mv_resolution_enabled_flag=off may be set (e.g., in the case of on=1, off=0).
According to the first embodiment, a unit for determining the alternative resolution at the alternative resolution determiner 720 is the same as a unit for setting the adaptive_mv_resolution_enabled_flag. In the case of adaptive_mv_resolution_enabled_flag=on, the alternative resolution determiner 720 determines an alternative resolution, and the resolution encoder 430 generates alternative_mv_resolution information indicating the determined alternative resolution. The resolution encoder 430 encodes the adaptive_mv_resolution information generated for the same respective units as those used to set the adaptive_mv_resolution_enabled_flag.
When one of a plurality of predefined motion vector resolution candidates is selected as the alternative resolution, the resolution encoder 430 may generate and encode information for identifying the alternative resolution selected from the plurality of predefined motion vector resolution candidates as the alternative_mv_resolution information.
When the adaptive resolution determiner 730 determines a default motion vector resolution to be used as the motion vector resolution of the current CU, the resolution encoder 430 may generate mv_resolution_flag (i.e., second identification information) indicating that the default motion vector resolution is used as the motion vector resolution of the current CU.
When the adaptive resolution determiner 730 determines a replacement resolution that is not the default motion vector resolution to be used as the motion vector resolution of the current CU, the resolution encoder 430 may generate and encode an mv_resolution_flag indicating that the replacement resolution is used as the motion vector resolution of the current CU.
Second embodiment
In the second embodiment, the image unit for selecting whether to adaptively determine the motion vector resolution between the default motion vector resolution and the alternative resolution is an image unit of a higher layer than the image unit for determining the alternative resolution, and the image unit for determining the alternative resolution is an image unit of a higher layer than the CU.
The first embodiment is the following embodiment: the image unit for selecting whether to adaptively determine the motion vector resolution between the default motion vector resolution and the alternative resolution is the same as the image unit for determining the alternative resolution, and both image units are larger than the CU.
When whether to adaptively enable the motion vector resolution is selected for each image sequence unit by the resolution mode determiner 710, the resolution encoder 430 may insert an adaptive_mv_resolution_enabled_flag, which is a flag indicating whether to adaptively enable the motion vector resolution, into the SPS.
When whether to adaptively enable the motion vector resolution is selected for each picture unit by the resolution mode determiner 710, the resolution encoder 430 may insert an adaptive_mv_resolution_enabled_flag, which is a flag indicating whether to adaptively enable the motion vector resolution, into the PPS.
Here, when the resolution mode determiner 710 selects the Adaptive motion vector resolution mode adaptively enabling the motion vector resolution, adaptive_mv_resolution_enabled_flag=on may be set. Otherwise, adaptive_mv_resolution_enabled_flag=off may be set.
The unit that determines the alternative resolution at the alternative resolution determiner 720 may be determined as an image unit (e.g., a slice or CTU) that is smaller than an image unit for which the adaptive_mv_resolution_enabled_flag is set and larger than the CU. In this case, the resolution encoder 430 may generate and encode the alternative_mv_resolution information indicating the alternative resolution of each slice or CTU.
When one of a plurality of predefined motion vector resolution candidates is selected as the alternative resolution, the resolution encoder 430 may generate and encode information for identifying the alternative resolution selected from the plurality of predefined motion vector resolution candidates as the alternative_mv_resolution information.
When the value of the alternative_mv_resolution information indicates 0, the motion vector resolution of all CUs in the corresponding slice or CTU is not adaptively enabled.
When the motion vector resolution is adaptively enabled, the resolution encoder 430 may generate, for each slice or CTU, alternate_enabled_flag information as a flag indicating whether to use alternate resolution, and may encode the alternate_enabled_flag information.
Here, when the alternative resolution is used, the alternative_mv_resolution information indicating the alternative resolution may be generated and encoded for each slice or CTU.
When the adaptive resolution determiner 730 determines to use the default motion vector resolution as the motion vector resolution of the current CU, the resolution encoder 430 may generate and encode mv_resolution_flag indicating that the default motion vector resolution is used as the motion vector resolution of the current CU.
When the adaptive resolution determiner 730 determines to use a substitute resolution that is not the default motion vector resolution as the motion vector resolution of the current CU, the resolution encoder 430 may generate an mv_resolution_flag indicating that the substitute resolution is used as the motion vector resolution of the current CU and may encode information on the mv_resolution_flag.
Third embodiment
In a second embodiment, the image unit for adaptively determining the motion vector resolution between the default motion vector resolution and the alternative resolution is an image unit of a higher layer than the image unit for determining the alternative resolution, and the image unit for determining the alternative resolution is a CU unit.
When whether to adaptively enable the motion vector resolution is selected for each picture sequence unit, each picture unit, each slice unit, or each CTU unit by the resolution mode determiner 710, the resolution encoder 430 may insert an adaptive_mv_resolution_enabled_flag, which is a flag indicating whether to adaptively enable the motion vector resolution, into the SPS, PPS, slice header, or CTU header.
As in the first and second embodiments, the adaptive_mv_resolution_enabled_flag may be set to ON or OFF according to a result of selection of whether to adaptively enable motion vector resolution.
When the resolution mode determiner 710 selects the adaptive motion vector resolution mode adaptively enabling the motion vector resolution, if a default motion vector resolution is selected as the motion vector resolution of the current CU, the resolution encoder 430 may generate and encode mv_resolution_flag indicating that the default motion vector resolution is used as the motion vector resolution of the current CU.
According to the third embodiment, the unit for determining the alternative resolution by the alternative resolution determiner 720 may be the same CU unit as the unit for determining the motion vector resolution.
When a substitute resolution that is not the default motion vector resolution is used as the motion vector resolution of the current CU, the resolution encoder 430 may generate mv_resolution_flag indicating that the substitute resolution is used as the motion vector resolution of the current CU, may generate alternative_mv_resolution information on the substitute resolution determined by the substitute resolution determiner 720, and may encode information on the mv_resolution_flag and the alternative_mv_resolution.
When one of a plurality of predefined motion vector resolution candidates is selected as the alternative resolution, the resolution encoder 430 may encode information for identifying the alternative resolution selected from the plurality of predefined motion vector resolution candidates as the alternative_mv_resolution information.
The adaptive resolution determiner 730 may select any one of the plurality of motion vector resolution candidates as the motion vector resolution of the current CU instead of selecting any one of the default motion vector resolution and the alternative resolution as the motion vector resolution of the current CU. In this case, in order to efficiently encode the determined motion vector resolution information of the current CU, the resolution encoder 430 may encode the motion vector resolution information of the current CU as a difference between the motion vector resolution of the current CU and the alternative resolution or a difference between the motion vector resolution of the current CU and the motion vector resolution of the previous CU, instead of encoding the motion vector resolution of the current CU itself.
Fig. 8 is a diagram illustrating an example of the resolution encoder 430 when motion vector resolution information of a current CU is encoded as a resolution difference value.
As shown in fig. 8, the resolution encoder 430 may include an encoding information generator 810 and a resolution difference calculator 820. The detailed operation will be described below with respect to the fourth embodiment.
Fourth embodiment
When the resolution mode determiner 710 adaptively enables the motion vector resolution in units of a sequence, a picture, a slice, or a CTU, the encoding information generator 810 may set adaptive_mv_resolution_enabled_flag to ON for each sequence, each picture, each slice unit, or each CTU unit, which is a higher image unit.
The encoding information generator 810 may check, for each CU in the higher image unit, whether the default motion vector resolution is used as the motion vector resolution of the corresponding CU. When the default motion vector resolution is used as the motion vector resolution of the current CU, the encoding information generator 810 may set mv_resolution_flag corresponding to the current CU to OFF. When the default motion vector resolution is not used as the motion vector resolution of the current CU and any one selected from the plurality of motion vector resolution candidates is used as the motion vector resolution of the current CU, the encoding information generator 810 may set mv_resolution_flag corresponding to the current CU to ON.
When the alternative resolution determiner 720 determines alternative resolution values of the same respective image units as the unit for setting adaptive_mv_resolution_enabled_flag, the encoding information generator 810 may encode the determined alternative resolution as alternative_mv_resolution information of the same respective image units as the unit for setting adaptive_mv_resolution_enabled_flag.
As described above, the unit for determining the alternative resolution is not limited to determining for the same respective image units as the unit for setting the adaptive_mv_resolution_enabled_flag, but may be determined for each CU unit or may be determined for each image unit smaller than the unit for setting the adaptive_mv_resolution_enabled_flag and larger than the CU unit, but a description thereof has been given above, and thus, a detailed description thereof will be omitted.
When any one selected from the plurality of motion vector resolution candidates is used as the motion vector resolution of the current CU, the resolution difference calculator 820 may calculate a resolution difference (e.g., a value obtained by subtracting the motion vector resolution of the previous CU from the motion vector resolution of the current CU) as an element included in the information on the motion vector resolution of the current CU, as a difference between the motion vector resolution of the current CU and the motion vector resolution of the previous CU.
However, when the current CU is the first CU in coding order among the CUs in the upper layer picture unit, the resolution difference calculator 820 may determine the motion vector resolution of the current CU as resolution difference information or may determine the alternative resolution determined by the alternative resolution determiner 720 as resolution difference information.
When any one selected from the plurality of motion vector resolution candidates is used as the motion vector resolution of the current CU, the resolution difference calculator 820 may calculate a resolution difference (e.g., a value obtained by subtracting the alternative_mv_resolution from the motion vector resolution of the current CU) as an element included in the information on the motion vector resolution of the current CU, as a difference between the motion vector resolution of the current CU and the alternative_mv_resolution.
The resolution difference information determined by the resolution difference calculator 820 may be stored as mv_resolution_delta.
When the motion vector resolution of the current CU is described as the mv_resolution_delta value, it is not necessary to generate the alternative resolution, and thus, the operation of the alternative resolution determiner 720 may be omitted.
The encoding information generator 810 may generate and encode information about the motion vector resolution of the current CU based on the results of the resolution mode determiner 710 and the adaptive resolution determiner 730. The encoding information generator 810 may encode adaptive_mv_resolution_enabled_flag as ON or OFF according to whether the resolution mode determiner 710 adaptively enables the motion vector resolution of the CU included in the upper layer picture unit.
When the adaptive_mv_resolution_enabled_flag is ON, the Adaptive resolution determiner 730 selects whether to use the default motion vector resolution as the motion vector resolution of the current CU or to determine any one selected from a plurality of motion vector resolution candidates as the motion vector resolution of the current CU. The encoding information generator 810 encodes the mv_resolution_flag according to the selection. When the adaptive_mv_resolution_enabled_flag is OFF, the encoding information generator 810 does not encode the mv_resolution_flag.
When the mv_resolution_flag is ON (i.e., when any one selected from a plurality of motion vector resolution candidates is used as the motion vector resolution of the current CU), the encoding information generator 810 may encode the mv_resolution_delta, which is information ON the resolution difference value calculated by the resolution difference value calculator 820. When the mv_resolution_flag is OFF (i.e., when the default motion vector resolution is used as the motion vector resolution of the current CU), the encoding information generator 810 may not encode the mv_resolution_delta.
The resolution encoder 430 may represent the resolution difference value as a resolution scale factor obtained via a division operation instead of a value obtained via a subtraction operation "-". Furthermore, the scale factor may be expressed on a logarithmic scale, and detailed operations thereof will be described with respect to the fifth embodiment.
Fifth embodiment
Fig. 9 is a diagram illustrating an example of a resolution encoder 430 that represents the motion vector resolution of the current CU as a resolution scale factor instead of a resolution difference value.
As shown in fig. 9, the resolution encoder 430 may include an encoding information generator 910 and a resolution ratio information generator 920. The resolution ratio information generator 920 of fig. 9 may replace the resolution difference calculator 820 of fig. 8, and the encoding information generator 910 of fig. 9 may replace the encoding information generator 810 of fig. 8.
The resolution difference calculator 820 and the resolution ratio information generator 920 are different from each other in that the resolution difference calculator 820 calculates a resolution difference and generates mv_resolution_delta as information about the resolution difference, and the resolution ratio information generator 920 calculates a resolution ratio factor and generates mv_resolution_scale as information about the resolution ratio factor.
For example, the resolution ratio information generator 920 may calculate a ratio between the motion vector resolution of the current CU and the motion vector resolution of the previous CU (e.g., a value obtained by dividing the motion vector resolution of the current CU by the motion vector resolution of the previous CU) as an element included in the information about the motion vector resolution of the current CU.
However, when the current CU is the first CU in coding order among the CUs in the upper layer picture unit, the resolution scale information generator 920 may determine the motion vector resolution of the current CU as resolution scale factor information or may determine the alternative resolution determined by the alternative resolution determiner 720 as resolution scale factor information.
The resolution scale information generator 920 may calculate a resolution scale factor (i.e., a value obtained by dividing the motion vector resolution of the current CU by the alternative_mv_resolution) as an element included in the information on the motion vector resolution of the current CU, which is a ratio between the motion vector resolution of the current CU and the alternative_mv_resolution.
The resolution scale factor information determined by the resolution scale information generator 920 may be stored as mv_resolution_scale.
The remaining operations of the encoding information generator 810 of fig. 8 may be the same as those of the encoding information generator 910 of fig. 9, except that the encoding information generator 810 of fig. 8 encodes mv_resolution_delta and the encoding information generator 910 of fig. 9 encodes information about a resolution scale factor (mv_resolution_scale). The encoding information generator 910 encodes information (mv_resolution_scale) on the resolution scale factor. The operation of the encoding information generator 910 may be the same as that of the encoding information generator 810 of fig. 8, except that mv_resolution_scale is encoded instead of mv_resolution_delta.
In the above embodiments of the video encoding apparatus, the unit for determining the block of the motion vector resolution is described as a CU, but is not limited thereto, and may be a CTU/, in some embodiments. When the unit for determining the block of the motion vector resolution is a CTU, all CUs included in the CTU may have the same motion vector resolution value. In addition, all CUs encoded in mvp mode among CUs included in a corresponding CTU may have the same motion vector resolution value as that of the corresponding CTU.
Hereinafter, a video decoding apparatus will be described.
Fig. 10 is a schematic diagram of a conventional video decoding apparatus.
The video decoding apparatus 1000 may include a decoder 1010, an inverse quantizer 1020, an inverse transformer 1030, a predictor 1040, an adder 1050, a filter unit 1060, and a memory 1070. As in the video encoding apparatus of fig. 1, each component of the video encoding apparatus may be implemented in the form of a hardware chip or may be implemented in the form of software so that a microprocessor performs the functions of the software corresponding to each component.
The decoder 1010 decodes a bitstream received from a video encoding apparatus, thereby extracting information related to block separation to determine a current block to be decoded and extracting prediction information required to reconstruct the current block, information about a residual block, and the like.
The decoder 1010 may extract information about the size of CTUs from a Sequence Parameter Set (SPS) or a Picture Parameter Set (PPS) to determine the size of CTUs, and may separate pictures into CTUs having the determined size. In addition, the decoder 1010 may set the CTU to the uppermost layer (i.e., root node) of the tree structure, and may extract separation information of the CTU to separate the CTU using the tree structure. For example, when separating CTUs using the QTBT structure, a first flag (qt_split_flag) related to separation of QT may be extracted, and then each node may be separated into four nodes of the lower layer. In addition, as for the node corresponding to the leaf node of QT, a second flag (bt_split_flag) and split type information related to the splitting of BT may be extracted, and the corresponding leaf node may be split in the BT structure.
As an example of the block separation structure of fig. 2, qt_split_flag corresponding to a node of the uppermost layer of the QTBT structure is extracted. The value of qt_split_flag extracted is 1, and thus the node of the uppermost layer is separated into four nodes of the lower layer (layer 1 of QT). In addition, qt_split_flag of the first node of layer 1 is extracted. The value of qt_split_flag extracted is 0, and thus the first node of layer 1 is no longer split into QT structures.
Since the first node of layer 1 of QT is a leaf node of QT, this operation is performed before BT using the first node of layer 1 of QT as the root node of BT. Bt_split_flag corresponding to the root node of BT, i.e., (layer 0) ", is extracted. Since bt_split_flag is 1, the root node of BT is split into two nodes of "(layer 1)". Since the root node of BT is separated, separation type information indicating whether the block corresponding to the root node of BT is vertically separated or horizontally classified is extracted. Since the split type information is 1, the block corresponding to the root node of BT is vertically split. Then, the decoder 1010 extracts bt_split_flag of the first node of the separated "(layer 1)" from the root node of BT. Since bt_split_flag is 1, split type information about the block of the first node of "(layer 1)" is extracted. Since the separation type information on the blocks of the first node of "(layer 1)" is 1, the blocks of the first node of "(layer 1)" are vertically separated. Then, bt_split_flag of the second node of "(layer 1)" separated from the root node of BT is extracted. Since bt_split_flag is 0, the node is not further split by BT.
In this way, the decoder 1010 recursively extracts qt_split_flag and separates CTUs by QT structure. The decoder extracts the bt_split_flag of the leaf node of QT. When bt_split_flag indicates split, split type information is extracted. In this way, the decoder 1510 may recognize that CTUs are separated into structures as shown in fig. 2A.
When information such as MinQTSize, maxBTSize, maxBTDepth and MinBTSize are additionally defined in the SPS or PPS, the decoder 1010 may extract the additional information and may use the additional information to extract separate information about QT and BT.
For example, blocks in QT that are the same size as MinQTSize are not further separated. Therefore, the decoder 1010 does not extract the separation information (qt_split_flag) related to the QT of the corresponding block from the bitstream (i.e., qt_split_flag of the corresponding block does not exist in the bitstream) and automatically sets the value to 0. In QT, a block larger than MaxBTSize does not have BT. Therefore, the decoder 1510 does not extract the BT split flag of the leaf node having a block larger than MaxBTSize in QT, and automatically sets the BT split flag to 0. In addition, when the depth of the corresponding node of BT reaches MaxBTDepth, the blocks of the node are not further separated. Therefore, the BT split flag of the node is not extracted from the bit stream, and the value of the BT split flag is automatically set to 0. In addition, the blocks of BT that are the same size as MinBTSize are not further separated. Therefore, the decoder 1510 does not extract the BT split flag of the same size block as MinBTSize from the bitstream, and automatically sets the value of the flag to 0.
When the current block to be decoded is determined through the split tree structure, the decoder 1010 extracts information on a prediction type indicating whether the current block is intra-predicted or inter-predicted.
When the prediction type information indicates inter prediction, the decoder 1010 may extract syntax elements of intra prediction information. First, mode information indicating a mode employed when motion information on a current block among a plurality of encoding modes is encoded is extracted. Here, the plurality of coding modes may include a merge mode and a Motion Vector Difference (MVD) coding mode. When the mode information indicates a merge mode, the decoder 1010 extracts merge index information indicating candidates from which a motion vector of a current block is derived among merge candidates as syntax elements of the motion information. On the other hand, when the mode information indicates an MVD encoding mode, the decoder 1010 extracts information on MVDs and information on reference pictures referenced by a motion vector of a current block as syntax elements of the motion vector. When the video encoding apparatus uses any one of a plurality of Motion Vector Prediction (MVP) candidates as the MVP of the current block, MVP identification information may be included in the bitstream. Therefore, in this case, MVP identification information is extracted together with information on MVDs and information on reference pictures as syntax elements of motion vectors.
The decoder 1010 extracts information on the quantized transform coefficients of the current block as information on the residual signal.
The inverse quantizer 1020 inversely quantizes the quantized transform coefficients, and the inverse transformer 1030 inversely transforms the inversely quantized transform coefficients from the frequency domain to the spatial domain to reconstruct the residual signal, thereby generating a residual block of the current block.
The predictor 1040 may include an intra predictor 1042 and an inter predictor 1044. In the case where the prediction type of the current block is intra prediction, the intra predictor 1042 is started, and in the case where the prediction type of the current block is inter prediction, the inter predictor 1044 is started.
The intra predictor 1042 determines an intra prediction mode of the current block among the plurality of intra prediction modes according to the syntax element of the intra prediction mode extracted from the decoder 1010, and predicts the current block using neighboring reference pixels of the current block according to the intra prediction mode.
The inter predictor 1044 determines motion information about the current block using syntax elements of the inter prediction information extracted from the decoder 1010, and predicts the current block using the determined motion information.
First, the inter predictor 1044 may check mode information of inter prediction extracted from the decoder 1010. When the mode information indicates a merge mode, the inter predictor 1044 constructs a merge list including a predetermined number of merge candidates using neighboring blocks of the current block. The method of the inter predictor 1044 constructing the merge list is the same as that of the inter predictor 124 of the video encoding apparatus. In addition, one merge candidate is selected among the merge candidates in the merge list using the merge index information received from the decoder 1010. The motion information of the selected merge candidate (i.e., the motion vector and the reference picture of the merge candidate) is set as the motion vector and the reference picture of the current block.
On the other hand, when the mode information indicates the MVD encoding mode, the inter predictor 1044 derives MVP candidates using motion vectors of neighboring blocks of the current block and determines MVPs of the motion vectors of the current block using the MVP candidates. The method of deriving MVP candidates by the inter predictor 1044 is the same as that of the inter predictor 124 of the video encoding device. When the video encoding apparatus uses any one of a plurality of MVP candidates as the MVP of the current block, the syntax element of the motion information includes MVP identification information. Accordingly, in this case, the inter predictor 1044 may select a candidate indicated by the MVP identification information from among the MVP candidates as the MVP of the current block. However, when the video encoding apparatus determines MVP by applying a predefined function to a plurality of MVP candidates, the inter predictor 1044 may apply the same function as the video encoding apparatus to determine MVP. Once the MVP of the current block is determined, the inter predictor 1044 adds the MVP and the MVD extracted from the decoder 1010 to derive a motion vector of the current block. In addition, the inter predictor 1044 determines a reference picture referred to by a motion vector of the current block using information on the reference picture extracted from the decoder 1010.
When determining a motion vector of a current block and a reference picture in a merge mode or an MVD encoding mode, the inter predictor 1044 generates a predicted block of the current block using a block at a position indicated by the motion vector in the reference picture.
The adder 1050 adds the residual block output from the inverse transformer and the prediction block output from the inter predictor or the intra predictor to reconstruct the current block. Pixels in the reconstructed current block may be used as reference pixels for intra prediction of a block to be decoded later.
The filter unit 1060 performs deblocking filtering on boundaries between reconstructed blocks to remove blocking artifacts caused by block-by-block encoding and stores the deblocking filtered blocks in the memory 1070. When all blocks in one picture are reconstructed, the reconstructed picture may be used as a reference picture for inter prediction of blocks in a subsequent picture to be encoded.
Fig. 11 is a diagram illustrating a video decoding apparatus 1100 according to an embodiment of the present invention.
The video decoding apparatus 1100 according to an embodiment of the present invention may include a motion vector resolution decoder 1110 and a video decoder 1120.
The motion vector resolution decoder 1110 parses information on the motion vector resolution of the current CU from the bitstream, and determines the motion vector resolution for estimating the motion of the current CU based on the parsed information on the motion vector resolution.
The video decoder 1120 predicts and decodes the current CU using the determined motion vector of the current CU according to the motion vector resolution of the current CU.
Here, the video decoder 1120 may be implemented as the video decoding apparatus 1000 described above with reference to fig. 10.
The function of the motion vector resolution decoder 1110 may be included in the above-mentioned functions of the decoder 1010 in the video decoding apparatus 1000, and may be integrated in the decoder 1010.
Fig. 12 is a flowchart illustrating a method of decoding video by the video decoding apparatus 1100 according to the first embodiment of the present invention.
As shown in fig. 12, in the video decoding apparatus 1100 according to the first embodiment of the present invention, the motion vector resolution decoder 1110 parses adaptive_mv_resolution_enabled_flag (i.e., first identification information) from the bitstream (S1210). Adaptive_mv_resolution_enabled_flag means identification information indicating whether motion vector resolution is adaptively enabled, and may be determined in at least one image unit among an image sequence, a picture, a slice, and a CTU, which are upper layer image units. Adaptive_mv_resolution_enabled_flag may be parsed from a bitstream header of at least one upper layer picture unit of a picture sequence, a picture, a slice, and a CTU.
After parsing the adaptive_mv_resolution_enabled_flag, the motion vector resolution decoder post 1110 checks whether the adaptive_mv_resolution_enabled_flag is a motion vector resolution indicating that the motion vector resolution of the CU in the upper layer picture unit is adaptively enabled (i.e., the adaptive_mv_resolution_enabled_flag is ON) or a motion vector resolution indicating that the default motion vector resolution is used as the motion vector resolution of the CU in the upper layer picture unit (i.e., the adaptive_mv_resolution_enabled_flag is OFF) (S1220).
As a result of the checking of operation S1220, when the adaptive_mv_resolution_enabled_flag is ON, the motion vector resolution decoder 1110 parses the adaptive_mv_resolution, which is information ON the alternative resolution, from the bitstream (S1230). Here, the adaptive_mv_resolution may be parsed for the same individual picture units as the picture units for the adaptive_mv_resolution_enabled_flag, or may be parsed from a bit stream of each picture unit smaller than the picture units for the adaptive_mv_resolution_enabled_flag. In addition, the alternative_mv_resolution may be parsed for each CU in the picture unit for the adaptive_mv_resolution_enabled_flag.
After operation S1230, the motion vector resolution decoder 1110 determines the motion vector resolution of the current CU as an encoding target according to whether the encoding mode of the current CU is a mode for encoding the MVD (S1240). When information on adaptive_mv_resolution_enabled_flag and alternative resolution is transmitted in a bitstream for each upper layer picture unit, a motion vector resolution may also be adaptively determined for each CU in the upper layer picture unit.
When information on adaptive_mv_resolution_enabled_flag is transmitted for each upper layer picture unit of one of the picture sequence and the picture and adaptive_mv_resolution is transmitted in a bitstream for each slice (or CTU) of a picture unit smaller than the upper layer picture unit, a motion vector resolution may also be adaptively determined for each CU in the slice (or CTU).
When it is checked in operation S1220 that the adaptive_mv_resolution_enabled_flag is OFF, the motion vector resolution decoder 1110 performs operation S1240 of determining the motion vector resolution of the current CU according to whether the encoding mode of the current CU is a mode for encoding MVDs.
Here, operation S1240 may include operations S1241 to S1246.
After operation S1230, the motion vector resolution decoder 1110 may parse the encoding mode of the current CU from the bitstream, and may check whether the encoding mode of the current CU is a mode (i.e., MVP mode) encoding the MVD using the MVP (S1241).
When it is checked in operation S1241 that the encoding mode of the current CU is mvp mode, the motion vector resolution decoder 1110 may check whether adaptive_mv_resolution_enabled_flag is a motion vector resolution indicating that the motion vector resolution of the CU in the upper layer picture unit is adaptively enabled (i.e., adaptive_mv_resolution_enabled_flag is ON) or a motion vector resolution indicating that the default motion vector resolution is used as the CU in the upper layer picture unit (i.e., adaptive_mv_resolution_enabled_flag is OFF) (S1242).
When it is checked in operation S1242 that the adaptive_mv_resolution_enabled_flag is ON, the motion vector resolution decoder 1110 may parse mv_resolution_flag (i.e., second identification information) indicating which one of the default motion vector resolution and the alternative resolution is used as identification information of the motion vector resolution of the current CU from the bitstream (S1243), and then may perform the subsequent operation of S1244.
Here, although the mv_resolution_flag is described as being transmitted in units of CUs, in some embodiments, the mv_resolution_flag may be parsed in units of CTUs, whether or not a CU is mvp mode may be checked for each CU in a corresponding CTU, and when the corresponding CU is mvp mode, the subsequent operation of S1244 may be performed.
After parsing the mv_resolution_flag, the motion vector resolution decoder 1110 may check the value of the mv_resolution_flag (S1244).
When it is checked in operation S1244 that the mv_resolution_flag indicates that the alternative resolution is used as the motion vector resolution of the current CU (i.e., when the mv_resolution_flag is ON), the motion vector resolution decoder 1110 may determine the alternative resolution as the motion vector resolution of the current CU (S1245).
When it is checked in operation S1244 that the mv_resolution_flag indicates that the default motion vector resolution is used as the motion vector resolution of the current CU (i.e., when the mv_resolution_flag is OFF), the motion vector resolution decoder 1110 may determine that the default motion vector resolution is the motion vector resolution of the current CU (S1246).
When it is checked in operation S1242 that the adaptive_mv_resolution_enabled_flag is OFF, the motion vector resolution decoder 1110 may determine the default motion vector resolution as the motion vector resolution of the current CU (S1246).
For example, when an adaptive_mv_resolution_enabled_flag indicating whether to adaptively enable motion vector resolution and an adaptive_mv_resolution syntax indicating a replacement resolution are located in an SPS, it may be determined whether to apply the method according to the present invention for each picture sequence unit.
For example, when adaptive_mv_resolution_enabled_flag in the SPS is ON, the adaptive_mv_resolution value (i.e., the alternative resolution) of the SPS is 4 pixels, and the default motion vector resolution is 1/4 pixel, the motion vector resolution of all CUs as encoding targets in the image sequence referring to the SPS may be determined to be 1/4 pixel or 4 pixel. That is, when the mv_resolution_flag, which is header information of the current CU, is OFF, the motion vector resolution of the current CU may be determined to be 1/4 pixel corresponding to the default motion vector resolution, and when the mv_resolution_flag of the current CU is ON, the motion vector resolution of the current CU may be determined to be 4 pixel corresponding to the alternative resolution.
When adaptive_mv_resolution_enabled_flag and adaptive_mv_resolution syntax are located in PPS (or slice header), it may be determined whether the method according to the present invention is applied in units of pictures (or in units of slices).
For example, when adaptive_mv_resolution_enabled_flag in PPS is ON, the adaptive_mv_resolution value (i.e., alternative resolution) of PPS is 4 pixels, and the default motion vector resolution is 1/4 pixel, the motion vector resolution of all CUs as encoding targets in a picture (or slice) with reference to PPS (or slice header) may be determined to be 1/4 pixel or 4 pixel. That is, when the mv_resolution_flag, which is header information of the current CU, is OFF, the motion vector resolution of the current CU may be determined to be 1/4 pixel corresponding to the default motion vector resolution, and when the mv_resolution_flag of the current CU is ON, the motion vector resolution of the current CU may be determined to be 4 pixel corresponding to the alternative resolution.
When adaptive_mv_resolution_enabled_flag is ON in the SPS (or PPS), an adaptive_mv_resolution value which is a header of a slice (or CTU) of an image unit smaller than an image sequence (or picture) is 4 pixels, and a default motion vector resolution is 1/4 pixel, motion vector resolutions of all CUs in the corresponding slice (or CTU) which are encoding targets may be determined as 1/4 pixels corresponding to the default motion vector resolution or 4 pixels corresponding to the alternative resolution, and thus a motion vector of the current CU may be represented at a resolution of 1/4 pixel or 4 pixels. That is, when mv_resolution_flag, which is header information of the current CU, is 0 (i.e., OFF), the motion vector of the current CU is represented with a resolution of 1/4 pixel, and when mv_resolution_flag of the current CU is ON, the motion vector of the current CU is represented with a resolution of 4 pixel.
When adaptive_mv_resolution_enabled_flag is ON in the SPS (or PPS), an adaptive_mv_resolution value of a header of a slice (or CTU) which is a smaller image unit than an image sequence (or picture) is 0, and a default motion vector resolution is 1/4 pixel, motion vector resolutions of all CUs in the corresponding slice (or CTU) which are encoding targets may be determined to be 1/4 pixel corresponding to the default motion vector resolution. Here, mv_resolution_flag, which is header information of the CU, may not be required.
When the adaptive_mv_resolution_enabled_flag may be ON in the SPS (or PPS), the motion vector resolution of all CUs in the corresponding slice (or CTU) as a coding target may be determined according to the value of the alternate_enabled_flag of the header of the slice (or CTU) of the image unit smaller than the image sequence (or picture). For example, when the alternate_enabled_flag is OFF and the default motion vector resolution is 1/4 pixel, the motion vector resolutions of all CUs in the corresponding slice (or CTU) may be represented in 1/4 pixel corresponding to the default motion vector resolution, and mv_resolution_flag, which is header information of the CU as an encoding target, may not be required. ON the other hand, when the alternate_enabled_flag is ON, the alternate_mv_resolution value is 4 pixels, and the default motion vector resolution is 1/4 pixel, the motion vector resolutions of all CUs in the corresponding slice (or CTU) may be determined to be 1/4 pixel corresponding to the default motion vector resolution or 4 pixel corresponding to the Alternative resolution. That is, when the mv_resolution_flag of the current CU is 0 (i.e., OFF), the motion vector of the current CU may be represented with a resolution of 1/4 pixel, and when the mv_resolution_flag of the current CU is ON, the motion vector of the current CU may be represented with a resolution of 4 pixel.
Fig. 13 is a diagram showing an example of adaptive determination of resolution.
When adaptive_mv_resolution_enabled_flag of the SPS header is ON, the adaptive_mv_resolution of the header of slice #0 is 2 pixels, and the default motion vector resolution (default MV resolution) is 1/4 pixel, the motion vector resolution of the current CU including the circle in the corresponding slice #0 may be determined to be 1/4 pixel corresponding to the default motion vector resolution or 2 pixels corresponding to the alternate resolution. In this case, when the mv_resolution_flag, which is header information of the current CU, is OFF, the motion vector of the current CU is represented in 1/4 pixel units corresponding to the default motion vector resolution, and when the mv_resolution_flag of the current CU is ON, the motion vector of the current CU is represented in 2 pixel units corresponding to the alternative_mv_resolution.
When the alternative_mv_resolution of the header of the slice #1 is 4 pixels, the motion vector resolution of the current CU including the triangle in the slice #1 may be determined as 1/4 pixel corresponding to the default motion vector resolution or 4 pixel corresponding to the alternative resolution. In this case, when the mv_resolution_flag, which is header information of the current CU, is OFF, the motion vector of the current CU is represented in 1/4 pixel units corresponding to the default motion vector resolution, and when the mv_resolution_flag of the current CU is ON, the motion vector of the current CU is represented in 4 pixel units corresponding to the alternative resolution.
After operation S1245 or S1246, the video decoder 1120 may derive a motion vector of the current block using the MVP (S1250).
Here, operation S1250 may include operations S1251 to S1255.
In operation S1251, the video decoder 1120 may derive the MVP candidate, and may parse information (mvp_idx) for identifying the MVP of the current CU from the MVP candidate from the bitstream. Here, the neighboring blocks for MVP candidates may use some or all of a left block L, an upper block a, an upper right block AR, a lower left block BL, and an upper left block AL adjacent to the current CU in the current picture shown in fig. 3.
The video decoder 1120 may check whether the motion vector resolution of the block corresponding to mvp_idx is the same as the motion vector resolution of the current CU (S1252).
When it is checked that the motion vector resolution of the block corresponding to mvp_idx is the same as the motion vector resolution of the current CU, the video decoder 1120 may decode the MVD from the bitstream (S1254). When it is detected that the motion vector resolution of the block corresponding to mvp_idx is different from the motion vector resolution of the current CU, the video decoder 1120 may scale the MVP of the current block such that the resolution of the MVP is the same as the motion vector resolution of the current CU (S1253), and may decode the MVD from the bitstream (S1254).
The video decoder 1120 adds the MVD parsed from the bitstream to the MVP to generate a motion vector of the current CU (S1255).
For example, when the motion vector resolution of the current CU is 2 pixels, the motion vector resolution of the block corresponding to mvp_idx is 1/4 pixel, and the motion vector of the block corresponding to mvp_idx is set to 3, the actual motion vector of the block corresponding to mvp_idx corresponds to 0.75. When scaling is performed according to a 2-pixel motion vector resolution corresponding to the motion vector resolution of the current CU, the actual motion vector of the block corresponding to mvp_idx is converted to 0. This conversion formula can be expressed according to the following formula 1.
[ 1]
MV'=Round(MV×neighbor_MV_Resol/curr_MV_Resol)
Here, MV denotes a motion vector of a block corresponding to mvp_idx, neighbor_mv_result denotes a motion vector resolution of a block corresponding to mvp_idx, current_mv_result denotes a motion vector resolution of a current CU, MV' is a scaled motion vector, and Round denotes a rounding operation.
For each of the x and y components of the motion vector, adaptive_mv_resolution_enabled_flag, mv_resolution_ flag, alternative _mv_resolution, and the like may each be separate, and also the motion vector resolution of the CU may be calculated for each of the x and y components, respectively, according to the following equation 2.
[ 2]
MV x '=Round(MV x ×neighbor_MV x _Resol/curr_MVx_Resol)
MV y '=Round(MV y ×neighbor_MV y _Resol/curr_MV y _Resol)
As a result of the checking of operation S1241, when it is checked that the encoding mode of the current CU is not the mvp mode (e.g., merge mode), the video decoder 1120 may derive a motion vector of the current CU from the motion vectors (i.e., merge candidates) of the temporal or spatial neighboring blocks (S1260).
Here, operation S1260 may include operations S1261 to S1264.
As a result of the checking of operation S1241, when it is checked that the encoding mode of the current CU is not mvp mode, the video decoder 1120 parses information (candid_idx) for identifying a motion vector of the current CU from among merging candidates of the current CU from the bitstream (S1261). The merge candidate of the current CU may use some or all of the left block L, the upper block a, the upper right block AR, the lower left block BL, and the upper left block AL adjacent to the current CU in the current picture shown in fig. 3. In addition, a block located in a reference picture (which is the same as or different from a reference picture used to predict the current CU) other than the current picture in which the current block is located may be used as a motion vector candidate (i.e., a merge candidate). For example, a co-located block of the current block in the reference picture or a block adjacent to the co-located block may be further used as a merging candidate.
The video decoder 1120 may check whether the motion vector resolution of the block corresponding to the candid_idx parsed from the bitstream is the same as the motion vector resolution predefined for the merge mode (S1262). Here, the motion vector resolution predefined for the merge mode may be a motion vector resolution defined in any one unit of an image sequence, a picture, and a slice.
When the motion vector resolution of the block corresponding to the candid_idx is the same as the motion vector resolution predefined for the merge mode, the video decoder 1120 may set the motion vector of the block corresponding to the candid_idx as the motion vector of the current CU (S1264). When the motion vector resolution of the block corresponding to candid_idx is different from the predefined motion vector resolution, the video decoder 1120 may scale the motion vector of the block corresponding to candid_idx such that the motion vector of the block corresponding to candid_idx is the same as the predefined motion vector resolution (S1263), and may set the scaled motion vector as the motion vector of the current CU (S1264).
For reference, operations S1240, S1250, and S1260 may be sequentially and repeatedly performed for each CU.
Fig. 14 is a flowchart showing a case where some operations are added in fig. 12.
In fig. 14, operation S1240 may include operations S1241, s1242_1, s1242_2, S1243, S1244, S1245, and S1246. In fig. 14, operation S1250 may include operations S1251, S1252, S1253, and S1255.
In contrast to fig. 12, fig. 14 includes operations s1242_1 and s1242_2 in operation S1240 instead of operation S1242 of fig. 12, and does not include operation S1254 in operation S1250 of fig. 12.
For reference, among the functional blocks of fig. 14, the functional block having the same reference numeral as that of the block of fig. 12 performs the same operation as the block of fig. 12 unless it has a significantly different meaning in context. For example, operation S1243 of fig. 14 is the same as operation S1243 of fig. 12.
In fig. 14, when it is checked in operation S1241 that the encoding mode of the current CU is a mode of encoding the MVD using MVP (i.e., MVP mode), the video decoder 1120 may parse information (MVD) about the MVD from the bitstream (s1242_1).
After parsing out the MVD in operation s1242_1, the motion vector resolution decoder 1110 may check whether the MVD is not 0 and whether the adaptive_mv_resolution_enabled_flag is ON (s1242_2). When the MVD is not 0 and the adaptive_mv_resolution_enabled_flag is ON, the motion vector resolution decoder 1110 may perform operation S1243. When the MVD is 0 or the adaptive_mv_resolution_enabled_flag is not ON, the motion vector resolution decoder 1110 may perform operation S1246.
In operation S1252 of fig. 14, when it is checked that the motion vector resolution of the block corresponding to mvp_idx is the same as the motion vector resolution of the current CU, the video decoder 1120 adds MVD to MVP to calculate the motion vector of the current CU (S1255). In operation S1252 of fig. 14, when it is detected that the motion vector resolution of the block corresponding to mvp_idx is not the same as the motion vector resolution of the current CU, the video decoder 1120 may perform operation S1253 of scaling the MVP such that the resolution of the MVP is the same as the motion vector resolution of the current CU. After operation S1253, the video decoder 1120 may perform operation S1255 of adding the MVD and the MVP to calculate a motion vector of the current CU.
The operation S1251 may be performed before the operation s1242_1 or may be performed between the operations s1242_1 and s1242_2.
Fig. 15 is a flowchart illustrating a method of decoding video at a video decoding apparatus 1100 according to a second embodiment of the present invention.
As a detailed example of the case of fig. 15, when adaptive_mv_resolution_enabled_flag is located in an SPS (PPS, slice, or CTU header) and mv_resolution_flag and adaptive_mv_resolution information are located in a header of a CU as an encoding target, it may be determined whether to enable the Adaptive motion vector resolution mode according to the present invention in units of image sequences (in units of pictures, in units of slices, or in units of CTU) according to the adaptive_mv_resolution_flag value, and the motion vector resolution of the CU may be adjusted by selection on a CU-by-CU basis according to the mv_resolution_flag value.
For example, when adaptive_mv_resolution_enabled_flag in SPS header (PPS header, slice header, or CTU header) is ON, default motion vector resolution is 1/4 pixel, and mv_resolution_flag, which is header information of the current CU, is OFF, no adaptive_mv_resolution information is needed to determine the motion vector resolution of the current CU. In this case, the motion vector resolution of the current CU may be determined to be 1/4 pixel corresponding to the default motion vector resolution, and the motion vector of the current CU may be expressed in 1/4 pixel units.
ON the other hand, when the mv_resolution_flag is ON and the variable_mv_resolution value of the CU header is 4 pixels, the motion vector resolution of the current CU may be determined as 4 pixels corresponding to the alternative resolution, and the motion vector of the current CU may be represented at a resolution of 4 pixels corresponding to the alternative resolution.
As shown in fig. 15, in the video decoding apparatus 1100 according to the second embodiment of the present invention, the motion vector resolution decoder 1110 may parse adaptive_mv_resolution_enabled_flag from the bitstream (S1510). Adaptive_mv_resolution_enabled_flag means identification information indicating whether motion vector resolution is adaptively enabled, and may be determined as at least one picture unit (higher layer picture unit than CU) among a picture sequence, a picture, a slice, and a CTU. Adaptive_mv_resolution_enabled_flag may be parsed from a bitstream header of at least one picture unit among a picture sequence, a picture, a slice, and a CTU.
After parsing the adaptive_mv_resolution_enabled_flag of each image unit of at least one of the image sequence, the picture, the slice, or the CTU, the motion vector resolution decoder 1110 may determine a motion vector resolution of each block according to whether an encoding mode of each block in the image unit in which the adaptive_mv_resolution_enabled_flag is parsed is a mode for encoding the MVD (i.e., mvp mode) (S1540).
Here, operation S1540 may include operations S1541 to S1547.
After operation S1510, the motion vector resolution decoder 1110 may parse the encoding mode of the current CU from the bitstream, and may check whether the encoding mode of the current CU is a mode (i.e., MVP mode) in which MVDs are encoded using MVPs (S1541).
When the encoding mode of the current CU is mvp mode, it is checked whether adaptive_mv_resolution_enabled_flag indicates that the motion vector resolution of the CU in the upper layer picture unit is adaptively enabled (i.e., adaptive_mv_resolution_enabled_flag is ON) or indicates that the default motion vector resolution is used as the motion vector resolution in the CU in the upper layer picture unit (i.e., adaptive_mv_resolution_enabled_flag is OFF) (S1542).
When the adaptive_mv_resolution_enabled_flag is OFF, the motion vector resolution decoder 1110 may use the default motion vector resolution as the motion vector resolution of the current CU (S1547).
When the adaptive_mv_resolution_enabled_flag is ON, the motion vector resolution decoder 1110 may parse mv_resolution_flag, which is identification information indicating which of the default motion vector resolution and the alternative resolution is used as the motion vector resolution of the current CU, from the bitstream (S1543).
After parsing out the mv_resolution_flag, the motion vector resolution decoder 1110 may check the value of the mv_resolution_flag (S1544).
When the mv_resolution_flag indicates that the alternative resolution is used as the motion vector resolution of the current CU (i.e., when the mv_resolution_flag is ON), the motion vector resolution decoder 1110 may parse the alternative_mv_resolution, which is information ON the alternative resolution of the current CU, from the bitstream (S1545), and may set the parsed alternative resolution as the motion vector resolution of the current CU (S1546).
When the mv_resolution_flag indicates that the default motion vector resolution is used as the motion vector resolution of the current CU (i.e., when the mv_resolution_flag is OFF), the motion vector resolution decoder 1110 may set the default motion vector resolution to the motion vector resolution of the current CU (S1547).
Although the mv_resolution_flag and/or the alternate_mv_resolution information have been transmitted in units of CUs as described above, in some embodiments, the mv_resolution_flag and/or the alternate_mv_resolution may be encoded in units of CTUs. It is checked whether the coding mode of each CU in the corresponding CTU is mvp mode, and then the subsequent operation of operation S1540 may be performed on the CU whose coding mode is mvp mode.
Fig. 16 is a diagram showing another example of the adaptive determination of the resolution.
For example, when adaptive_mv_resolution_enabled_flag in SPS (PPS, slice header, or CTU header) is ON, the default motion vector resolution is 1/4 pixel, mv_resolution_flag in the header including a circular CU in the corresponding image sequence is ON and the alternative_mv_resolution is 1 pixel, the motion vector resolution of the corresponding CU may be determined to be 1 pixel corresponding to the alternative resolution and the motion vector of the CU may be represented with a resolution of 1 pixel.
When mv_resolution_flag is ON and the alternative_mv_resolution is 4 pixels in the header of the CU including the triangle in the corresponding image sequence, the motion vector resolution of the corresponding CU may be determined as 4 pixels corresponding to the alternative resolution, and the motion vector of the CU may be represented at a resolution of 4 pixels corresponding to the alternative resolution. In the case of a CU whose mv_resolution_flag value is set to OFF, the motion vector resolution of the CU may be determined to be 1/4 pixel corresponding to the default motion vector resolution, and the motion vector of the CU may be represented at a resolution of 1/4 pixel corresponding to the default motion vector resolution.
After operation S1546 or S1547, the video decoder 1120 may derive a motion vector of the current block using the MVP (S1550).
In fig. 15, operation S1550 may include operations S1551 to S1555.
Operations S1551 to S1555 are similar to operations S1251 to S1255, respectively, and thus, detailed descriptions of operations S1551 to S1555 are omitted.
When it is detected in operation S1541 that the encoding mode of the corresponding CU is not the mvp mode (e.g., merge mode), the video decoder 1120 may derive a motion vector of the current CU from motion vectors of temporal or spatial neighboring blocks (i.e., merge candidates) (S1560).
Here, operation S1560 may include operations S1561 through S1564.
Operations S1561 to S1564 are similar to operations S1261 to S1264, respectively, and thus, detailed descriptions of operations S1561 to S1564 are omitted.
Fig. 17 is a flowchart showing a case where some operations are added in fig. 15.
In fig. 17, the operation S1540 may include operations S1541, s1542_1, s1542_2, S1543, S1544, S1545, S1546, and S1547. In fig. 17, operation S1550 may include operations S1551, S1552, S1553, and S1555.
For reference, among the functional blocks of fig. 17, the functional block having the same reference numeral as that of the block of fig. 15 performs the same operation as the block of fig. 15 unless it has a significantly different meaning in context. For example, operation S1543 of fig. 17 is the same as operation S1543 of fig. 15.
When it is checked in operation S1541 of fig. 17 that the encoding mode of the current CU is a mode of encoding the MVD using the MVP (i.e., MVP mode), the video decoder 1120 may parse information (MVD) about the MVD from the bitstream (s1542_1).
After parsing out the MVD in operation S1542_1, the motion vector resolution decoder 1110 may check whether the MVD is not 0 and whether the adaptive_mv_resolution_enabled_flag is ON (S1542_2). When MVD is not 0 and adaptive_mv_resolution_enabled_flag is ON, the motion vector resolution decoder 1110 may perform operation S1543. When the MVD is 0 or the adaptive_mv_resolution_enabled_flag is not ON, the motion vector resolution decoder 1110 may perform operation S1547.
The operation S1551 may be performed before the operation S1542_1 or may be performed between the operations S1542_1 and S1542_2.
Operations S1551, S1552, S1553, and S1555 of fig. 17 are similar to operations S1251, S1252, S1253, and S1255 of fig. 14, respectively, and thus detailed descriptions of operations S1551, S1552, S1553, and S1555 of fig. 17 are omitted.
Fig. 18 is a flowchart illustrating a method of decoding video at a video decoding apparatus 1100 according to a third embodiment of the present invention.
As shown in fig. 18, in the video decoding apparatus 1100 according to the third embodiment of the present invention, the motion vector resolution decoder 1110 may parse adaptive_mv_resolution_enabled_flag from the bitstream (S1810). Adaptive_mv_resolution_enabled_flag means identification information indicating whether motion vector resolution is adaptively enabled, and may be determined in at least one image unit among an image sequence, a picture, a slice, or a CTU, which are upper layer image units. Adaptive_mv_resolution_enabled_flag may be parsed from a bitstream header of an image unit of at least one of an image sequence, a picture, a slice, and a CTU.
The motion vector resolution decoder 1110 may check whether adaptive_mv_resolution_enabled_flag is a motion vector resolution indicating whether motion vector resolution is adaptively enabled (i.e., in case of ON) in an upper layer picture unit (i.e., in all CUs in the upper layer picture unit) of at least one of a picture sequence, a picture, a slice, and a CTU or a default motion vector resolution is used as a motion vector resolution (in case of OFF) of all CUs in the upper layer picture unit (S1820).
When it is checked in operation S1820 that the adaptive_mv_resolution_enabled_flag is OFF, the motion vector resolution decoder 1110 may use the default motion vector resolution as the motion vector resolution of the corresponding CU in the corresponding picture unit (S1822).
When it is checked in operation S1820 that the adaptive_mv_resolution_enabled_flag is ON, the motion vector resolution decoder 1110 may parse the adaptive_mv_resolution, which is information ON the alternative resolution, from the bitstream (S1830). Here, the adaptive_mv_resolution may be parsed from the bitstream for the same respective picture units as the picture units transmitting the adaptive_mv_resolution_enabled_flag.
Alternatively, the adaptive_mv_resolution may be parsed from the bitstream for each picture unit that is smaller than the picture unit that transmits the adaptive_mv_resolution_enabled_flag and larger than the CU that is the block unit that determines the motion vector resolution. For example, when the image unit transmitting adaptive_mv_resolution_enabled_flag is one of an image sequence and a picture, the unit parsed with adaptive_mv_resolution may be a slice (or CTU).
Alternatively, the adaptive_mv_resolution may be parsed for each CU in the picture unit for the adaptive_mv_resolution_enabled_flag.
In addition, the alternative_mv_resolution cannot be resolved at any location.
After operation S1830, the motion vector resolution decoder 1110 may determine the motion vector resolution of the current CU according to whether the coding mode of the current CU is a mode of coding the MVD (S1840).
Here, operation S1840 may include operations S1841 to S1848.
After operation S1830, the motion vector resolution decoder 1110 may parse the coding mode of the current CU from the bitstream and may check whether the coding mode of the current CU is a mode (i.e., MVP mode) in which MVDs are encoded using MVPs (S1841).
When the encoding mode of the current CU is mvp mode, the motion vector resolution decoder 1110 may check whether adaptive_mv_resolution_enabled_flag is a motion vector resolution indicating that the CU in the upper layer picture unit is adaptively enabled (i.e., when adaptive_mv_resolution_enabled_flag is ON) or a motion vector resolution indicating that a default motion vector resolution is used as the CU in the upper layer picture unit (i.e., when adaptive_mv_resolution_enabled_flag is OFF) (S1842).
When the adaptive_mv_resolution_enabled_flag is ON, the motion vector resolution decoder 1110 may parse mv_resolution_flag, which is identification information indicating which one of the default motion vector resolution and the alternative resolution is used as the motion vector resolution of the current CU, from the bitstream (S1843).
On the other hand, when the adaptive_mv_resolution_enabled_flag is OFF, the motion vector resolution decoder 1110 may determine the default motion vector resolution as the motion vector resolution of the current CU (S1848).
After parsing the mv_resolution_flag of the current CU in operation S1843, the motion vector resolution decoder 1110 may check the value of the mv_resolution_flag (S1844).
When it is checked in operation S1844 that the mv_resolution_flag indicates that the motion vector resolution of the current CU is determined using the resolution difference value, which is the difference between the motion vector resolution of the current CU and the motion vector resolution of the previous CU (i.e., when the mv_resolution_flag is ON), the motion vector resolution decoder 1110 may parse mv_resolution_delta, which is information ON the difference value between the motion vector resolution of the current CU and the motion vector resolution of the previous CU, from the bitstream (S1845). Here, when there is no alternative_mv_resolution value and the current CU is the first CU in decoding order among CUs in an upper layer picture unit, mv_resolution_delta may represent information indicating a motion vector resolution of the current CU.
The motion vector resolution decoder 1110 may calculate a motion vector resolution of the current CU using mv_resolution_delta (S1846), and may set the calculation result as the motion vector resolution of the current CU (S1847).
When it is checked in operation S1844 that the mv_resolution_flag indicates that the default motion vector resolution is used as the motion vector resolution of the current CU (i.e., when the mv_resolution_flag is OFF), the motion vector resolution decoder 1110 may determine the default motion vector resolution as the motion vector resolution of the current CU (S1848).
In operation S1846, when the current CU is the first CU in decoding order among the CUs in the upper layer picture unit, the motion vector resolution decoder 1110 may determine mv_resolution_delta as the motion vector resolution of the current CU. When the current CU is a CU subsequent to the first CU in decoding order, the motion vector resolution decoder 1110 may add mv_resolution_delta to the motion vector resolution of a CU that has been encoded immediately prior to the current CU in encoding order, generating the motion vector resolution of the current CU. In this case, the motion vector resolution decoder 1110 does not need the alternative_mv_resolution to generate the motion vector resolution of the current CU, and thus, an operation of parsing the alternative_mv_resolution from the bitstream may be omitted.
For example, when adaptive_mv_resolution_enabled_flag in the slice header is ON, the default motion vector resolution is 1/4 pixel, and mv_resolution_flag, which is header information of the current CU, is OFF, mv_resolution_delta information may not be needed to determine the motion vector resolution of the current CU. In this case, the motion vector resolution of the current CU may be set to 1/4 pixel as a default motion vector resolution, and the motion vector of the current CU may be represented at a resolution of 1/4 pixel corresponding to the default motion vector resolution of the motion vector.
ON the other hand, when the mv_resolution_flag is ON, the current CU is the first CU in decoding order, and the mv_resolution_delta of the current CU is 4 pixels, the motion vector resolution of the current CU may be determined to be 4 pixels corresponding to the mv_resolution_delta, and the motion vector of the current CU may be represented with a resolution of 4 pixels. When the current CU is not the first CU in decoding order, in case that the mv_resolution_flag of the current CU is ON and the mv_resolution_delta of the current CU is 0, the motion vector resolution of the current CU may be set to 4 pixels obtained by adding the mv_resolution_delta of the current CU to the motion vector resolution of the previous CU of the current CU, and the motion vector of the current CU may be represented in the resolution of 4 pixels.
When the current CU is not the first CU in decoding order, in case that the mv_resolution_flag of the current CU is ON and the mv_resolution_delta of the current CU is-2 pixels, the motion vector resolution of the current CU may be set to 2 pixels obtained by adding the-2 pixels corresponding to the mv_resolution_delta to 4 pixels, which are the motion vector resolutions of the previous CUs, and the motion vector of the current CU may be expressed in 2 pixel units.
In the embodiment described above with reference to fig. 18, the motion vector resolution decoder 1110 adds mv_resolution_delta to the motion vector resolution of the previous CU of the current CU to restore the motion vector resolution of the current CU, but the present invention is not limited thereto. The motion vector resolution decoder 1110 may add mv_resolution_delta of the current CU to the alternative resolution to decode the motion vector resolution of the current CU.
As an example of the embodiment of fig. 18, when adaptive_mv_resolution_enabled_flag and adaptive_mv_resolution are located in a slice (tile) header (SPS, PPS, or CTU), and mv_resolution_flag and mv_resolution_delta information are located in a header of a CU in a slice (tile) (image sequence, picture, or CTU) unit, whether to apply the Adaptive motion vector resolution mode and the alternative resolution according to the present invention may be enabled according to the adaptive_mv_resolution_enabled_flag, and the motion vector resolution may be adaptively determined for each CU in the slice (tile) (image sequence, picture, or CTU) unit according to the mv_resolution_flag of each CU, and the motion vector resolution of the current CU may be adjusted using the mv_resolution_delta.
In this case, mv_resolution_delta of the current CU may be represented as a difference between the alternative_mv_resolution (i.e., alternative resolution) in a slice (tile) header (or image sequence, picture, or CTU) and the motion vector resolution of the current CU. The motion vector resolution decoder 1110 may calculate the motion vector resolution of the current CU by adding the variable_mv_resolution to the mv_resolution_delta.
For example, when adaptive_mv_resolution_enabled_flag of the slice header is ON, an alternate_mv_resolution value as an alternate resolution is 4 pixels, a default motion vector resolution is 1/4 pixel, and mv_resolution_flag as header information of the current CU is OFF, mv_resolution_delta information may not be required to determine the motion vector resolution of the current CU. In this case, the motion vector resolution of the current CU may be determined to be 1/4 pixel corresponding to the default motion vector resolution, and the motion vector of the current CU may be represented at a resolution of 1/4 pixel.
When the mv_resolution_flag of the current CU is ON and the mv_resolution_delta of the current CU is 0, the motion vector resolution of the current CU may be set to 4 pixels corresponding to a value obtained by adding the alternative_mv_resolution and the mv_resolution_delta, and the motion vector of the current CU may be represented in a resolution of 4 pixels. When the mv_resolution_flag of the current CU is ON, if the mv_resolution_delta of the current CU is-2 pixels, the motion vector resolution of the current CU may be set to 2 pixels obtained by adding 4 pixels of the alternative_mv_resolution, which is a slice header, to-2 pixels corresponding to the mv_resolution_delta value, and the motion vector of the current CU may be represented at a resolution of 2 pixels.
As another example of the embodiment of fig. 18, when adaptive_mv_resolution_enabled_flag and alternate_mv_resolution as alternate resolution are located in a slice (tile) header (or SPS, PPS, or CTU), it may be determined whether to apply the Adaptive motion vector resolution mode and alternate resolution according to the present invention in units of slices (tiles) (image sequence, picture, or CTU) according to adaptive_mv_resolution_enabled_flag. When the mv_resolution_flag and the mv_resolution_delta information are located in the header of a CU in a slice (tile) (image sequence, picture, or CTU) unit, a motion vector resolution may be adaptively determined for each CU in the slice (tile) (image sequence, picture, or CTU) unit according to the mv_resolution_flag, and the motion vector resolution of the current CU may be adjusted using the mv_resolution_delta.
Here, mv_resolution_delta of the current CU may be expressed as a difference between an alternative_mv_resolution in a slice (tile) (SPS, PPS, or CTU) and a motion vector resolution of the current CU, or may be expressed as a difference between a motion vector resolution of a CU encoded immediately before the current CU and a motion vector resolution of the current CU.
For example, when the current CU is the first CU in decoding order, mv_resolution_delta of the current CU may indicate the difference between the temporal_mv_resolution in the slice (tile) (SPS, PPS, or CTU) header and the motion vector resolution of the current CU. When the current CU is not the first CU, mv_resolution_delta of the current CU may be represented as a difference between a motion vector resolution encoded immediately before the current CU and a motion vector resolution of the current CU.
For example, when adaptive_mv_resolution_enabled_flag of a slice header is ON, the alternative resolution is 4 pixels, the default motion vector resolution is 1/4 pixel, and mv_resolution_flag, which is header information of the current CU, is OFF, mv_resolution_delta of the current CU may not be required to determine the motion vector resolution of the current CU. In this case, the motion vector resolution of the current CU may be set to 1/4 pixel corresponding to the default motion vector resolution.
ON the other hand, when mv_resolution_flag, which is header information of the current CU, is ON, the current CU is the first CU, and mv_resolution_delta of the CU is +2 pixels indicating a resolution difference between 4 pixels, which are the alternative_mv_resolution in a slice (tile) header, and a motion vector resolution of the current CU, the motion vector resolution of the current CU may be 6 pixels obtained by adding the alternative_mv_resolution 4 pixels and the resolution difference +2 pixels. In this case, when the mv_resolution_flag of the next CU of the first CU is ON and the mv_resolution_delta of the corresponding next CU is-2 pixels, the motion vector resolution of the corresponding next CU may be 4 pixels obtained by adding 6 pixels, which are the motion vector resolution of the previous CU, to-2 pixels, which are the mv_resolution_delta of the corresponding next CU.
After operation S1847 or S1848 of fig. 18, the video decoder 1120 may derive a motion vector of the current CU using the MVP (S1850).
Here, operation S1850 may include operations S1851 to S1855.
Operations S1851 to S1855 of fig. 18 are similar to operations S1251 to S1255 of fig. 12, respectively, and thus, detailed descriptions of operations S1851 to S1855 are omitted.
When it is detected in operation S1841 that the encoding mode of the corresponding CU is not mvp mode, the video decoder 1120 may derive a motion vector of the current CU from motion vectors of temporal or spatial neighboring blocks (i.e., merging candidates) (S1860).
Here, operation S1860 may include operations S1861 to S1864.
Operations S1861 to S1864 of fig. 18 are similar to operations S1261 to S1264 of fig. 12, respectively, and thus, detailed descriptions of operations S1861 to S1864 are omitted.
Fig. 18 is a flowchart for explaining the operation of the video decoding apparatus 1100 according to the fourth embodiment of the present invention and the operation of the video decoding apparatus 1100 according to the third embodiment of the present invention.
That is, according to the fourth embodiment, the video decoder 1120 may parse mv_resolution_scale instead of mv_resolution_delta from the bitstream, and may restore the motion vector resolution of the current CU using the parsed mv_resolution_scale.
In the third embodiment and the fourth embodiment, operations corresponding to operations S1845 to S1846 among the operations of fig. 18 are different from each other, and the remaining operations are the same.
According to the fourth embodiment, operation S1845 may be implemented to parse mv_resolution_scale from the bitstream, and operation S1846 may be implemented to restore the motion vector resolution of the current CU using mv_resolution_scale.
When it is checked in operation S1844 that the mv_resolution_flag indicates that the motion vector resolution of the current CU is determined using the resolution scaling factor, which is a value obtained by dividing the motion vector resolution of the current CU by the motion vector resolution of the previous CU (i.e., when the mv_resolution_flag is ON), the motion vector resolution decoder 1110 may parse mv_resolution_scale, which is information indicating a value obtained by dividing the motion vector resolution of the current CU by the motion vector resolution of the previous CU, from the bitstream (S1845). Alternatively, mv_resolution_scale may be information indicating a value obtained by dividing the motion vector resolution of the current CU by the alternative resolution.
The motion vector resolution decoder 1110 may calculate a motion vector resolution of the current CU using mv_resolution_delta (S1846).
In operation S1846, when the current CU is the first CU in decoding order among the CUs in the upper layer picture unit, the motion vector resolution decoder 1110 may parse the motion vector resolution corresponding to the mv_resolution_scale into the motion vector resolution of the current CU. When the current CU is a CU subsequent to the first CU in decoding order, the motion vector resolution decoder 1110 may calculate the motion vector resolution of the current CU by multiplying the motion vector resolution of the CU encoded immediately before the current CU by the scale factor of the current CU. In this case, since the alternative resolution is not required to calculate the motion vector resolution of the current CU, the motion vector resolution decoder 1110 may omit an operation of parsing the alternative_mv_resolution from the bitstream.
As another embodiment of calculating the motion vector resolution of the current CU with mv_resolution_scale, the motion vector resolution decoder 1110 may decode a result obtained by multiplying mv_resolution_scale in the current CU in the upper layer picture unit by the alternative resolution as the motion vector resolution of the current CU.
As another embodiment of calculating the motion vector resolution of the current CU with the mv_resolution_scale, when the current CU is the first CU in decoding order among CUs in an upper layer picture unit, the motion vector resolution decoder 1110 may decode the motion vector resolution of the current CU by multiplying the mv_resolution_scale with the alternative resolution. On the other hand, when the current CU is a CU subsequent to the first CU in decoding order, the motion vector resolution decoder 1110 may calculate the motion vector resolution of the current CU by multiplying the motion vector resolution of the CU encoded immediately prior to the current CU by the mv_resolution_scale of the current CU.
As an example of the operation of the video decoding apparatus 1100 according to the fourth embodiment of the present invention, when adaptive_mv_resolution_enabled_flag and adaptive_mv_resolution are located in a slice (tile) header (SPS, PPS, or CTU) and mv_resolution_flag and mv_resolution_scale information are located in a header of a current CU as an encoding target, the motion vector resolution decoder 1110 may determine whether to apply the Adaptive motion vector resolution mode and the alternative resolution according to the present invention in units of slices (tiles) (image sequences, pictures, or CTUs). The motion vector resolution decoder 1110 may adaptively enable motion vector resolution in units of blocks in a slice (tile) (image sequence, picture, or CTU), and adjust the motion vector resolution of the current CU using the mv_resolution_scale value.
In this case, when the current CU is the first CU in a slice (tile) (image sequence, picture, or CTU) that is an upper layer image unit, mv_resolution_scale may be represented as a value corresponding to a motion vector resolution of the current CU, and when the current CU is a CU subsequent to the first CU, mv_resolution_scale may be represented as a multiplication value between a motion vector resolution of a CU encoded immediately before the current CU and a motion vector resolution of the current CU.
For example, when adaptive_mv_resolution_enabled_flag of a slice header is ON, a default motion vector resolution is 1/4 pixel, and mv_resolution_flag, which is header information of the current CU, is OFF, mv_resolution_scale information of the current CU may not be parsed, and the motion vector resolution of the current CU may be set to 1/4 pixel corresponding to the default motion vector resolution.
When the mv_resolution_flag of the current CU is ON and the current CU is the first CU in decoding order and the mv_resolution_scale of the current CU is 4, the motion vector resolution of the current CU may be 4 pixels corresponding to the mv_resolution_scale. When the current CU is the next CU of the first CU, the mv_resolution_flag of the current CU is ON, and the mv_resolution_scale of the current CU is 1/2, the motion vector resolution of the current CU may be set to 2 pixels obtained by multiplying 4 pixels of the motion vector resolution of the previous CU by 1/2 of the mv_resolution_scale of the current CU.
Alternatively, mv_resolution_scale of each CU may be represented as a multiplication value between the variable_mv_resolution included in a slice (tile) header (SPS, PPS, or CTU) and the motion vector resolution of the current block.
For example, when the adaptive_mv_resolution_enabled_flag of the slice header is ON, the alternative_mv_resolution value as the alternative resolution is 4 pixels, the default motion vector resolution is 1/4 pixel, and the mv_resolution_flag as the header information of the current CU is OFF, the mv_resolution_scale information of the current CU may not be parsed, and the motion vector resolution of the current CU may be set to 1/4 pixel corresponding to the default motion vector resolution.
ON the other hand, when the mv_resolution_flag of the current CU is ON and the mv_resolution_scale of the current CU is 1, the motion vector resolution of the current CU may be set to 4 pixels obtained by multiplying 4 pixels as the alternative_mv_resolution by 1 as the mv_resolution_scale of the current CU. When the mv_resolution_flag of the next CU is ON and mv_resolution_scale is 1/2, the motion vector resolution of the corresponding next CU may be set to 2 pixels obtained by multiplying 4 pixels as the alternative resolution with 1/2 of the mv_resolution_scale of the next CU.
In another embodiment, the mv_resolution_scale of each CU may be represented by a multiplication value between an alternative_mv_resolution value in a slice (tile) header (SPS or PPS) and the motion vector resolution of the current CU, or may be represented by a multiplication value between the motion vector resolution of the CU encoded immediately before the current CU and the motion vector resolution of the current CU. That is, when the current CU is the first CU in decoding order, mv_resolution_scale of the current CU may indicate a multiplication value between the motion vector resolution of the current CU and the variable_mv_resolution in a slice (tile) header (SPS or PPS), and when the current CU is a CU following the first CU, mv_resolution_scale of the current CU may be represented by a multiplication value between the motion vector resolution of a CU encoded immediately before the current CU and the motion vector resolution of the current CU.
For example, when adaptive_mv_resolution_enabled_flag of a slice header is ON, adaptive_mv_resolution is 4 pixels, a default motion vector resolution is 1/4 pixels, and mv_resolution_flag, which is header information of the current CU, is OFF, the motion vector resolution of the current CU may not need mv_resolution_scale information, and thus, the motion vector resolution of the current CU may be set to 1/4 pixels corresponding to the default motion vector resolution, and the motion vector of the current CU may be represented at a resolution of 1/4 pixels corresponding to the default motion vector resolution.
ON the other hand, when the mv_resolution_flag of the current CU is ON and the mv_resolution_scale of the current CU is 1, the motion vector resolution of the current CU may be set to 4 pixels obtained by multiplying 4 pixels of the alternative_mv_resolution as a slice header by 1 of the mv_resolution_scale of the current CU. When the current CU is the next CU of the first CU, the mv_resolution_flag of the current CU is ON, and the mv_resolution_scale of the current CU is 1/2, the motion vector resolution of the current CU may be set to 2 pixels obtained by multiplying 4 pixels, which are motion vector resolution values of the CU encoded immediately before the current CU, by 1/2 of the mv_resolution_scale of the current CU.
Fig. 19 is a flowchart showing a case where some operations are added in the flowchart of fig. 18.
In fig. 19, the operation S1840 may include operations S1841, S1842_1, S1842_2, S1843, S1844, S1845, S1846, S1847, and S1848. In addition, operation S1850 of fig. 19 may include operations S1851, S1852, S1853, and S1855.
In comparison with the case of fig. 18, fig. 19 includes operations s1242_1 and s1242_2 in operation S1240 instead of operation S1242 of fig. 18, and operation S1854 of fig. 18 may be excluded from operation S1850.
For reference, among the functional blocks of fig. 19, the functional block having the same reference numeral as that of the block of fig. 18 performs the same operation as the block of fig. 18 unless it has a significantly different meaning in context. For example, operation S1843 of fig. 19 is the same as operation S1843 of fig. 18.
When it is checked in operation S1841 of fig. 19 that the encoding mode of the current CU is a mode in which MVDs are encoded using MVPs (i.e., MVP mode), the video decoder 1120 may decode information about MVDs from the bitstream (s1842_1).
After decoding the MVD, the motion vector resolution decoder 1110 may check whether the MVD is not 0 and whether adaptive_mv_resolution_enabled_flag is ON (S1842_2). As a result of the check in operation S1842_2, when MVD is not 0 and adaptive_mv_resolution_enabled_flag is ON, the motion vector resolution decoder 1110 may perform operation S1843. When the MVD is 0 or the adaptive_mv_resolution_enabled_flag is not ON, the motion vector resolution decoder 1110 may perform operation S1848.
The operation S1851 may be performed before the operation S1842_1 or may be performed between the operations S1842_1 and S1842_2.
Operations S1851, S1852, S1853, and S1855 of fig. 19 are similar to operations S1251, S1252, S1253, and S1255 of fig. 14, respectively, and thus detailed descriptions of operations S1851, S1852, S1853, and S1855 of fig. 19 are omitted.
In accordance with an embodiment of the present invention, a case in which the motion vector resolution decoder 1110 parses adaptive_mv_resolution_enabled_flag from the bitstream has been illustrated, but in some embodiments, the motion vector resolution decoder 1110 in the video decoding apparatus 1100 may omit an operation of parsing adaptive_mv_resolution_enabled_flag from the bitstream. In this case, the motion vector resolution decoder 1110 may perform the same operation as that performed when the adaptive_mv_resolution_enabled_flag is ON, or may perform the same operation as that performed when the adaptive_mv_resolution_enabled_flag is OFF. Accordingly, the video encoding apparatus 400 may omit the operation of encoding the adaptive_mv_resolution_enabled_flag.
The above embodiments of the video decoding apparatus are not limited to the case of parsing the mv_resolution_flag in units of CUs, and may parse the mv_resolution_flag in units of CTUs to set one of a default motion vector resolution or an alternative resolution as a motion vector resolution. When the motion vector resolution is determined in units of CTUs, all CUs included in one CTU may have the same motion vector resolution. In this case, the upper layer image unit of the CTU unit may be one of an image sequence, a picture, or a slice.
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims. The exemplary embodiments should be considered in descriptive sense only and not for purposes of limitation. Therefore, the scope of the invention is defined not by the detailed description of the invention but by the appended claims, and all differences within the scope will be construed as being included in the present invention.
Cross Reference to Related Applications
The present application is based on and claims priority under 35 u.s.c. ≡119 (a) of korean patent application nos. 10-2016-013086 and 10-2017-0025673, filed in the korean intellectual property office on day 2016, 10 and day 19, 2, 2017, respectively, the disclosures of which are incorporated herein by reference in their entireties. In addition, the non-provisional application requires priority in countries other than the united states for the same reason as in the korean patent application, which is hereby incorporated by reference in its entirety.

Claims (7)

1. A method of encoding video, the method comprising the steps of:
encoding an enable flag into a bitstream indicating whether to adaptively determine a motion vector resolution, wherein the enable flag is encoded into a sequence parameter set or a picture parameter set of the bitstream;
determining a motion vector resolution of the current block;
determining a motion vector of the current block according to the motion vector resolution of the current block;
generating a motion vector difference value of the current block as a difference between the motion vector of the current block and a motion vector predictor derived from a neighboring block of the current block, and encoding information on the motion vector difference value of the current block into the bitstream; and
when the motion vector resolution is adaptively determined and the motion vector difference value of the current block is non-zero, information about the motion vector resolution of the current block is encoded into the bitstream,
wherein the step of encoding information on the motion vector resolution of the current block comprises the steps of:
encoding a mv resolution flag indicating whether the motion vector resolution of the current block is a default motion vector resolution as a 1/4 pixel unit into the bitstream; and
When the motion vector resolution of the current block is not the default motion vector resolution, encoding information on a substitute resolution indicating one of a plurality of motion vector resolution candidates including 1/2 pixel units, 1 pixel units, and 4 pixel units into the bitstream,
wherein when the motion vector resolution is not adaptively determined, the motion vector resolution of the current block is determined as the default motion vector resolution,
wherein the motion vector resolution of the current block is determined as the default motion vector resolution when the motion vector difference of the current block is zero.
2. The method of claim 1, wherein the step of encoding information about the motion vector resolution comprises the steps of: the difference between the motion vector resolution of the current block and the motion vector resolution of a block encoded before the current block is encoded as one element of information about the motion vector resolution of the current block.
3. The method of claim 1, wherein the step of encoding information about the motion vector resolution comprises the steps of: information on a ratio between the motion vector resolution of the current block and a motion vector resolution of a block encoded before the current block is encoded as one element of information on the motion vector resolution of the current block.
4. A video decoding method that adaptively determines a motion vector resolution of a current block and decodes the current block, the video decoding method comprising the steps of:
extracting an enable flag indicating whether to adaptively determine a motion vector resolution from a bitstream, wherein the enable flag is extracted from a sequence parameter set or a picture parameter set of the bitstream;
extracting information on a motion vector difference value of the current block from the bitstream;
when the enable flag indicates that the motion vector resolution is adaptively determined and the motion vector difference value of the current block is non-zero, extracting information on the motion vector resolution of the current block from a bitstream, and determining the motion vector resolution of the current block based on the information on the motion vector resolution of the current block; and
determining a motion vector of the current block according to the motion vector resolution of the current block, the motion vector difference value of the current block, and a motion vector predictor derived from neighboring blocks of the current block;
wherein the step of extracting information on the motion vector resolution of the current block comprises the steps of:
Extracting an mv resolution flag indicating whether the motion vector resolution of the current block is a default motion vector resolution as a 1/4 pixel unit from the bitstream; and
when the mv resolution flag indicates that the motion vector resolution of the current block is not the default motion vector resolution, extracting information on a substitute resolution for indicating one of a plurality of motion vector resolution candidates including 1/2 pixel units, 1 pixel units, and 4 pixel units from the bitstream,
wherein the motion vector resolution of the current block is set to the default motion vector resolution when the enable flag indicates that the motion vector resolution is not adaptively determined,
wherein the motion vector resolution of the current block is set to the default motion vector resolution when the motion vector difference of the current block is zero.
5. The video decoding method of claim 4, wherein the step of determining the motion vector resolution of the current block comprises the steps of: information on a difference between the motion vector resolution of the current block and a motion vector resolution of a block encoded before the current block is extracted from the bitstream as one element of the information on the motion vector resolution of the current block.
6. The video decoding method of claim 4, wherein the step of determining the motion vector resolution of the current block comprises the steps of: information on a ratio between the motion vector resolution of the current block and the motion vector resolution of a block encoded before the current block is extracted from the bitstream as one element of the information on the motion vector resolution of the current block.
7. A non-transitory recording medium storing a bitstream to be decoded by a video decoding method for adaptively determining a motion vector resolution of a current block and decoding the current block, the video decoding method comprising the steps of:
extracting an enable flag indicating whether to adaptively determine a motion vector resolution from a bitstream, wherein the enable flag is extracted from a sequence parameter set or a picture parameter set of the bitstream;
extracting information on a motion vector difference value of the current block from the bitstream;
when the enable flag indicates that the motion vector resolution is adaptively determined and the motion vector difference value of the current block is non-zero, extracting information on the motion vector resolution of the current block from a bitstream, and determining the motion vector resolution of the current block based on the information on the motion vector resolution of the current block; and
Determining a motion vector of the current block according to the motion vector resolution of the current block, the motion vector difference value of the current block, and a motion vector predictor derived from neighboring blocks of the current block; and is also provided with
Wherein the step of extracting information on the motion vector resolution of the current block comprises the steps of:
extracting an mv resolution flag indicating whether the motion vector resolution of the current block is a default motion vector resolution as a 1/4 pixel unit from the bitstream; and
when the mv resolution flag indicates that the motion vector resolution of the current block is not the default motion vector resolution, extracting information on a substitute resolution for indicating one of a plurality of motion vector resolution candidates including 1/2 pixel units, 1 pixel units, and 4 pixel units from the bitstream,
wherein the motion vector resolution of the current block is set to the default motion vector resolution when the enable flag indicates that the motion vector resolution is not adaptively determined,
wherein the motion vector resolution of the current block is set to the default motion vector resolution when the motion vector difference of the current block is zero.
CN202310713804.5A 2016-10-19 2017-10-17 Video encoding/decoding apparatus and method, and non-transitory recording medium Pending CN116567210A (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
KR20160136066 2016-10-19
KR10-2016-0136066 2016-10-19
KR1020170025673A KR20180043151A (en) 2016-10-19 2017-02-27 Apparatus and Method for Video Encoding or Decoding
KR10-2017-0025673 2017-02-27
PCT/KR2017/011484 WO2018074825A1 (en) 2016-10-19 2017-10-17 Device and method for encoding or decoding image
CN201780064071.XA CN109845258B (en) 2016-10-19 2017-10-17 Apparatus and method for encoding or decoding image

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201780064071.XA Division CN109845258B (en) 2016-10-19 2017-10-17 Apparatus and method for encoding or decoding image

Publications (1)

Publication Number Publication Date
CN116567210A true CN116567210A (en) 2023-08-08

Family

ID=62019191

Family Applications (4)

Application Number Title Priority Date Filing Date
CN202310713804.5A Pending CN116567210A (en) 2016-10-19 2017-10-17 Video encoding/decoding apparatus and method, and non-transitory recording medium
CN202310710814.3A Pending CN116567208A (en) 2016-10-19 2017-10-17 Video encoding/decoding apparatus and method, and non-transitory recording medium
CN202310713793.0A Pending CN116567209A (en) 2016-10-19 2017-10-17 Video encoding/decoding apparatus and method, and non-transitory recording medium
CN202310713813.4A Pending CN116567211A (en) 2016-10-19 2017-10-17 Video encoding/decoding apparatus and method, and non-transitory recording medium

Family Applications After (3)

Application Number Title Priority Date Filing Date
CN202310710814.3A Pending CN116567208A (en) 2016-10-19 2017-10-17 Video encoding/decoding apparatus and method, and non-transitory recording medium
CN202310713793.0A Pending CN116567209A (en) 2016-10-19 2017-10-17 Video encoding/decoding apparatus and method, and non-transitory recording medium
CN202310713813.4A Pending CN116567211A (en) 2016-10-19 2017-10-17 Video encoding/decoding apparatus and method, and non-transitory recording medium

Country Status (2)

Country Link
CN (4) CN116567210A (en)
WO (1) WO2018074825A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
MX2021007166A (en) 2018-12-21 2021-08-19 Samsung Electronics Co Ltd Image encoding device and image decoding device using triangular prediction mode, and image encoding method and image decoding method performed thereby.

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100704626B1 (en) * 2005-02-07 2007-04-09 삼성전자주식회사 Method and apparatus for compressing multi-layered motion vectors
KR101441874B1 (en) * 2009-08-21 2014-09-25 에스케이텔레콤 주식회사 Video Coding Method and Apparatus by Using Adaptive Motion Vector Resolution
US9288490B2 (en) * 2010-09-30 2016-03-15 Panasonic Intellectual Property Corporation Of America Image decoding method, image coding method, image decoding apparatus, image coding apparatus, program, and integrated circuit
EP2952003B1 (en) * 2013-01-30 2019-07-17 Intel Corporation Content adaptive partitioning for prediction and coding for next generation video
US10531116B2 (en) * 2014-01-09 2020-01-07 Qualcomm Incorporated Adaptive motion vector resolution signaling for video coding

Also Published As

Publication number Publication date
WO2018074825A1 (en) 2018-04-26
CN116567208A (en) 2023-08-08
CN116567211A (en) 2023-08-08
CN116567209A (en) 2023-08-08

Similar Documents

Publication Publication Date Title
CN109845258B (en) Apparatus and method for encoding or decoding image
US20230088154A1 (en) Effective wedgelet partition coding using spatial prediction
US11425367B2 (en) Effective wedgelet partition coding
US20230412822A1 (en) Effective prediction using partition coding
JP2022123085A (en) Partial cost calculation
EP2777285B1 (en) Adaptive partition coding
KR102450863B1 (en) Method and Apparatus for Encoding and Decoding Motion Vector
KR101924088B1 (en) Apparatus and method for video encoding and decoding using adaptive prediction block filtering
US11671584B2 (en) Inter-prediction method and video decoding apparatus using the same
US11962764B2 (en) Inter-prediction method and video decoding apparatus using the same
CN116567208A (en) Video encoding/decoding apparatus and method, and non-transitory recording medium
WO2021219144A1 (en) Entropy coding for partition syntax
US11997270B2 (en) Entropy coding for partition syntax
CN113455000B (en) Bidirectional prediction method and video decoding apparatus
US20220182604A1 (en) Video encoding and decoding using intra block copy
CN117461312A (en) Video encoding and decoding method and device
KR20200081186A (en) Method for deriving motion vector of temporal candidate and apparatus using the same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination