CN117044200A - Method and apparatus for video encoding and decoding using spiral scan order - Google Patents

Method and apparatus for video encoding and decoding using spiral scan order Download PDF

Info

Publication number
CN117044200A
CN117044200A CN202280019848.1A CN202280019848A CN117044200A CN 117044200 A CN117044200 A CN 117044200A CN 202280019848 A CN202280019848 A CN 202280019848A CN 117044200 A CN117044200 A CN 117044200A
Authority
CN
China
Prior art keywords
block
reference sample
current
information
scanning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280019848.1A
Other languages
Chinese (zh)
Inventor
安镕照
李钟石
朴胜煜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hyundai Motor Co
Kia Corp
DigitalInsights Inc
Original Assignee
Hyundai Motor Co
Kia Corp
DigitalInsights Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hyundai Motor Co, Kia Corp, DigitalInsights Inc filed Critical Hyundai Motor Co
Priority claimed from PCT/KR2022/003103 external-priority patent/WO2022191525A1/en
Publication of CN117044200A publication Critical patent/CN117044200A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/129Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/423Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A video encoding and decoding method and apparatus using a spiral scan order are disclosed, and the present embodiment provides a video encoding and decoding method and apparatus for encoding/decoding blocks to be encoded in a spiral scan order so that blocks located at the center of a video frame can use more previously reconstructed neighboring blocks.

Description

Method and apparatus for video encoding and decoding using spiral scan order
Technical Field
The present invention relates to a video encoding and decoding method and apparatus using a spiral scan order.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
Since video data has a larger data amount than audio data or still image data, the video data requires a large amount of hardware resources (including a memory) to store or transmit the video data that is not subjected to compression processing.
Accordingly, encoders are typically used to compress and store or transmit video data. The decoder receives the compressed video data, decompresses the received compressed video data, and plays the decompressed video data. Video compression techniques include h.264/AVC, high efficiency video codec (High Efficiency Video Coding, HEVC), and multi-function video codec (Versatile Video Coding, VVC) that has an increase in codec efficiency of HEVC of about 30% or more.
However, as the image size, resolution, and frame rate gradually increase, the amount of data to be encoded is also increasing. Accordingly, a new compression technique providing higher codec efficiency and improved image enhancement effect compared to the existing compression technique is needed.
Conventional video codec techniques perform encoding/decoding after hierarchical partitioning of individual frames, as shown in the example of fig. 6. The video includes a plurality of frames, wherein the video encoding device encodes one frame at a time and the video encoding device decodes one frame at a time in the same encoding sequence. A frame may be hierarchically partitioned into a plurality of blocks. At this time, each layer may be partitioned into blocks of the same size or blocks of different sizes. Alternatively, each layer may be partitioned into one quarter, three, two, etc. blocks by utilizing a tree structure. The blocks in each layer may become prediction units. Alternatively, the block may become a transform unit. Alternatively, the block may be a quantization unit. Alternatively, the block may be a filtering unit. For example, the example of fig. 6 represents hierarchical partitioning of a single frame.
On the other hand, the conventional video codec device encodes/decodes blocks of each layer in a predetermined order. For example, in order to encode/decode blocks partitioned in a tree structure, a conventional video encoding/decoding apparatus selectively uses raster scan order and reverse raster scan. In general, an advantage of raster scan and inverse raster scan schemes is that most blocks can utilize information about two neighboring blocks that have been reconstructed. In other words, the number of neighboring blocks that have been reconstructed is one of the important factors greatly affecting the coding performance of video. Therefore, in order to improve the coding efficiency of video, it is necessary to consider the scanning order in which a greater number of reconstructed neighboring blocks can be utilized.
Disclosure of Invention
Technical problem
In some implementations, the present invention seeks to a video encoding/decoding method and apparatus that encodes/decodes target blocks in a spiral scan order to enable blocks located in the center of a video frame to utilize more neighboring blocks that have been reconstructed.
Technical proposal
At least one aspect of the present invention provides a method performed by a video encoding device for encoding a current layer. The method includes determining a block scanning scheme for a current layer, the block scanning scheme including a position of a start block of the current layer divided into blocks of a same size, and a scanning order based on the start block. The scanning order is one of a horizontal scanning order, a vertical scanning order, a clockwise helical scanning order, or a counterclockwise helical scanning order. The method further includes encoding a current block representing each block following a block scanning scheme. The method further includes determining a reference sample line for the current block after encoding. The method further includes determining a line buffer for storing the reference sample line. The method further includes storing information about the reference sample line in a line buffer.
Another aspect of the present invention provides a video encoding apparatus. The apparatus includes a scan determination unit configured to determine a block scanning scheme for a current layer, the block scanning scheme including a position of a start block of the current layer divided into blocks of a same size, and a scan order based on the start block. The scanning order is one of a horizontal scanning order, a vertical scanning order, a clockwise spiral scanning order, or a counterclockwise spiral scanning order. The apparatus also includes a block encoder configured to encode a current block representing each block following a block scanning scheme. The apparatus further comprises a reference sample determination unit configured to determine a reference sample line of the current block that has been encoded, and to determine a line buffer for storing the reference sample line. The apparatus further includes a reference sample storage unit configured to store information about the reference sample line in the line buffer.
Yet another aspect of the present invention provides a method performed by a video decoding apparatus for decoding a current layer. The method includes determining block scan information for a current layer, the block scan information including a position of a start block of the current layer divided into blocks of a same size, and a scan order based on the start block. The scanning order is one of a horizontal scanning order, a vertical scanning order, a clockwise spiral scanning order, or a counterclockwise spiral scanning order. The method further includes decoding a current block representing each block following the block scan information. The method further includes determining a reference sample line for the current block after decoding. The method further includes determining a line buffer for storing the reference sample line. The method further includes storing information about the reference sample line in a line buffer.
Advantageous effects
As described above, the present invention provides a video encoding/decoding method and apparatus that encodes/decodes a target block in a spiral scan order, thereby enabling a block located at the center of a video frame to utilize more neighboring blocks that have been reconstructed. Therefore, the codec efficiency can be improved.
Drawings
Fig. 1 is a block diagram of a video encoding device according to an embodiment of the present invention.
Fig. 2 illustrates a method of partitioning a block using a quadtree plus binary tree trigeminal tree (QTBTTT) structure.
Fig. 3a and 3b illustrate a plurality of intra prediction modes including a wide-angle intra prediction mode.
Fig. 4 shows neighboring blocks of the current block.
Fig. 5 is a block diagram of a video decoding apparatus according to an embodiment of the present invention.
Fig. 6 is a schematic diagram showing layered partitioning of a frame.
Fig. 7 is a block diagram conceptually illustrating a video encoding apparatus using a spiral scan order according to at least one embodiment of the present invention.
Fig. 8a to 8d are schematic diagrams showing the position of a start block in the current layer and showing a scan order based on the start block.
Fig. 9 is a schematic diagram illustrating slice partitioning in accordance with at least one embodiment of the present invention.
Fig. 10 is a schematic diagram illustrating a line buffer for encoding/decoding and a location of a reference sample line stored in the line buffer according to at least one embodiment of the present invention.
Fig. 11 is a schematic diagram illustrating the locations of line buffers and reference sample lines stored following a clockwise spiral scan sequence in accordance with at least one embodiment of the present invention.
Fig. 12 is a block diagram conceptually illustrating a video decoding apparatus using a spiral scan order according to at least one embodiment of the present invention.
Fig. 13 is a flowchart of a video encoding method using a spiral scan order in accordance with at least one embodiment of the present invention.
Fig. 14 is a flowchart of a video decoding method using a spiral scan order in accordance with at least one embodiment of the present invention.
Detailed Description
Hereinafter, some embodiments of the present invention will be described in detail with reference to the accompanying illustrative drawings. In the following description, like reference numerals denote like elements, although the elements are shown in different drawings. Furthermore, in the following description of some embodiments, detailed descriptions of related known components and functions have been omitted for clarity and conciseness when it may be considered that the subject matter of the present invention is obscured.
Fig. 1 is a block diagram of a video encoding device in one embodiment of the present invention. Hereinafter, a video encoding apparatus and components of the apparatus are described with reference to the diagram of fig. 1.
The encoding apparatus may include: an image divider 110, a predictor 120, a subtractor 130, a transformer 140, a quantizer 145, a reordering unit 150, an entropy encoder 155, an inverse quantizer 160, an inverse transformer 165, an adder 170, a loop filtering unit 180, and a memory 190.
Each component of the encoding apparatus may be implemented as hardware or software, or as a combination of hardware and software. In addition, the function of each component may be implemented as software, and the microprocessor may also be implemented to execute the function of the software corresponding to each component.
A video is made up of one or more sequences comprising a plurality of images. Each image is divided into a plurality of regions, and encoding is performed on each region. For example, an image is segmented into one or more tiles (tiles) or/and slices (slices). Here, one or more tiles may be defined as a tile set. Each tile or/and slice is partitioned into one or more Coding Tree Units (CTUs). In addition, each CTU is partitioned into one or more Coding Units (CUs) by a tree structure. Information applied to each CU is encoded as a syntax of the CU, and information commonly applied to CUs included in one CTU is encoded as a syntax of the CTU. In addition, information commonly applied to all blocks in one slice is encoded as syntax of a slice header, and information applied to all blocks constituting one or more pictures is encoded as a picture parameter set (Picture Parameter Set, PPS) or a picture header. Furthermore, information commonly referred to by the plurality of images is encoded as a sequence parameter set (Sequence Parameter Set, SPS). In addition, information commonly referenced by the one or more SPS is encoded as a set of video parameters (Video Parameter Set, VPS). Furthermore, information commonly applied to one tile or group of tiles may also be encoded as syntax of the tile or group of tiles header. The syntax included in the SPS, PPS, slice header, tile, or tile set header may be referred to as a high level syntax.
The image divider 110 determines the size of a Coding Tree Unit (CTU). Information about the size of the CTU (CTU size) is encoded as a syntax of the SPS or PPS and transmitted to the video decoding apparatus.
The image divider 110 divides each image constituting a video into a plurality of coding tree units CTUs having a predetermined size, and then recursively divides the CTUs by using a tree structure. Leaf nodes in the tree structure become Coding Units (CUs), which are the basic units of coding.
The tree structure may be a Quadtree (QT) in which a higher node (or parent node) is partitioned into four lower nodes (or child nodes) of the same size. The tree structure may also be a Binary Tree (BT) in which a higher node is split into two lower nodes. The tree structure may also be a Trigeminal Tree (TT), where the higher nodes are split into three lower nodes at a ratio of 1:2:1. The tree structure may also be a structure in which two or more of a QT structure, a BT structure, and a TT structure are mixed. For example, a quadtree plus binary tree (quadtree plus binarytree, QTBT) structure may be used, or a quadtree plus binary tree (quadtree plus binarytree ternarytree, QTBTTT) structure may be used. Here, BTTT is added to the tree structure to be called multiple-type tree (MTT).
Fig. 2 is a schematic diagram for describing a method of dividing a block by using the QTBTTT structure.
As shown in fig. 2, the CTU may be first partitioned into QT structures. Quadtree partitioning may be recursive until the size of the partitioned block reaches the minimum block size (MinQTSize) of leaf nodes allowed in QT. A first flag (qt_split_flag) indicating whether each node of the QT structure is partitioned into four lower-layer nodes is encoded by the entropy encoder 155 and signaled to the video decoding apparatus. When the leaf node of QT is not greater than the maximum block size (MaxBTSize) of the root node allowed in BT, the leaf node may be further divided into at least one of BT structure or TT structure. There may be multiple directions of segmentation in the BT structure and/or the TT structure. For example, there may be two directions, i.e., a direction of dividing the block of the corresponding node horizontally and a direction of dividing the block of the corresponding node vertically. As shown in fig. 2, when the MTT division starts, a second flag (MTT _split_flag) indicating whether a node is divided, and a flag additionally indicating a division direction (vertical or horizontal) and/or a flag indicating a division type (binary or trigeminal) in the case that a node is divided are encoded by the entropy encoder 155 and signaled to the video decoding apparatus.
Alternatively, a CU partition flag (split_cu_flag) indicating whether a node is partitioned may be further encoded before encoding a first flag (qt_split_flag) indicating whether each node is partitioned into four nodes of a lower layer. When the value of the CU partition flag (split_cu_flag) indicates that each node is not partitioned, the block of the corresponding node becomes a leaf node in the partition tree structure and becomes a CU, which is a basic unit of encoding. When the value of the CU partition flag (split_cu_flag) indicates that each node is partitioned, the video encoding apparatus first starts encoding the first flag in the above scheme.
When QTBT is used as another example of the tree structure, there may be two types, i.e., a type of horizontally dividing a block of a corresponding node into two blocks having the same size (i.e., symmetrical horizontal division) and a type of vertically dividing a block of a corresponding node into two blocks having the same size (i.e., symmetrical vertical division). A partition flag (split_flag) indicating whether each node of the BT structure is partitioned into lower-layer blocks and partition type information indicating a partition type are encoded by the entropy encoder 155 and transmitted to the video decoding apparatus. On the other hand, there may additionally be a type in which a block of a corresponding node is divided into two blocks in an asymmetric form to each other. The asymmetric form may include a form in which a block of a corresponding node is divided into two rectangular blocks having a size ratio of 1:3, or may also include a form in which a block of a corresponding node is divided in a diagonal direction.
A CU may have various sizes according to QTBT or QTBTTT divided from CTUs. Hereinafter, a block corresponding to a CU to be encoded or decoded (i.e., a leaf node of QTBTTT) is referred to as a "current block". When QTBTTT segmentation is employed, the shape of the current block may also be rectangular in shape, in addition to square shape.
The predictor 120 predicts the current block to generate a predicted block. Predictor 120 includes an intra predictor 122 and an inter predictor 124.
In general, each of the current blocks in the image may be predictively encoded. In general, prediction of a current block may be performed by using an intra prediction technique using data from an image including the current block or an inter prediction technique using data from an image encoded before the image including the current block. Inter prediction includes both unidirectional prediction and bi-directional prediction.
The intra predictor 122 predicts pixels in the current block by using pixels (reference pixels) located adjacent to the current block in the current image including the current block. Depending on the prediction direction, there are multiple intra prediction modes. For example, as shown in fig. 3a, the plurality of intra prediction modes may include two non-directional modes including a Planar (Planar) mode and a DC mode, and may include 65 directional modes. The neighboring pixels and algorithm equations to be used are defined differently according to each prediction mode.
For efficient direction prediction of a current block having a rectangular shape, direction modes (# 67 to # 80) indicated by dotted arrows in fig. 3b, intra prediction modes # -1 to # -14) may be additionally used. The direction mode may be referred to as a "wide angle intra-prediction mode". In fig. 3b, the arrows indicate the respective reference samples for prediction, rather than representing the prediction direction. The prediction direction is opposite to the direction indicated by the arrow. When the current block has a rectangular shape, the wide-angle intra prediction mode is a mode in which prediction is performed in a direction opposite to a specific direction mode without additional bit transmission. In this case, in the wide-angle intra prediction mode, some of the wide-angle intra prediction modes available for the current block may be determined by a ratio of a width to a height of the current block having a rectangular shape. For example, when the current block has a rectangular shape having a height smaller than a width, wide-angle intra prediction modes (intra prediction modes #67 to # 80) having angles smaller than 45 degrees are available. When the current block has a rectangular shape with a width greater than a height, a wide-angle intra prediction mode having an angle greater than-135 degrees is available.
The intra predictor 122 may determine intra prediction to be used for encoding the current block. In some examples, intra predictor 122 may encode the current block by utilizing a plurality of intra prediction modes, and may also select an appropriate intra prediction mode to use from among the test modes. For example, the intra predictor 122 may calculate a rate distortion value by using rate-distortion (rate-distortion) analysis of a plurality of tested intra prediction modes, and may also select an intra prediction mode having the best rate distortion characteristics among the test modes.
The intra predictor 122 selects one intra prediction mode among a plurality of intra prediction modes, and predicts the current block by using neighboring pixels (reference pixels) determined according to the selected intra prediction mode and an algorithm equation. Information about the selected intra prediction mode is encoded by the entropy encoder 155 and transmitted to a video decoding device.
The inter predictor 124 generates a prediction block of the current block by using a motion compensation process. The inter predictor 124 searches for a block most similar to the current block in a reference picture that has been encoded and decoded earlier than the current picture, and generates a predicted block of the current block by using the searched block. In addition, a Motion Vector (MV) is generated, which corresponds to a displacement (displacement) between a current block in the current image and a prediction block in the reference image. In general, motion estimation is performed on a luminance (luma) component, and a motion vector calculated based on the luminance component is used for both the luminance component and the chrominance component. Motion information including information of the reference picture and information on a motion vector for predicting the current block is encoded by the entropy encoder 155 and transmitted to a video decoding device.
The inter predictor 124 may also perform interpolation of reference pictures or reference blocks to increase the accuracy of prediction. In other words, the sub-samples are interpolated between two consecutive integer samples by applying the filter coefficients to a plurality of consecutive integer samples comprising the two integer samples. When the process of searching for a block most similar to the current block is performed on the interpolated reference image, the decimal-unit precision may be represented for the motion vector instead of the integer-sample-unit precision. The precision or resolution of the motion vector may be set differently for each target region to be encoded, e.g., a unit such as a slice, tile, CTU, CU, etc. When such adaptive motion vector resolution (adaptive motion vector resolution, AMVR) is applied, information on the motion vector resolution to be applied to each target area should be signaled for each target area. For example, when the target area is a CU, information about the resolution of a motion vector applied to each CU is signaled. The information on the resolution of the motion vector may be information representing the accuracy of a motion vector difference to be described below.
On the other hand, the inter predictor 124 may perform inter prediction by using bi-directional prediction. In the case of bi-prediction, two reference pictures and two motion vectors representing block positions most similar to the current block in each reference picture are used. The inter predictor 124 selects a first reference picture and a second reference picture from the reference picture list0 (RefPicList 0) and the reference picture list1 (RefPicList 1), respectively. The inter predictor 124 also searches for a block most similar to the current block in the corresponding reference picture to generate a first reference block and a second reference block. Further, a prediction block of the current block is generated by averaging or weighted-averaging the first reference block and the second reference block. Further, motion information including information on two reference pictures for predicting the current block and information on two motion vectors is transmitted to the entropy encoder 155. Here, the reference image list0 may be constituted by an image preceding the current image in display order among the pre-restored images, and the reference image list1 may be constituted by an image following the current image in display order among the pre-restored images. However, although not particularly limited thereto, a pre-restored image following the current image in the display order may be additionally included in the reference image list 0. Conversely, a pre-restored image preceding the current image may be additionally included in the reference image list 1.
In order to minimize the amount of bits consumed for encoding motion information, various methods may be used.
For example, when a reference image and a motion vector of a current block are identical to those of a neighboring block, information capable of identifying the neighboring block is encoded to transmit motion information of the current block to a video decoding apparatus. This method is called merge mode (merge mode).
In the merge mode, the inter predictor 124 selects a predetermined number of merge candidate blocks (hereinafter, referred to as "merge candidates") from neighboring blocks of the current block.
As the neighboring blocks used to derive the merge candidates, all or some of the left block A0, the lower left block A1, the upper block B0, the upper right block B1, and the upper left block B2 adjacent to the current block in the current image may be used, as shown in fig. 4. In addition, in addition to the current picture in which the current block is located, a block located within a reference picture (which may be the same as or different from the reference picture used to predict the current block) may also be used as a merging candidate. For example, a co-located block (co-located block) of a current block within a reference picture or a block adjacent to the co-located block may additionally be used as a merging candidate. If the number of merging candidates selected by the above method is less than a preset number, a zero vector is added to the merging candidates.
The inter predictor 124 configures a merge list including a predetermined number of merge candidates by using neighboring blocks. A merge candidate to be used as motion information of the current block is selected from among the merge candidates included in the merge list, and merge index information for identifying the selected candidate is generated. The generated merging index information is encoded by the entropy encoder 155 and transmitted to a video decoding apparatus.
The merge skip mode is a special case of the merge mode. After quantization, when all transform coefficients used for entropy coding are near zero, only neighboring block selection information is transmitted without transmitting a residual signal. By using the merge skip mode, relatively high encoding efficiency can be achieved for images with slight motion, still images, screen content images, and the like.
Hereinafter, the merge mode and the merge skip mode are collectively referred to as a merge/skip mode.
Another method for encoding motion information is advanced motion vector prediction (advanced motion vector prediction, AMVP) mode.
In the AMVP mode, the inter predictor 124 derives a motion vector prediction candidate for a motion vector of a current block by using neighboring blocks of the current block. As the neighboring blocks used to derive the motion vector prediction candidates, all or some of the left block A0, the lower left block A1, the upper side block B0, the upper right block B1, and the upper left block B2 adjacent to the current block in the current image shown in fig. 4 may be used. In addition, in addition to the current picture in which the current block is located, a block located within a reference picture (which may be the same as or different from a reference picture used to predict the current block) may also be used as a neighboring block used to derive a motion vector prediction candidate. For example, a co-located block of the current block within the reference picture or a block adjacent to the co-located block may be used. If the number of motion vector candidates selected by the above method is less than a preset number, a zero vector is added to the motion vector candidates.
The inter predictor 124 derives a motion vector prediction candidate by using the motion vector of the neighboring block, and determines a motion vector prediction of the motion vector of the current block by using the motion vector prediction candidate. In addition, a motion vector difference is calculated by subtracting a motion vector prediction from a motion vector of the current block.
Motion vector prediction may be obtained by applying a predefined function (e.g., median and average calculations, etc.) to the motion vector prediction candidates. In this case, the video decoding device is also aware of the predefined function. Further, since the neighboring block used to derive the motion vector prediction candidates is a block for which encoding and decoding have been completed, the video decoding apparatus may also already know the motion vector of the neighboring block. Therefore, the video encoding device does not need to encode information for identifying motion vector prediction candidates. Accordingly, in this case, information on a motion vector difference and information on a reference image for predicting a current block are encoded.
On the other hand, motion vector prediction may also be determined by selecting a scheme of any one of the motion vector prediction candidates. In this case, the information for identifying the selected motion vector prediction candidates is additionally encoded together with the information about the motion vector difference and the information about the reference picture for predicting the current block.
The subtractor 130 generates a residual block by subtracting the current block from the prediction block generated by the intra predictor 122 or the inter predictor 124.
The transformer 140 transforms a residual signal in a residual block having pixel values of a spatial domain into transform coefficients of a frequency domain. The transformer 140 may transform a residual signal in a residual block by using the entire size of the residual block as a transform unit, or may divide the residual block into a plurality of sub-blocks and perform the transform by using the sub-blocks as transform units. Alternatively, the residual block is divided into two sub-blocks, i.e., a transform region and a non-transform region, to transform the residual signal by using only the transform region sub-block as a transform unit. Here, the transform region sub-block may be one of two rectangular blocks having a size ratio of 1:1 based on a horizontal axis (or a vertical axis). In this case, a flag (cu_sbt_flag) indicating that only the sub-block is transformed, and direction (vertical/horizontal) information (cu_sbt_horizontal_flag) and/or position information (cu_sbt_pos_flag) are encoded by the entropy encoder 155 and signaled to the video decoding apparatus. In addition, the size of the transform region sub-block may have a size ratio of 1:3 based on the horizontal axis (or vertical axis). In this case, a flag (cu_sbt_quad_flag) dividing the corresponding division is additionally encoded by the entropy encoder 155 and signaled to the video decoding device.
On the other hand, the transformer 140 may perform transformation of the residual block separately in the horizontal direction and the vertical direction. For this transformation, various types of transformation functions or transformation matrices may be used. For example, the pair-wise transformation function for horizontal and vertical transformations may be defined as a transformation set (multiple transform set, MTS). The transformer 140 may select one transform function pair having the highest transform efficiency in the MTS and transform the residual block in each of the horizontal and vertical directions. Information (mts_idx) about the transform function pairs in the MTS is encoded by the entropy encoder 155 and signaled to the video decoding means.
The quantizer 145 quantizes the transform coefficient output from the transformer 140 using a quantization parameter, and outputs the quantized transform coefficient to the entropy encoder 155. The quantizer 145 may also immediately quantize the relevant residual block without transforming any block or frame. The quantizer 145 may also apply different quantization coefficients (scaling values) according to the positions of the transform coefficients in the transform block. A quantization matrix applied to quantized transform coefficients arranged in two dimensions may be encoded and signaled to a video decoding apparatus.
The reordering unit 150 may perform the rearrangement of the coefficient values on the quantized residual values.
The rearrangement unit 150 may change the 2D coefficient array to a 1D coefficient sequence by using coefficient scanning. For example, the rearrangement unit 150 may scan the DC coefficients to the coefficients of the high frequency region using zigzag scanning (zig-zag scan) or diagonal scanning (diagonal scan) to output a 1D coefficient sequence. Instead of the zig-zag scan, a vertical scan that scans the 2D coefficient array in the column direction and a horizontal scan that scans the 2D block type coefficients in the row direction may also be utilized, depending on the size of the transform unit and the intra prediction mode. In other words, the scanning method to be used may be determined in zigzag scanning, diagonal scanning, vertical scanning, and horizontal scanning according to the size of the transform unit and the intra prediction mode.
The entropy encoder 155 encodes the sequence of the 1D quantized transform coefficients output from the rearrangement unit 150 by using various encoding schemes including Context-based adaptive binary arithmetic coding (Context-based Adaptive Binary Arithmetic Code, CABAC), exponential golomb (Exponential Golomb), and the like to generate a bitstream.
Further, the entropy encoder 155 encodes information related to block division (e.g., CTU size, CTU division flag, QT division flag, MTT division type, MTT division direction, etc.) so that the video decoding apparatus can divide blocks equally to the video encoding apparatus. Further, the entropy encoder 155 encodes information on a prediction type indicating whether the current block is encoded by intra prediction or inter prediction. The entropy encoder 155 encodes intra prediction information (i.e., information about an intra prediction mode) or inter prediction information (a merge index in the case of a merge mode, and information about a reference picture index and a motion vector difference in the case of an AMVP mode) according to a prediction type. Further, the entropy encoder 155 encodes information related to quantization (i.e., information about quantization parameters and information about quantization matrices).
The inverse quantizer 160 inversely quantizes the quantized transform coefficient output from the quantizer 145 to generate a transform coefficient. The inverse transformer 165 transforms the transform coefficients output from the inverse quantizer 160 from the frequency domain to the spatial domain to restore a residual block.
The adder 170 adds the restored residual block and the prediction block generated by the predictor 120 to restore the current block. The pixels in the restored current block may be used as reference pixels when intra-predicting the next block.
The loop filtering unit 180 performs filtering on the restored pixels to reduce block artifacts (blocking artifacts), ringing artifacts (ringing artifacts), blurring artifacts (blurring artifacts), etc., which occur due to block-based prediction and transform/quantization. The loop filtering unit 180 as an in-loop filter may include all or some of a deblocking filter 182, a sample adaptive offset (sample adaptive offset, SAO) filter 184, and an adaptive loop filter (adaptive loop filter, ALF) 186.
Deblocking filter 182 filters boundaries between restored blocks to remove block artifacts (blocking artifacts) that occur due to block unit encoding/decoding, and SAO filter 184 and ALF 186 additionally filter the deblock filtered video. The SAO filter 184 and ALF 186 are filters for compensating for differences between restored pixels and original pixels that occur due to lossy coding (loss coding). The SAO filter 184 applies an offset as a CTU unit to enhance subjective image quality and coding efficiency. On the other hand, the ALF 186 performs block unit filtering, and applies different filters to compensate for distortion by dividing boundaries of respective blocks and the degree of variation. Information about filter coefficients to be used for ALF may be encoded and signaled to the video decoding apparatus.
The restored blocks filtered by the deblocking filter 182, the SAO filter 184, and the ALF 186 are stored in the memory 190. When all blocks in one image are restored, the restored image may be used as a reference image for inter-predicting blocks within a picture to be subsequently encoded.
Fig. 5 is a functional block diagram of a video decoding device in one embodiment of the present invention. Hereinafter, with reference to fig. 5, a video decoding apparatus and components of the apparatus are described.
The video decoding apparatus may include an entropy decoder 510, a reordering unit 515, an inverse quantizer 520, an inverse transformer 530, a predictor 540, an adder 550, a loop filtering unit 560, and a memory 570.
Similar to the video encoding apparatus of fig. 1, each component of the video decoding apparatus may be implemented as hardware or software, or as a combination of hardware and software. In addition, the function of each component may be implemented as software, and the microprocessor may also be implemented to execute the function of the software corresponding to each component.
The entropy decoder 510 extracts information related to block segmentation by decoding a bitstream generated by a video encoding apparatus to determine a current block to be decoded, and extracts prediction information required to restore the current block and information on a residual signal.
The entropy decoder 510 determines the size of CTUs by extracting information about the CTU size from a Sequence Parameter Set (SPS) or a Picture Parameter Set (PPS), and partitions a picture into CTUs having the determined size. Further, the CTU is determined as the highest layer (i.e., root node) of the tree structure, and the division information of the CTU may be extracted to divide the CTU by using the tree structure.
For example, when dividing a CTU by using the QTBTTT structure, first a first flag (qt_split_flag) related to the division of QT is extracted to divide each node into four nodes of the lower layer. In addition, a second flag (MTT _split_flag), a split direction (vertical/horizontal), and/or a split type (binary/trigeminal) related to the split of the MTT are extracted with respect to a node corresponding to the leaf node of the QT to split the corresponding leaf node into the MTT structure. As a result, each node below the leaf node of QT is recursively partitioned into BT or TT structures.
As another example, when a CTU is divided by using the QTBTTT structure, a CU division flag (split_cu_flag) indicating whether to divide the CU is extracted. When the corresponding block is partitioned, a first flag (qt_split_flag) may also be extracted. During the segmentation process, recursive MTT segmentation of 0 or more times may occur after recursive QT segmentation of 0 or more times for each node. For example, for CTUs, MTT partitioning may occur immediately, or conversely, QT partitioning may occur only multiple times.
As another example, when dividing the CTU by using the QTBT structure, a first flag (qt_split_flag) related to the division of QT is extracted to divide each node into four nodes of the lower layer. In addition, a split flag (split_flag) indicating whether or not a node corresponding to a leaf node of QT is further split into BT and split direction information are extracted.
On the other hand, when the entropy decoder 510 determines the current block to be decoded by using the partition of the tree structure, the entropy decoder 510 extracts information on a prediction type indicating whether the current block is intra-predicted or inter-predicted. When the prediction type information indicates intra prediction, the entropy decoder 510 extracts syntax elements for intra prediction information (intra prediction mode) of the current block. When the prediction type information indicates inter prediction, the entropy decoder 510 extracts information representing syntax elements of the inter prediction information, i.e., a motion vector and a reference picture to which the motion vector refers.
Further, the entropy decoder 510 extracts quantization-related information and extracts information on transform coefficients of the quantized current block as information on a residual signal.
The reordering unit 515 may change the sequence of the 1D quantized transform coefficients entropy-decoded by the entropy decoder 510 into a 2D coefficient array (i.e., block) again in the reverse order of the coefficient scan order performed by the video encoding device.
The inverse quantizer 520 inversely quantizes the quantized transform coefficient and inversely quantizes the quantized transform coefficient by using a quantization parameter. The inverse quantizer 520 may also apply different quantization coefficients (scaling values) to the quantized transform coefficients arranged in 2D. The inverse quantizer 520 may perform inverse quantization by applying a matrix of quantized coefficients (scaled values) from the video encoding device to a 2D array of quantized transform coefficients.
The inverse transformer 530 restores a residual signal by inversely transforming the inversely quantized transform coefficients from the frequency domain to the spatial domain to generate a residual block of the current block.
Further, when the inverse transformer 530 inversely transforms a partial region (sub-block) of the transform block, the inverse transformer 530 extracts a flag (cu_sbt_flag) transforming only the sub-block of the transform block, direction (vertical/horizontal) information (cu_sbt_horizontal_flag) of the sub-block, and/or position information (cu_sbt_pos_flag) of the sub-block. The inverse transformer 530 also inversely transforms transform coefficients of the corresponding sub-block from the frequency domain to the spatial domain to restore a residual signal, and fills the region that is not inversely transformed with a value of "0" as the residual signal to generate a final residual block of the current block.
Further, when applying MTS, the inverse transformer 530 determines a transform index or a transform matrix to be applied in each of the horizontal direction and the vertical direction by using MTS information (mts_idx) signaled from the video encoding apparatus. The inverse transformer 530 also performs inverse transformation on the transform coefficients in the transform block in the horizontal direction and the vertical direction by using the determined transform function.
The predictor 540 may include an intra predictor 542 and an inter predictor 544. The intra predictor 542 is activated when the prediction type of the current block is intra prediction, and the inter predictor 544 is activated when the prediction type of the current block is inter prediction.
The intra predictor 542 determines an intra prediction mode of the current block among the plurality of intra prediction modes according to syntax elements of the intra prediction mode extracted from the entropy decoder 510. The intra predictor 542 also predicts the current block by using neighboring reference pixels of the current block according to an intra prediction mode.
The inter predictor 544 determines a motion vector of the current block and a reference picture to which the motion vector refers by using syntax elements of the inter prediction mode extracted from the entropy decoder 510.
The adder 550 restores the current block by adding the residual block output from the inverse transformer 530 to the prediction block output from the inter predictor 544 or the intra predictor 542. In intra prediction of a block to be decoded later, pixels within the restored current block are used as reference pixels.
The loop filtering unit 560, which is an in-loop filter, may include a deblocking filter 562, an SAO filter 564, and an ALF 566. Deblocking filter 562 performs deblocking filtering on boundaries between restored blocks to remove block artifacts occurring due to block unit decoding. The SAO filter 564 and ALF 566 perform additional filtering on the restored block after deblocking filtering to compensate for differences between restored pixels and original pixels that occur due to lossy encoding. The filter coefficients of the ALF are determined by using information on the filter coefficients decoded from the bitstream.
The restored blocks filtered by the deblocking filter 562, the SAO filter 564, and the ALF 566 are stored in the memory 570. When all blocks in one image are restored, the restored image may be used as a reference image for inter-predicting blocks within a picture to be subsequently encoded.
In some embodiments, the invention relates to encoding and decoding video imagery as described above. More particularly, the present invention provides a video encoding/decoding method and apparatus that encodes/decodes a target block in a spiral scan order so that a block located at the center of a video frame can utilize more previously reconstructed neighboring blocks.
The following implementation may be applied to the image divider 110, the predictor 120, the transformer 140, the quantizer 145, the inverse quantizer 160, the inverse transformer 165, the loop filter unit 180, and the entropy encoder 155 in a video encoding device. The following embodiments may also be applied to the entropy decoder 510, the inverse quantizer 520, the inverse transformer 530, the predictor 540, and the loop filtering unit 560 in a video decoding device.
In the following description, the term "target block" to be encoded/decoded may be used interchangeably with the current block or Coding Unit (CU) as described above, or the term "target block" may refer to some region of the coding unit.
Fig. 7 is a block diagram conceptually illustrating a video encoding apparatus using a spiral scan order according to at least one embodiment of the present invention.
A video encoding apparatus according to at least one embodiment of the present invention utilizes a spiral scan order to encode a target block. The video encoding apparatus may include all or part of a block divider 702, a first scan determination unit 704, a slice divider 706, a second scan determination unit 708, a block encoder 710, a reference sample determination unit 712, and a reference sample storage unit 714. In the video encoding apparatus, the block divider 702, the first scan determination unit 704, the slice divider 706, and the second scan determination unit 708 correspond to block partition steps for encoding. Although described separately for convenience, the block divider 702, the first scan determination unit 704, the slice divider 706, and the second scan determination unit 708 perform the functions of the image divider 110. In addition, the block encoder 710, the reference sample determination unit 712, and the reference sample storage unit 714 correspond to encoding and post-processing steps of a target block. The block encoder 710, the reference sample determining unit 712, and the reference sample storage unit 714 perform the functions of the predictor 120, the transformer 140, the quantizer 145, the inverse quantizer 160, the inverse transformer 165, the loop filtering unit 180, and the entropy encoder 155.
The block divider 702 divides the input video frame into blocks of the same size.
The first scan determination unit 704 obtains the position of a start block for encoding a block of a partition, and obtains a scan order according to the start block. Hereinafter, the method obtained by the first scan determination unit 704 is designated as a first block scan scheme. The video encoding apparatus may calculate a position of a start block for encoding a frame from the viewpoint of optimizing rate distortion and scan order according to the start block. The first scan determination unit 704 may obtain a position of a start block for encoding a frame and a scan order according to the start block determined at a high level as a first block scan scheme. The video encoding apparatus may encode the position information of the start block and the scanning scheme to generate a bitstream and transmit the bitstream to the video decoding apparatus.
Fig. 8a to 8d are schematic diagrams showing the position of a start block in the current layer and showing the scanning order according to the start block.
The start block of the current layer may be one of an upper left square, an upper right square, a lower left square, and a lower right square. In the examples of fig. 8a to 8d, the block marked 0 (zero) represents the start block. Further, numbers and arrows within each block indicate the encoding/decoding order in the current layer.
The first scan determination unit 704 may use one of a horizontal scan order, a vertical scan order, a clockwise spiral scan order, or a counterclockwise spiral scan order as a scan scheme.
When using horizontal scanning or vertical scanning, each block may use information about one or two previously reconstructed neighboring blocks, as shown in the examples of fig. 8a and 8 b. In contrast, when using a helical scan, the number of neighboring blocks (information of which is available for the target block) increases as the helical scan approaches the center of the current layer, as shown in the examples of fig. 8c and 8 d. For example, when the target block is near the center from the edge of the current layer, the number of neighboring blocks (for which information is available) may be increased up to four.
On the other hand, the same scanning order as in the example of fig. 8a and 8d may be utilized by the video decoding apparatus in decoding the blocks.
The slice segmenter 706 determines a slice partition scheme from the position of the start block determined for the frame and determines the type of scanning scheme (i.e., the first block scanning scheme), and then the slice segmenter 706 partitions the frame into slices using the determined partition scheme.
Fig. 9 is a schematic diagram illustrating slice partitioning in accordance with at least one embodiment of the present invention.
Slices represent individual encodable units. A slice may include one or more blocks. Where the block scanning scheme in the current layer is a clockwise helical scanning order, the slice segmenter 706 may generate slices by bundling a number of blocks in the clockwise helical scanning order, such as in the example of fig. 9. At this time, the video encoding apparatus may encode the start position of each slice and the number of blocks included in the slice at a layer higher than the current layer, and may transmit the encoded start position and the number of encoded blocks to the video decoding apparatus.
On the other hand, slice partitioning according to the example of fig. 9 may be used by the video decoding apparatus in decoding the block.
The second scan determination unit 708 follows the first block scan scheme to determine a start block position and a scan scheme of the block of each slice. Hereinafter, the method determined by the second scan determination unit 708 is referred to as a second block scan scheme. In this case, the start block position and the scanning order utilized in the first block scanning scheme may be different from those utilized in the second block scanning scheme. The video encoding apparatus may calculate a start block position for the encoded slice from the viewpoint of optimizing rate distortion and scan order according to the start block. The video encoding apparatus may encode the position information of the start block in the slice and the scanning scheme to generate a bitstream and transfer the bitstream to the video decoding apparatus.
The second scan determination unit 708 can show the start position of the slice by an index of the blocks expressed as the scan order. Alternatively, the start positions of the slices may be the order of the blocks in the x-axis direction and the order of the blocks in the y-axis direction. In this case, the starting reference points of the x-axis and the y-axis may be different according to the block scanning scheme. For example, in the examples of fig. 8a to 8d, if the start block position is an upper right block, the video encoding apparatus may set the coordinates of the upper right block to (0, 0), and may represent the position of the block with respect to the upper right block.
In the embodiment shown in fig. 7, the frame unit corresponds to a higher layer, and the slice unit corresponds to a current layer. The video encoding device may communicate the position of the start block and the scanning scheme to the video decoding device for each layer. Alternatively, the video encoding apparatus may use the orientation and scanning scheme of the start block of the higher layer as the start block position and scanning scheme of the current layer. The video encoding apparatus may further transfer flag information indicating whether the same scanning scheme as in the higher layer is used to the video decoding apparatus. Alternatively, the video encoding apparatus and the video decoding apparatus may utilize the same predetermined scanning scheme for each layer.
The block encoder 710 encodes each block in an order determined by the second block scanning scheme. As described above, the process for encoding each block performed by the video encoding apparatus may include all or part of prediction, transformation, quantization, loop filtering, and entropy encoding.
The reference sample determination unit 712 determines a reference sample line to be stored for future use in the encoded current block, and determines a line buffer for storing the determined reference sample line.
Fig. 10 is a schematic diagram illustrating a line buffer for encoding/decoding and a location of a reference sample line stored in the line buffer according to at least one embodiment of the present invention.
After encoding the current block, some information about the current block needs to be stored in the line buffer for encoding the subsequent block. For this reason, four or more line buffers may be required. The video encoding apparatus may use four lines corresponding to the boundaries of the current block as reference sample lines, as shown in the example of fig. 10. Here, the reference sample line and the line buffer have the same names as in fig. 10.
Information about the reference sample line may be stored in at least one line buffer. The line buffers used to store information may be determined to be near the boundaries of subsequently encoded blocks to facilitate the utilization of the subsequently encoded blocks. For example, as shown in the example of fig. 10, the video encoding device may store information about the first reference sample line at the top boundary of the current block in the third line buffer. Similarly, the video encoding device may store information about the second reference sample line at the right boundary of the current block in the fourth line buffer. Further, information about the third reference sample line may be stored in the first line buffer. Further, information about the fourth reference sample line may be stored in the second line buffer.
The video encoding device may skip storing information of some reference sample lines according to the position of the current block. For example, if the location of the current block is the top-most block of frames, slices, and tiles, then there is no need to store information of the first reference sample line, as it is not referenced by the subsequent block. Accordingly, the video encoding apparatus may omit storing information of the first reference sample line.
The video encoding device may skip storing information for some reference sample lines according to the scanning scheme. For example, a horizontal scanning scheme (where the upper left square is the starting block) always utilizes only the second reference sample line and the third reference sample line. Accordingly, the video encoding device may skip storing information about the first reference sample line and the fourth reference sample line.
The video encoding device may determine a reference sample line to store based on whether the current block and its neighboring blocks have been encoded. For example, in a horizontal scanning scheme in which the upper left block is designated as the start block, the video encoding device may be configured to store the first reference sample line and the fourth reference sample line in the line buffer in response to blocks above and to the left of the current block having not been encoded/decoded.
The information about the reference sample line may comprise reconstructed sample values. Further, the information about the reference sample line may include information related to intra prediction, and may include information related to inter prediction. Furthermore, the information on the reference sample line may include information on transformation and quantization, information on filtering, and information on entropy encoding.
On the other hand, the same processing of determining the line buffer and the position of the reference sample line stored in the line buffer as the example of fig. 10 may be utilized in the processing of decoding a block by the video decoding apparatus.
The reference sample storage unit 714 stores information about the determined reference sample line in the corresponding line buffer.
Fig. 11 is a schematic diagram illustrating the locations of line buffers and reference sample lines stored following a clockwise spiral scan sequence in accordance with at least one embodiment of the present invention.
Fig. 11 shows an encoding/decoding sequence following a spiral scan order, wherein the upper left square is a start block. In the example of fig. 11, block C is the current block being encoded/decoded, block D is the completed block that has been encoded/decoded, and block N is the next block to be encoded/decoded. The arrows attached to each block indicate the process of storing the determined reference sample line in the corresponding line buffer.
On the other hand, as described above, the video encoding apparatus may generate a high-level syntax for all partition information associated with the first block scanning scheme and the second block scanning scheme, and may transfer the generated syntax to the video decoding apparatus. Alternatively, the video decoding apparatus may derive all or some of the partition information based on a pre-protocol.
Fig. 12 is a block diagram conceptually illustrating a video decoding apparatus using a spiral scan order according to at least one embodiment of the present invention.
The video decoding apparatus according to the present embodiment decodes a target block by using a spiral scan order. The video decoding apparatus may include all or part of a partition information acquisition unit 1202, a block decoder 1204, a reference sample determination unit 1206, and a reference sample storage unit 1208. The partition information acquisition unit 1202 in the video decoding apparatus corresponds to a step of obtaining block partition information for decoding, and performs the function of the entropy decoder 510. In addition, the block decoder 1204, the reference sample determination unit 1206, and the reference sample storage unit 1208 correspond to decoding and post-processing steps of a target block, and perform functions of the inverse quantizer 520, the inverse transformer 530, the predictor 540, and the loop filter unit 560.
As described above, the partition information acquisition unit 1202 may decode or derive the block scan information. As described above, the block scanning information includes all information related to the first block scanning scheme and the second block scanning scheme.
The first block scanning scheme in the block scanning information is based on the scanning order according to the examples of fig. 8a and 8 d. In addition, the second block scanning scheme in the block partition information may be based on the slice partition shown in fig. 9.
The block decoder 1204 decodes each block following the order determined by the block scan information. As described above, the process for decoding each block performed by the video decoding apparatus may include all or part of entropy decoding, inverse quantization, inverse transformation, prediction, and loop filtering.
The reference sample determination unit 1206 determines a reference sample line to be stored for future use from the decoded current block, and determines a line buffer for storing the determined reference sample line. The reference sample determination unit 1206 may determine the line buffer and the position of the reference sample line stored in the line buffer according to the example of fig. 10.
The reference sample storage unit 1208 stores information about the determined reference sample line in the corresponding line buffer.
Hereinafter, a method for encoding/decoding a current layer in a spiral scan order will be described with reference to fig. 13 and 14.
Fig. 13 is a flowchart of a video encoding method using a spiral scan order in accordance with at least one embodiment of the present invention.
The video encoding apparatus determines a block scanning scheme for the current layer (S1300). Here, the block scanning scheme processes equally sized blocks from the partition current layer, including the location of the start block, and includes a scanning order based on the start block. The scan order may be one of a horizontal scan order, a vertical scan order, a clockwise spiral scan order, and a counterclockwise spiral scan order. Further, the start block of the current layer may be one of an upper left square, an upper right square, a lower left square, and a lower right square.
On the other hand, the video encoding apparatus may obtain a block scanning scheme for the current layer from a high level, or may inherit the orientation and scanning order of a start block at the high layer to be used as the current block scanning scheme.
The video encoding apparatus encodes the current block representing each block following a block scanning scheme (S1302). As described above, the process for encoding each block performed by the video encoding apparatus may include all or part of prediction, transformation, quantization, loop filtering, and entropy encoding.
The video encoding apparatus determines a reference sample line for the encoded current block and determines a line buffer for storing the reference sample line (S1304).
The video encoding device may utilize four sample lines corresponding to the boundaries of the current block as reference sample lines. The video encoding device may determine the reference sample line based on a position of the current block in the current layer, a scan order, and whether a block adjacent to the current block has been encoded. Information about the reference sample line may be stored in at least one line buffer. The line buffer for storing information may be determined to be near the boundary of the subsequently encoded block to facilitate use of the subsequently encoded block.
The video encoding apparatus stores information about the reference sample line in the line buffer (S1306). The information on the reference sample line may include reconstructed sample values of the current block and information related to prediction of the current block. Furthermore, the information on the reference sample line may include information related to transformation, quantization, filtering, and entropy coding of the current block.
Fig. 14 is a flowchart of a video decoding method using a spiral scan order in accordance with at least one embodiment of the present invention.
The video decoding apparatus determines block scanning information for the current layer (S1400). Here, the block scan information processes an equally sized block from the partition current layer, and includes a position of a start block and a scan order based on the start block. The scan order may be one of a horizontal scan order, a vertical scan order, a clockwise spiral scan order, and a counterclockwise spiral scan order. Further, the start block of the current layer may be one of an upper left square, an upper right square, a lower left square, and a lower right square.
On the other hand, the video decoding apparatus may decode the block scan information of the current layer, or may inherit the orientation and scan order of the start block at the higher layer to be used as the block scan information of the current layer.
The video decoding apparatus decodes a current block representing each block following the block scanning information (S1402). As described above, the process for decoding each block performed by the video decoding apparatus may include all or part of entropy decoding, inverse quantization, inverse transformation, prediction, or loop filtering.
The video decoding apparatus determines a reference sample line for the decoded current block, and determines a line buffer for storing the reference sample line (S1404).
The video decoding device may utilize four sample lines corresponding to the boundary of the current block as reference sample lines. The video decoding device may determine the reference sample line based on a position of the current block in the current layer, a scan order, and whether a block adjacent to the current block has been decoded. Information about the reference sample line may be stored in at least one line buffer. The line buffers for storing information may be determined to be located near the boundaries of subsequently decoded blocks to facilitate use of the subsequently decoded blocks.
The video decoding apparatus stores information about the reference sample line in the line buffer (S1406). The information on the reference sample line may include reconstructed sample values of the current block and information related to prediction of the current block. Further, the information on the reference sample line may include information related to entropy decoding, inverse transformation, inverse quantization, and filtering of the current block.
Although steps in the respective flowcharts are described as sequentially performed, these steps merely exemplify the technical ideas of some embodiments of the present invention. Accordingly, one of ordinary skill in the art to which the invention pertains may perform the steps by changing the order depicted in the various figures or by performing two or more steps in parallel. Accordingly, the steps in the various flowcharts are not limited to the order in which they occur as shown.
It should be understood that the foregoing description presents illustrative embodiments that may be implemented in various other ways. The functions described in some embodiments may be implemented by hardware, software, firmware, and/or combinations thereof. It should also be understood that the functional components described in this specification are labeled "… … units" to highlight the possibility of their independent implementation.
On the other hand, the various methods or functions described in some embodiments may be implemented as instructions stored in a non-volatile recording medium, which may be read and executed by one or more processors. The nonvolatile recording medium may include various types of recording devices that store data in a form readable by a computer system, for example. For example, the nonvolatile recording medium may include a storage medium such as an erasable programmable read-only memory (EPROM), a flash memory drive, an optical disk drive, a magnetic hard disk drive, a Solid State Drive (SSD), and the like.
Although exemplary embodiments of the present invention have been described for illustrative purposes, those skilled in the art to which the present invention pertains will appreciate that various modifications, additions and substitutions are possible, without departing from the scope and spirit of the invention. Accordingly, embodiments of the present invention have been described for brevity and clarity. The scope of the technical idea of the embodiment of the invention is not limited by the illustration. Accordingly, it will be understood by those of ordinary skill in the art that the scope of the present invention should not be limited by the embodiments explicitly described above, but by the claims and their equivalents.
(reference numerals)
702: block divider
704: first scan determination unit
706: slice divider
708: second scan determination unit
710: block encoder
712: reference sample determination unit
714: a reference sample storage unit.
Cross Reference to Related Applications
The present application claims priority from korean patent application No.10-2021-0030287 filed on 3 months 8 of 2021 and korean patent application No.10-2022-0027878 filed on 4 months 3 of 2022, each of which is incorporated herein by reference in its entirety.

Claims (18)

1. A method performed by a video encoding device for encoding a current layer, the method comprising:
determining a block scanning scheme for a current layer, the block scanning scheme including a position of a start block of the current layer divided into blocks of the same size, and a scanning order based on the start block, wherein the scanning order is one of a horizontal scanning order, a vertical scanning order, a clockwise spiral scanning order, or a counterclockwise spiral scanning order;
encoding a current block representing each block following a block scanning scheme;
determining a reference sample line for the current block after encoding;
determining a line buffer for storing a reference sample line; and
Information about the reference sample line is stored in a line buffer.
2. The method of claim 1, wherein the start block is one of an upper left square, an upper right square, a lower left square, and a lower right square of the current layer.
3. The method of claim 1, wherein determining a block scanning scheme comprises:
the block scanning scheme for the current layer is obtained from a high level, or the orientation and scanning order of the starting block at the high layer are inherited to be used as the block scanning scheme.
4. The method of claim 1, wherein encoding the current block comprises:
all or part of prediction, transformation, quantization, loop filtering, or entropy coding is performed on the current block.
5. The method of claim 1, wherein determining a reference sample line comprises:
four sample lines corresponding to the boundary of the current block are used as reference sample lines.
6. The method of claim 1, wherein determining a reference sample line comprises:
the reference sample line is determined based on the position of the current block in the current layer, the scanning order, and whether blocks adjacent to the current block have been encoded.
7. The method of claim 1, wherein determining a line buffer comprises:
A line buffer located near the boundary of the block being encoded is determined as a line buffer for storing the reference sample line.
8. The method of claim 1, wherein the information about the reference sample line comprises:
reconstructed sample values of the current block; and
information related to prediction of the current block.
9. A video encoding apparatus, comprising:
a scan determination unit configured to determine a block scan scheme for a current layer, the block scan scheme including a position of a start block of the current layer divided into blocks of the same size, and a scan order based on the start block, wherein the scan order is one of a horizontal scan order, a vertical scan order, a clockwise spiral scan order, or a counterclockwise spiral scan order;
a block encoder configured to encode a current block representing each block following a block scanning scheme;
a reference sample determining unit configured to determine a reference sample line of the current block that has been encoded, and to determine a line buffer for storing the reference sample line; and
a reference sample storage unit configured to store information about a reference sample line in the line buffer.
10. The video encoding device of claim 9, wherein the reference sample determination unit is configured to utilize four sample lines corresponding to boundaries of a current block as reference sample lines.
11. The video encoding apparatus according to claim 9, wherein the reference sample line determination unit is configured to determine the reference sample line based on a position of the current block in the current layer, a scanning order, and whether a block adjacent to the current block has been encoded.
12. A method performed by a video decoding device for decoding a current layer, the method comprising:
determining block scanning information for a current layer, the block scanning information including a position of a start block of the current layer divided into blocks of the same size, and a scanning order based on the start block, wherein the scanning order is one of a horizontal scanning order, a vertical scanning order, a clockwise spiral scanning order, or a counterclockwise spiral scanning order;
decoding a current block representing each block following the block scanning information;
determining a reference sample line for the current block after decoding;
determining a line buffer for storing a reference sample line; and
information about the reference sample line is stored in a line buffer.
13. The method of claim 12, wherein determining block scan information comprises:
decoding block scan information for current layer, or
The orientation and scanning order of the starting block at the higher layer are inherited to be used as block scanning information.
14. The method of claim 12, wherein decoding the current block comprises:
all or part of entropy decoding, inverse quantization, inverse transformation, prediction, and loop filtering are performed on the current block.
15. The method of claim 12, wherein determining a reference sample line comprises:
four sample lines corresponding to the boundary of the current block are used as reference sample lines.
16. The method of claim 12, wherein determining a reference sample line comprises:
the reference sample line is determined based on the position of the current block in the current layer, the scan order, and whether blocks adjacent to the current block have been decoded.
17. The method of claim 12, wherein determining a line buffer comprises:
a line buffer located near the boundary of the block being decoded next is determined as a line buffer for storing the reference sample line.
18. The method of claim 12, wherein the information about the reference sample line comprises:
reconstructed sample values of the current block; and
information related to prediction of the current block.
CN202280019848.1A 2021-03-08 2022-03-04 Method and apparatus for video encoding and decoding using spiral scan order Pending CN117044200A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
KR20210030287 2021-03-08
KR10-2021-0030287 2021-03-08
PCT/KR2022/003103 WO2022191525A1 (en) 2021-03-08 2022-03-04 Video coding method and apparatus using spiral scan order

Publications (1)

Publication Number Publication Date
CN117044200A true CN117044200A (en) 2023-11-10

Family

ID=83281445

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202280019848.1A Pending CN117044200A (en) 2021-03-08 2022-03-04 Method and apparatus for video encoding and decoding using spiral scan order

Country Status (2)

Country Link
KR (1) KR20220126226A (en)
CN (1) CN117044200A (en)

Also Published As

Publication number Publication date
KR20220126226A (en) 2022-09-15

Similar Documents

Publication Publication Date Title
CN113812147B (en) Image encoding and decoding using intra-block copy
CN113892268A (en) Intra-frame prediction device and method based on prediction mode estimation
CN116472709A (en) Apparatus and method for video encoding and decoding
CN116530082A (en) Method and apparatus for video coding using intra prediction
CN116941241A (en) Video encoding and decoding method and apparatus using matrix-based cross component prediction
CN116636211A (en) Method and apparatus for encoding video using block merging
CN117044212A (en) Video coding and decoding method and device using deblocking filtering based on segmentation information
CN116530084A (en) Video codec using block-based deep learning model
CN116491114A (en) Image encoding and decoding method and apparatus using sub-block unit intra prediction
CN117044200A (en) Method and apparatus for video encoding and decoding using spiral scan order
US20230412811A1 (en) Method and apparatus for video coding using spiral scan order
US20230283768A1 (en) Method for predicting quantization parameter used in a video encoding/decoding apparatus
CN117581534A (en) Video encoding/decoding method and apparatus
CN118251891A (en) Method and apparatus for video coding and decoding using template matching-based intra prediction
CN116918323A (en) Video encoding and decoding method and apparatus for improving prediction signal of intra prediction
CN118901240A (en) Video coding method and apparatus using context model initialization
CN118369914A (en) Method and apparatus for video encoding and decoding using template matching based secondary MPM list
CN118160304A (en) Video encoding method and apparatus using various block division structures
CN117044197A (en) Method and apparatus for video encoding and decoding using derived intra prediction modes
CN116648907A (en) Block partitioning structure for efficient prediction and transformation, and method and apparatus for video encoding and decoding using the same
CN117917072A (en) Video encoding/decoding method and apparatus
CN116671104A (en) Method and apparatus for intra prediction using geometric transform-based block copy
CN118489244A (en) Method and apparatus for video encoding and decoding using improved AMVP-MERGE mode
CN116711312A (en) Image encoding and decoding method for adaptively determining intra-chroma direction prediction mode
CN118451706A (en) Template-based intra mode derivation method for chrominance components

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination