WO2019004664A1 - Procédé et dispositif de traitement de signal vidéo - Google Patents

Procédé et dispositif de traitement de signal vidéo Download PDF

Info

Publication number
WO2019004664A1
WO2019004664A1 PCT/KR2018/007098 KR2018007098W WO2019004664A1 WO 2019004664 A1 WO2019004664 A1 WO 2019004664A1 KR 2018007098 W KR2018007098 W KR 2018007098W WO 2019004664 A1 WO2019004664 A1 WO 2019004664A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
padding
image
degree
current face
Prior art date
Application number
PCT/KR2018/007098
Other languages
English (en)
Korean (ko)
Inventor
이배근
Original Assignee
주식회사 케이티
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 주식회사 케이티 filed Critical 주식회사 케이티
Publication of WO2019004664A1 publication Critical patent/WO2019004664A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/158Switching image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/332Displays for viewing with the aid of special glasses or head-mounted displays [HMD]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/332Displays for viewing with the aid of special glasses or head-mounted displays [HMD]
    • H04N13/339Displays for viewing with the aid of special glasses or head-mounted displays [HMD] using spatial multiplexing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type

Definitions

  • the present invention relates to a video signal processing method and apparatus.
  • HD image and UHD image are increasing in various applications.
  • HD image and UHD image are increasing in various applications.
  • the image data has high resolution and high quality, the amount of data increases relative to the existing image data. Therefore, when the image data is transmitted using a medium such as a wired / wireless broadband line or stored using an existing storage medium, The storage cost is increased.
  • High-efficiency image compression techniques can be utilized to solve such problems as image data becomes high-resolution and high-quality.
  • An inter picture prediction technique for predicting a pixel value included in a current picture from a previous or a subsequent picture of a current picture by an image compression technique an intra picture prediction technique for predicting a pixel value included in a current picture using pixel information in the current picture
  • an entropy encoding technique in which a short code is assigned to a value having a high appearance frequency and a long code is assigned to a value having a low appearance frequency.
  • Image data can be effectively compressed and transmitted or stored using such an image compression technique.
  • a method of encoding an image according to the present invention includes the steps of generating a 360 degree projection image including a plurality of paces by projectively transforming a three dimensional 360 degree image into a two dimensional plane, Adding a padding area, and encoding the padding related information of the current face.
  • the padding region may be generated based on a sample included in at least a part of a face that is not adjacent to the current face in the 360-degree projection image.
  • a method of decoding an image comprising: decoding padding-related information of a current face; decoding a padding area on at least one border of the current face based on the padding- And generating a 360-degree image by projecting the 360-degree projection image including the projected image back onto the three-dimensional space.
  • the padding region may be generated based on a sample included in at least a portion of a face that is not adjacent to the current face in the 360-degree projection image.
  • the padding region may be configured such that when the neighboring pace neighboring the current pace in the 360 degree projection image is not adjacent to the current pace in the 360 degree image, May be added to the boundary of the current face that is tangential to the tangent plane.
  • the padding region may not be adjacent to the current face in the 360-degree projection image, but may copy a portion of a neighboring face neighboring the current face in the 360- Lt; / RTI >
  • the padding region may include a sample included in the current pace, and a neighboring pseudo image that is not adjacent to the current pace in the 360 degree projection image, Lt; / RTI > may be generated based on the average or weighted operation of the samples included in the neighboring paces.
  • the shape of the padding area is not adjacent to the current face in the 360-degree projection image, but in the 360-degree image, the shape of the neighboring face adjacent to the current face is . ≪ / RTI >
  • the value of a sample in the active region, to which the padding region is added Can be determined by the depth.
  • the padding region may include a vertical padding region tangent to an upper or lower boundary of the current face and a horizontal padding region tangent to a left or right boundary of the current face,
  • the length of the vertical padding region and the size of the horizontal padding region may be different.
  • the padding-related information may include at least one of information indicating whether the padding area exists, information indicating a position of the padding area, or information indicating a length of the padding area .
  • the encoding / decoding efficiency can be improved by projectively transforming the 360 degree image into two dimensions.
  • a coding / decoding efficiency can be improved by adding a padding area to a border or face boundary of a 360-degree image.
  • padding is performed using a neighboring face neighboring the current face in a three-dimensional space, thereby preventing image deterioration of the image.
  • FIG. 1 is a block diagram illustrating an image encoding apparatus according to an embodiment of the present invention.
  • FIG. 2 is a block diagram illustrating an image decoding apparatus according to an embodiment of the present invention.
  • FIG. 3 is a diagram illustrating a partition mode that can be applied to a coding block when a coding block is coded by inter-picture prediction.
  • FIGS. 4 to 6 are views illustrating a camera apparatus for generating a panoramic image.
  • FIG. 7 is a block diagram of a 360-degree video data generation apparatus and a 360-degree video play apparatus.
  • FIG. 8 is a flowchart showing the operation of a 360-degree video data generation apparatus and a 360-degree video play apparatus.
  • Figure 9 shows a 2D projection method using the isometric quadrature method.
  • FIG. 11 shows a 2D projection method using a bipartite projection technique.
  • FIG. 13 shows a 2D projection method using a cutting pyramid projection technique.
  • Fig. 15 is a diagram illustrating the conversion between the face 2D coordinate and the three-dimensional coordinate.
  • 16 is a diagram for explaining an example in which padding is performed in an ERP projected image.
  • 17 is a view for explaining an example in which the lengths of the padding regions in the horizontal direction and the vertical direction are differently set in the ERP projection image.
  • 18 is a diagram showing an example in which padding is performed at the boundary of the face.
  • 19 is a diagram showing an example of determining a sample value of a padding area between paces.
  • 20 is a view illustrating a CMP-based 360 degree projection image.
  • 21 is a diagram showing an example in which a plurality of data is included in one face.
  • each face is configured to include a plurality of faces.
  • FIGS. 23 and 24 are views showing a 360-degree projection image based on the TPP technique in which face overlap padding is performed.
  • FIG. 25 is a diagram showing a 360-degree projection image based on OHP technique considering continuity of images.
  • FIG. 26 is a diagram illustrating a 360-degree projection image based on the OHP technique in which face overlap padding is performed.
  • first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another.
  • the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component.
  • / or < / RTI &gt includes any combination of a plurality of related listed items or any of a plurality of related listed items.
  • FIG. 1 is a block diagram illustrating an image encoding apparatus according to an embodiment of the present invention.
  • the image encoding apparatus 100 includes a picture division unit 110, prediction units 120 and 125, a transform unit 130, a quantization unit 135, a reordering unit 160, an entropy encoding unit An inverse quantization unit 140, an inverse transform unit 145, a filter unit 150, and a memory 155.
  • each of the components shown in FIG. 1 is shown independently to represent different characteristic functions in the image encoding apparatus, and does not mean that each component is composed of separate hardware or one software configuration unit. That is, each constituent unit is included in each constituent unit for convenience of explanation, and at least two constituent units of the constituent units may be combined to form one constituent unit, or one constituent unit may be divided into a plurality of constituent units to perform a function.
  • the integrated embodiments and separate embodiments of the components are also included within the scope of the present invention, unless they depart from the essence of the present invention.
  • the components are not essential components to perform essential functions in the present invention, but may be optional components only to improve performance.
  • the present invention can be implemented only with components essential for realizing the essence of the present invention, except for the components used for the performance improvement, and can be implemented by only including the essential components except the optional components used for performance improvement Are also included in the scope of the present invention.
  • the picture division unit 110 may divide the input picture into at least one processing unit.
  • the processing unit may be a prediction unit (PU), a transform unit (TU), or a coding unit (CU).
  • the picture division unit 110 divides one picture into a plurality of coding units, a prediction unit, and a combination of conversion units, and generates a coding unit, a prediction unit, and a conversion unit combination So that the picture can be encoded.
  • one picture may be divided into a plurality of coding units.
  • a recursive tree structure such as a quad tree structure can be used.
  • a unit can be divided with as many child nodes as the number of divided coding units. Under certain constraints, an encoding unit that is no longer segmented becomes a leaf node. That is, when it is assumed that only one square division is possible for one coding unit, one coding unit can be divided into a maximum of four different coding units.
  • a coding unit may be used as a unit for performing coding, or may be used as a unit for performing decoding.
  • the prediction unit may be one divided into at least one square or rectangular shape having the same size in one coding unit, and one of the prediction units in one coding unit may be divided into another prediction Or may have a shape and / or size different from the unit.
  • intraprediction can be performed without dividing the prediction unit into a plurality of prediction units NxN.
  • the prediction units 120 and 125 may include an inter prediction unit 120 for performing inter prediction and an intra prediction unit 125 for performing intra prediction. It is possible to determine whether to use inter prediction or intra prediction for a prediction unit and to determine concrete information (e.g., intra prediction mode, motion vector, reference picture, etc.) according to each prediction method.
  • the processing unit in which the prediction is performed may be different from the processing unit in which the prediction method and the concrete contents are determined. For example, the method of prediction, the prediction mode and the like are determined as a prediction unit, and the execution of the prediction may be performed in a conversion unit.
  • the residual value (residual block) between the generated prediction block and the original block can be input to the conversion unit 130.
  • the prediction mode information, motion vector information, and the like used for prediction can be encoded by the entropy encoding unit 165 together with the residual value and transmitted to the decoder.
  • the entropy encoding unit 165 When a particular encoding mode is used, it is also possible to directly encode the original block and transmit it to the decoding unit without generating a prediction block through the prediction units 120 and 125.
  • the inter-prediction unit 120 may predict a prediction unit based on information of at least one of a previous picture or a following picture of the current picture, and may predict a prediction unit based on information of a partially- Unit may be predicted.
  • the inter prediction unit 120 may include a reference picture interpolation unit, a motion prediction unit, and a motion compensation unit.
  • the reference picture information is supplied from the memory 155 and pixel information of an integer pixel or less can be generated in the reference picture.
  • a DCT-based interpolation filter having a different filter coefficient may be used to generate pixel information of an integer number of pixels or less in units of quarter pixels.
  • a DCT-based 4-tap interpolation filter having a different filter coefficient may be used to generate pixel information of an integer number of pixels or less in units of 1/8 pixel.
  • the motion prediction unit may perform motion prediction based on the reference picture interpolated by the reference picture interpolating unit.
  • Various methods such as Full Search-based Block Matching Algorithm (FBMA), Three Step Search (TSS), and New Three-Step Search Algorithm (NTS) can be used as methods for calculating motion vectors.
  • the motion vector may have a motion vector value of 1/2 or 1/4 pixel unit based on the interpolated pixel.
  • the motion prediction unit can predict the current prediction unit by making the motion prediction method different.
  • Various methods such as a skip method, a merge method, an AMVP (Advanced Motion Vector Prediction) method, and an Intra Block Copy method can be used as the motion prediction method.
  • AMVP Advanced Motion Vector Prediction
  • the intra prediction unit 125 can generate a prediction unit based on reference pixel information around the current block which is pixel information in the current picture.
  • the reference pixel included in the block in which the inter prediction is performed is referred to as the reference pixel Information. That is, when the reference pixel is not available, the reference pixel information that is not available may be replaced by at least one reference pixel among the available reference pixels.
  • the prediction mode may have a directional prediction mode in which reference pixel information is used according to a prediction direction, and a non-directional mode in which direction information is not used in prediction.
  • the mode for predicting the luminance information may be different from the mode for predicting the chrominance information and the intra prediction mode information or predicted luminance signal information used for predicting the luminance information may be utilized to predict the chrominance information.
  • intraprediction when the size of the prediction unit is the same as the size of the conversion unit, intra prediction is performed on the prediction unit based on pixels existing on the left side of the prediction unit, pixels existing on the upper left side, Can be performed.
  • intra prediction when the size of the prediction unit differs from the size of the conversion unit, intraprediction can be performed using the reference pixel based on the conversion unit. It is also possible to use intraprediction using NxN partitioning only for the minimum encoding unit.
  • the intra prediction method can generate a prediction block after applying an AIS (Adaptive Intra Smoothing) filter to the reference pixel according to the prediction mode.
  • the type of the AIS filter applied to the reference pixel may be different.
  • the intra prediction mode of the current prediction unit can be predicted from the intra prediction mode of the prediction unit existing around the current prediction unit.
  • the prediction mode of the current prediction unit is predicted using the mode information predicted from the peripheral prediction unit, if the intra prediction mode of the current prediction unit is the same as the intra prediction mode of the current prediction unit,
  • the prediction mode information of the current block can be encoded by performing entropy encoding if the prediction mode of the current prediction unit is different from the prediction mode of the neighbor prediction unit.
  • a residual block including a prediction unit that has been predicted based on the prediction unit generated by the prediction units 120 and 125 and a residual value that is a difference value from the original block of the prediction unit may be generated.
  • the generated residual block may be input to the transform unit 130.
  • the transform unit 130 transforms the residual block including the residual information of the prediction unit generated through the original block and the predictors 120 and 125 into a DCT (Discrete Cosine Transform), a DST (Discrete Sine Transform), a KLT You can convert using the same conversion method.
  • the decision to apply the DCT, DST, or KLT to transform the residual block may be based on the intra prediction mode information of the prediction unit used to generate the residual block.
  • the quantization unit 135 may quantize the values converted into the frequency domain by the conversion unit 130. [ The quantization factor may vary depending on the block or the importance of the image. The values calculated by the quantization unit 135 may be provided to the inverse quantization unit 140 and the reorder unit 160.
  • the reordering unit 160 can reorder the coefficient values with respect to the quantized residual values.
  • the reordering unit 160 may change the two-dimensional block type coefficient to a one-dimensional vector form through a coefficient scanning method.
  • the rearranging unit 160 may scan a DC coefficient to a coefficient in a high frequency region using a Zig-Zag scan method, and change the DC coefficient to a one-dimensional vector form.
  • a vertical scan may be used to scan two-dimensional block type coefficients in a column direction, and a horizontal scan to scan a two-dimensional block type coefficient in a row direction depending on the size of the conversion unit and the intra prediction mode. That is, it is possible to determine whether any scanning method among the jig-jag scan, the vertical direction scan and the horizontal direction scan is used according to the size of the conversion unit and the intra prediction mode.
  • the entropy encoding unit 165 may perform entropy encoding based on the values calculated by the reordering unit 160.
  • various encoding methods such as Exponential Golomb, Context-Adaptive Variable Length Coding (CAVLC), and Context-Adaptive Binary Arithmetic Coding (CABAC) may be used.
  • the entropy encoding unit 165 receives the residual value count information of the encoding unit, the block type information, the prediction mode information, the division unit information, the prediction unit information and the transmission unit information, and the motion information of the motion unit from the reordering unit 160 and the prediction units 120 and 125 Vector information, reference frame information, interpolation information of a block, filtering information, and the like.
  • the entropy encoding unit 165 can entropy-encode the coefficient value of the encoding unit input by the reordering unit 160.
  • the inverse quantization unit 140 and the inverse transformation unit 145 inverse quantize the quantized values in the quantization unit 135 and inversely transform the converted values in the conversion unit 130.
  • the residual value generated by the inverse quantization unit 140 and the inverse transform unit 145 is combined with the prediction unit predicted through the motion estimation unit, the motion compensation unit and the intra prediction unit included in the prediction units 120 and 125, A block (Reconstructed Block) can be generated.
  • the filter unit 150 may include at least one of a deblocking filter, an offset correction unit, and an adaptive loop filter (ALF).
  • a deblocking filter may include at least one of a deblocking filter, an offset correction unit, and an adaptive loop filter (ALF).
  • ALF adaptive loop filter
  • the deblocking filter can remove block distortion caused by the boundary between the blocks in the reconstructed picture. It may be determined whether to apply a deblocking filter to the current block based on pixels included in a few columns or rows included in the block to determine whether to perform deblocking. When a deblocking filter is applied to a block, a strong filter or a weak filter may be applied according to the deblocking filtering strength required. In applying the deblocking filter, horizontal filtering and vertical filtering may be performed concurrently in performing vertical filtering and horizontal filtering.
  • the offset correction unit may correct the offset of the deblocked image with respect to the original image in units of pixels.
  • pixels included in an image are divided into a predetermined number of areas, and then an area to be offset is determined and an offset is applied to the area.
  • Adaptive Loop Filtering can be performed based on a comparison between the filtered reconstructed image and the original image. After dividing the pixels included in the image into a predetermined group, one filter to be applied to the group may be determined and different filtering may be performed for each group.
  • the information related to whether to apply the ALF may be transmitted for each coding unit (CU), and the shape and the filter coefficient of the ALF filter to be applied may be changed according to each block. Also, an ALF filter of the same type (fixed form) may be applied irrespective of the characteristics of the application target block.
  • the memory 155 may store the reconstructed block or picture calculated through the filter unit 150 and the reconstructed block or picture stored therein may be provided to the predictor 120 or 125 when the inter prediction is performed.
  • FIG. 2 is a block diagram illustrating an image decoding apparatus according to an embodiment of the present invention.
  • the image decoder 200 includes an entropy decoding unit 210, a reordering unit 215, an inverse quantization unit 220, an inverse transform unit 225, prediction units 230 and 235, 240, and a memory 245 may be included.
  • the input bitstream may be decoded in a procedure opposite to that of the image encoder.
  • the entropy decoding unit 210 can perform entropy decoding in a procedure opposite to that in which entropy encoding is performed in the entropy encoding unit of the image encoder. For example, various methods such as Exponential Golomb, Context-Adaptive Variable Length Coding (CAVLC), and Context-Adaptive Binary Arithmetic Coding (CABAC) may be applied in accordance with the method performed by the image encoder.
  • various methods such as Exponential Golomb, Context-Adaptive Variable Length Coding (CAVLC), and Context-Adaptive Binary Arithmetic Coding (CABAC) may be applied in accordance with the method performed by the image encoder.
  • CAVLC Context-Adaptive Variable Length Coding
  • CABAC Context-Adaptive Binary Arithmetic Coding
  • the entropy decoding unit 210 may decode information related to intra prediction and inter prediction performed in the encoder.
  • the reordering unit 215 can perform reordering based on a method in which the entropy decoding unit 210 rearranges the entropy-decoded bitstreams in the encoding unit.
  • the coefficients represented by the one-dimensional vector form can be rearranged by restoring the coefficients of the two-dimensional block form again.
  • the reordering unit 215 can perform reordering by receiving information related to the coefficient scanning performed by the encoding unit and performing a reverse scanning based on the scanning order performed by the encoding unit.
  • the inverse quantization unit 220 can perform inverse quantization based on the quantization parameters provided by the encoder and the coefficient values of the re-arranged blocks.
  • the inverse transform unit 225 may perform an inverse DCT, an inverse DST, and an inverse KLT on the DCT, DST, and KLT transformations performed by the transform unit on the quantization result performed by the image encoder.
  • the inverse transform can be performed based on the transmission unit determined by the image encoder.
  • a transform technique e.g., DCT, DST, KLT
  • the prediction units 230 and 235 can generate a prediction block based on the prediction block generation related information provided by the entropy decoding unit 210 and the previously decoded block or picture information provided in the memory 245.
  • intraprediction is performed using a reference pixel based on the conversion unit . It is also possible to use intra prediction using NxN division only for the minimum coding unit.
  • the prediction units 230 and 235 may include a prediction unit determination unit, an inter prediction unit, and an intra prediction unit.
  • the prediction unit determination unit receives various information such as prediction unit information input from the entropy decoding unit 210, prediction mode information of the intra prediction method, motion prediction related information of the inter prediction method, and identifies prediction units in the current coding unit. It is possible to determine whether the unit performs inter prediction or intra prediction.
  • the inter prediction unit 230 predicts the current prediction based on the information included in at least one of the previous picture of the current picture or the following picture including the current prediction unit by using information necessary for inter prediction of the current prediction unit provided by the image encoder, Unit can be performed. Alternatively, the inter prediction may be performed on the basis of the information of the partial region previously reconstructed in the current picture including the current prediction unit.
  • a motion prediction method of a prediction unit included in a corresponding encoding unit on the basis of an encoding unit includes a skip mode, a merge mode, an AMVP mode, and an intra block copy mode It is possible to judge whether or not it is any method.
  • the intra prediction unit 235 can generate a prediction block based on the pixel information in the current picture. If the prediction unit is a prediction unit that performs intra prediction, the intra prediction can be performed based on the intra prediction mode information of the prediction unit provided by the image encoder.
  • the intraprediction unit 235 may include an AIS (Adaptive Intra Smoothing) filter, a reference pixel interpolator, and a DC filter.
  • the AIS filter performs filtering on the reference pixels of the current block and can determine whether to apply the filter according to the prediction mode of the current prediction unit.
  • the AIS filtering can be performed on the reference pixel of the current block using the prediction mode of the prediction unit provided in the image encoder and the AIS filter information. When the prediction mode of the current block is a mode in which AIS filtering is not performed, the AIS filter may not be applied.
  • the reference pixel interpolator may interpolate the reference pixels to generate reference pixels in units of pixels less than or equal to an integer value when the prediction mode of the prediction unit is a prediction unit that performs intra prediction based on pixel values obtained by interpolating reference pixels.
  • the reference pixel may not be interpolated in the prediction mode in which the prediction mode of the current prediction unit generates the prediction block without interpolating the reference pixel.
  • the DC filter can generate a prediction block through filtering when the prediction mode of the current block is the DC mode.
  • the restored block or picture may be provided to the filter unit 240.
  • the filter unit 240 may include a deblocking filter, an offset correction unit, and an ALF.
  • the deblocking filter of the video decoder When information on whether a deblocking filter is applied to a corresponding block or picture from the image encoder or a deblocking filter is applied, information on whether a strong filter or a weak filter is applied can be provided.
  • the deblocking filter of the video decoder the deblocking filter related information provided by the video encoder is provided, and the video decoder can perform deblocking filtering for the corresponding block.
  • the offset correction unit may perform offset correction on the reconstructed image based on the type of offset correction applied to the image and the offset value information during encoding.
  • the ALF can be applied to an encoding unit on the basis of ALF application information and ALF coefficient information provided from an encoder.
  • ALF information may be provided in a specific parameter set.
  • the memory 245 may store the reconstructed picture or block to be used as a reference picture or a reference block, and may also provide the reconstructed picture to the output unit.
  • a coding unit (coding unit) is used as a coding unit for convenience of explanation, but it may be a unit for performing not only coding but also decoding.
  • the current block indicates a block to be coded / decoded.
  • the current block includes a coding tree block (or coding tree unit), a coding block (or coding unit), a transform block (Or prediction unit), and the like.
  • 'unit' represents a basic unit for performing a specific encoding / decoding process
  • 'block' may represent a sample array of a predetermined size.
  • the terms 'block' and 'unit' may be used interchangeably.
  • the encoding block (coding block) and the encoding unit (coding unit) have mutually equivalent meanings.
  • the basic block may be referred to as a coding tree unit.
  • the coding tree unit may be defined as a coding unit of the largest size allowed in a sequence or a slice. Information regarding whether the coding tree unit is square or non-square or about the size of the coding tree unit can be signaled through a sequence parameter set, a picture parameter set, or a slice header.
  • the coding tree unit can be divided into smaller size partitions. In this case, if the partition generated by dividing the coding tree unit is depth 1, the partition created by dividing the partition having depth 1 can be defined as depth 2. That is, the partition created by dividing the partition having the depth k in the coding tree unit can be defined as having the depth k + 1.
  • a partition of arbitrary size generated as the coding tree unit is divided can be defined as a coding unit.
  • the coding unit may be recursively divided or divided into basic units for performing prediction, quantization, transformation, or in-loop filtering, and the like.
  • a partition of arbitrary size generated as a coding unit is divided may be defined as a coding unit, or may be defined as a conversion unit or a prediction unit, which is a basic unit for performing prediction, quantization, conversion or in-loop filtering and the like.
  • a prediction block having the same size as the coding block or smaller than the coding block can be determined through predictive division of the coding block.
  • Predictive partitioning of the coded block can be performed by a partition mode (Part_mode) indicating the partition type of the coded block.
  • Part_mode partition mode
  • the size or shape of the prediction block may be determined according to the partition mode of the coding block.
  • the division type of the coding block can be determined through information specifying any one of the partition candidates.
  • the partition candidates available to the coding block may include an asymmetric partition type (for example, nLx2N, nRx2N, 2NxnU, 2NxnD) depending on the size, type, coding mode or the like of the coding block.
  • the partition candidate available to the coding block may be determined according to the coding mode of the current block. For example, FIG. 3 illustrates a partition mode that can be applied to a coding block when the coding block is coded by inter-picture prediction.
  • one of eight partitioning modes can be applied to the coding block, as in the example shown in Fig.
  • the coding mode can be applied to the partition mode PART_2Nx2N or PART_NxN.
  • PART_NxN may be applied when the coding block has a minimum size.
  • the minimum size of the coding block may be one previously defined in the encoder and the decoder.
  • information regarding the minimum size of the coding block may be signaled via the bitstream.
  • the minimum size of the coding block is signaled through the slice header, so that the minimum size of the coding block per slice can be defined.
  • the partition candidates available to the coding block may be determined differently depending on at least one of the size or type of the coding block. In one example, the number or type of partition candidates available to the coding block may be differently determined according to at least one of the size or type of the coding block.
  • the type or number of asymmetric partition candidates among the partition candidates available to the coding block may be limited depending on the size or type of the coding block. In one example, the number or type of asymmetric partition candidates available to the coding block may be differently determined according to at least one of the size or type of the coding block.
  • the size of the prediction block may have a size from 64x64 to 4x4.
  • the coding block is coded by inter-picture prediction, it is possible to prevent the prediction block from having a 4x4 size in order to reduce the memory bandwidth when performing motion compensation.
  • FIGS. 4 to 6 show an example in which a plurality of cameras are used to photograph up and down, right and left, or front and back at the same time.
  • a video generated by stitching a plurality of videos can be referred to as a panoramic video.
  • an image having a degree of freedom (Degree of Freedom) based on a predetermined center axis can be referred to as a 360-degree video.
  • the 360 degree video may be an image having rotational degrees of freedom for at least one of Yaw, Roll, and Pitch.
  • the camera structure (or camera arrangement) for acquiring 360-degree video may have a circular arrangement, as in the example shown in Fig. 4, or a one-dimensional vertical / horizontal arrangement as in the example shown in Fig. Or a two-dimensional arrangement (i.e., a combination of vertical arrangement and horizontal arrangement) as in the example shown in Fig. 5 (b).
  • a plurality of cameras may be mounted on the spherical device.
  • FIG. 7 is a block diagram of a 360-degree video data generation apparatus and a 360-degree video play apparatus
  • FIG. 8 is a flowchart illustrating operations of a 360-degree video data generation apparatus and a 360-degree video data apparatus.
  • the 360-degree video data generation apparatus includes a projection unit 710, a frame packing unit 720, an encoding unit 730, and a transmission unit 740, A parsing unit 750, a decoding unit 760, a frame deblocking unit 770, and an inverse decoding unit 780.
  • the encoding unit and the decoding unit shown in FIG. 7 may correspond to the image encoding apparatus and the image decoding apparatus shown in FIG. 1 and FIG. 2, respectively.
  • the data generation apparatus can determine a projection transformation technique of a 360-degree image generated by stitching an image photographed by a plurality of cameras.
  • the 3D shape of the 360-degree video is determined according to the determined projection transformation technique, and the 360-degree video is projected on the 2D plane according to the determined 3D shape (S801).
  • the projection transformation technique can represent a 3D shape of 360-degree video and an aspect in which 360-degree video is developed on the 2D plane.
  • 360 degree images can be approximated to have shapes such as spheres, cylinders, cubes, octahedrons, or regular twins, etc., in 3D space according to projection transformation techniques.
  • an image generated by projecting a 360-degree video onto a 2D plane can be referred to as a 360-degree projection image.
  • the 360 degree projection image may be composed of at least one face according to the projection transformation technique.
  • each face constituting the polyhedron can be defined as a pace.
  • the specific surface constituting the polyhedron may be divided into a plurality of regions, and each divided region may be configured to form a separate face.
  • a plurality of faces on the polyhedron may be configured to form one face.
  • 360 degree video which approximates spherical shape, can have multiple faces according to the projection transformation technique.
  • Frame packing may be performed in the frame packing unit 720 in order to increase the encoding / decoding efficiency of the 360-degree video (S802).
  • the frame packing may include at least one of rearranging, resizing, warping, rotating, or flipping the face.
  • the 360 degree projection image can be converted into a form having a high encoding / decoding efficiency (for example, a rectangle) or discontinuous data between faces can be removed.
  • the frame packing may also be referred to as frame reordering or Region-wise Packing.
  • the frame packing may be selectively performed to improve the coding / decoding efficiency for the 360 degree projection image.
  • the 360-degree projection image or the 360-degree projection image in which the frame packing is performed may be encoded (S803).
  • the encoding unit 730 may encode information indicating a projection transformation technique for 360-degree video.
  • the information indicating the projection transformation technique may be index information indicating any one of a plurality of projection transformation techniques.
  • the encoding unit 730 can encode information related to frame packing for 360-degree video.
  • the information related to the frame packing may include at least one of whether or not frame packing has been performed, the number of paces, the position of the pace, the size of the pace, the shape of the pace, or the rotation information of the pace.
  • the transmitting unit 740 encapsulates the bit stream and transmits the encapsulated data to the player terminal (S804).
  • the file parsing unit 750 can parse the file received from the content providing apparatus (S805).
  • the decoding unit 760 the 360-degree projection image can be decoded using the parsed data (S806).
  • the frame deblocking unit 760 may perform a frame de-packing (Region-wise depacking), which is opposite to the frame packing performed on the content providing side (S807).
  • the frame de-packing may be to restore the frame-packed 360 degree projection image to before the frame packing is performed.
  • frame de-packing may be to reverse the pacing, resizing, warping, rotation, or flipping performed at the data generating device.
  • the inverse transformation unit 780 can perform inverse projection on the 360 degree projection image on the 2D plane in 3D form according to the projection transformation technique of 360 degree video (S808).
  • Projection transformation techniques include ERP, Equirectangular Procction, Cube Map Projection (CMP), Icosahedral Projection (ISP), Octahedron Projection (OHP), Cutting Pyramid And may include at least one of Truncated Pyramid Projection (TPP), Sphere Segment Projection (SSP), Equatorial Cylindrical Projection (ECP), and rotated spherical projection (RSP).
  • Figure 9 shows a 2D projection method using the isometric quadrature method.
  • the isometric method is a method of projecting a pixel corresponding to a sphere into a rectangle having an aspect ratio of N: 1, which is the most widely used 2D transformation technique.
  • N may be 2, or may be 2 or less or 2 or more real numbers.
  • the actual length of the sphere corresponding to the unit length on the 2D plane becomes shorter as the sphere becomes closer to the sphere.
  • the coordinates of both ends of the unit length on the 2D plane may correspond to a distance difference of 20 cm in the vicinity of the sphere of the sphere, and a distance difference of 5 cm in the vicinity of the sphere of the sphere.
  • the isochronous quadrature method has a disadvantage in that the image is distorted in the vicinity of the sphere and the coding efficiency is lowered.
  • the cube projection method approximates a 360 degree video with a cube and then transforms the cube into 2D.
  • one face or plane
  • the cube projection method has an advantage in that the coding efficiency is higher than that of the isotropic square method.
  • the 2D projection converted image may be rearranged into a rectangular shape to perform encoding / decoding.
  • FIG. 11 shows a 2D projection method using a bipartite projection technique.
  • the trilateral projection method is a method of approximating a 360-degree video to a twenty-sided shape and transforming it into 2D.
  • the twin-sided projection technique has a strong continuity between faces.
  • the octahedron projection method is a method of approximating a 360 degree video to an octahedron and transforming it into 2D.
  • the octahedral projection technique is characterized by strong continuity between faces. As in the example shown in FIG. 12, it is possible to perform encoding / decoding by rearranging the faces in the 2D projection-converted image.
  • FIG. 13 shows a 2D projection method using a cutting pyramid projection technique.
  • the truncated pyramid projection technique is a method of approximating a 360 degree video with a cutting pyramid and transforming it into 2D.
  • frame packing may be performed such that the face at a particular point in time has a different size from the neighboring face.
  • the Front face may have a larger size than the side face and the back face.
  • the image data at a specific point in time is large and the encoding / decoding efficiency at a specific point is higher than that at the other points.
  • the SSP is a method of performing 2D projection transformation by dividing spherical 360 degree video into high latitude regions and mid-latitude regions. Specifically, as in the example shown in Fig. 14, two high-latitude regions in the north and south directions of the sphere can be mapped to two circles on the 2D plane, and the mid-latitude region of the sphere can be mapped to a rectangle on the 2D plane like the ERP.
  • the boundary between high latitudes and mid-latitudes may be 45 degrees latitude or above or below latitude 45 degrees.
  • ECP is a method of transforming spherical 360 degree video into cylindrical shape and then 2D cylindrical projection of 360 degree video. Specifically, when the ECP is followed, the upper and lower surfaces of the cylinder can be mapped to two circles on the 2D plane, and the body of the cylinder can be mapped to a rectangle on the 2D plane.
  • the RSP represents a method of projecting and transforming a sphere-shaped 360-degree video around a tennis ball into two ellipses on a 2D plane.
  • Each sample of the 360 degree projection image can be identified by face 2D coordinates.
  • the face 2D coordinates may include an index f for identifying the face where the sample is located, and coordinates (m, n) representing a sample grid in the 360 degree projection image.
  • FIG. 15 is an illustration to illustrate the conversion between face 2D coordinates and three-dimensional coordinates.
  • (X, y, z) and the face 2D coordinates (f, m, n) can be performed using the following equations (1) have.
  • the current picture may include at least one face.
  • the number of faces may be 1, 2, 3, 4 or more natural numbers, depending on the projection method.
  • f may be set to a value equal to or less than the number of faces.
  • the current picture may include at least one pace having the same temporal order or output order (POC).
  • the number of paces constituting the current picture may be fixed or variable.
  • the number of paces constituting the current picture may be limited so as not to exceed a predetermined threshold value.
  • the threshold value may be a fixed value promised in the encoder and the decoder.
  • information regarding the maximum number of paces constituting one picture may be signaled through the bit stream.
  • Paces can be determined by partitioning the current picture using at least one of horizontal, vertical, or diagonal lines, depending on the projection method.
  • Each face in the picture may be assigned an index to identify each face.
  • Each face may be capable of parallel processing, such as a tile or a slice. Accordingly, when intra prediction or inter prediction of the current block is performed, a neighboring block belonging to a different face from the current block can be judged as unavailable.
  • Pairs that do not allow parallel processing may be defined, or interdependent paces may be defined.
  • paces for which parallel processing is not allowed or interdependent paces may be sequentially encoded / decoded instead of being parallel-encoded / decoded. Accordingly, even if the neighboring block belongs to a different pace than the current block, the neighboring block may be determined to be available for intra prediction or inter prediction of the current block, depending on whether inter-face parallel processing is possible or dependency.
  • padding can be performed at a picture or face boundary.
  • the padding may be performed as a part of performing the frame packing (S802), or may be performed as a separate step before performing the frame packing.
  • padding may be performed in the preprocessing process before encoding the 360-degree projection image in which the frame packing is performed, or padding may be performed as a part of the encoding step S803.
  • the padding can be performed considering the continuity of the 360 degree image.
  • the continuity of the 360 degree image may mean spatially continuous when the 360 degree projection image is projected backward as a sphere or a polyhedron.
  • the spatially continuous paces have mutual continuity. Padding between pictures or face boundaries may be performed using spatially continuous samples.
  • 16 is a diagram for explaining an example in which padding is performed in an ERP projected image.
  • the upper boundary on the left has continuity with the upper boundary on the right.
  • pixels G and H outside the upper left boundary line can be predicted to be similar to the inner pixels G 'and H' of the upper right boundary, and pixels I and J Can be predicted to be similar to the inner pixels I 'and J' of the upper left boundary.
  • the upper left boundary has continuity with the upper right boundary.
  • pixels K and L outside the lower left boundary line can be predicted to be similar to the inner pixels K 'and L' of the lower right boundary
  • pixels M and N Can be predicted to be similar to the inner pixels M 'and N' of the lower left boundary.
  • padding can be performed at the boundary of the 360 degree projection image or at the boundary between faces.
  • the padding can be performed using samples contained inside the boundary having continuity with the boundary where the padding is performed.
  • padding is performed using the samples adjacent to the right boundary at the left boundary of the 360 degree projection image
  • padding is performed using the samples adjacent to the left boundary at the right boundary of the 360 degree projection image . That is, at positions A, B and C of the left boundary, padding can be performed using samples at positions A ', B' and C 'contained inside the right boundary, and the positions D, E and F , Padding can be performed using samples of the positions of D ', E' and F 'included inside the left boundary.
  • padding is performed using samples adjacent to the upper right boundary at the upper left boundary
  • padding can be performed using samples adjacent to the upper left boundary at the upper right boundary. That is, at the G and H positions of the upper left boundary, padding is performed using the samples at G 'and H' positions contained in the upper right boundary, and at the I and J positions of the upper right boundary, The padding can be performed by using the samples of the positions I 'and J' contained inside.
  • padding may be performed using samples adjacent to the lower-right boundary at the lower left boundary, and padding may be performed using samples adjacent to the lower left boundary at the lower right boundary. That is, at the K and L positions of the lower left boundary, padding is performed using samples at positions K 'and L' included in the upper right boundary, and at the M and N positions of the upper right boundary, The padding can be performed using the samples at the positions M 'and N' included in the inner side of the padding.
  • a padding area An area where padding is performed may be referred to as a padding area, and a padding area may include a plurality of sample lines.
  • the number of sample lines included in the padding area can be defined as the length of the padding area or the padding size.
  • the length of the padding area is shown as k in both the horizontal and vertical directions.
  • the length of the padding area may be set differently for each horizontal or vertical direction, or different for each face boundary.
  • large distortion occurs at the upper or lower end of the 360 degree projection image using the ERP projection transformation.
  • 17 is a view for explaining an example in which the lengths of the padding regions in the horizontal direction and the vertical direction are differently set in the ERP projection image.
  • the length of the arrow indicates the length of the padding area.
  • the length of the padding area performed in the horizontal direction and the length of the padding area performed in the vertical direction may be set differently, as in the example shown in FIG. For example, if k columns of samples are generated through padding in the horizontal direction, padding may be performed such that 2k rows of samples are generated in the vertical direction.
  • padding may be performed with the same length in both the vertical direction and the horizontal direction, but the length of the padding area may be posteriorly extended through interpolation in at least one of the vertical direction and the horizontal direction.
  • k sample lines in the vertical direction and horizontal direction can be generated, and k sample lines can be additionally generated in the vertical direction through interpolation or the like. That is, k sample lines are generated in both the horizontal and vertical directions (see FIG. 16), and k sample lines are further generated for the vertical direction so that the length in the vertical direction is 2k (refer to FIG. 17) .
  • Interpolation may be performed using at least one of the samples contained within the boundary or the sample contained outside the boundary. For example, after copying the samples inside the lower boundary to the outside of the padding area adjacent to the upper boundary, additional padding areas can be created by interpolating the copied samples and the samples contained in the padding area adjacent to the upper boundary .
  • the interpolation filter may include at least one of a vertical direction filter and a horizontal direction filter. Depending on the position of the sample to be produced, either the vertical filter or the horizontal filter may be selectively used. Alternatively, the vertical filter and the horizontal filter may be used simultaneously to generate a sample included in the additional padding area.
  • the length n in the horizontal direction of the padding area and the length m in the vertical direction of the padding area may have the same value or may have different values.
  • n and m are natural numbers equal to or greater than 0 and may have mutually the same value, or one of m and n may have a smaller value than the other.
  • m and n can be encoded in the encoder and signaled through the bit stream.
  • the length n in the horizontal direction and the length m in the vertical direction in the encoder and decoder may be predefined.
  • the padding area may be generated by copying samples located inside the image.
  • the padding region located adjacent to a predetermined boundary may be generated by copying a sample located inside the boundary having continuity with a predetermined boundary in 3D space.
  • a padding area located at the left boundary of the image may be generated by copying the sample adjacent to the right border of the image.
  • a padding area may be created using at least one sample inside the boundary to be padded and at least one sample outside the boundary. For example, after padding the spatially contiguous samples with the boundary to be padded to the outside of the boundary, a weighted average calculation or an average calculation is performed between the copied samples and the samples included in the boundary, Can be determined. 16 and 17, the sample value of the padding region located at the left boundary of the image may include at least one sample adjacent to the left boundary of the image and at least one sample adjacent to the right boundary of the image Weighted average or averaged.
  • the weight applied to each sample in the weighted average operation may be determined based on the distance to the boundary where the padding region is located. For example, of the samples in the padding region located at the left boundary, a sample close to the left boundary is derived by giving a large weight to samples located inside the left boundary, while a sample far away from the left boundary is sampled That is, samples adjacent to the right border of the image).
  • frame packing can be performed by adding a padding area between faces. That is, a 360 degree projection image can be generated by adding a padding area to the face boundary.
  • 18 is a diagram showing an example in which padding is performed at the boundary of the face.
  • the face located at the upper end of the 360 degree projection image will be referred to as the upper face and the face located at the lower end of the 360 degree projection image will be referred to as the lower face based on the drawing shown in FIG. 18 (a) do.
  • the upper face may represent one of faces 1, 2, 3, and 4, and the lower face may represent any of faces 5, 6, 7,
  • a padding area may be set in the form of surrounding a predetermined face.
  • a padding region containing m samples may be created.
  • the padding area is set to surround the face, but the padding area may be set to only a part of the face boundary. That is, unlike in the example shown in FIG. 18 (b), the padding area may be added only at the boundary of the image, or the padding area may be added only between the faces to perform the frame packing.
  • frame packing may be performed by adding a padding area only between paces at which image discontinuity occurs, in consideration of continuity between paces.
  • the length of the padding area between the faces may be set the same or may be set differently depending on the position.
  • the length (i.e., length in the horizontal direction) n of the padding region located at the left or right side of the predetermined face and the length m in the horizontal direction of the padding region located at the upper or lower end of the predetermined face may have the same value, Value.
  • n and m are natural numbers equal to or greater than 0 and may have mutually the same value, or one of m and n may have a smaller value than the other.
  • m and n can be encoded in the encoder and signaled through the bit stream.
  • the length n in the horizontal direction and the length m in the vertical direction may be predefined in the encoder and decoder in accordance with the projection conversion method, the position of the face, the size of the face or the shape of the face.
  • the sample value of the padding area may be determined based on the sample included in the predetermined face or the sample included in the predetermined face and the sample included in the face adjacent to the predetermined face.
  • a sample value of a padding area adjacent to a boundary of a predetermined face may be generated by copying a sample included in the face or interpolating samples included in the face.
  • the upper extension region U of the upper face may be created by copying a sample adjacent to the boundary of the upper face, or by interpolating a predetermined number of samples adjacent to the boundary of the upper face .
  • the lower extension region D of the lower face may be generated by copying a sample adjacent to the boundary of the lower face or by interpolating a predetermined number of samples adjacent to the boundary of the lower face.
  • a sample value of a padding area adjacent to a boundary of a predetermined face may be generated using a sample value included in a face spatially adjacent to the face.
  • the inter-face adjacency can be determined based on whether the faces have continuity when the 360 degree projection image is projected back onto the 3D space.
  • a sample value of a padding area adjacent to a boundary of a predetermined face is generated by copying a sample included in a face spatially adjacent to the face, or a sample included in the face and a sample included in the face spatially adjacent to the face Can be generated by interpolating samples.
  • the left portion of the upper extended region of the second face may be generated based on the samples included in the first face, and the right portion may be generated based on the samples included in the third face.
  • 19 is a diagram showing an example of determining a sample value of a padding area between paces.
  • the padding region between the first face and the second face may be obtained by weighted averaging at least one sample included in the first face and at least one sample included in the second face.
  • the padding region between the upper face and the lower face can be obtained by weighted averaging the upper extension region U and the lower extension region D.
  • the weight w may be determined based on the information encoded and signaled by the encoder. Alternatively, depending on the position of the sample in the padding region, the weight w may be variably determined. For example, the weight w may be determined based on the distance from the position of the sample in the padding region to the first face and the distance from the position of the sample in the padding region to the second face.
  • Equations (4) and (5) show examples in which the weight w is variably determined according to the position of the sample.
  • a sample value of the padding area is generated based on Equation (4) in the lower extended region close to the lower face, and in the upper extended region close to the upper face, A sample value of the padding region can be generated.
  • the filter for the weighting operation may have a vertical direction, a horizontal direction, or a predetermined angle. If the weighted filter has a predetermined angle, the sample included in the first pace and the sample included in the second pace located on the predetermined angle line from the sample in the padding region may be used to determine the sample value of the corresponding sample .
  • the padding region may be generated using only samples included in either the first face or the second face. For example, if any one of the samples included in the first face or the sample included in the second face is not available, padding can be performed using only the available samples. Alternatively, padding may be performed by replacing the unavailable sample with the surrounding available sample.
  • padding-related embodiments are described based on a specific projection transformation method
  • padding can be performed on the same principle as the embodiments described in the projection transformation method other than the exemplified projection transformation method.
  • padding can be performed at a face boundary or an image boundary even in a 360 degree projection image based on CMP, OHP, ECP, RSP, TPP, and the like.
  • padding related information can be signaled through the bitstream.
  • the padding related information may include whether padding has been performed, the position of the padding area or the padding size, and the like.
  • Padding related information may be signaled on a picture, slice or pace basis. In one example, information indicating whether padding was performed on the top boundary, bottom boundary, left boundary, or right boundary on a per-pace basis and the padding size may be signaled.
  • a 360 degree image can be projected and converted into a two dimensional image composed of a plurality of faces.
  • a 360 degree image can be projected and transformed into a two dimensional image composed of six faces.
  • the six paces may be arranged in a 2x3 form, or in a 3x2 form, as in the example shown in Fig.
  • FIG. 20 shows a 360-degree projection image in the form of 3 ⁇ 2.
  • FIG. 20 six square faces of MxM size are illustrated as arranged in 3x2 form.
  • the predetermined pace can be configured to include not only the area corresponding to the predetermined face but also the area adjacent to the corresponding area.
  • a 360-degree image approximated to a cube can be projected and transformed onto a 2D plane such that one face on the cube becomes one face, as in the example shown in FIG.
  • the Nth face of the cube may constitute the face of the index N of the 360 degree projection image.
  • a face can be configured so that data of a plurality of faces are included in one face.
  • the data of a plurality of surfaces may include at least a partial area of at least one of a surface corresponding to a predetermined face (hereinafter, referred to as a 'corresponding surface') and a plurality of surfaces adjacent to the corresponding surface.
  • 21 is a diagram showing an example in which a plurality of data is included in one face.
  • the face 0 may be configured to include a face located at the front face and at least a partial area of the face adjacent to the face located at the front face. That is, a 360 degree image may be projected and transformed so that at least some of the corresponding faces of face 0 (i.e., the face located at the front face) and the corresponding faces of face 2, face 3, face 4, have. Accordingly, a part of the data included in the face 0 may be overlapped with data included in the face 2, face 3, face 4, and face 5.
  • each face is configured to include a plurality of faces.
  • each face can be configured to include data for a plurality of planes.
  • each face may be configured to include a corresponding area and a part of four sides adjacent to the corresponding area, as in the example shown in Fig.
  • the number of adjacent faces included in each face may be set differently from the example shown in Fig.
  • the predetermined face may be configured to include only a partial area of the adjacent face adjacent to the right and left of the corresponding face and the corresponding face, or only a partial area of the adjacent face adjacent to the upper face and the lower face of the corresponding face. That is, an area including data on the other side only in the left and right or upper and lower sides of the face can be set.
  • the number of adjacent faces included in the face may be determined to be different.
  • faces 2, 3, 4, and 5 in FIG. 22 located at the left and right boundaries of the image are configured to include a corresponding face and a partial area of the face adjacent to the corresponding face
  • Faces 1 and 6 may be configured to include a corresponding area and a partial area of two sides adjacent to the corresponding surface.
  • An area generated based on the adjacent surface adjacent to the face corresponding to the face may be defined as a padding area.
  • the padding sizes for the vertical direction and the horizontal direction may have the same value.
  • the padding size for the vertical and horizontal directions is illustrated as being set to k. Unlike the illustrated example, the padding sizes for the vertical and horizontal directions may be set differently.
  • padding sizes for the vertical and horizontal directions may be set differently depending on the face.
  • the padding size in the horizontal direction at the face located at the left or right boundary can be set larger than the padding size in the vertical direction.
  • the padding size may be set differently for each face.
  • a predetermined face can be configured by resampling the corresponding face of a predetermined face to a size smaller than the face, and then padding the remaining region in which the resampled image is disposed.
  • the image corresponding to the front face may be resampled to a size smaller than MxM, and the resampled image may be disposed at the center of face 0. Thereafter, padding can be performed on the remaining area of the face 0 excluding the resampled image.
  • Resampling can be used to reduce the size of at least one of the width or height of the image corresponding to the corresponding surface.
  • resampling may be performed to make the width and height of the image corresponding to the front face smaller than M, as in the example shown in FIG. That is, a filter for resampling can be applied to both the horizontal direction and the vertical direction.
  • resampling may be performed in order to keep the size of either the width or the height of the image corresponding to the corresponding surface at M, while making the size of the other one smaller than M. That is, a filter for resampling can be applied only in the horizontal direction or the vertical direction.
  • the padding may be performed using at least one of a sample (or block) located at the boundary of the corresponding surface or a sample (or block) included in the plane adjacent to the corresponding surface.
  • the value of a sample included in the padding region may be generated by copying a sample located at a boundary of a corresponding surface or a sample included in a surface adjacent to the corresponding surface, or a sample located at a boundary of the corresponding surface, Can be generated based on an averaging operation or a weighting operation of the samples included in the plane.
  • the projection transformation method of constructing the face using the corresponding surface and the adjacent surface adjacent to the corresponding surface can be defined as Overlapped Face Projection.
  • the face overlap projection conversion method can be applied to the projection conversion technique in which a plurality of face generation is caused.
  • the face overlap projection conversion method may be applied to ISP, OHP, TPP, SSP, ECP, or RSP.
  • Information regarding the face overlap projection conversion method can be signaled through the bit stream.
  • the information on the face overlap projection conversion method includes information indicating whether or not the face overlap projection conversion method is used, information indicating the number of adjacent faces included in the face, information indicating whether or not the padding area exists, Information indicating the padding size, whether or not a padding area has been created using the neighboring paces adjacent to the current face in the three-dimensional space, and the like.
  • spatially continuous faces are not spatially continuous on the 2D plane in the 3D space.
  • the front face and the left face are spatially continuous in the 3D space, but the front face and the left face are not spatially continuous on the 2D plane.
  • the Fate artifacts are relatively large at the face boundaries that are adjacent to each other but not adjacent to each other after being projected and transformed on the 2D plane. Subjective image quality may be lowered accordingly.
  • padding can be performed at a predetermined face using data of a face which is not neighboring the predetermined face in the 360 degree projection image. More specifically, it is possible to add a padding area to the boundary of a predetermined page using data of a face which is not neighboring to a predetermined face.
  • the padding is not adjacent to the predetermined face on the 2D plane, but when the 360 degree projection image is reconstructed in 3D, the padding can be performed using a face adjacent to the predetermined face.
  • padding the boundary of the current face using data of a continuous face (or sub-face) in the 3D space although it is not adjacent to the 360-degree projection image can be defined as Overlapped Face Padding.
  • Face overlap padding can be performed by copying a portion of the face that is not adjacent to the current face. That is, the padding area added to the border of the current face may be a copy of a face area that is not adjacent to the current face.
  • face overlap padding may be performed based on a sample (or block) adjacent to the boundary of the current face and a sample (or block) adjacent to the boundary of the face that is not adjacent to the current face.
  • the value of the sample included in the padding area is calculated by copying a sample located at the boundary of the current face or a sample located at the boundary of the face not adjacent to the current face, May be generated based on an averaging operation or a weighting operation of a sample located at the boundary of a face that is not adjacent to the current face.
  • Face overlap padding may be performed considering at least one of the continuity of the current face and the shape of the face that is not adjacent to the current face. Specifically, a padding area may be added to the boundary of the current face where the face is spatially contiguous with the current face in the 3D space. At this time, the shape of the padding area to be added may be determined based on the shape of the pace that is not neighboring the current face. Referring to the drawings, face overlap padding will be described in more detail.
  • FIGS. 23 and 24 are views showing a 360-degree projection image based on the TPP technique in which face overlap padding is performed.
  • the front face is continuous with the top face, the face, the right face, and the left face.
  • the padding area added to the boundary of the front face can be generated based on the face that is adjacent to the boundary of the front face when reconstructing the 360 projection image into 3D. For example, since the left boundary of the front face is adjacent to the left face in the 3D space, a padding area generated using data included in the left face is added to the left boundary of the front face, as in the example shown in FIG. 23 .
  • padding may be performed only on one side boundary of the 360 degree projection image, or padding may be performed on multiple boundaries of the 360 degree projection image, as in the example shown in FIG. have.
  • a padding area generated using data included in the top face is added to the upper boundary of the front face, while data included in the bottom face is added to the lower boundary of the front face
  • a padding area may be added.
  • a padding area generated using data included in the front face can be added to the upper boundary of the top face and the lower boundary of the bottom face.
  • the right border of the front face may be adjacent to the right face that is continuous in 3D space, and the padding area may not be added. That is, the pace adjacent to the current face may not be contiguous in the 3D space, or a padding area may be added to the boundary where the neighboring face does not exist.
  • the shape of the padding area may be determined based on the shape of the pace that is not adjacent to the current face. For example, since the top face, the bottom face, and the left face are in a trapezoidal shape, a padding area added to each boundary of the front face may be a copy of a part of the trapezoid as in the example shown in FIG. With the same principle, the upper boundary of the top face and the padding area added to the lower boundary of the bottom face may be a copy of a part of the rectangle, as in the example shown in Fig.
  • the padding area of the top border of the front face may be a copy of a part of the top face rotated 180 degrees
  • the padding area of the bottom border of the front face may be a copy of the bottom face rotated 180 degrees It can be done.
  • the upper boundary of the top face and the padding area of the lower boundary of the bottom face may be copies of a portion of the front face rotated 180 degrees.
  • the 360 degree projection image may have a rectangular shape. Accordingly, when the non-rectangular padding region is added to the boundary of the face, an inactive region that can not fill the rectangle may occur. For example, in the example shown in FIGS. 23 and 24, a trapezoidal padding area among the areas surrounding the Front face may be added, and the remaining area may be set as an inactive area.
  • a pixel included in the inactive area may have a value calculated based on a predefined value or a bit depth.
  • a pixel in the inactive area may have an intermediate value of the maximum value that can be represented by the bit depth.
  • pixels in the inactive region can have 128, which is the intermediate value of the maximum value that can be represented by 8 bits.
  • pixels in the inactive region can be 512, which is an intermediate value of the maximum value that can be expressed by 10 bits.
  • a pixel in an inactive area may be determined by a sample located at the boundary of the face or padding area.
  • pixels in the inactive area can be generated by copying samples located at the boundary of the face or padding area.
  • the pixels in the inactive area may be generated based on an averaging operation or a weighting operation of the samples in the padding area lying on the same horizontal line and the sample in the padding area lying on the same vertical line.
  • FIG. 25 is a diagram showing a 360-degree projection image based on OHP technique considering continuity of images.
  • the four upper faces constituting the octahedron are referred to as faces 1, 2, 3 and 4, respectively, and the four lower faces are referred to as faces 5, 6 and 7 , And 8, respectively.
  • the spatially continuous faces 1, 2, 3 on the 2D plane continuously in the 3D space it is desirable to arrange the spatially continuous faces 1, 2, 3 on the 2D plane continuously in the 3D space, and continuously face the spatially continuous faces 5, 6, 7 on the 2D plane Can be deployed. Further, using the point where the face 2 and the face 6 are continuous in the 3D space, the face 2 and the face 6 can be arranged so as to be adjacent to each other.
  • the remaining faces 4 and 8 may be bisected, and the bisected faces may be placed in the remaining portions of the rectangle. Accordingly, a 360-degree projection image in a rectangular shape can be obtained as in the example shown in FIG. For convenience of explanation, it is assumed to distinguish the bisected paces with '-1' and '-2', respectively.
  • adjacent faces may not be adjacent to each other when the 360 degree projection image is restored to 3D in the 2D plane.
  • faces 8-2 and 5, faces 8-1 and 7, faces 4-2 and 1, and faces 4-1 and 3 are neighbors on the 2D plane, It is not neighbor in space.
  • padding can be performed at a face-to-face boundary that is contiguous on the 2D plane but not contiguous in the 3D space.
  • FIG. 26 is a diagram illustrating a 360-degree projection image based on the OHP technique in which face overlap padding is performed.
  • a padding area may be added to a border where there is no neighbor face.
  • a padding area may be added to the upper boundary of face 8-2 and the upper boundary of face 4-2, and a padding area may be added to the lower boundary of face 8-1 and the lower boundary of face 4-1 .
  • the value of a sample included in the padding region may be generated based on at least one of a sample included in the current face or a sample contained in a neighboring neighboring face of the current face.
  • the neighboring face may include a face neighboring the current face on the 2D plane, or a face neighboring the current face when the 360 degree projection image is projected backward in 3D.
  • the samples included in the padding region located between the faces 8-2 and 5 may be generated based on the average or weighted operation of the samples included at face 8 and the samples included at face 5.
  • the padding may be performed by copying a portion of the neighboring paces on the 3D space.
  • the padding area added to the upper boundaries of faces 8-2 and 4-2 may be generated by copying a portion of face 8-1 and face 4-1.
  • the padding area added to the lower boundaries of the faces 8-1 and 4-1 may be generated by copying a part of the faces 8-2 and 4-2.
  • the shape of the padding area may be determined based on the shape of the pace that is not adjacent to the current face.
  • the padding areas added to the upper boundaries of faces 8-2 and 4-2 may be trapezoidal in shape corresponding to portions of triangles, face 8-1 and face 4-1.
  • the padding area added to the bottom edges of the faces 8-1 and 4-1 may also be a trapezoidal shape corresponding to a part of the faces 8-2 and 4-2.
  • inactive areas can occur in a 360 degree projection image.
  • the sample value in the inactive area may have a predefined value, a value determined by bit depth, or a value determined by an adjacent sample.
  • face overlap padding 23 and 26 illustrate the face overlap padding based on the TPP and OHP techniques, respectively, but the face overlap padding can also be applied to the projection transformation technique in which a plurality of face generation is caused.
  • face overlap padding may be applied to ISP, CMP, TPP, SSP, ECP or RSP.
  • Information regarding the face overlap padding can be signaled through the bit stream.
  • the information on the face overlap padding may include at least one of information indicating whether face overlap padding is used, information indicating whether a padding area exists, information indicating a position of the padding area, or information indicating a padding size .
  • each of the components (for example, units, modules, etc.) constituting the block diagram may be implemented by a hardware device or software, and a plurality of components may be combined into one hardware device or software .
  • the above-described embodiments may be implemented in the form of program instructions that may be executed through various computer components and recorded in a computer-readable recording medium.
  • the computer-readable recording medium may include program commands, data files, data structures, and the like, alone or in combination.
  • Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tape, optical recording media such as CD-ROMs and DVDs, magneto-optical media such as floptical disks, media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like.
  • the hardware device may be configured to operate as one or more software modules for performing the processing according to the present invention, and vice versa.
  • the present invention can be applied to an electronic device capable of encoding / decoding an image.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

La présente invention concerne un procédé de codage d'image comprenant les étapes consistant à : générer une image projetée à 360° contenant une pluralité de visages au moyen de la transformation projective d'une image tridimensionnelle à 360° sur un plan bidimensionnel ; ajouter une région de remplissage à une bordure sur au moins un côté d'un visage actuel parmi la pluralité de visages ; et coder des informations associées au remplissage du visage actuel, la région de remplissage étant générée sur la base d'un échantillon inclus dans au moins une portion d'un visage qui n'est pas adjacente au visage actuel dans l'image projetée à 360°.
PCT/KR2018/007098 2017-06-26 2018-06-22 Procédé et dispositif de traitement de signal vidéo WO2019004664A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR10-2017-0080495 2017-06-26
KR20170080495 2017-06-26
KR10-2017-0080496 2017-06-26
KR20170080496 2017-06-26

Publications (1)

Publication Number Publication Date
WO2019004664A1 true WO2019004664A1 (fr) 2019-01-03

Family

ID=64742881

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2018/007098 WO2019004664A1 (fr) 2017-06-26 2018-06-22 Procédé et dispositif de traitement de signal vidéo

Country Status (2)

Country Link
KR (1) KR20190001548A (fr)
WO (1) WO2019004664A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112291542A (zh) * 2020-10-16 2021-01-29 合肥安达创展科技股份有限公司 一种封闭式多面投影的显示方法及系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130070646A (ko) * 2010-10-01 2013-06-27 제너럴 인스트루먼트 코포레이션 유연한 분할에서 영상 경계 패딩을 활용하는 코딩 및 디코딩
KR20160001430A (ko) * 2014-06-27 2016-01-06 삼성전자주식회사 영상 패딩영역의 비디오 복호화 및 부호화 장치 및 방법
US20160277762A1 (en) * 2015-03-20 2016-09-22 Qualcomm Incorporated Downsampling process for linear model prediction mode

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130070646A (ko) * 2010-10-01 2013-06-27 제너럴 인스트루먼트 코포레이션 유연한 분할에서 영상 경계 패딩을 활용하는 코딩 및 디코딩
KR20160001430A (ko) * 2014-06-27 2016-01-06 삼성전자주식회사 영상 패딩영역의 비디오 복호화 및 부호화 장치 및 방법
US20160277762A1 (en) * 2015-03-20 2016-09-22 Qualcomm Incorporated Downsampling process for linear model prediction mode

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HE, YUWEN ET AL.: "AHG8: Geometry Padding for 360 Video Coding", JVET-D0075 (VERSION 3), JOINT VIDEO EXPLORATION TEAM (JVET) OF ITU-T SG 16 WP 3, 15 October 2016 (2016-10-15), Chengdu, CM, pages 1 - 10 *
LEE, YA-HSUAN ET AL.: "AHG 8: An Improvement on Compact Octahedron Projection with Padding", JVET -F0053 (VERSION 4), JOINT VIDEO EXPLORATION TEAM (JVET) OF ITU-T SG 16 WP 3, 4 April 2017 (2017-04-04), Hobart, AU, pages 1 - 8 *

Also Published As

Publication number Publication date
KR20190001548A (ko) 2019-01-04

Similar Documents

Publication Publication Date Title
WO2018117706A1 (fr) Procédé et dispositif de traitement de signal vidéo
WO2018106047A1 (fr) Procédé et appareil de traitement de signal vidéo
WO2020218793A1 (fr) Procédé de codage basé sur une bdpcm et dispositif associé
WO2020246849A1 (fr) Procédé de codage d'image fondé sur une transformée et dispositif associé
WO2018236028A1 (fr) Procédé de traitement d'image basé sur un mode d'intra-prédiction et appareil associé
WO2019132577A1 (fr) Procédé et dispositif d'encodage et de décodage d'image, et support d'enregistrement avec un train de bits stocké dedans
WO2018236031A1 (fr) Procédé de traitement d'image basé sur un mode d'intraprédiction, et appareil associé
WO2020256389A1 (fr) Procédé de décodage d'image sur la base d'une bdpcm et dispositif associé
WO2018044089A1 (fr) Procédé et dispositif pour traiter un signal vidéo
WO2018124819A1 (fr) Procédé et appareil pour traiter des signaux vidéo
WO2020180119A1 (fr) Procédé de décodage d'image fondé sur une prédiction de cclm et dispositif associé
WO2020246805A1 (fr) Dispositif et procédé de prédiction intra basée sur une matrice
WO2019009600A1 (fr) Procédé et appareil de décodage d'image utilisant des paramètres de quantification basés sur un type de projection dans un système de codage d'image pour une vidéo à 360 degrés
WO2018174531A1 (fr) Procédé et dispositif de traitement de signal vidéo
WO2018221946A1 (fr) Procédé et dispositif de traitement de signal vidéo
WO2016200235A1 (fr) Procédé de traitement d'image basé sur un mode de prédiction intra et appareil associé
WO2018131830A1 (fr) Procédé et dispositif de traitement de signal vidéo
WO2021034161A1 (fr) Dispositif et procédé de prédiction intra
WO2020149616A1 (fr) Procédé et dispositif de décodage d'image sur la base d'une prédiction cclm dans un système de codage d'image
WO2020055208A1 (fr) Procédé et appareil de prédiction d'image pour prédiction intra
WO2018174542A1 (fr) Procédé et dispositif de traitement de signal vidéo
WO2019004664A1 (fr) Procédé et dispositif de traitement de signal vidéo
WO2019045393A1 (fr) Procédé et dispositif de traitement de signal vidéo
WO2019182293A1 (fr) Procédé et appareil de traitement de signal vidéo
WO2019190197A1 (fr) Procédé et appareil de traitement de signal vidéo

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18823489

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 21.04.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18823489

Country of ref document: EP

Kind code of ref document: A1