KR20150043977A - Method and apparatus for video encoding/decoding based on multi-layer - Google Patents
Method and apparatus for video encoding/decoding based on multi-layer Download PDFInfo
- Publication number
- KR20150043977A KR20150043977A KR20140135694A KR20140135694A KR20150043977A KR 20150043977 A KR20150043977 A KR 20150043977A KR 20140135694 A KR20140135694 A KR 20140135694A KR 20140135694 A KR20140135694 A KR 20140135694A KR 20150043977 A KR20150043977 A KR 20150043977A
- Authority
- KR
- South Korea
- Prior art keywords
- poc
- picture
- value
- reset
- layer
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
More particularly, the present invention relates to a method of coding a picture of a multi-layered structure and more particularly to a method of setting a picture order count (POC) of pictures in the same AU (Access Unit) Buffer) of a reference picture.
Recently, as a multimedia environment has been established, a variety of terminals and networks have been used, and user demands have been diversified accordingly.
For example, as the performance and computing capability of a terminal are diversified, the performance to be supported varies depending on a device. In addition, the network in which the information is transmitted is also diversified not only by the external structure such as a wired / wireless network, but also by the type of information to be transmitted, information amount and speed, and the like. The user selects the terminal and the network to be used according to the desired function, and the spectrum of the terminal and the network provided by the enterprise to the user is also diversified.
In this regard, recently, broadcasting having a high definition (HD) resolution has been expanded not only in the domestic market but also in the world, so that many users are accustomed to high resolution and high quality video. Accordingly, many video service related organizations are making efforts to develop next generation video equipment.
In addition, with the increasing interest in UHD (Ultra High Definition), which has a resolution more than four times that of HDTV in addition to HDTV, there is a growing demand for a technology for compressing and processing higher resolution and higher quality images.
An inter prediction technique for predicting a pixel value included in a current picture from a previous and / or a temporal picture in order to compress and process an image, an inter prediction technique for predicting a pixel value included in a current picture, An entropy encoding technique for assigning a short code to a symbol having a high appearance frequency and a long code to a symbol having a low appearance frequency can be used.
As described above, considering the requirements of each terminal, network, and diversified user with different functions to be supported, the quality, size, and frame of a supported image need to be diversified accordingly.
As described above, scalability that supports various image quality, resolution, size, frame rate, and viewpoint due to heterogeneous communication networks and various functions and types of terminals has become an important function of a video format.
Therefore, in order to provide a service required by a user in various environments based on a highly efficient video coding method, it is necessary to provide a scalability function so as to enable efficient video encoding and decoding in terms of time, space, image quality, and viewpoint .
The present invention provides a method and apparatus for equally setting the POC of pictures in an AU in scalable video coding comprising a plurality of layers.
The present invention provides a method and apparatus for calculating a POC of reference pictures in a DPB referenced by a current picture as resetting a POC of a current picture in scalable video coding comprising a plurality of layers.
The present invention provides a method and apparatus for signaling whether a POC of a current picture has been reset in scalable video coding comprising a plurality of layers.
According to an embodiment of the present invention, an image decoding method supporting a plurality of layers is provided. Decoding the POC reset information indicating whether or not the picture order count (POC) of the current picture has been reset to zero, decoding the POC reset information based on the POC reset information, Calculating a POC value of each of a long term reference picture and a short term reference picture in a DPB (Decoded Picture Buffer) referred to by the current picture and a POC value of the long term reference picture and a POC value of the short- And constructing a reference picture set (RPS) for inter prediction.
According to another embodiment of the present invention, an image decoding apparatus supporting a plurality of layers is provided. The image decoding apparatus includes a decoding unit for decoding POC reset information indicating whether a picture order count (POC) of a current picture has been reset to 0 and a POC reset value indicating a POC value of the current picture based on the POC reset information And a POC value of each of a long term reference picture and a short term reference picture in a DPB (Decoded Picture Buffer) referenced by the current picture, and based on the POC value of the long term reference picture and the POC value of the short term reference picture, And a prediction unit for constructing a reference picture set (RPS) for inter prediction of the reference picture.
According to the present invention, there is provided a method for equally resetting the POC of pictures in an AU when the POCs of the pictures in the same AU are not the same. In addition, even if the POC value of the current picture is reset, the reference pictures in the decoded picture buffer referenced by the current picture can be normally identified.
1 is a block diagram illustrating a configuration of an image encoding apparatus according to an embodiment of the present invention.
2 is a block diagram illustrating a configuration of an image decoding apparatus according to an embodiment of the present invention.
3 is a conceptual diagram schematically showing an embodiment of a scalable video coding structure using a plurality of layers to which the present invention can be applied.
4 schematically shows a method of reconstructing a POC of pictures in a scalable video coding structure including a plurality of layers according to an embodiment of the present invention and constructing a reference picture set for inter prediction based on the POC of the reconstructed pictures It is a flowchart.
FIG. 5 is an example of a scalable video structure including a plurality of layers shown to explain a process of resetting POC values of pictures in an AU according to an embodiment of the present invention.
6 illustrates a process of resetting a POC value of reference pictures in a DPB based on information (e.g., poc_reset_flag) indicating whether or not the POC value of the current picture is reset to 0 according to an embodiment of the present invention Lt; / RTI >
FIG. 7 is a diagram illustrating a method for calculating a POC value of long term reference pictures in a DPB according to an embodiment of the present invention.
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In describing the embodiments of the present invention, if the detailed description of related known structures or functions is deemed to obscure the subject matter of the present specification, the description may be omitted.
When an element is referred to herein as being "connected" or "connected" to another element, it may mean directly connected or connected to the other element, Element may be present. In addition, the content of " including " a specific configuration in this specification does not exclude a configuration other than the configuration, and means that additional configurations can be included in the scope of the present invention or the scope of the present invention.
The terms first, second, etc. may be used to describe various configurations, but the configurations are not limited by the term. The terms are used for the purpose of distinguishing one configuration from another. For example, without departing from the scope of the present invention, the first configuration may be referred to as the second configuration, and similarly, the second configuration may be named as the first configuration.
In addition, the constituent elements shown in the embodiments of the present invention are shown independently to represent different characteristic functions, which do not mean that each constituent element is composed of separate hardware or a single software constituent unit. That is, each constituent unit is included in each constituent unit for convenience of explanation, and at least two constituent units of each constituent unit may form one constituent unit or one constituent unit may be divided into a plurality of constituent units to perform a function. The integrated embodiments and the separate embodiments of each component are also included in the scope of the present invention unless they depart from the essence of the present invention.
In addition, some of the components are not essential components to perform essential functions in the present invention, but may be optional components only to improve performance. The present invention can be implemented only with components essential for realizing the essence of the present invention, except for the components used for the performance improvement, and can be implemented by only including the essential components except the optional components used for performance improvement Are also included in the scope of the present invention.
1 is a block diagram illustrating a configuration of an image encoding apparatus according to an embodiment of the present invention.
A scalable video encoding apparatus supporting a multi-layer structure can be implemented by extending a general video encoding apparatus having a single layer structure. The block diagram of FIG. 1 shows an embodiment of an image encoding apparatus that can be the basis of a scalable video encoding apparatus applicable to a multi-layer structure.
1, an
The image encoding
In the intra mode, the
In the intra mode, the
In the inter mode, the
In the case of a multi-layer structure, inter prediction that is applied in inter mode may include inter-layer prediction. The
On the other hand, when the current layer picture and the reference layer picture are the same size, sampling applied to the reference layer picture may mean generation of a reference sample by sample copy from the reference layer picture. The sampling applied to the reference layer picture may mean upsampling when the resolution of the current layer picture and the reference layer picture are different.
For example, in a case where the inter-layer resolution is different, an inter-layer reference picture may be constructed by up-sampling a reconstructed picture of a reference layer between layers supporting scalability regarding resolution.
Which layer picture to use to construct an interlayer reference picture can be determined in consideration of coding cost and the like. The encoding apparatus can transmit to the decoding apparatus information specifying a layer to which a picture to be used as an interlayer reference picture belongs.
The layer to be referred to in the interlayer prediction, that is, the picture used for prediction of the current block in the reference layer may be a picture of the same AU (Access Unit) as the current picture (current intra-layer prediction picture).
The
The transforming
When the transform skip mode is applied, the transforming
The
The
The encoding parameters are information necessary for encoding and decoding, and may include information that can be inferred during encoding or decoding, as well as information encoded and encoded by a coding device such as a syntax element.
For example, the coding parameters include values of intra / inter prediction mode, motion / motion vector, reference picture index, coding block pattern, residual signal presence, conversion coefficient, quantized transform coefficient, quantization parameter, block size, Or statistics.
The residual signal can mean the difference between the original signal and the predicted signal, and the difference between the original signal and the predicted signal is transformed or the difference between the original signal and the predicted signal is transformed and the quantized signal is transformed It may mean. The residual signal may be referred to as a residual block in block units.
When entropy coding is applied, a small number of bits are allocated to a symbol having a high probability of occurrence, and a large number of bits are allocated to a symbol having a low probability of occurrence, so that the size of a bit string for the symbols to be coded Can be reduced. Therefore, the compression performance of the image encoding can be enhanced through the entropy encoding.
The
Since the
The restoration block passes through the
2 is a block diagram illustrating a configuration of an image decoding apparatus according to an embodiment of the present invention.
A scalable video decoding apparatus supporting a multi-layer structure can be implemented by extending a general video decoding apparatus having a single layer structure. The block diagram of FIG. 2 shows an embodiment of an image decoding apparatus that can be the basis of a scalable video decoding apparatus applicable to a multi-layer structure.
2, the
The
In the intra mode, the switch is switched to the intra mode, and in the inter mode, the switch can be switched to the inter mode.
The
The
The quantized coefficients are inversely quantized in the
In the intra mode, the
In the inter mode, the
In the case of a multi-layer structure, inter prediction that is applied in inter mode may include inter-layer prediction. The
On the other hand, when the current layer picture and the reference layer picture are the same size, sampling applied to the reference layer picture may mean generation of a reference sample by sample copy from the reference layer picture. The sampling applied to the reference layer picture may mean upsampling when the resolution of the current layer picture and the reference layer picture are different.
For example, if inter-layer resolution is different, and inter-layer prediction is applied between layers supporting scalability regarding resolution, an inter-layer reference picture may be constructed by up-sampling reconstructed pictures of a reference layer.
At this time, information specifying a layer to which a picture to be used as an interlayer reference picture belongs can be transmitted from the encoding apparatus to the decoding apparatus.
The layer to be referred to in the interlayer prediction, that is, the picture used for prediction of the current block in the reference layer may be a picture of the same AU (Access Unit) as the current picture (current intra-layer prediction picture).
The restored residual block and the prediction block are added by the
The restored picture is filtered by the
The
In FIG. 1 and FIG. 2, it is described that one encoding / decoding apparatus processes all of encoding / decoding for a multi-layer. However, this is for convenience of explanation, and the encoding / decoding apparatus may be configured for each layer.
In this case, the upper layer encoding / decoding apparatus can perform encoding / decoding of the upper layer using information of the upper layer and information of the lower layer. For example, the prediction unit (inter prediction unit) of the upper layer may perform intra prediction or inter prediction on the current block using pixel information or picture information of the upper layer, or may receive picture information reconstructed from the lower layer, Inter-prediction (inter-layer prediction) on the current block of the upper layer may be performed. Here, only the inter-layer prediction has been described as an example. However, the encoder / decoder performs encoding / decoding on the current layer using information of other layers regardless of whether the apparatus is configured for each layer or one layer for processing a multilayer can do.
In the present invention, a layer may include a view. In this case, in the case of inter-layer prediction, prediction of an upper layer is not simply performed using information of a lower layer, but information of another layer among layers specified as dependency by information specifying inter-layer dependency May be used to perform inter-layer prediction.
3 is a conceptual diagram schematically showing an embodiment of a scalable video coding structure using a plurality of layers to which the present invention can be applied. In FIG. 3, a GOP (Group of Pictures) represents a picture group, that is, a group of pictures.
In order to transmit video data, a transmission medium is required, and the performance of the transmission medium varies depending on various network environments. A scalable video coding method may be provided for application to these various transmission media or network environments.
A video coding method supporting scalability (hereinafter, referred to as 'scalable coding' or 'scalable video coding') removes redundancy between layers by utilizing texture information, motion information, residual signals, etc. between layers Thereby improving the encoding and decoding performance. The scalable video coding method can be applied to various scalers in terms of spatial, temporal, image quality (or quality, quality), and viewpoint according to the surrounding conditions such as transmission bit rate, transmission error rate, It is possible to provide the capability.
Scalable video coding can be performed using multiple layers structure to provide a bitstream applicable to various network situations. For example, the scalable video coding structure may include a base layer for compressing and processing image data using a general image decoding method, and compressing and compressing the image data using the decoding information of the base layer and a general image decoding method. Lt; RTI ID = 0.0 > layer. ≪ / RTI >
The base layer may be referred to as a base layer or may be referred to as a lower layer. The enhancement layer may be referred to as an enhancement layer or a higher layer. In this case, the lower layer may mean a layer supporting lower scalability than the specific layer, and the upper layer may mean a layer supporting higher scalability than a specific layer. In addition, a layer to be referred to in coding / decoding of another layer is referred to as a reference layer (reference layer), and a layer to be encoded / decoded using another layer may be referred to as a current layer (current layer). The reference layer may be a lower layer than the current layer, and the current layer may be a layer higher than the reference layer.
Here, the layer may be divided into a plurality of layers based on spatial (e.g., image size), temporal (e.g., decoding order, image output order, frame rate), image quality, complexity, Means a set of separated video and bitstreams.
Referring to FIG. 3, for example, the base layer may be defined by a standard definition (SD), a frame rate of 15 Hz, a bit rate of 1 Mbps, and a first enhancement layer may be defined as high definition (HD), a frame rate of 30 Hz, And the second enhancement layer may be defined as 4K-UHD (ultra high definition), a frame rate of 60 Hz, and a bit rate of 27.2 Mbps.
The format, the frame rate, the bit rate, and the like are one example, and can be determined as needed. Also, the number of layers to be used is not limited to the present embodiment, but can be otherwise determined depending on the situation. For example, if the transmission bandwidth is 4 Mbps, the frame rate of the first enhancement layer HD may be reduced to 15 Hz or less.
The scalable video coding method can provide temporal, spatial, picture quality, and viewability scalability by the above-described method in the embodiment of FIG. In this specification, scalable video coding has the same meaning as scalable video encoding in terms of encoding and scalable video decoding in decoding.
Meanwhile, pictures in the same AU (Access Unit) have the same picture order count (POC) value.
The POC is a value that can identify pictures in the same layer and may be a value indicating an output order of decoded pictures output from a decoded picture buffer (DPB).
The AU includes coded pictures with the same output time. For example, in a scalable video coding structure composed of a plurality of layers, when a picture A of the first hierarchy and a picture B of the second hierarchy have the same output time, the picture A of the first hierarchy and the picture B of the second hierarchy are the same AU. ≪ / RTI >
If the pictures in the same AU have different picture types, pictures in the same AU may have different POC values. Thus, if the POCs of the pictures in the same AU are not the same, a method of setting the pictures in the AU to have the same POC value is needed. In addition, by resetting the POCs of the pictures in the AU, there is a need for a method that can calculate the POC of the reference pictures in the DPB so that existing reference pictures in the DPB can be normally identified.
In the present invention, when pictures having different POCs are included in the same AU in the multi-layer-based image coding / decoding process, the POC values of the pictures in the AU and the POC values of the reference pictures in the DPB are reset, And a method of constructing a reference picture set (RPS) based on the POC.
The present invention relates to encoding and decoding an image including a plurality of layers or views, wherein a plurality of layers or views are divided into a first layer, a second layer, a third layer, an n-th layer, , The third and the n-th time points.
Hereinafter, in an embodiment of the present invention, an image in which the first layer and the second layer exist is described as an example for the sake of explanation, but the same method can be applied to an image in which layers or viewpoint exist. Also, the first layer may be represented as a base layer or a base layer or a reference layer, and the second layer may be expressed as an enhancement layer or an enhancement layer or a current layer.
4 schematically shows a method of reconstructing a POC of pictures in a scalable video coding structure including a plurality of layers according to an embodiment of the present invention and constructing a reference picture set for inter prediction based on the POC of the reconstructed pictures It is a flowchart. The method of FIG. 4 can be performed in the image encoding apparatus of FIG. 1 and the image decoding apparatus of FIG. 2 described above.
Referring to FIG. 4, the encoding / decoding apparatus calculates a POC value of a current picture to be coded / decoded (hereinafter referred to as a current picture) (S410).
As described above, the POC is an identifier for identifying pictures in a layer having the same layer identifier (nuh_layer_id) value in a coded video bit stream, and may be a value indicating an output order of pictures output from the DPB .
For example, the value of the POC may increase as the order output from the DPB is delayed, and the POC value may be 0 in the case of a specific picture.
The specific picture may be an IRAP (Intra Random Access Point) picture that becomes the first picture in the bitstream in decoding order, and the POC value of the IRAP picture may be zero. In other words, since the IRAP picture is a picture that can be decoded without decoding the picture prior to the IRAP picture in the decoding order, the PAP value can be 0 in the IRAP picture. The IRAP picture is a picture to be a random access point, and includes only an I (intra) slice (a slice decoded using only intra prediction), an instantaneous decoding refresh (IDR) picture, a clean random access (CRA) ) Picture. The IDR picture may be the first picture in the bitstream in the decoding order, or may be in the middle of the bitstream. The CRA picture may be the first picture in the bitstream in the decoding order, or may be in the middle of the bitstream for normal play. A BLA picture has functions and properties similar to those of a CRA picture, and refers to a picture that exists in the middle of a bitstream as a random access point when a coded picture is spliced or a bitstream is interrupted.
The POC value can be calculated using the most significant bit (MSB) value (POC_MSB) of the POC value and the LSB (least significant bit) value (POC_LSB) of the POC value.
At this time, the POC_LSB value can be transmitted in a slice segment header of the corresponding picture, and the POC_MSB value can be calculated according to the type of the corresponding picture in the following manner.
(1-1) In the case of a non-IRAP picture, that is, a non-IRAP picture
The POC_MSB value of the non-IRAP picture is the POC of the picture (referred to as the previous picture) close to the current picture (the difference between the POC of the current picture is small) and the temporal sublayer identifier (temporal_id) of the previously decoded pictures is 0 (prevPOCLSB) and POC_MSB (prevPOCMSB) of the previous picture obtained by using the LSB value (MaxPicOrderCntLsb) of the maximum POC transmitted from the current picture (prevPOC) and the SPS (Sequence Parameter Sets), and the current picture signaled from the slice segment header of the current picture (Slice_pic_order_cnt_lsb) value of the POC_LSB.
(1-2) In the case of an IRAP picture
The POC value of the IDR picture is always assumed to be '0'.
If the first picture in the bitstream is a CRA picture or a BLA picture, the POC_MSB value of the CRA picture or BLA picture is assumed to be '0', and the value of POC_LSB (slice_pic_order_cnt_lsb) signaled in the slice segment header of the current picture is CRA picture or BLA It can be used as the POC value of a picture.
If the CRA picture is not the first picture in the bitstream, the POC value of the CRA picture can be calculated to be the same as the POC value of the non-IRAP picture.
If there is a picture having a POC different from the POC of the current picture in the AU, that is, if the pictures in the AU have different POC values, the encoding / decoding device resets the POC values so that the pictures in the AU have the same POC value . The process of resetting the POC values of pictures in the AU will be described with reference to FIGS. 5 and 6. FIG.
FIG. 5 is an example of a scalable video structure including a plurality of layers shown to explain a process of resetting POC values of pictures in an AU according to an embodiment of the present invention.
The scalable video shown in FIG. 5 may be an image including a first layer (Layer 0) and a second layer (Layer 1). For example, the first layer (Layer 0) may be a lower layer and the second layer (Layer 1) may be an upper layer. The second layer (Layer 1) may provide a higher scalability than the first layer (Layer 0).
Referring to FIG. 5, when there are IRAP pictures and non-IRAP pictures in the same AU, as in AU 'A' and AU 'B', pictures in the same AU may have different POC values.
At this time, the encoding / decoding apparatus can reset the POC values of the pictures in the AU so that all the pictures in the AU have the same POC value. For example, the encoding / decoding apparatus can reset the POC value of the picture to a predetermined value. The predetermined value may be '0'.
The encoding apparatus can signal to the decoding apparatus that the POC value of the picture has been reset to a predetermined value (e.g., 0). For example, the encoding apparatus can transmit information indicating whether or not the POC value of the picture has been reset to 0 to the decoder through the slice segment header.
Tables 1 and 2 are examples of slice segment header syntax for signaling POC reset information indicating whether the POC value of a picture according to an embodiment of the present invention has been reset to zero.
Referring to Table 1 and Table 2, poc_reset_flag indicates whether the POC value of the current picture is reset to 0 or not. For example, if the value of poc_reset_flag is 1, it indicates that the POC value of the current picture has been reset to 0, and if the value of poc_reset_flag is 0, it indicates that the POC value of the current picture has not been reset to 0.
The poc_reset_flag can be transmitted through the slice segment header according to the value of cross_layer_irap_aligned_flag signaled in the VPS (Video Parameter Sets) extension. For example, if the value of cross_layer_irap_aligned_flag signaled in the VPS extension is zero, poc_reset_flag may be transmitted via the slice segment header.
The cross_layer_irap_aligned_flag is information indicating that the picture B in the same AU belonging to the reference layer of the layer A is an IRAP picture when the picture A of the layer A in the AU is an IRAP picture. For example, when the value of cross_layer_irap_aligned_flag is 1, it can be informed that pictures in the AU are configured as IRAP pictures when there is an IRAP picture in the AU. At this time, the network abstraction layer (NAL) unit types of the IRAP pictures in the same AU may all be the same.
If the current picture is an IDR picture, poc_reset_flag may not be signaled.
If poc_reset_flag does not exist, the value of poc_reset_flag can be deduced to zero.
poc_reset_flag can be defined as a rule that all slices constituting a picture must have the same value.
Referring again to FIG. 5, AU 'A' includes an IRAP picture of a first layer (e.g., an IDR picture) and a non-IRAP picture of a second layer (Layer 1). As described above, since the POC value of the IDR picture is 0, the POC value of the IDR picture of the first layer (Layer 0) can be derived as zero. The POC value of the non-IRAP picture of
For example, since the POC value of the IDR picture of the first layer (Layer 0) in the AU 'A' is 0, the encoder does not need to reset the POC value of the IDR picture to 0 and the poc_reset_flag value should not be set to 1 . Since the POC value of the non-IRAP picture of the second layer (Layer 1) in the AU 'A' is not 0, the coding apparatus sets the POC of the IDR picture of the first layer (Layer 0) The POC value of the non-IRAP picture of the second layer (Layer 1) can be reset to 0, and the poc_reset_flag value can be set to 1.
AU 'B' includes a Non-IRAP picture of a first layer and an IRAP picture of a second layer (e.g., a CRA picture). The non-IRAP picture of the first layer and the CRA picture of the second layer can be calculated using the MSB and the LSB of the POC value as described above (see (1-1), 1-2) method, for example, a non-zero value. In this case, since the pictures in AU 'B' may have different POC values, the encoder can reset the POC values of the pictures in AU 'B' and reset the POC value of pictures in AU 'B' POC reset information (e.g., poc_reset_flag) indicating whether or not the POC is reset can be set and signaled to the decoding apparatus through the slice segment header.
For example, if the non-IRAP picture of the first layer (Layer 0) and the CRA picture of the second layer (Layer 1) in the AU 'B' have POC values other than 0 and have different POC values, The encoding apparatus resets the POC of the non-IRAP picture of the first layer (Layer 0) and the POC of the CRA picture of the second layer (Layer 1) to 0 in order to equalize the POC values of the pictures in the AU 'B' , Poc_reset_flag of the non-IRAP picture of the first layer (Layer 0) and poc_reset_flag of the CRA picture of the second layer (Layer 1) can be set to 1.
On the other hand, the decoding apparatus receives the slice segment header from the encoding apparatus, and based on the POC reset information (for example, poc_reset_flag) indicating whether or not the POC value of the current picture is reset to 0, which is parsed from the slice segment header, The POC value of the picture can be reset to zero. At this time, if there are reference pictures in the DPB for the current picture, it is necessary to reset the POC value of the reference pictures in the DPB referenced by the current picture as the POC value of the current picture is reset. The decoding apparatus can calculate the POC value of the reference pictures in the DPB in the same manner as in the embodiment of FIG.
6 illustrates a process of resetting the POC value of the reference pictures in the DPB based on the POC reset information (e.g., poc_reset_flag) indicating whether or not the POC value of the current picture is reset to 0 according to the embodiment of the present invention Is a drawing that is shown for illustrative purposes.
Referring to FIG. 6, when the poc_reset_flag value parsed in the slice segment header is 1, that is, when the POC reset information indicates that the POC value of the current picture has been reset to 0, the decoding apparatus stores the POC value of the reference picture in the DPB as the current picture Lt; RTI ID = 0.0 > POC < / RTI >
For example, the decoding apparatus can calculate and decode the POC value of the current picture using the MSB and the LSB of the POC value (the methods (1-1) and (1-2)) (S610) . In step S620, the decoding apparatus resets the POC value of the reference pictures in the DPB by the POC value of the decoded current picture in step S620, and resets the POC value of the current picture to zero in step S630.
Referring again to FIG. 4, the encoding / decoding apparatus decodes a reference picture set for inter prediction of a current picture based on POC reset information (for example, poc_reset_flag) indicating whether or not the POC value of the current picture is reset to 0 (S420).
The reference picture set refers to a set of reference pictures of the current picture, and may be composed of reference pictures preceding the current picture in decoding order. The reference picture can be used for inter prediction of the current picture.
The reference picture set includes a forward short-term reference picture set (PocStCurrBefore) referenced by the current picture, a reverse short-term reference picture set (PocStCurrAfter) referenced by the current picture, a short reference picture set (PocStFoll) not referred to by the current picture, A long term reference picture set (PocLtCurr) referenced by the picture, and a long term reference picture set (PocLtFoll) not referred to by the current picture.
The encoding / decoding apparatus can derive the POC value of the reference picture constituting the reference picture set differently according to the POC reset information (for example, poc_reset_flag) indicating whether the POC value of the current picture is reset to 0.
(2-1) When the poc_reset_flag value parsed from the slice segment header is 0 (when the POC reset information indicates that the POC value of the current picture is not reset to 0), the encoding / decoding apparatus constructs the current picture The POC values of the reference pictures referred to by the slice can be calculated as follows.
In the case of a short-term reference picture, the POC value of the short-term reference picture can be calculated using the delta_poc value indicating each short-term reference picture signaled in the slice segment header and the POC value of the decoded current picture . In this case, the delta_poc value may be a POC difference value between the current picture and the i-th short-term reference picture, or may be a difference value between the (i + 1) th short-term reference picture and the i-th short-
In the case of a long-term reference picture, a POC_LSB (pocLsbLt [i]) value for indicating the LSB of each long-term reference picture POC signaled in the slice segment header and a MSB (POC_MSB) value of each long- The POC_LSB value or the POC value of the long-term reference picture can be calculated by using the value (delta_poc_msb_cycle_lt [i]) for calculating the POC_LS_Cycle_lt [i] and the POC value and the POC_LSB value of the decoded current picture.
Although a long-term reference picture can be identified by only POC_LSB basically, there may be a case where there are reference pictures having the same POC_LSB of a long-term reference picture among reference pictures. In this case, the value (delta_poc_msb_cycle_lt) for calculating the POC_MSB value of the long reference picture is additionally signaled so that the reference pictures can be distinguished.
In Equation (1), pocLsbLt [i] is the POC_LSB value of the i-th long term reference picture signaled in the slice segment header. PicOrderCntVal is the POC value of the decoded current picture. MaxPicOrderCntLsb is the value signaled in Sequence Parameter Sets (SPS). DeltaPocMsbCyCleLt [i] is a value derived from delta_poc_msb_cycle_lt signaled in the slice segment header, and can be derived as shown in equation (2).
In Equation (2), if (i = 0 || i = = num_long_term_sps) denotes a case where i is the zeroth long reference picture or i is the number of sets of long term reference picture sets in the SPS.
(2-2) When the poc_reset_flag value parsed from the slice segment header is 1 (when the POC reset information indicates that the POC value of the current picture is reset to 0), the encoding / decoding apparatus determines that the slice constituting the current picture is a reference The POC values of the reference pictures that are being processed can be calculated as follows.
In the case of a short-term reference picture, the POC value of the short-term reference picture is calculated by using the delta_poc value indicating each short-term reference picture signaled in the slice segment header and the POC value (= 0) Can be calculated. In this case, the delta_poc value may be a POC difference value between the current picture and the i-th short-term reference picture, or may be a difference value between the (i + 1) th short-term reference picture and the i-th short-
In the case of a long-term reference picture, a difference value poc_lsb (delta_poc_lsb) between the POC_LSB (pocLsbLt) value indicating the LSB of the long term reference picture POC signaled in the slice segment header and the POC_LSB (slice_pic_order_cnt_lsb) The POC_LSB value or the POC value of the long-term reference picture can be calculated by Equation (3). The long reference picture can be distinguished by the PocLt value derived by Equation (3).
In Equation (3), the difference value delta_poc_lsb between the POC_LSB (pocLstLt [i]) of the long term reference picture signaled in the slice segment header and the POC_LSB value of the current picture may have a value within a range from 0 to MaxPicOrderCntLsb-1.
(Delta_poc_lsb) value and a value (delta_poc_msb_cycle_lt) for calculating the POC_MSB value derived by the equation (3) are used in the case where there are reference pictures having the same POC_LSB (pocLsbLt) of the long reference picture among the reference pictures, The POC value of the long-term reference picture can be calculated.
Although a long-term reference picture can be identified by only POC_LSB basically, there may be a case where there are reference pictures having the same POC_LSB of a long-term reference picture among reference pictures. In this case, the value (delta_poc_msb_cycle_lt) for calculating the POC_MSB value of the long reference picture is additionally signaled so that the reference pictures can be distinguished.
As described above, the encoding / decoding apparatus can calculate the POC value of the reference picture using another method according to POC reset information (for example, poc_reset_flag) indicating whether or not the POC value of the current picture is reset to 0.
The encoding / decoding apparatus can construct a reference picture set based on the POC of the derived short-term reference picture and the POC of the long-term reference picture, and can perform inter-prediction of the current picture using the reference picture set.
FIG. 7 is a diagram illustrating a method for calculating a POC value of long term reference pictures in a DPB according to an embodiment of the present invention.
Referring to FIG. 7, when the poc_reset_flag value parsed in the slice segment header is 1, that is, when the POC reset information indicates that the POC value of the current picture has been reset to 0, the decoding apparatus calculates the POC value and the POC_LSB value of the current picture, The POC value of the long term reference picture in the DPB can be calculated using the information related to the long term reference picture transmitted in the slice segment header of the current picture.
For example, assume that the poc_reset_flag value of the current picture is 1 and the POC value of the current picture is 331. At this time, the POC of the long term reference picture corresponding to i = 2 in the DBP can be calculated as follows. Can be calculated using Equations (3) and (4) described in (2-2).
delta_poc_lsb [2] = PocLsbLt [2] - slice_pic_order_cnt_lsb = 20 - 11 = 9
pocLt [2] = delta_poc_lsb [2] & (MaxPicOrderCntLsb-1) = 9 & (32-1) = 9, where MaxPicOrderCntLsb is 32.
Since delta_poc_msb_present_flag is 1, calculate POC using delta_poc_msb_cycle_lt [i]. Since delta_poc_lsb [i] has a value larger than 0, pocLt = pocLt [2] - (DeltaPocMsbCycle) * (MaxPicOrderCntLsb) = 9-8 * 32 = -247, where DeltaPocMsbCycle can be obtained by Equation (2).
The decoding apparatus resets the POC value of the long-term reference picture corresponding to i = 2 in the DBP to -247, and stores the long-term reference picture corresponding to i = 2 from the pictures in the DBP as the POC value of the re- Can be identified.
The method according to the present invention may be implemented as a program for execution on a computer and stored in a computer-readable recording medium. Examples of the computer-readable recording medium include a ROM, a RAM, a CD- , A floppy disk, an optical data storage device, and the like, and may also be implemented in the form of a carrier wave (for example, transmission over the Internet).
The computer readable recording medium may be distributed over a networked computer system so that computer readable code can be stored and executed in a distributed manner. And, functional programs, codes and code segments for implementing the above method can be easily inferred by programmers of the technical field to which the present invention belongs.
In the above-described embodiments, the methods are described on the basis of a flowchart as a series of steps or blocks, but the present invention is not limited to the order of the steps, and some steps may occur in different orders or simultaneously . It will also be understood by those skilled in the art that the steps depicted in the flowchart illustrations are not exclusive and that other steps may be included or that one or more steps in the flowchart may be deleted without affecting the scope of the invention You will understand.
The foregoing description is merely illustrative of the technical idea of the present invention, and various changes and modifications may be made by those skilled in the art without departing from the essential characteristics of the present invention. Therefore, the embodiments disclosed in the present invention are intended to illustrate rather than limit the scope of the present invention, and the scope of the technical idea of the present invention is not limited by these embodiments. The scope of protection of the present invention should be construed according to the claims, and all technical ideas within the scope of the claims should be construed as being included in the scope of the present invention.
Claims (14)
Decoding POC reset information indicating whether a picture order count (POC) of the current picture has been reset to zero;
Calculating a POC value of the current picture based on the POC reset information and a POC value of each of a long term reference picture and a short term reference picture in a DPB (Decoded Picture Buffer) referenced by the current picture; And
And constructing a reference picture set (RPS) for inter-prediction of the current picture based on the POC value of the long-term reference picture and the POC value of the short-term reference picture. .
And resets the POC value of the current picture to 0 when the POC reset information indicates that the POC of the current picture is reset to zero.
When the POC reset information indicates that the POC of the current picture is reset to 0, the POC value of the short-
Wherein the POC difference value between the current picture and the short-term reference picture is calculated using the POC value of the reset current picture and the POC difference value between the current picture and the short-term reference picture.
When the POC reset information indicates that the POC of the current picture has been reset to 0, the POC value of the long-
And a POC LSB difference value between a POC LSB value for indicating the LSB (Least Significant Bit) of the long term reference picture POC and a POC LSB value for indicating the LSB of the current picture POC. .
When there are reference pictures having the same POC LSB value for indicating the LSB of the long term reference picture POC in the DPB, the POC value of the long-
Wherein the POC LSB difference value is calculated using a value used for determining a MSB (Most Significant Bit) value of the POC LSB difference value and the long term reference picture POC.
Wherein the POC reset information is information that is signaled by an encoder when an IRAP picture and a non-IRAP picture other than an IRAP picture are included in an AU (Access Unit) Decoding method.
Wherein the current picture is a Non-IRAP (Non-IRAP) picture included in the AU.
A decoding unit for decoding POC reset information indicating whether a picture order count (POC) of a current picture is reset to 0; And
Calculating a POC value of the current picture and a POC value of each of a long term reference picture and a short term reference picture in a DPB (Decoded Picture Buffer) referenced by the current picture based on the POC reset information,
And a prediction unit configured to construct a reference picture set (RPS) for inter-prediction of the current picture based on the POC value of the long-term reference picture and the POC value of the short-term reference picture. .
Wherein the POC reset value of the current picture is reset to 0 when the POC reset information indicates that the POC of the current picture is reset to zero.
When the POC reset information indicates that the POC of the current picture is reset to 0, the POC value of the short-
Wherein the POC value of the current picture is calculated using the POC value of the current picture and the POC difference value between the current picture and the short-term reference picture.
When the POC reset information indicates that the POC of the current picture has been reset to 0, the POC value of the long-
And a POC LSB difference value between a POC LSB value for indicating the LSB (Least Significant Bit) of the long term reference picture POC and a POC LSB value for indicating the LSB of the current picture POC. .
When there are reference pictures having the same POC LSB value for indicating the LSB of the long term reference picture POC in the DPB, the POC value of the long-
And a value used for determining the POC LSB difference value and the most significant bit (MSB) value of the long term reference picture POC.
Wherein the POC reset information is information that is signaled by an encoder when an IRAP picture and a non-IRAP picture other than an IRAP picture are included in an AU (Access Unit) Decoding device.
Wherein the current picture is a non-IRAP (Non-IRAP) picture included in the AU.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/511,333 US20150103912A1 (en) | 2013-10-11 | 2014-10-10 | Method and apparatus for video encoding/decoding based on multi-layer |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR20130121133 | 2013-10-11 | ||
KR1020130121133 | 2013-10-11 |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20150043977A true KR20150043977A (en) | 2015-04-23 |
Family
ID=53036360
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR20140135694A KR20150043977A (en) | 2013-10-11 | 2014-10-08 | Method and apparatus for video encoding/decoding based on multi-layer |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR20150043977A (en) |
-
2014
- 2014-10-08 KR KR20140135694A patent/KR20150043977A/en not_active Application Discontinuation
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7171654B2 (en) | Interlayer prediction method and apparatus based on temporal sublayer information | |
JP7371181B2 (en) | Picture decoding method and picture encoding method | |
US10306244B2 (en) | Method for encoding/decoding image and device using same | |
US20150103912A1 (en) | Method and apparatus for video encoding/decoding based on multi-layer | |
KR20140121355A (en) | Method and apparatus for image encoding/decoding | |
KR20150043986A (en) | Method and apparatus for video encoding/decoding based on multi-layer | |
KR102431741B1 (en) | Method and apparatus for image encoding/decoding | |
KR102226893B1 (en) | Video decoding method and apparatus using the same | |
KR102412637B1 (en) | Method and apparatus for image encoding/decoding | |
KR20140043240A (en) | Method and apparatus for image encoding/decoding | |
KR102246634B1 (en) | Video encoding and decoding method and apparatus using the same | |
KR20210013254A (en) | Method and apparatus for image encoding/decoding | |
KR20150043974A (en) | Video encoding and decoding method and apparatus using the same | |
KR20140088002A (en) | Video encoding and decoding method and apparatus using the same | |
KR102722391B1 (en) | Method and apparatus for image encoding/decoding | |
KR102271878B1 (en) | Video encoding and decoding method and apparatus using the same | |
KR102418524B1 (en) | Method and apparatus for image encoding/decoding | |
KR20150043977A (en) | Method and apparatus for video encoding/decoding based on multi-layer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WITN | Withdrawal due to no request for examination |