KR20230148506A

KR20230148506A - A method and an apparatus for processing a video signal

Info

Publication number: KR20230148506A
Application number: KR1020220047306A
Authority: KR
Inventors: 임성원
Original assignee: 주식회사 케이티
Priority date: 2022-04-18
Filing date: 2022-04-18
Publication date: 2023-10-25

Abstract

본 발명은 비디오 신호의 인트라 예측에 기반한 부/복호화 방법 및 이를 위한 장치를 제공한다. The present invention provides an encoding/decoding method based on intra prediction of a video signal and an apparatus therefor.

Description

Video signal processing method and apparatus {A METHOD AND AN APPARATUS FOR PROCESSING A VIDEO SIGNAL}

본 발명은 비디오 신호 처리 방법 및 장치에 관한 것이다.The present invention relates to a video signal processing method and device.

비디오 영상은 시공간적 중복성 및 시점 간 중복성을 제거하여 압축부호화되며, 이는 통신 회선을 통해 전송되거나 저장 매체에 적합한 형태로 저장될 수 있다.Video images are compressed and encoded by removing spatial-temporal redundancy and inter-view redundancy, and can be transmitted through communication lines or stored in a format suitable for storage media.

본 발명은 비디오 신호의 코딩 효율을 향상시키고자 함에 있다.The present invention seeks to improve the coding efficiency of video signals.

상기 과제를 해결하기 위하여 인트라 예측을 통한 비디오 신호의 부/복호화 방법 및 이를 위한 장치를 제공한다.In order to solve the above problems, a method of encoding/decoding a video signal through intra prediction and a device therefor are provided.

본 발명에 따른 비디오 신호 처리 방법 및 장치는 영상 부호화/복호화 효율을 향상시킬 수 있다.The video signal processing method and device according to the present invention can improve video encoding/decoding efficiency.

최근, 초고해상도 영상은 디지털 방송뿐 아니라 넷플릭스 및 유튜브 등의 스트리밍 서비스 분야의 핵심이다. 게다가 기존의 2D영상이외에도 VR, 3D 영상 서비스가 상용화되고 있으며, 디지털 TV뿐만 아니라 스마트폰과 같은 모바일 장비에서도 위와 같은 영상 서비스를 사용할 수 있다. 이러한 영상 서비스의 공통점은, 영상 압축의 적용 없이는 서비스가 불가능하다는 점이다. Full-HD라고 할 수 있는 1080p@60Hz의 경우, 1920x1080크기의 화면을 1초에 60번 전송해야 한다. 3D 영상과 같이 양쪽 눈에 정보를 전달하기 위해서는 2배의 데이터가 필요하며, 4K(4096x2048), 8K(8192x4096) 등 초고해상도 영상 서비스는 한 화면을 1초에 120번 이상 전송해야 되기 때문에 full-HD대비 엄청난 데이터가 발생된다. 이러한 데이터를 감당하기 위해서는 통신 대역폭, 영상 압축 기술 등 다양한 분야의 기술이 필요하다. Recently, ultra-high resolution video is the core of not only digital broadcasting but also streaming services such as Netflix and YouTube. Moreover, in addition to existing 2D video, VR and 3D video services are being commercialized, and the above video services can be used not only on digital TVs but also on mobile devices such as smartphones. What these video services have in common is that they are impossible without applying video compression. In the case of 1080p@60Hz, which can be called Full-HD, a 1920x1080 screen must be transmitted 60 times per second. Like 3D video, twice as much data is required to transmit information to both eyes, and ultra-high resolution video services such as 4K (4096x2048) and 8K (8192x4096) require one screen to be transmitted more than 120 times per second, so full- A huge amount of data is generated compared to HD. In order to handle such data, technologies in various fields such as communication bandwidth and video compression technology are required.

1.One. 예측 기술predictive technology

예측은, 원본값에 해당하는 예측값을 생성 후, 두 값을 차분하여 잔차값을 생성한다. 이 과정을 통해 원본값의 중복된 데이터가 제거된다. 부호화기에서 영상은, 블록단위로 분할되어 부호화가 수행된다. 이에 따라 분할된 블록 단위로 예측이 수행될 수 있다. 수행된 예측에 의해 예측 블록이 생성되면, 부호화하려는 원본 블록과 차분하여 잔차블록을 생성할 수 있다. 반대로 복호화기에서는, 복원된 잔차 블록에 예측블록을 합산하여 블록을 복원한다. 이 복원된 블록을 복원 블록이라하며, 부호화기에서 무손실부호화가 수행되었다면 복원 블록과 원본블록은 동일한 값을 가진다. 반변에, 부호화기에서 손실부호화가 수행되었다면 복원 블록은 원본블록과 유사한 다른 값을 가질 수도 있다.Forecasting generates a predicted value corresponding to the original value and then creates a residual value by differentiating the two values. Through this process, duplicate data from the original value is removed. In the encoder, the image is divided into blocks and encoding is performed. Accordingly, prediction can be performed on a divided block basis. When a prediction block is generated by the performed prediction, a residual block can be generated by differentiating it from the original block to be encoded. Conversely, in the decoder, the block is restored by adding the prediction block to the restored residual block. This restored block is called a restored block, and if lossless coding has been performed in the encoder, the restored block and the original block have the same value. On the other hand, if lossy coding was performed in the encoder, the restored block may have a different value similar to the original block.

예측은 부호화를 수행할 데이터의 중복도를 줄여주는 역할을 한다. 다음의 그림은 화면 내 예측(Intra prediction)의 예시를 나타낸다. Prediction serves to reduce the redundancy of data to be encoded. The following figure shows an example of intra prediction.

그림 1Figure 1

위 그림에서와 같이, 화면 내 예측은 0~66번 모드를 가진다. 이 때, 0번모드는 평면모드, 1번모드는 DC모드, 2~66번 모드는 방향성모드라고도 한다.As shown in the picture above, intra-screen prediction has modes 0 to 66. At this time, mode 0 is also called planar mode, mode 1 is DC mode, and modes 2 to 66 are also called directional modes.

0번 모드로 할당된 평면 모드 하에서 예측 샘플은 다음의 그림 2과 같이 생성될 수 있다.Under the planar mode assigned to mode 0, prediction samples can be generated as shown in Figure 2 below.

그림 2Figure 2

위 그림에서, T와 L은 평면 모드로 예측 블록을 생성할 때 사용되는 주변 참조 샘플에 대한 예시이다. T는 우측 상단 코너에 위치하는 참조 샘플을, L은 좌측 하단 코너에 위치하는 참조 샘플을 나타낸다. P1은 수평방향에 대한 예측 샘플이다. P1은 P1과 Y축으로 동일 위치에 있는 참조 샘플과 T를 선형 보간하여 생성할 수 있다. P2는 수직방향에 대한 예측 샘플이다. P2는 P2와 X축으로 동일 위치에 있는 참조샘플과 L을 선형 보간하여 생성할 수 있다. 그 후, 다음의 수식 (1)을 이용하여, 즉, P1과 P2를 가중치합하여 최종적인 예측 샘플을 생성한다. In the figure above, T and L are examples of surrounding reference samples used when generating prediction blocks in planar mode. T represents the reference sample located in the upper right corner, and L represents the reference sample located in the lower left corner. P1 is a prediction sample for the horizontal direction. P1 can be generated by linear interpolation of T and a reference sample located at the same location on the Y axis as P1. P2 is a prediction sample for the vertical direction. P2 can be generated by linear interpolation of L and a reference sample located at the same position on the X-axis as P2. After that, the final prediction sample is generated by adding the weights of P1 and P2 using the following equation (1).

(1) (One)

여기서 가중치 와 를 결정할 때, 현재 블록의 가로와 세로 길이를 고려할 수 있다. 현재 블록의 가로와 세로 길이에 따라 가중치 와 는 동일한 값을 갖거나, 상이한 값을 가질 수 있다. 만약 블록의 가로와 세로 길이가 같다면 가중치 와 를 동일하게 설정하여, P1과 P2의 평균으로 예측 샘플을 설정할 수 있다. 블록의 어느 한 면이 다른 한 면보다 더 길다면, 긴 면에 대응하는 가중치에 더 낮은 값을 적용할 수도 있다. 혹은 긴 면에 대응하는 가중치에 더 높은 가중치를 적용하는 것 또한 가능하다.weight here and When determining , the horizontal and vertical length of the current block can be considered. Weighted according to the horizontal and vertical length of the current block and may have the same value or may have different values. If the width and height of the block are the same, the weight and By setting , the prediction sample can be set to the average of P1 and P2. If one side of the block is longer than the other, a lower value may be applied to the weight corresponding to the longer side. Alternatively, it is also possible to apply a higher weight to the weight corresponding to the long side.

1번모드로 할당된 DC모드 하에서 예측 샘플은 그림 3의 예시를 통해 생성하는 방법을 설명한다.We explain how to generate a prediction sample under DC mode assigned to mode 1 using the example in Figure 3.

그림 3Figure 3

그림 3에서와 같이 주변 참조 샘플들의 평균 값을 계산한 후에, 계산된 값을 예측 블록 안의 모든 예측 샘플로 설정한다. 참조 샘플들은 상단 참조 샘플들과 좌측 참조 샘플들을 포함한다. 블록의 형태에 따라, 상단 참조 샘플들 또는 좌측 참조 샘플들만을 이용하여 평균값을 계산할 수 있다. 일 예로, 블록의 가로 길이가 세로 길이보다 큰 경우, 또는 블록의 가로 길이와 세로 길이 사이의 비율이 기 정의된 값 이상(또는 이하)인 경우 상단 참조 샘플들만을 이용하여 평균값을 계산할 수 있다. 반면, 블록의 가로 길이가 세로 길이보다 작은 경우, 또는 블록의 가로 길이와 세로 길이 사이의 비율이 기 정의된 값 이하(또는 이상)인 경우, 좌측 참조 샘플들만을 이용하여 평균값을 계산할 수 있다.As shown in Figure 3, after calculating the average value of surrounding reference samples, the calculated value is set to all prediction samples in the prediction block. Reference samples include top reference samples and left reference samples. Depending on the type of block, the average value can be calculated using only the top reference samples or the left reference samples. For example, when the horizontal length of a block is greater than the vertical length, or when the ratio between the horizontal and vertical lengths of a block is greater than (or less than) a predefined value, the average value can be calculated using only the upper reference samples. On the other hand, if the horizontal length of the block is smaller than the vertical length, or if the ratio between the horizontal and vertical lengths of the block is less than (or more than) a predefined value, the average value can be calculated using only the left reference samples.

방향성 모드에서는, 각 방향성 모드의 각도에 따라 참조 샘플 방향으로 프로젝션(projection)을 수행한다. 해당 위치에 참조 샘플이 존재하면 해당 참조 샘플을 예측 샘플로 설정하한다. 만약, 해당 위치에 참조 샘플이 존재하지 않으면, 주변 참조 샘플을 이용하여 보간 후, 보간된 값을 예측 샘플로 설정한다. 다음의 그림 4은 이에 대한 예시를 나타낸다.In the directional mode, projection is performed in the direction of the reference sample according to the angle of each directional mode. If a reference sample exists at that location, the reference sample is set as the prediction sample. If there is no reference sample at that location, interpolation is performed using surrounding reference samples, and the interpolated value is set as the prediction sample. Figure 4 below shows an example of this.

그림 4Figure 4

위 예시에서 예측샘플 B의 경우, 해당위치에서 각도에 따라 참조샘플 방향으로 프로젝션하였을 때 참조샘플이 존재한다. 따라서 해당 참조샘플을 예측샘플로 설정한다. 예측샘플 A의 경우, 해당위치에서 각도에 따라 참조샘플 방향으로 프로젝션하였을때 정수위치에 참조샘플이 존재하지 않는다. 따라서 이 경우, 주변 정수위치 참조샘플들을 이용하여 보간을 수행 후, 보간된 값(분수위치 참조샘플)을 예측샘플로 설정한다.In the above example, in the case of prediction sample B, the reference sample exists when projected from the corresponding location in the direction of the reference sample according to the angle. Therefore, the reference sample is set as the prediction sample. In the case of prediction sample A, there is no reference sample at the integer position when projected from that location toward the reference sample according to the angle. Therefore, in this case, interpolation is performed using surrounding integer position reference samples, and then the interpolated value (fractional position reference sample) is set as the prediction sample.

블록에 적용된 화면 내 예측모드에 관한 정보가 비트스트림을 통해 명시적으로 시그날링될 수 있다. 이때, 화면 내 예측 모드 리스트에 기초하여, 화면 내 예측 모드에 대한 정보가 생성될 수 있다. 여기서, 화면 내 예측 모드 리스트에는, 복수의 화면 내 예측 모드 후보가 존재할 수 있다. 화면 내 예측 모드 후보는, 현재 블록에 이웃하는 이웃 블록의 화면 내 예측 모드를 기초로 유도될 수 있다.Information about the intra-screen prediction mode applied to the block can be explicitly signaled through the bitstream. At this time, information about the intra-screen prediction mode may be generated based on the intra-screen prediction mode list. Here, a plurality of intra-prediction mode candidates may exist in the intra-prediction mode list. The intra-prediction mode candidate may be derived based on the intra-prediction mode of a neighboring block neighboring the current block.

일 예로, 화면 내 예측모드 리스트 이용하여 화면 내 예측모드를 시그널링 할 때, 화면 내 예측 모드 후보들 중 하나를 화면 내 예측모드의 예측값으로 선택하고, 현재 블록의 화면 내 예측모드와 예측값과의 차분값을 부호화하여 시그널링하는 것이 가능하다. 이 때, 화면 내 예측 모드 리스트 내 예측값으로 선택된 화면 내 예측 모드 후보를 지시하는 인덱스를 부호화하여 디코더로 시그널링할 수 있다. For example, when signaling the intra-screen prediction mode using the intra-screen prediction mode list, one of the intra-screen prediction mode candidates is selected as the predicted value of the intra-screen prediction mode, and the difference value between the intra-screen prediction mode and the predicted value of the current block It is possible to encode and signal. At this time, an index indicating an intra-prediction mode candidate selected as a prediction value in the intra-prediction mode list can be encoded and signaled to the decoder.

혹은, 현재 블록의 화면내 예측모드와 동일한 화면 내 예측 모드 후보가 존재하는지 여부를 시그널링할 수도 있다. 일 예로, 1비트의 플래그를 통해, 현재 블로그이 화면 내 예측 모드와 동일한 화면 내 예측 모드 후보가 존재하는지 여부를 나타낼 수 있다. 만약 존재하는 것으로 판단되면 화면 내 예측 모드 리스트 내 어떤 화면 내 예측 모드 후보가 현재블록의 화면 내 예측모드와 동일한지를 인덱스로 표시한 후, 인덱스를 디코더로 시그널링 한다. 만약 존재하지 않는 것으로 판단되면 화면 내 예측 모드 리스트에 존재하는 화면 내 예측 모드 후보들을 제외하고 나머지 화면 내 예측모드들에 인덱스를 다시 할당한 후, 현재 블록의 화면 내 예측모드에 해당하는 인덱스를 시그널링할 수 있다.Alternatively, it may be signaled whether an intra-prediction mode candidate identical to the intra-prediction mode of the current block exists. As an example, a 1-bit flag may indicate whether a candidate for the same intra-screen prediction mode as the current blog intra-screen prediction mode exists. If it is determined to exist, which intra-prediction mode candidate in the intra-prediction mode list is the same as the intra-prediction mode of the current block is indicated with an index, and then the index is signaled to the decoder. If it is determined that it does not exist, the indices are reassigned to the remaining intra-prediction modes, excluding the intra-prediction mode candidates existing in the intra-prediction mode list, and then the index corresponding to the intra-prediction mode of the current block is signaled. can do.

2.2. 라인기반 부호화 방법Line-based coding method

상기 언급한 화면 내 예측은 블록단위로 적용된다. 따라서, 블록 내에서 참조 샘플과의 거리가 멀어질수록 예측의 정확도가 떨어지게 된다. 따라서 블록을 라인단위로 분할하고 부호화 및 복호화를 수행한 후, 다음 라인에 대한 부호화를 진행할 수 있다. The above-mentioned intra-screen prediction is applied on a block basis. Therefore, as the distance from the reference sample within a block increases, prediction accuracy decreases. Therefore, after dividing the block into line units and performing encoding and decoding, encoding of the next line can be performed.

NxN블록은 라인단위로 분할 시, 1xN블록 N개 혹은 Nx1블록 N개로 분할할 수 있다. 다음의 그림은 이와 관련된 예시를 나타낸다. 이 그림에서 블록은 4x4크기라고 가정하으며, 세로 분할이 되어 1x4블록 4개로 분할된 것으로 가정하였다. 또한 사용된 화면 내 예측 모드는 그림 1의 18번 모드라 가정하였다.When dividing an NxN block by line, it can be divided into N 1xN blocks or N Nx1 blocks. The following figure shows an example related to this. In this figure, the block is assumed to be 4x4 in size, and is assumed to be vertically divided into four 1x4 blocks. Additionally, it was assumed that the intra-screen prediction mode used was mode 18 in Figure 1.

그림 5Figure 5

위 그림의 (a)에서와 같이, 참조 샘플 A`~D`을 이용하여 예측 샘플 P11~P41을 생성??나. 그 후, P11~P41에 해당하는 잔차값을 생성 후 부호화 및 복호화를 수행하여 복원 샘플 R11~R41을 생성한다. 그 후, (b)에서와 같이 R11~R41을 참조 샘플로 설정하여 예측 샘플 P12~42를 생성한다. 그 후, P12~P42에 해당하는 잔차값을 생성 후 부호화 및 복호화를 수행하여 복원 샘플 R12~R42을 생성한다. 이와 같은 방식으로 부호화를 수행하는 경우, 참조 샘플과 예측 샘플과의 거리는 항상 1이 된다. As shown in (a) of the figure above, predicted samples P11 to P41 are generated using reference samples A` to D`. Afterwards, residual values corresponding to P11 to P41 are generated and then encoded and decoded to generate restored samples R11 to R41. Then, as in (b), R11 to R41 are set as reference samples to generate prediction samples P12 to 42. Afterwards, residual values corresponding to P12 to P42 are generated, and then encoding and decoding are performed to generate restored samples R12 to R42. When encoding is performed in this way, the distance between the reference sample and the prediction sample is always 1.

위 예시에서 사용될 수 있는 화면 내 예측 모드는 그림 1과 같지만, 블록의 가로 및 세로 비율에 따라 사용할 수 있는 화면 내 예측 모드를 조절할 수 있다. 일 예로, 위 그림 5와 같이 분할된 블록의 세로길이가 가로길이보다 더욱 길다면, 블록의 왼쪽을 가리키는 화면 내 예측모드 (예를 들어 그림 1의 2번모드~34번 모드)만 사용가능하게 하는 것 또한 가능하다.The intra-screen prediction modes that can be used in the example above are the same as Figure 1, but the available intra-screen prediction modes can be adjusted depending on the horizontal and vertical ratio of the block. For example, if the vertical length of the divided block is longer than the horizontal length as shown in Figure 5 above, only the on-screen prediction mode pointing to the left of the block (for example, modes 2 to 34 in Figure 1) is available. It is also possible.

혹은 사용된 라인마다 화면 내 예측모드를 다르게 사용하는 것 또한 가능하다. 또는, 이전 라인에서 사용한 화면 내 예측모드에 임의의 오프셋을 합산하여 화면 내 예측모드를 사용할 수도 있다. 다음의 그림, 첫 번째 라인에서 사용된 화면 내 예측모드가 18번이고 오프셋이 -1인 경우에 대한 예시를 나타낸다.Alternatively, it is also possible to use a different prediction mode within the screen for each line used. Alternatively, the intra-screen prediction mode can be used by adding a random offset to the intra-screen prediction mode used in the previous line. The following figure shows an example where the on-screen prediction mode used in the first line is 18 and the offset is -1.

그림 6Figure 6

위 그림 6에서 사용된 오프셋은 블록마다 시그널링 될 수 있다. The offset used in Figure 6 above can be signaled for each block.

혹은 위 그림 6과는 다르게, 라인마다 부호화 및 복호화를 수행하고 다음 라인으로 넘어가는 것이 아닌, 라인마다 화면 내 예측 모드만 다르게 설정하고, 블록 단위로 예측값을 전부 생성 후, 블록 단위로 부호화 및 복호화를 수행하는 것 또한 가능하다. 다음의 그림은 이와 관련된 예시를 나타낸다.Or, unlike Figure 6 above, rather than performing encoding and decoding for each line and moving on to the next line, only the prediction mode within the screen is set differently for each line, and all predicted values are generated in block units, and then encoded and decoded in block units. It is also possible to perform . The following figure shows an example related to this.

그림 7Figure 7

혹은, 블록단위로 화면 내 예측모드를 시그널링하고 추가로 라인마다 사용되는 오프셋을 시그널링하는 것이 아닌, MPM 리스트에 존재하는 화면 내 예측모드만 이용하여 라인에 할당하는 것이 가능하다. 이러한 경우, 라인마다 MPM 인덱스 정보가 시그널링 될 수 있다.Alternatively, rather than signaling the intra-screen prediction mode on a block basis and additionally signaling the offset used for each line, it is possible to assign to a line using only the intra-screen prediction mode that exists in the MPM list. In this case, MPM index information may be signaled for each line.

혹은, 위 예시와는 다르게 블록마다 2개의 화면 내 예측모드를 사용하고, 라인마다 가중치를 다르게 적용하여 가중합을 수행할 수 있다. 다음의 그림은 가중합을 적용할 때 사용할 매트릭스와 관련된 예시를 나타낸다.Alternatively, unlike the example above, a weighted sum can be performed by using two intra-screen prediction modes for each block and applying different weights to each line. The following figure shows an example of the matrix to be used when applying a weighted sum.

그림 8Figure 8

일 예로, 위 그림에서 사용된 2개의 모드가 그림 1의 평면모드와 18번모드인 경우, 두 모드를 이용하여 해당 라인에 대해 예측 샘플을 생성 후, 가중합하여 최종 예측 샘플을 생성한다. 위 그림에서, P11~P41에 해당하는 예측 샘플은 평면모드와 18번 모드에 의해 생성된 샘플들을 7:1로 가중합한 결과이다. 이와 유사하게 P14~P44에 해당하는 예측 샘플은 평면모드와 18번 모드에 의해 생성된 샘플들을 1:7로 가중합한 결과이다.For example, if the two modes used in the figure above are the planar mode and mode 18 in Figure 1, a prediction sample is generated for the corresponding line using the two modes, and then the final prediction sample is generated by weighted sum. In the figure above, the predicted samples corresponding to P11 to P41 are the result of a 7:1 weighted sum of the samples generated by the planar mode and mode 18. Similarly, the predicted samples corresponding to P14 to P44 are the result of a 1:7 weighted sum of the samples generated by the planar mode and the 18 mode.

혹은 가중합된 참조샘플라인을 생성 후, 화면 내 예측에 사용하는 것 또한 가능하다. 다음의 그림은 이와 관련된 예시를 나타낸다.Alternatively, it is also possible to create a weighted reference sample line and use it for intra-screen prediction. The following picture shows an example related to this.

그림 9Figure 9

위 그림에서와 같이, 세 번째 라인에 대해 16번 모드를 이용하여 화면 내 예측을 수행할 때, R12~R42를 세 번째 라인에 대한 참조샘플로 사용하는 것이 아닌 가중합된 R12`~R42`를 사용할 수 있다. R12'~R42'는 예측 방향에 따라, 첫 번째 라인에 존재하는 참조 샘플과 두 번째 라인에 존재하는 참조 샘플의 가중합으로 생성될 수 있다. As shown in the figure above, when performing intra-screen prediction using mode 16 for the third line, rather than using R12 to R42 as reference samples for the third line, the weighted sum of R12` to R42` is used. You can use it. R12' to R42' can be generated as a weighted sum of the reference samples existing in the first line and the reference samples existing in the second line, depending on the prediction direction.

라인단위로 부호화 및 복호화를 수행하는 것이 아닌, 블록 단위로 부호화 및 복호화를 수행하는 경우, 잔차값에 변환이 수행될 수 있다. 이 때, 라인단위로 1-D변환이 수행될 수도 있으며 혹은 블록단위로 2-D변환이 수행되는 것 또한 가능하다. 이때, 1-D 혹은 2-D변환 중 어떤 변환이 사용되었는지 여부를 시그널링할 수 있다.When encoding and decoding are performed on a block basis rather than on a line basis, conversion may be performed on the residual value. At this time, 1-D conversion may be performed on a line basis, or 2-D conversion may be performed on a block basis. At this time, it can be signaled whether 1-D or 2-D transformation was used.

Claims

Video signal encoding/decoding method using intra prediction.