KR20090016970A

KR20090016970A - Method for encoding and decoding video and apparatus thereof

Info

Publication number: KR20090016970A
Application number: KR1020070081343A
Authority: KR
Inventors: 박승욱; 전병문
Original assignee: 엘지전자 주식회사
Priority date: 2007-08-13
Filing date: 2007-08-13
Publication date: 2009-02-18

Abstract

The present invention relates to a video encoding and decoding method and apparatus therefor. When zooming in and out of video data, motion compensation is performed using zoom in and zoom out information. This enables efficient encoding and decoding of video data when zooming in and out.

Description

Method for encoding and decoding video and apparatus

The present invention relates to a video encoding and decoding method and apparatus, and more particularly, to a video encoding and decoding method and apparatus for performing motion compensation by performing zoom in / zoom out correction when zooming in / out of video data.

Video data is generally processed and transmitted in the form of bitstreams. Typical video encoding and decoding methods achieve high efficiency compression by predicting a reference picture for the current picture to be encoded and encoding the difference between the current picture and the predictive picture. The higher the predictive picture correlates with the current picture, the fewer bits are needed to compress the predictive picture, increasing the efficiency of the process. Therefore, it is desirable to predict the best possible predictive picture. In many video compression standards, including Moving Picture Experts Group (MPEG) -1, MPEG-2, and MPEG-4, motion estimation versions of previous reference pictures are used as predictive pictures for the current picture, with single picture prediction (" Reference picture is not scaled when forming the predictive picture, and when using bi-directional picture prediction (“B” picture), the same weighting factor is used for two different reference pictures. Average to form a single predictive picture.

An object of the present invention is to provide a video encoding and decoding method and apparatus for performing efficient encoding and decoding by motion compensation by zooming in and zooming out of video data when zooming in and out.

According to another aspect of the present invention, there is provided a video encoding method including obtaining zoom in / zoom out information of a current picture based on at least one reference picture, and downsampling a reference picture according to the information. obtaining at least one resampled reference picture by downsampling or upsampling, obtaining a motion vector of the current picture based on the resampled reference picture, and reconstructing based on the motion vector. The method may include obtaining a predictive picture by motion compensation of the sampled reference picture.

On the other hand, the video decoding method according to the present invention comprises the steps of receiving a bitstream, extracting the zoom in / zoom out information, the motion vector, and the residual signal, decoding the zoom in / zoom out information, the motion vector, and the residual signal; Obtaining at least one resampled reference picture by downsampling or upsampling a reference picture based on the decoded information, and obtaining a predictive picture by motion compensating the resampled reference picture based on the decoded motion vector. It may include a step.

On the other hand, the video encoding method according to the present invention, in the encoding process, signaling a resampling flag for zoom-in / zoom-out correction at the first syntax level in the video sequence, and the resampling flag at the first syntax level Indicates that resampling is enabled for at least one reference picture corresponding to the current picture in the video sequence, zoom-in / zoom out correction for the current picture at a second syntax level lower than the first syntax level in the video sequence. And signaling at least one resampling rate for the control, wherein the zoom in / zoom out correction may be to resample at least one reference picture to generate a resampled reference picture based on the at least one resampling ratio.

On the other hand, the video decoding method according to the present invention, in the decoding process, receiving and processing a resampling flag for zoom-in / zoom-out correction at the first syntax level in the video sequence, and the resampling flag at the first syntax level is If resampling is enabled for at least one reference picture corresponding to the current picture in the video sequence, zoom in / zoom out correction is performed for the current picture at a second syntax level lower than the first syntax level in the video sequence. Receiving and processing at least one resampling rate for the zoom in / zoom out correction, wherein the zoom in / zoom out correction is to resample the at least one reference picture to generate a resampled reference picture based on at least one resampling ratio Can be.

Meanwhile, the video encoding apparatus according to the present invention obtains zoom-in / zoom-out information of a current picture based on a storage unit in which a reference picture is stored and at least one stored reference picture, and downsamples or de-samples the reference picture according to the information. A zoom-in / zoom-out correction unit for upsampling to generate at least one resampled reference picture, a motion estimator for obtaining a motion vector of the current picture based on the resampled reference picture, and a reconstruction based on the motion vector It may include a motion compensation unit for obtaining a prediction picture by motion compensation of the sampled reference picture.

Meanwhile, a video decoding apparatus according to the present invention includes a decoder which receives a bitstream and extracts and decodes zoom in / zoom out information, a motion vector, and a residual signal, a storage unit in which a reference picture is stored, and decoded zoom in / zoom out information. A zoom-in / zoom-out correction unit for generating at least one resampled reference picture by downsampling or upsampling the reference picture based on the first and second reference pictures, and motion-compensating the resampled reference picture based on a motion vector to obtain a predictive picture Obtaining may include a motion compensation unit.

In order to achieve the above object, the present invention can provide a processor-readable recording medium having recorded thereon a program for executing the video encoding method in a processor.

In order to achieve the above object, the present invention can provide a processor-readable recording medium having recorded thereon a program for executing the video decoding method in a processor.

The video encoding and decoding method and apparatus according to the present invention perform motion compensation by zoom in / zoom out correction when zooming in / out of video data, so that efficient encoding and decoding is performed.

Hereinafter, with reference to the drawings will be described the present invention in more detail.

1 is a diagram illustrating a comparison between a reference picture and a current picture at zoom in / zoom out.

Referring to the drawings, FIG. 1A illustrates an example of a reference picture and FIG. 1B illustrates an example of a current picture. Comparing the reference picture and the current picture in the drawing, it can be seen that the current picture is in a zoomed-in state compared to the reference picture. That is, it can be seen that block A of FIG. 1B has a larger size than block B of FIG. 1A.

Many video compression standards, including Moving Picture Experts Group (MPEG) -1, MPEG-2, and MPEG-4, perform motion prediction and compensation on a block-by-block basis. However, in the zoom-in / zoom-out situation as shown in FIG. 1, according to the conventional video encoding method without scaling or weight prediction of a reference picture, since motion compensation and prediction are simply performed on a block basis, a difference value between the prediction block and the current block is performed. The data of the remaining residual blocks increases, resulting in poor coding efficiency.

FIG. 2 is a diagram illustrating a comparison of a resampled reference picture and a current picture by resampling the reference picture of FIG. 1. Referring to the drawings, it can be seen that in the context of zooming in on the reference picture of the current picture as shown in FIG. 1, the reference picture is resampled, specifically, upsampling. In other words, it can be seen that the reference picture is upsampled so that the size of the block A of FIG. 2B and the block B of FIG. 2A are the same. As a result of the upsampling with respect to the reference picture, it can also be seen that the size of the entire upsampled reference picture is larger than the size of the current picture.

3 is a flowchart illustrating an embodiment of a video encoding method of the present invention. Referring to the drawings, first, in video data, zoom-in / zoom-out information of the current picture is obtained based on at least one reference picture (S300). The zoom-in / zoom-out information may include whether the current picture is zoomed in / zoom out, and a resampling ratio of the reference picture when zooming in / zoom out.

Next, at least one resampled reference picture is obtained by resampling the reference picture according to the zoom in / zoom out information (S310). Resampling includes upsampling or downsampling. If the resampling ratio is greater than 1, upsampling is performed to generate an upsampled reference picture. If the resampling ratio is less than 1, downsampling is performed to generate a downsampled reference picture. The upsampled reference picture is generated based on a combination of the original pixel in the reference picture and the resampling (generating) pixel generated by upsampling. The downsampled reference picture is generated based on the resampling (generation) pixel generated by downsampling.

Next, a motion vector of the current picture is obtained based on the resampled reference picture (S320). The most similar block in the resampled reference picture and the predetermined block in the current picture are found, and a motion vector is calculated accordingly.

Next, a predictive picture is obtained by motion compensation of the resampled reference picture based on the motion vector (S330). Based on the motion vector, motion compensated resampled reference picture. The motion compensated resampled reference picture is called a predictive picture.

Although not shown in the figure, after the obtaining of the prediction picture, the step of encoding the difference between the current picture and the prediction picture may be further performed. The difference value may be referred to as a residual picture or a residual signal, and the encoding step may be performed in units of blocks. Specifically, the encoding may further include transforming the residual signal into the frequency domain, quantizing the transformed residual signal, and entropy encoding the frequency-converted quantized residual signal.

Also, although not shown in the figure, the step of encoding the motion vector may be further performed. In detail, the motion vector may be entropy encoded. In addition, although not shown in the figure, the step of encoding the zoom in / zoom out information may be further performed. In detail, the zoom in / zoom out information may be entropy encoded. Meanwhile, each of the steps illustrated in FIG. 3, the residual signal encoding step, the motion vector encoding step, and the zoom in / zoom out information encoding step may be performed in various block units such as 16X16, 8X8, 8X4, 4X8, and 4X4.

4 is a flowchart illustrating an example of obtaining the resampled reference picture of FIG. 3, FIG. 5 is a diagram illustrating upsampling of a pixel, and FIG. 6 is a diagram illustrating filter coefficients used in the upsampling of FIG. 5. FIG. 7 is a diagram showing downsampling for a pixel, and FIG. 8 is a diagram showing filter coefficients used in downsampling of FIG. 7.

Referring to the drawings, in the obtaining of the reference picture of FIG. 4, first, the position information of the resampling pixel generated based on the resampling ratio included in the zoom in / zoom out information is calculated (S400). First of all, it is assumed that the current picture is zoomed in / out for the reference picture. The resampling ratio may be compared with the reference picture and the current picture and generated accordingly. Once the resampling ratio is determined, position information of the generated pixel is determined based on the original pixel of the reference picture. In FIG. 5 of upsampling and FIG. 7 of downsampling, the position information is classified into 16 pieces from 0 to 15. FIG. For example, when the resampling ratio is 2, the position information phase is determined as 8, and the position of the generated pixel is located in the middle of the original pixels. In FIG. 5, 5 is used as the position information phase, and in FIG. 7, 8 is used as the position information phase.

Next, predetermined filter coefficients are selected based on the positional information (S410). In the upsampling, FIG. 6 shows that four filter coefficients are determined according to the position information. In the down sampling, FIG. 8 shows that 12 filter coefficients are determined according to the position information.

Next, the resampling pixel value is calculated based on the selected filter coefficient and the pixel value in the reference picture (S420). During upsampling, filtering is performed according to the formula (B '= a * A + b * B + c * C + d * D) shown in FIG. 5, and finally, the resampling (generation) pixel value is determined. During downsampling, filtering is performed according to the formula (F '= a * A + b * B + ... h * H + l * L) shown in FIG. 7, and finally the resampling (generation) pixel value is determined.

Meanwhile, as described above, during upsampling, the final pixel is determined by the original pixel in the reference picture and the generated pixel generated according to the equation of FIG. 5, and an upsampled reference picture including the final pixel is generated. Such upsampling increases the number of pixels and increases the size of the resampled picture (block).

In downsampling, the final pixel is determined by the generated pixel generated according to the equation of FIG. By such downsampling, the number of pixels is reduced, and the size of the resampled picture (block) is reduced.

9 is a flowchart illustrating an embodiment of a video decoding method according to the present invention.

Referring to the drawings, first, a bitstream is received to extract zoom-in / zoom-out information, a motion vector, and a residual signal (S800). The bitstream is received to extract zoom in / zoom out information, motion vectors, and residual signals included in the bitstream. The zoom in / zoom out information may be information including whether to zoom in / zoom out and a resampling rate.

Next, the zoom-in / zoom-out information, the motion vector, and the residual signal are decoded (S810). When the extracted zoom-in / zoom-out information, the motion vector, and the residual signal are encoded, they are respectively decoded. If not encoded, there is no need to decode it. The decoding step is an inverse of the above-described encoding method and includes an entropy decoding step, an inverse quantization step, and an inverse transform step.

Next, the resampled reference picture is obtained by resampling the reference picture according to the zoom in / zoom out information (S820). When zoom in / zoom out is performed, the resampled reference picture is generated by resampling the reference picture by the resampling ratio included in the decoded zoom in / zoom out information.

Next, a predictive picture is obtained based on the motion vector (S830). The resampled reference picture is compared with the current picture to find a predetermined block in the current picture and the most similar block in the resampled reference picture, and calculates a moving M vector accordingly. A predictive picture is generated based on the calculated motion vector.

Next, the decoded picture is obtained by adding the predictive picture and the decoded residual signal (S840). By adding the predictive picture and the decoded residual signal, the decoded picture is obtained.

10 is a block diagram showing an embodiment of a video encoding apparatus of the present invention.

Referring to the drawings, the video encoding apparatus of the present invention includes a storage unit 900, a zoom in / zoom out correction unit 910, a motion estimation unit 920, a motion compensator 930, an inverse quantizer 970, It may include an inverse transform unit 980 and an encoder 990. The encoder 990 may include a transformer 940, a quantizer 950, and an entropy encoder 960.

The storage unit 900 stores the reconstructed (decoded) reference picture for use in predicting a subsequent picture.

The zoom-in / zoom-out corrector 910 obtains zoom-in / zoom-out information of the current picture using at least one reference picture stored in the storage 900, and downsamples or upsamples the reference picture according to the information. Generate at least one resampled reference picture. The zoom-in / zoom-out information may include reference picture information as a reference for zoom-in / zoom-out and zoom-in / zoom-out of the current picture, and a resampling ratio that is a ratio for downsampling or upsampling the reference picture. When the current picture is zoomed in / out, a resampled reference picture is generated by resampling a corresponding reference picture using a resampling ratio.

The motion estimator 920 compares the resampled reference picture and the current picture to estimate a motion of the current picture and calculates a motion vector. That is, the motion vector is calculated by estimating the motion of the predetermined block in the current picture by referring to the predetermined block in the resampled reference picture.

The motion compensator 930 calculates a predictive picture by motion compensating the resampled reference picture based on the motion vector. The difference value between the prediction picture and the current picture is called a residual picture or a residual signal.

Meanwhile, the encoder 990 encodes the residual signal. The encoder 990 may include a transformer 940, a quantizer 950, and an entropy encoder 960.

First, the converter 940 converts video data in the spatial domain into frequency domain (ie, spectrum) data. When motion estimation and compensation are performed, the converter 940 converts the residual signal into the frequency domain. The transform unit 940 converts the spatial domain video data into frequency domain data using various techniques such as discrete cosine transform (DCT), wavelet transform, Hadamard transform, and integer DCT transform.

Next, the quantizer 950 quantizes the transformed frequency domain data. When motion estimation and compensation are performed, the transformed residual signal is quantized. Quantization may be performed by various techniques such as scalar quantization, vector quantization, adaptive quantization, and non-adaptive quantization.

Next, the entropy encoding unit 960 compresses not only the output of the quantization unit 950 but also other additional information (motion vectors, zoom in / zoom out information), and the like. Entropy coding techniques include Huffman coding, run length coding, LZ coding, dictionary coding, exponential Golem coding (Exp-Golomb) adaptive arithmetic coding (CABAC), adaptive length-length coding (CAVLC), and more. Can be implemented. The entropy encoder 960 may use different coding techniques for different kinds of information (eg, DC coefficients, AC coefficients, and other kinds of additional information), and may select from among a plurality of code tables within a specific coding technique. .

Meanwhile, inverse quantization and inverse transformation are performed through the inverse quantization unit 970 and the inverse transform unit 980 for motion estimation and compensation for a subsequent picture. After performing the inverse transform, the reconstructed picture is stored in the storage unit 900 as a reference picture. On the other hand, although not shown in the figure, in order to adaptively smooth the discontinuity in the block of the reconstructed picture, the loop filter unit may be disposed after the inverse transform unit 980. Meanwhile, the units may be performed in various block units such as 16X16, 8X8, 8X4, 4X8, and 4X4.

FIG. 11 is a block diagram illustrating an example of a zoom in / zoom out corrector of FIG. 9.

Referring to the drawings, the zoom-in / zoom-out corrector 910 may include a location information calculator 1000, a filter coefficient selector 1010, and a resampling filter unit 1020.

The position information calculation unit 1000 calculates position information of a generated pixel to be generated by resampling based on the resampling ratio included in the zoom in / zoom out information and the original pixel of the reference picture. For example, when the resampling ratio is 2, the position information of the generated pixel is set to the center of the original pixels. In FIG. 5 and FIG. 7, 16 pixel positions from 0 to 15 are classified into one pixel. In particular, in FIG. 5, 5 is used as the position information (phase) in FIG. 5, and 8 is used as the position information (phase) in FIG.

The filter coefficient selection unit 1010 selects predetermined filter coefficients based on the positional information. In the case of upsampling with a resampling ratio greater than 1, as shown in FIG. 6, four filter coefficients are selected according to the position information. In the case of downsampling with a resampling ratio of less than 1, as shown in FIG. 8, 12 filter coefficients are selected according to the position information.

The resampling filter unit 1020 calculates the resampling pixel value based on the selected filter coefficient and the pixel value of the original pixel in the reference picture. That is, the equation (B '= a * A + b * B + c * C + d * D) of FIG. 5 and the equation (F' = a * A + b * B + ... h * H + l) of FIG. 7. According to * L), a new generation pixel is generated in the reference picture. In the case of upsampling a reference picture, a final pixel is formed by the original pixel and the generated pixel, thereby generating an upsampled reference picture. In the case of downsampling of a reference picture, a final pixel is formed by the generated pixel, thereby generating a downsampled reference picture.

12 is a block diagram showing an embodiment of a video decoding apparatus of the present invention.

Referring to the drawings, the video decoding apparatus of the present invention may include a decoder 1100, a storage 1130, a zoom in / zoom out corrector 1140, and a motion compensator 1150.

The decoder 1100 includes an entropy decoder 1105, an inverse quantizer 1115, and an inverse transform unit 1125. The entropy decoding unit 1105 decodes entropy encoded data as the inverse of the entropy encoding performed by the encoding apparatus. That is, entropy-encoded data such as residual signals and additional information (motion vectors, zoom in / zoom out information) are decoded. As described in the entropy encoding unit 960, entropy decoding may be implemented as various techniques, and different decoding techniques may be used for different kinds of information (eg, DC coefficients, AC coefficients, and other kinds of additional information). And select from a plurality of code tables within a particular decoding technique. The zoom-in / zoom-out information may include reference picture information as a reference for zoom-in / zoom-out and zoom-in / zoom-out of a current picture, and a resampling ratio that is a ratio for downsampling or upsampling a reference picture.

The dequantization unit 1115 dequantizes the entropy decoded data. In other words, the entropy coded residual signal is inversely quantized. Inverse quantization may be performed by various techniques such as scalar inverse quantization, vector inverse quantization, adaptive inverse quantization, non-adaptive inverse quantization, and the like.

The inverse transformer 1125 converts the quantized frequency domain data into spatial domain video data. That is, the inverse quantized residual signal is inversely transformed to obtain a residual signal. The inverse transform unit 1125 converts the frequency domain data into the video data of the spatial domain by using the inverse transform of the above-described various transformation techniques.

The storage unit 1130 stores a previously reconstructed picture for use as a reference picture.

The zoom-in / zoom-out corrector 1140 may downsample or upsample the reference picture based on the at least one reference picture stored in the storage 1130 and the decoded zoom-in / zoom out information, thereby re-sampling the at least one sample. Create a reference picture. When the current picture is zoomed in / out, a resampled reference picture is generated by resampling a corresponding reference picture using a resampling ratio.

As illustrated in FIG. 11, the zoom in / zoom out corrector 1140 includes a location information calculator 1100, a filter coefficient selector 1110, and a resampling filter 1120. The position information calculator 1100 calculates position information of a resampling pixel generated based on the resampling ratio included in the zoom-in / zoom-out information and based on the position information phase of the filter coefficient selector 1110. A filter coefficient is selected, and the resampling filter unit 1120 calculates the resampling pixel value based on the selected filter coefficient and the pixel value in the reference picture. When the resampling ratio is greater than 1, the zoom in / zoom out correction unit 1140 generates a resampled reference picture based on a combination of pixels in the reference picture and resampling pixels generated by upsampling, and the resampling ratio is greater than 1. In the small case, a resampled reference picture is generated based on the resampling pixel generated by downsampling.

The motion compensator 1150 calculates a predictive picture by motion compensating the resampled reference picture based on the decoded motion vector.

Meanwhile, the video decoding apparatus according to an embodiment of the present invention may further include a synthesis unit configured to synthesize the predictive picture and the decoded residual signal to generate a decoded picture. The decoded picture generated by the synthesizer is stored in the storage 1130. Although not shown in the drawing, in order to adaptively smooth the discontinuities in the blocks of the decoded picture, the loop filter unit may be disposed between the synthesis unit and the storage unit 1130. Meanwhile, the units may be performed in various block units such as 16X16, 8X8, 8X4, 4X8, and 4X4.

FIG. 13 is a diagram illustrating information related to zoom in / zoom out in a bitstream, and FIG. 14 is a diagram illustrating offsets for partially cutting an upsampled reference picture.

Referring to the figure, in the encoding process, first, a resampling flag (a) for zoom in / zoom out correction is signaled at a first syntax level. The resampling flag (a) may be defined as "use_seq_resampled_ref_flag" that indicates whether the video sequence uses a resampled reference picture in inter presiction, and the first syntax level 1300 may be a sequence level. In particular, it may be a "sequence parameter set" of H.264.

Next, if the resampling flag a indicates that resampling is active for at least one reference picture corresponding to the current picture in the video sequence (use_seq_resampled_ref_flag = 1), then the first syntax level in the video sequence Signal at least one resampling ratio c for zoom in / zoom out correction for the current picture at a second syntax level 1310 lower than 1300. The resampling ratio c may be defined as "resampling_ratio [i]" indicating a resampling ratio between the i th reference picture and the current picture. The second syntax level 1310 may be a slice level, in particular a slice header. The zoom in / zoom out correction resamples the at least one reference picture based on at least one resampling ratio (c) and generates a resampled reference picture. Motion estimation and compensation are performed based on the generated resampled reference picture.

On the other hand, when the resampling flag (a) indicates that the resampling is active (use_seq_resampled_ref_flag = 1), a second syntax level lower than the first syntax parameter set in the video sequence. Signal a second resampling flag (b) for zoom in / zoom out correction at. The second sampling flag b may be defined as "use_slice_resampled_ref_flag" indicating whether the current slice uses the resampled reference picture in inter prediction. The resampling ratio (c) may be signaled when the resampling flag (a) and the second resampling flag (b) are activated (use_seq_resampled_ref_flag = 1, use_slice_resampled_ref_flag = 1).

Meanwhile, when the resampling flag a and the second resampling flag b are activated (use_seq_resampled_ref_flag = 1 and use_slice_resampled_ref_flag = 1), a third lower than the second syntax level 1310 in the video sequence. A third resampling flag (e) for zoom in / zoom out correction may be signaled at syntax level 1320. The third resampling flag e may be defined as "use_block_resampled_ref_flag" indicating whether the current block uses a resampled reference picture in inter prediction. The third syntax level 1320 may be a macroblock or block level.

Meanwhile, at the second sequence level 1310, offset data d for cutting at least a portion of the upsampled or downsampled reference picture may be further signaled. The offset data d may be defined as "left_offset", "right_offset", "top_offset", and "bottom_offset". FIG. 14A shows each offset data ("left_offset", "right_offset", "top_offset", "bottom_offset") for cropping an upsampled reference picture as compared to the current picture. Using each offset data shown, the upsampled reference picture is generated to be the same size as the current picture of FIG. 14B.

Meanwhile, in the decoding process, the bitstream shown in FIG. 13 is received. First, a resampling flag a for zoom in / zoom out correction is received and processed at a first syntax level 1300 in a video sequence, and then the resampling flag a at the first syntax level 1300 is When resampling is activated for at least one reference picture corresponding to a current picture in the video sequence, the current picture at a second syntax level 1310 lower than the first syntax level 1300 in the video sequence Receive and process at least one resampling ratio (c) for zoom in / zoom out correction for. Zoom in / zoom out correction resamples the at least one reference picture based on at least one resampling ratio c to generate a resampled reference picture.

On the other hand, when the resampling flag (a) indicates that the resampling is active, a second resampling flag for zoom in / zoom out correction at a second syntax level 1310 lower than the first syntax level 1300 in the video sequence. (b) can be received and further processed. The processing of the resampling rate c may be performed when the first resampling flag a and the second resampling flag b are activated.

In addition, when the resampling flag (a) and the second resampling flag (b) are activated, zoom-in / zoom-out correction is performed at a third syntax level 1320 lower than the second syntax level 1310 in the video sequence. The third resampling flag (e) may be received and further processed.

In addition, at the second syntax level 1310, offset data d for cutting at least a portion of the resampled reference picture may be received and further processed. The offset data is as shown in FIG.

As described above, the first to third syntax levels may be a sequence level, a slice level, and a macro block or a block level, respectively.

Meanwhile, the video encoding method or the video decoding method of the present invention may be embodied as a processor readable code on a processor readable recording medium included in the video encoding apparatus or the video decoding apparatus. The processor-readable recording medium includes all kinds of recording devices that store data that can be read by the processor. Examples of the processor-readable recording medium include ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like, and also include a carrier wave such as transmission through the Internet. The processor-readable recording medium can also be distributed over network coupled computer systems so that the processor-readable code is stored and executed in a distributed fashion.

While preferred embodiments of the present invention have been shown and described, the present invention is not limited to the specific embodiments described above, and the present invention is not limited to the specific scope of the present invention as claimed in the claims. Various modifications can be made by those skilled in the art, and these modifications should not be understood individually from the technical idea or the prospect of the present invention.

FIG. 2 is a diagram illustrating a comparison of a resampled reference picture and a current picture by resampling the reference picture of FIG. 1.

3 is a flowchart illustrating an embodiment of a video encoding method of the present invention.

FIG. 4 is a flowchart illustrating an example of obtaining a resampled reference picture of FIG. 3. FIG.

5 illustrates upsampling for a pixel.

FIG. 6 is a diagram illustrating filter coefficients used in upsampling of FIG. 5.

7 illustrates downsampling for a pixel.

FIG. 8 is a diagram illustrating filter coefficients used in downsampling of FIG. 7. FIG.

13 is a diagram illustrating information related to zoom in / zoom out in a bitstream.

14 is a diagram illustrating offsets for partially cropping an upsampled reference picture.

900: storage unit 910: zoom in / zoom out correction unit

920: the motion estimation unit 930: the motion compensation unit

1000: location information calculation unit 1010: filter coefficient selection unit

1020: resampling filter unit

Claims

Obtaining at least one resampled reference picture by performing any one of downsampling and upsampling on the reference picture based on the zoom-in / zoom-out information extracted from the received bitstream;

Obtaining a predictive picture by motion compensation of the resampled reference picture; And

And adding the predictive picture and the decoded residual signal to obtain a decoded picture.

The method of claim 1,

Obtaining the resampled reference picture,

Calculating position information of a resampling pixel generated based on a resampling ratio included in the zoom in / zoom out information;

Selecting predetermined filter coefficients based on the position information; And

Calculating the resampling pixel value based on the selected filter coefficient and the pixel value in the reference picture.

The method of claim 2,

The resampled reference picture is

When the resampling ratio is greater than 1, the resampling ratio is generated based on a combination of pixels in the reference picture and resampling pixels generated by the upsampling,

And when the resampling ratio is less than 1, generated based on the resampling pixels generated by the downsampling.

Obtaining zoom in / zoom out information of the current picture based on the at least one reference picture;

Obtaining at least one resampled reference picture by downsampling or upsampling the reference picture according to the information;

Obtaining a motion vector of the current picture based on the resampled reference picture; And

And obtaining a predictive picture by motion compensating the resampled reference picture based on the motion vector.

The method of claim 4, wherein

Obtaining the resampled reference picture,

Calculating position information of a resampling pixel generated based on a resampling ratio included in the information;

And calculating the resampling pixel value based on the selected filter coefficient and the pixel value in the reference picture.

The method of claim 5,

The resampled reference picture is

When the resampling ratio is greater than 1, generated based on a combination of pixels in the reference picture and resampling pixels generated by the upsampling,

In the decryption process,

Receiving and processing a resampling flag for zoom in / zoom out correction at a first syntax level in the video sequence; And

A second syntax lower than the first syntax level in the video sequence when the resampling flag at the first syntax level indicates that resampling is activated for at least one reference picture corresponding to a current picture in the video sequence Receiving and processing at least one resampling ratio for zoom in / zoom out correction for the current picture at a level;

And the zoom in / zoom out correction resamples the at least one reference picture based on at least one resampling ratio to generate a resampled reference picture.

The method of claim 7, wherein

Receiving and processing a second resampling flag for zoom in / zoom out correction at the second syntax level in the video sequence when the resampling flag indicates that the resampling is active; And

Receiving and processing a third resampling flag for zoom in / zoom out correction at a third syntax level lower than the second syntax level in the video sequence when the resampling flag and the second resampling flag are active; A video decoding method further comprising.

In the encoding process,

Signaling a resampling flag for zoom in / zoom out correction at a first syntax level in the video sequence; And

A second syntax lower than the first syntax level in the video sequence when the resampling flag at the first syntax level indicates that resampling is activated for at least one reference picture corresponding to a current picture in the video sequence Signaling at least one resampling ratio for zoom in / zoom out correction for the current picture at a level;

The method of claim 9,

If the resampling flag indicates that the resampling is active, signaling a second resampling flag for zoom in / zoom out correction at the second syntax level in the video sequence; And

When the resampling flag and the second resampling flag indicate that the resampling flag is active, signaling a third resampling flag for zoom in / zoom out correction at a third syntax level lower than the second syntax level in the video sequence; Video encoding method.

A decoder which receives a bitstream and extracts and decodes zoom in / zoom out information, a motion vector, and a residual signal;

A zoom in / zoom out corrector configured to downsample or upsample a reference picture based on the decoded zoom in / zoom out information to generate at least one resampled reference picture;

A motion compensator for motion compensating the resampled reference picture based on a motion vector to obtain a predictive picture; And

And a synthesizer configured to synthesize the predictive picture and the decoded residual signal to generate a decoded picture.

A zoom-in / zoom-out correction unit configured to obtain zoom-in / zoom-out information of the current picture based on at least one reference picture and to generate at least one resampled reference picture by downsampling or upsampling the reference picture according to the information;

A motion estimation unit for obtaining a motion vector of the current picture based on the resampled reference picture;

A motion compensator configured to obtain a predictive picture by motion compensating the resampled reference picture based on the motion vector; And

And an encoder configured to generate a residual signal by encoding the current picture and the predictive picture, and to encode the generated residual signal.

A processor-readable recording medium having recorded thereon a program for executing the video decoding method of claim 1.

A processor-readable recording medium having recorded thereon a program for executing the video encoding method of claim 4.