KR20090016970A - Method for encoding and decoding video and apparatus thereof - Google Patents

Method for encoding and decoding video and apparatus thereof Download PDF

Info

Publication number
KR20090016970A
KR20090016970A KR1020070081343A KR20070081343A KR20090016970A KR 20090016970 A KR20090016970 A KR 20090016970A KR 1020070081343 A KR1020070081343 A KR 1020070081343A KR 20070081343 A KR20070081343 A KR 20070081343A KR 20090016970 A KR20090016970 A KR 20090016970A
Authority
KR
South Korea
Prior art keywords
zoom
resampling
reference picture
picture
flag
Prior art date
Application number
KR1020070081343A
Other languages
Korean (ko)
Inventor
박승욱
전병문
Original Assignee
엘지전자 주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 엘지전자 주식회사 filed Critical 엘지전자 주식회사
Priority to KR1020070081343A priority Critical patent/KR20090016970A/en
Publication of KR20090016970A publication Critical patent/KR20090016970A/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/18Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present invention relates to a video encoding and decoding method and apparatus therefor. When zooming in and out of video data, motion compensation is performed using zoom in and zoom out information. This enables efficient encoding and decoding of video data when zooming in and out.

Description

Method for encoding and decoding video and apparatus

The present invention relates to a video encoding and decoding method and apparatus, and more particularly, to a video encoding and decoding method and apparatus for performing motion compensation by performing zoom in / zoom out correction when zooming in / out of video data.

Video data is generally processed and transmitted in the form of bitstreams. Typical video encoding and decoding methods achieve high efficiency compression by predicting a reference picture for the current picture to be encoded and encoding the difference between the current picture and the predictive picture. The higher the predictive picture correlates with the current picture, the fewer bits are needed to compress the predictive picture, increasing the efficiency of the process. Therefore, it is desirable to predict the best possible predictive picture. In many video compression standards, including Moving Picture Experts Group (MPEG) -1, MPEG-2, and MPEG-4, motion estimation versions of previous reference pictures are used as predictive pictures for the current picture, with single picture prediction (" Reference picture is not scaled when forming the predictive picture, and when using bi-directional picture prediction (“B” picture), the same weighting factor is used for two different reference pictures. Average to form a single predictive picture.

An object of the present invention is to provide a video encoding and decoding method and apparatus for performing efficient encoding and decoding by motion compensation by zooming in and zooming out of video data when zooming in and out.

According to another aspect of the present invention, there is provided a video encoding method including obtaining zoom in / zoom out information of a current picture based on at least one reference picture, and downsampling a reference picture according to the information. obtaining at least one resampled reference picture by downsampling or upsampling, obtaining a motion vector of the current picture based on the resampled reference picture, and reconstructing based on the motion vector. The method may include obtaining a predictive picture by motion compensation of the sampled reference picture.

On the other hand, the video decoding method according to the present invention comprises the steps of receiving a bitstream, extracting the zoom in / zoom out information, the motion vector, and the residual signal, decoding the zoom in / zoom out information, the motion vector, and the residual signal; Obtaining at least one resampled reference picture by downsampling or upsampling a reference picture based on the decoded information, and obtaining a predictive picture by motion compensating the resampled reference picture based on the decoded motion vector. It may include a step.

On the other hand, the video encoding method according to the present invention, in the encoding process, signaling a resampling flag for zoom-in / zoom-out correction at the first syntax level in the video sequence, and the resampling flag at the first syntax level Indicates that resampling is enabled for at least one reference picture corresponding to the current picture in the video sequence, zoom-in / zoom out correction for the current picture at a second syntax level lower than the first syntax level in the video sequence. And signaling at least one resampling rate for the control, wherein the zoom in / zoom out correction may be to resample at least one reference picture to generate a resampled reference picture based on the at least one resampling ratio.

On the other hand, the video decoding method according to the present invention, in the decoding process, receiving and processing a resampling flag for zoom-in / zoom-out correction at the first syntax level in the video sequence, and the resampling flag at the first syntax level is If resampling is enabled for at least one reference picture corresponding to the current picture in the video sequence, zoom in / zoom out correction is performed for the current picture at a second syntax level lower than the first syntax level in the video sequence. Receiving and processing at least one resampling rate for the zoom in / zoom out correction, wherein the zoom in / zoom out correction is to resample the at least one reference picture to generate a resampled reference picture based on at least one resampling ratio Can be.

Meanwhile, the video encoding apparatus according to the present invention obtains zoom-in / zoom-out information of a current picture based on a storage unit in which a reference picture is stored and at least one stored reference picture, and downsamples or de-samples the reference picture according to the information. A zoom-in / zoom-out correction unit for upsampling to generate at least one resampled reference picture, a motion estimator for obtaining a motion vector of the current picture based on the resampled reference picture, and a reconstruction based on the motion vector It may include a motion compensation unit for obtaining a prediction picture by motion compensation of the sampled reference picture.

Meanwhile, a video decoding apparatus according to the present invention includes a decoder which receives a bitstream and extracts and decodes zoom in / zoom out information, a motion vector, and a residual signal, a storage unit in which a reference picture is stored, and decoded zoom in / zoom out information. A zoom-in / zoom-out correction unit for generating at least one resampled reference picture by downsampling or upsampling the reference picture based on the first and second reference pictures, and motion-compensating the resampled reference picture based on a motion vector to obtain a predictive picture Obtaining may include a motion compensation unit.

In order to achieve the above object, the present invention can provide a processor-readable recording medium having recorded thereon a program for executing the video encoding method in a processor.

In order to achieve the above object, the present invention can provide a processor-readable recording medium having recorded thereon a program for executing the video decoding method in a processor.

The video encoding and decoding method and apparatus according to the present invention perform motion compensation by zoom in / zoom out correction when zooming in / out of video data, so that efficient encoding and decoding is performed.

Hereinafter, with reference to the drawings will be described the present invention in more detail.

1 is a diagram illustrating a comparison between a reference picture and a current picture at zoom in / zoom out.

Referring to the drawings, FIG. 1A illustrates an example of a reference picture and FIG. 1B illustrates an example of a current picture. Comparing the reference picture and the current picture in the drawing, it can be seen that the current picture is in a zoomed-in state compared to the reference picture. That is, it can be seen that block A of FIG. 1B has a larger size than block B of FIG. 1A.

Many video compression standards, including Moving Picture Experts Group (MPEG) -1, MPEG-2, and MPEG-4, perform motion prediction and compensation on a block-by-block basis. However, in the zoom-in / zoom-out situation as shown in FIG. 1, according to the conventional video encoding method without scaling or weight prediction of a reference picture, since motion compensation and prediction are simply performed on a block basis, a difference value between the prediction block and the current block is performed. The data of the remaining residual blocks increases, resulting in poor coding efficiency.

FIG. 2 is a diagram illustrating a comparison of a resampled reference picture and a current picture by resampling the reference picture of FIG. 1. Referring to the drawings, it can be seen that in the context of zooming in on the reference picture of the current picture as shown in FIG. 1, the reference picture is resampled, specifically, upsampling. In other words, it can be seen that the reference picture is upsampled so that the size of the block A of FIG. 2B and the block B of FIG. 2A are the same. As a result of the upsampling with respect to the reference picture, it can also be seen that the size of the entire upsampled reference picture is larger than the size of the current picture.

3 is a flowchart illustrating an embodiment of a video encoding method of the present invention. Referring to the drawings, first, in video data, zoom-in / zoom-out information of the current picture is obtained based on at least one reference picture (S300). The zoom-in / zoom-out information may include whether the current picture is zoomed in / zoom out, and a resampling ratio of the reference picture when zooming in / zoom out.

Next, at least one resampled reference picture is obtained by resampling the reference picture according to the zoom in / zoom out information (S310). Resampling includes upsampling or downsampling. If the resampling ratio is greater than 1, upsampling is performed to generate an upsampled reference picture. If the resampling ratio is less than 1, downsampling is performed to generate a downsampled reference picture. The upsampled reference picture is generated based on a combination of the original pixel in the reference picture and the resampling (generating) pixel generated by upsampling. The downsampled reference picture is generated based on the resampling (generation) pixel generated by downsampling.

Next, a motion vector of the current picture is obtained based on the resampled reference picture (S320). The most similar block in the resampled reference picture and the predetermined block in the current picture are found, and a motion vector is calculated accordingly.

Next, a predictive picture is obtained by motion compensation of the resampled reference picture based on the motion vector (S330). Based on the motion vector, motion compensated resampled reference picture. The motion compensated resampled reference picture is called a predictive picture.

Although not shown in the figure, after the obtaining of the prediction picture, the step of encoding the difference between the current picture and the prediction picture may be further performed. The difference value may be referred to as a residual picture or a residual signal, and the encoding step may be performed in units of blocks. Specifically, the encoding may further include transforming the residual signal into the frequency domain, quantizing the transformed residual signal, and entropy encoding the frequency-converted quantized residual signal.

Also, although not shown in the figure, the step of encoding the motion vector may be further performed. In detail, the motion vector may be entropy encoded. In addition, although not shown in the figure, the step of encoding the zoom in / zoom out information may be further performed. In detail, the zoom in / zoom out information may be entropy encoded. Meanwhile, each of the steps illustrated in FIG. 3, the residual signal encoding step, the motion vector encoding step, and the zoom in / zoom out information encoding step may be performed in various block units such as 16X16, 8X8, 8X4, 4X8, and 4X4.

4 is a flowchart illustrating an example of obtaining the resampled reference picture of FIG. 3, FIG. 5 is a diagram illustrating upsampling of a pixel, and FIG. 6 is a diagram illustrating filter coefficients used in the upsampling of FIG. 5. FIG. 7 is a diagram showing downsampling for a pixel, and FIG. 8 is a diagram showing filter coefficients used in downsampling of FIG. 7.

Referring to the drawings, in the obtaining of the reference picture of FIG. 4, first, the position information of the resampling pixel generated based on the resampling ratio included in the zoom in / zoom out information is calculated (S400). First of all, it is assumed that the current picture is zoomed in / out for the reference picture. The resampling ratio may be compared with the reference picture and the current picture and generated accordingly. Once the resampling ratio is determined, position information of the generated pixel is determined based on the original pixel of the reference picture. In FIG. 5 of upsampling and FIG. 7 of downsampling, the position information is classified into 16 pieces from 0 to 15. FIG. For example, when the resampling ratio is 2, the position information phase is determined as 8, and the position of the generated pixel is located in the middle of the original pixels. In FIG. 5, 5 is used as the position information phase, and in FIG. 7, 8 is used as the position information phase.

Next, predetermined filter coefficients are selected based on the positional information (S410). In the upsampling, FIG. 6 shows that four filter coefficients are determined according to the position information. In the down sampling, FIG. 8 shows that 12 filter coefficients are determined according to the position information.

Next, the resampling pixel value is calculated based on the selected filter coefficient and the pixel value in the reference picture (S420). During upsampling, filtering is performed according to the formula (B '= a * A + b * B + c * C + d * D) shown in FIG. 5, and finally, the resampling (generation) pixel value is determined. During downsampling, filtering is performed according to the formula (F '= a * A + b * B + ... h * H + l * L) shown in FIG. 7, and finally the resampling (generation) pixel value is determined.

Meanwhile, as described above, during upsampling, the final pixel is determined by the original pixel in the reference picture and the generated pixel generated according to the equation of FIG. 5, and an upsampled reference picture including the final pixel is generated. Such upsampling increases the number of pixels and increases the size of the resampled picture (block).

In downsampling, the final pixel is determined by the generated pixel generated according to the equation of FIG. By such downsampling, the number of pixels is reduced, and the size of the resampled picture (block) is reduced.

9 is a flowchart illustrating an embodiment of a video decoding method according to the present invention.

Referring to the drawings, first, a bitstream is received to extract zoom-in / zoom-out information, a motion vector, and a residual signal (S800). The bitstream is received to extract zoom in / zoom out information, motion vectors, and residual signals included in the bitstream. The zoom in / zoom out information may be information including whether to zoom in / zoom out and a resampling rate.

Next, the zoom-in / zoom-out information, the motion vector, and the residual signal are decoded (S810). When the extracted zoom-in / zoom-out information, the motion vector, and the residual signal are encoded, they are respectively decoded. If not encoded, there is no need to decode it. The decoding step is an inverse of the above-described encoding method and includes an entropy decoding step, an inverse quantization step, and an inverse transform step.

Next, the resampled reference picture is obtained by resampling the reference picture according to the zoom in / zoom out information (S820). When zoom in / zoom out is performed, the resampled reference picture is generated by resampling the reference picture by the resampling ratio included in the decoded zoom in / zoom out information.

Next, a predictive picture is obtained based on the motion vector (S830). The resampled reference picture is compared with the current picture to find a predetermined block in the current picture and the most similar block in the resampled reference picture, and calculates a moving M vector accordingly. A predictive picture is generated based on the calculated motion vector.

Next, the decoded picture is obtained by adding the predictive picture and the decoded residual signal (S840). By adding the predictive picture and the decoded residual signal, the decoded picture is obtained.

10 is a block diagram showing an embodiment of a video encoding apparatus of the present invention.

Referring to the drawings, the video encoding apparatus of the present invention includes a storage unit 900, a zoom in / zoom out correction unit 910, a motion estimation unit 920, a motion compensator 930, an inverse quantizer 970, It may include an inverse transform unit 980 and an encoder 990. The encoder 990 may include a transformer 940, a quantizer 950, and an entropy encoder 960.

The storage unit 900 stores the reconstructed (decoded) reference picture for use in predicting a subsequent picture.

The zoom-in / zoom-out corrector 910 obtains zoom-in / zoom-out information of the current picture using at least one reference picture stored in the storage 900, and downsamples or upsamples the reference picture according to the information. Generate at least one resampled reference picture. The zoom-in / zoom-out information may include reference picture information as a reference for zoom-in / zoom-out and zoom-in / zoom-out of the current picture, and a resampling ratio that is a ratio for downsampling or upsampling the reference picture. When the current picture is zoomed in / out, a resampled reference picture is generated by resampling a corresponding reference picture using a resampling ratio.

The motion estimator 920 compares the resampled reference picture and the current picture to estimate a motion of the current picture and calculates a motion vector. That is, the motion vector is calculated by estimating the motion of the predetermined block in the current picture by referring to the predetermined block in the resampled reference picture.

The motion compensator 930 calculates a predictive picture by motion compensating the resampled reference picture based on the motion vector. The difference value between the prediction picture and the current picture is called a residual picture or a residual signal.

Meanwhile, the encoder 990 encodes the residual signal. The encoder 990 may include a transformer 940, a quantizer 950, and an entropy encoder 960.

First, the converter 940 converts video data in the spatial domain into frequency domain (ie, spectrum) data. When motion estimation and compensation are performed, the converter 940 converts the residual signal into the frequency domain. The transform unit 940 converts the spatial domain video data into frequency domain data using various techniques such as discrete cosine transform (DCT), wavelet transform, Hadamard transform, and integer DCT transform.

Next, the quantizer 950 quantizes the transformed frequency domain data. When motion estimation and compensation are performed, the transformed residual signal is quantized. Quantization may be performed by various techniques such as scalar quantization, vector quantization, adaptive quantization, and non-adaptive quantization.

Next, the entropy encoding unit 960 compresses not only the output of the quantization unit 950 but also other additional information (motion vectors, zoom in / zoom out information), and the like. Entropy coding techniques include Huffman coding, run length coding, LZ coding, dictionary coding, exponential Golem coding (Exp-Golomb) adaptive arithmetic coding (CABAC), adaptive length-length coding (CAVLC), and more. Can be implemented. The entropy encoder 960 may use different coding techniques for different kinds of information (eg, DC coefficients, AC coefficients, and other kinds of additional information), and may select from among a plurality of code tables within a specific coding technique. .

Meanwhile, inverse quantization and inverse transformation are performed through the inverse quantization unit 970 and the inverse transform unit 980 for motion estimation and compensation for a subsequent picture. After performing the inverse transform, the reconstructed picture is stored in the storage unit 900 as a reference picture. On the other hand, although not shown in the figure, in order to adaptively smooth the discontinuity in the block of the reconstructed picture, the loop filter unit may be disposed after the inverse transform unit 980. Meanwhile, the units may be performed in various block units such as 16X16, 8X8, 8X4, 4X8, and 4X4.

FIG. 11 is a block diagram illustrating an example of a zoom in / zoom out corrector of FIG. 9.

Referring to the drawings, the zoom-in / zoom-out corrector 910 may include a location information calculator 1000, a filter coefficient selector 1010, and a resampling filter unit 1020.

The position information calculation unit 1000 calculates position information of a generated pixel to be generated by resampling based on the resampling ratio included in the zoom in / zoom out information and the original pixel of the reference picture. For example, when the resampling ratio is 2, the position information of the generated pixel is set to the center of the original pixels. In FIG. 5 and FIG. 7, 16 pixel positions from 0 to 15 are classified into one pixel. In particular, in FIG. 5, 5 is used as the position information (phase) in FIG. 5, and 8 is used as the position information (phase) in FIG.

The filter coefficient selection unit 1010 selects predetermined filter coefficients based on the positional information. In the case of upsampling with a resampling ratio greater than 1, as shown in FIG. 6, four filter coefficients are selected according to the position information. In the case of downsampling with a resampling ratio of less than 1, as shown in FIG. 8, 12 filter coefficients are selected according to the position information.

The resampling filter unit 1020 calculates the resampling pixel value based on the selected filter coefficient and the pixel value of the original pixel in the reference picture. That is, the equation (B '= a * A + b * B + c * C + d * D) of FIG. 5 and the equation (F' = a * A + b * B + ... h * H + l) of FIG. 7. According to * L), a new generation pixel is generated in the reference picture. In the case of upsampling a reference picture, a final pixel is formed by the original pixel and the generated pixel, thereby generating an upsampled reference picture. In the case of downsampling of a reference picture, a final pixel is formed by the generated pixel, thereby generating a downsampled reference picture.

12 is a block diagram showing an embodiment of a video decoding apparatus of the present invention.

Referring to the drawings, the video decoding apparatus of the present invention may include a decoder 1100, a storage 1130, a zoom in / zoom out corrector 1140, and a motion compensator 1150.

The decoder 1100 includes an entropy decoder 1105, an inverse quantizer 1115, and an inverse transform unit 1125. The entropy decoding unit 1105 decodes entropy encoded data as the inverse of the entropy encoding performed by the encoding apparatus. That is, entropy-encoded data such as residual signals and additional information (motion vectors, zoom in / zoom out information) are decoded. As described in the entropy encoding unit 960, entropy decoding may be implemented as various techniques, and different decoding techniques may be used for different kinds of information (eg, DC coefficients, AC coefficients, and other kinds of additional information). And select from a plurality of code tables within a particular decoding technique. The zoom-in / zoom-out information may include reference picture information as a reference for zoom-in / zoom-out and zoom-in / zoom-out of a current picture, and a resampling ratio that is a ratio for downsampling or upsampling a reference picture.

The dequantization unit 1115 dequantizes the entropy decoded data. In other words, the entropy coded residual signal is inversely quantized. Inverse quantization may be performed by various techniques such as scalar inverse quantization, vector inverse quantization, adaptive inverse quantization, non-adaptive inverse quantization, and the like.

The inverse transformer 1125 converts the quantized frequency domain data into spatial domain video data. That is, the inverse quantized residual signal is inversely transformed to obtain a residual signal. The inverse transform unit 1125 converts the frequency domain data into the video data of the spatial domain by using the inverse transform of the above-described various transformation techniques.

The storage unit 1130 stores a previously reconstructed picture for use as a reference picture.

The zoom-in / zoom-out corrector 1140 may downsample or upsample the reference picture based on the at least one reference picture stored in the storage 1130 and the decoded zoom-in / zoom out information, thereby re-sampling the at least one sample. Create a reference picture. When the current picture is zoomed in / out, a resampled reference picture is generated by resampling a corresponding reference picture using a resampling ratio.

As illustrated in FIG. 11, the zoom in / zoom out corrector 1140 includes a location information calculator 1100, a filter coefficient selector 1110, and a resampling filter 1120. The position information calculator 1100 calculates position information of a resampling pixel generated based on the resampling ratio included in the zoom-in / zoom-out information and based on the position information phase of the filter coefficient selector 1110. A filter coefficient is selected, and the resampling filter unit 1120 calculates the resampling pixel value based on the selected filter coefficient and the pixel value in the reference picture. When the resampling ratio is greater than 1, the zoom in / zoom out correction unit 1140 generates a resampled reference picture based on a combination of pixels in the reference picture and resampling pixels generated by upsampling, and the resampling ratio is greater than 1. In the small case, a resampled reference picture is generated based on the resampling pixel generated by downsampling.

The motion compensator 1150 calculates a predictive picture by motion compensating the resampled reference picture based on the decoded motion vector.

Meanwhile, the video decoding apparatus according to an embodiment of the present invention may further include a synthesis unit configured to synthesize the predictive picture and the decoded residual signal to generate a decoded picture. The decoded picture generated by the synthesizer is stored in the storage 1130. Although not shown in the drawing, in order to adaptively smooth the discontinuities in the blocks of the decoded picture, the loop filter unit may be disposed between the synthesis unit and the storage unit 1130. Meanwhile, the units may be performed in various block units such as 16X16, 8X8, 8X4, 4X8, and 4X4.

FIG. 13 is a diagram illustrating information related to zoom in / zoom out in a bitstream, and FIG. 14 is a diagram illustrating offsets for partially cutting an upsampled reference picture.

Referring to the figure, in the encoding process, first, a resampling flag (a) for zoom in / zoom out correction is signaled at a first syntax level. The resampling flag (a) may be defined as "use_seq_resampled_ref_flag" that indicates whether the video sequence uses a resampled reference picture in inter presiction, and the first syntax level 1300 may be a sequence level. In particular, it may be a "sequence parameter set" of H.264.

Next, if the resampling flag a indicates that resampling is active for at least one reference picture corresponding to the current picture in the video sequence (use_seq_resampled_ref_flag = 1), then the first syntax level in the video sequence Signal at least one resampling ratio c for zoom in / zoom out correction for the current picture at a second syntax level 1310 lower than 1300. The resampling ratio c may be defined as "resampling_ratio [i]" indicating a resampling ratio between the i th reference picture and the current picture. The second syntax level 1310 may be a slice level, in particular a slice header. The zoom in / zoom out correction resamples the at least one reference picture based on at least one resampling ratio (c) and generates a resampled reference picture. Motion estimation and compensation are performed based on the generated resampled reference picture.

On the other hand, when the resampling flag (a) indicates that the resampling is active (use_seq_resampled_ref_flag = 1), a second syntax level lower than the first syntax parameter set in the video sequence. Signal a second resampling flag (b) for zoom in / zoom out correction at. The second sampling flag b may be defined as "use_slice_resampled_ref_flag" indicating whether the current slice uses the resampled reference picture in inter prediction. The resampling ratio (c) may be signaled when the resampling flag (a) and the second resampling flag (b) are activated (use_seq_resampled_ref_flag = 1, use_slice_resampled_ref_flag = 1).

Meanwhile, when the resampling flag a and the second resampling flag b are activated (use_seq_resampled_ref_flag = 1 and use_slice_resampled_ref_flag = 1), a third lower than the second syntax level 1310 in the video sequence. A third resampling flag (e) for zoom in / zoom out correction may be signaled at syntax level 1320. The third resampling flag e may be defined as "use_block_resampled_ref_flag" indicating whether the current block uses a resampled reference picture in inter prediction. The third syntax level 1320 may be a macroblock or block level.

Meanwhile, at the second sequence level 1310, offset data d for cutting at least a portion of the upsampled or downsampled reference picture may be further signaled. The offset data d may be defined as "left_offset", "right_offset", "top_offset", and "bottom_offset". FIG. 14A shows each offset data ("left_offset", "right_offset", "top_offset", "bottom_offset") for cropping an upsampled reference picture as compared to the current picture. Using each offset data shown, the upsampled reference picture is generated to be the same size as the current picture of FIG. 14B.

Meanwhile, in the decoding process, the bitstream shown in FIG. 13 is received. First, a resampling flag a for zoom in / zoom out correction is received and processed at a first syntax level 1300 in a video sequence, and then the resampling flag a at the first syntax level 1300 is When resampling is activated for at least one reference picture corresponding to a current picture in the video sequence, the current picture at a second syntax level 1310 lower than the first syntax level 1300 in the video sequence Receive and process at least one resampling ratio (c) for zoom in / zoom out correction for. Zoom in / zoom out correction resamples the at least one reference picture based on at least one resampling ratio c to generate a resampled reference picture.

On the other hand, when the resampling flag (a) indicates that the resampling is active, a second resampling flag for zoom in / zoom out correction at a second syntax level 1310 lower than the first syntax level 1300 in the video sequence. (b) can be received and further processed. The processing of the resampling rate c may be performed when the first resampling flag a and the second resampling flag b are activated.

In addition, when the resampling flag (a) and the second resampling flag (b) are activated, zoom-in / zoom-out correction is performed at a third syntax level 1320 lower than the second syntax level 1310 in the video sequence. The third resampling flag (e) may be received and further processed.

In addition, at the second syntax level 1310, offset data d for cutting at least a portion of the resampled reference picture may be received and further processed. The offset data is as shown in FIG.

As described above, the first to third syntax levels may be a sequence level, a slice level, and a macro block or a block level, respectively.

Meanwhile, the video encoding method or the video decoding method of the present invention may be embodied as a processor readable code on a processor readable recording medium included in the video encoding apparatus or the video decoding apparatus. The processor-readable recording medium includes all kinds of recording devices that store data that can be read by the processor. Examples of the processor-readable recording medium include ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like, and also include a carrier wave such as transmission through the Internet. The processor-readable recording medium can also be distributed over network coupled computer systems so that the processor-readable code is stored and executed in a distributed fashion.

While preferred embodiments of the present invention have been shown and described, the present invention is not limited to the specific embodiments described above, and the present invention is not limited to the specific scope of the present invention as claimed in the claims. Various modifications can be made by those skilled in the art, and these modifications should not be understood individually from the technical idea or the prospect of the present invention.

1 is a diagram illustrating a comparison between a reference picture and a current picture at zoom in / zoom out.

FIG. 2 is a diagram illustrating a comparison of a resampled reference picture and a current picture by resampling the reference picture of FIG. 1.

3 is a flowchart illustrating an embodiment of a video encoding method of the present invention.

FIG. 4 is a flowchart illustrating an example of obtaining a resampled reference picture of FIG. 3. FIG.

5 illustrates upsampling for a pixel.

FIG. 6 is a diagram illustrating filter coefficients used in upsampling of FIG. 5.

7 illustrates downsampling for a pixel.

FIG. 8 is a diagram illustrating filter coefficients used in downsampling of FIG. 7. FIG.

9 is a flowchart illustrating an embodiment of a video decoding method according to the present invention.

10 is a block diagram showing an embodiment of a video encoding apparatus of the present invention.

FIG. 11 is a block diagram illustrating an example of a zoom in / zoom out corrector of FIG. 9.

12 is a block diagram showing an embodiment of a video decoding apparatus of the present invention.

13 is a diagram illustrating information related to zoom in / zoom out in a bitstream.

14 is a diagram illustrating offsets for partially cropping an upsampled reference picture.

<Explanation of symbols for the main parts of the drawings>

900: storage unit 910: zoom in / zoom out correction unit

920: the motion estimation unit 930: the motion compensation unit

1000: location information calculation unit 1010: filter coefficient selection unit

1020: resampling filter unit

Claims (14)

Obtaining at least one resampled reference picture by performing any one of downsampling and upsampling on the reference picture based on the zoom-in / zoom-out information extracted from the received bitstream; Obtaining a predictive picture by motion compensation of the resampled reference picture; And And adding the predictive picture and the decoded residual signal to obtain a decoded picture. The method of claim 1, Obtaining the resampled reference picture, Calculating position information of a resampling pixel generated based on a resampling ratio included in the zoom in / zoom out information; Selecting predetermined filter coefficients based on the position information; And Calculating the resampling pixel value based on the selected filter coefficient and the pixel value in the reference picture. The method of claim 2, The resampled reference picture is When the resampling ratio is greater than 1, the resampling ratio is generated based on a combination of pixels in the reference picture and resampling pixels generated by the upsampling, And when the resampling ratio is less than 1, generated based on the resampling pixels generated by the downsampling. Obtaining zoom in / zoom out information of the current picture based on the at least one reference picture; Obtaining at least one resampled reference picture by downsampling or upsampling the reference picture according to the information; Obtaining a motion vector of the current picture based on the resampled reference picture; And And obtaining a predictive picture by motion compensating the resampled reference picture based on the motion vector. The method of claim 4, wherein Obtaining the resampled reference picture, Calculating position information of a resampling pixel generated based on a resampling ratio included in the information; Selecting predetermined filter coefficients based on the position information; And And calculating the resampling pixel value based on the selected filter coefficient and the pixel value in the reference picture. The method of claim 5, The resampled reference picture is When the resampling ratio is greater than 1, generated based on a combination of pixels in the reference picture and resampling pixels generated by the upsampling, And when the resampling ratio is less than 1, generated based on the resampling pixels generated by the downsampling. In the decryption process, Receiving and processing a resampling flag for zoom in / zoom out correction at a first syntax level in the video sequence; And A second syntax lower than the first syntax level in the video sequence when the resampling flag at the first syntax level indicates that resampling is activated for at least one reference picture corresponding to a current picture in the video sequence Receiving and processing at least one resampling ratio for zoom in / zoom out correction for the current picture at a level; And the zoom in / zoom out correction resamples the at least one reference picture based on at least one resampling ratio to generate a resampled reference picture. The method of claim 7, wherein Receiving and processing a second resampling flag for zoom in / zoom out correction at the second syntax level in the video sequence when the resampling flag indicates that the resampling is active; And Receiving and processing a third resampling flag for zoom in / zoom out correction at a third syntax level lower than the second syntax level in the video sequence when the resampling flag and the second resampling flag are active; A video decoding method further comprising. In the encoding process, Signaling a resampling flag for zoom in / zoom out correction at a first syntax level in the video sequence; And A second syntax lower than the first syntax level in the video sequence when the resampling flag at the first syntax level indicates that resampling is activated for at least one reference picture corresponding to a current picture in the video sequence Signaling at least one resampling ratio for zoom in / zoom out correction for the current picture at a level; And the zoom in / zoom out correction resamples the at least one reference picture based on at least one resampling ratio to generate a resampled reference picture. The method of claim 9, If the resampling flag indicates that the resampling is active, signaling a second resampling flag for zoom in / zoom out correction at the second syntax level in the video sequence; And When the resampling flag and the second resampling flag indicate that the resampling flag is active, signaling a third resampling flag for zoom in / zoom out correction at a third syntax level lower than the second syntax level in the video sequence; Video encoding method. A decoder which receives a bitstream and extracts and decodes zoom in / zoom out information, a motion vector, and a residual signal; A zoom in / zoom out corrector configured to downsample or upsample a reference picture based on the decoded zoom in / zoom out information to generate at least one resampled reference picture; A motion compensator for motion compensating the resampled reference picture based on a motion vector to obtain a predictive picture; And And a synthesizer configured to synthesize the predictive picture and the decoded residual signal to generate a decoded picture. A zoom-in / zoom-out correction unit configured to obtain zoom-in / zoom-out information of the current picture based on at least one reference picture and to generate at least one resampled reference picture by downsampling or upsampling the reference picture according to the information; A motion estimation unit for obtaining a motion vector of the current picture based on the resampled reference picture; A motion compensator configured to obtain a predictive picture by motion compensating the resampled reference picture based on the motion vector; And And an encoder configured to generate a residual signal by encoding the current picture and the predictive picture, and to encode the generated residual signal. A processor-readable recording medium having recorded thereon a program for executing the video decoding method of claim 1. A processor-readable recording medium having recorded thereon a program for executing the video encoding method of claim 4.
KR1020070081343A 2007-08-13 2007-08-13 Method for encoding and decoding video and apparatus thereof KR20090016970A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020070081343A KR20090016970A (en) 2007-08-13 2007-08-13 Method for encoding and decoding video and apparatus thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020070081343A KR20090016970A (en) 2007-08-13 2007-08-13 Method for encoding and decoding video and apparatus thereof

Publications (1)

Publication Number Publication Date
KR20090016970A true KR20090016970A (en) 2009-02-18

Family

ID=40685863

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020070081343A KR20090016970A (en) 2007-08-13 2007-08-13 Method for encoding and decoding video and apparatus thereof

Country Status (1)

Country Link
KR (1) KR20090016970A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015002444A1 (en) * 2013-07-01 2015-01-08 삼성전자 주식회사 Video encoding and decoding method accompanied with filtering, and device thereof

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015002444A1 (en) * 2013-07-01 2015-01-08 삼성전자 주식회사 Video encoding and decoding method accompanied with filtering, and device thereof
KR20150004292A (en) * 2013-07-01 2015-01-12 삼성전자주식회사 Method and apparatus for video encoding with filtering, method and apparatus for video decoding with filtering
US10003805B2 (en) 2013-07-01 2018-06-19 Samsung Electronics Co., Ltd. Video encoding and decoding method accompanied with filtering, and device thereof

Similar Documents

Publication Publication Date Title
US7379496B2 (en) Multi-resolution video coding and decoding
KR101359490B1 (en) Color Video Encoding/Decoding Method and Apparatus
KR102003051B1 (en) Method for multiple interpolation filters, and apparatus for encoding by using the same
WO2010001614A1 (en) Video image encoding method, video image decoding method, video image encoding apparatus, video image decoding apparatus, program and integrated circuit
KR101377660B1 (en) Motion Vector Encoding/Decoding Method and Apparatus Using Multiple Motion Vector Estimation and Video Encoding/Decoding Method and Apparatus Using Same
EP1601204A1 (en) Apparatus for scalable encoding/decoding of moving image and method thereof
EP1737243A2 (en) Video coding method and apparatus using multi-layer based weighted prediction
JP2003250157A (en) Optimal scanning method for transform coefficients in coding/decoding of still image and moving image
JP2011160469A (en) Method and apparatus for encoding/decoding image using single coding mode
KR20060135992A (en) Method and apparatus for coding video using weighted prediction based on multi-layer
KR101418104B1 (en) Motion Vector Coding Method and Apparatus by Using Motion Vector Resolution Combination and Video Coding Method and Apparatus Using Same
KR20110045912A (en) Adaptive Resolution Based Video Encoding/Decoding Method and Apparatus
KR20070049816A (en) Apparatus for encoding and decoding image, and method theroff, and a recording medium storing program to implement the method
US9319710B2 (en) Video encoding and decoding apparatus and method
EP1841235A1 (en) Video compression by adaptive 2D transformation in spatial and temporal direction
KR20080067922A (en) Method and apparatus for decoding video with image scale-down function
KR20050112587A (en) Video encoding and decoding apparatus, and method thereof
JP4762486B2 (en) Multi-resolution video encoding and decoding
WO2006132509A1 (en) Multilayer-based video encoding method, decoding method, video encoder, and video decoder using smoothing prediction
KR20090016970A (en) Method for encoding and decoding video and apparatus thereof
KR100950764B1 (en) Apparatus and method of transcoding for down-scaling video transcoding by adaptive motion vector resampling algorithm
JPWO2009133938A1 (en) Video encoding and decoding apparatus
JP2004158946A (en) Moving picture coding method, its decoding method, and apparatus thereof
KR20090018432A (en) Method for encoding and decoding video signal and apparatus thereof

Legal Events

Date Code Title Description
WITN Withdrawal due to no request for examination