WO2016068630A1 - 비디오 신호의 인코딩, 디코딩 방법 및 그 장치 - Google Patents
비디오 신호의 인코딩, 디코딩 방법 및 그 장치 Download PDFInfo
- Publication number
- WO2016068630A1 WO2016068630A1 PCT/KR2015/011518 KR2015011518W WO2016068630A1 WO 2016068630 A1 WO2016068630 A1 WO 2016068630A1 KR 2015011518 W KR2015011518 W KR 2015011518W WO 2016068630 A1 WO2016068630 A1 WO 2016068630A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- transform
- transform coefficient
- coefficient
- signal
- prediction
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/107—Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/12—Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/137—Motion inside a coding unit, e.g. average field, frame or block difference
- H04N19/139—Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/18—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/189—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
- H04N19/19—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding using optimisation based on Lagrange multipliers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/44—Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
Definitions
- the present invention relates to a method and apparatus for encoding and decoding a video signal, and more particularly, to a conditional non-linear transform of a spatiotemporal volume of a video signal (hereinafter referred to as 'CNT'). It's about technology.
- Compression coding refers to a series of signal processing techniques for transmitting digitized information through a communication line or for storing in a form suitable for a storage medium.
- Media such as video, image, and voice may be subjected to compression encoding.
- a technique of performing compression encoding on an image is called video image compression.
- hybrid coding hybrid coding
- predictive coding In the case of predictive coding, no statistical dependence is available in obtaining prediction error samples. That is, predictive coding predicts signal components that use already coded portions of the same signal and codes the difference between the predicted and actual values. Based on that. This follows the information theory that more accurately predicted signals can be compressed more efficiently, and better compression effects can be obtained by increasing the consistency and accuracy of the prediction. Because predictive coding is based on causal stat ist ics relat ionships, it is advantageous to process smooth or irregular signals, while it is inefficient to process large signals. In addition, since the quantization is applied to the original video signal, there is a disadvantage in that the limitation of the human audiovisual system cannot be used.
- Transform coding is a technique that decomposes a signal into a series of elements to identify the most critical data, and most of the transform coefficients are zero after quantization.
- the compression efficiency may be improved by considering the inter-pixel correlation on the transform domain.
- the present invention we propose a method of applying the CNT technique to the temporal volume of a video signal.
- it is proposed to apply a CNT technique independently to each of the three-dimensional space of the video signal (spat io-temporal) volume.
- the present invention provides a method of conditional nonlye transform ('CNT') that takes into account inter-pixel correlations on the transform domain.
- the present invention proposes a method of applying the CNT technique to the space-time (spat io-temporal) volume of the video signal.
- the present invention proposes a method of designing a CNT for inter-frame coding by performing prediction using transform coefficients.
- the present invention can obtain an optimal transformed coefficient (opt imized transform coeff icient) by considering all signals that are already reconstructed when performing the prediction process.
- the present invention may utilize all signals and context signals that have already been reconstructed to obtain an opt imized transform coeff icient, where the context signal is a previously reconstructed signal, At least one of a previously reconstructed intra-coded signal, a reconstructed portion of the current frame, or information that the encoder sends to the decoder in connection with the decoding of the signal to be reconstructed.
- the present invention may find a candidate function that minimizes the sum of the distortion measure and the rate measure to obtain an optimal transform coefficient.
- the present invention can improve the compression efficiency by using a conditional non-linear transform that takes into account the correlation between pixels on the transform domain.
- the present invention can significantly reduce the complexity while maintaining the efficiency of the conditional nonlinear transformation by transforming the original optimization problem of the space-time volume of the pixels in the video signal into a one-dimensional temporal trajectory. .
- the present invention fuses predictive coding and transform coding to take advantage of each coding scheme. All can be saved. That is, by using all of the signals that have already been reconstructed, more precise and improved prediction can be performed, and statistical dependence of prediction error samples can be used.
- a high quality image including a non-smooth or non-stat iffy signal can be coded more efficiently.
- signal adaptive decoding can be performed without the need for additional information, and when compared with a conventional hybrid coder, high quality prediction can be performed and prediction errors can be reduced.
- the present invention provides an improved method of spatial spatial video compression, thereby enabling efficient coding even for an image having large motion dependence or spatial boundary characteristics.
- 1 and 2 show schematic block diagrams of an encoder and a decoder in which media coding is performed, respectively.
- FIG. 3 is an embodiment to which the present invention is applied and shows a schematic blot diagram of an encoder and a decoder to which an improved coding method is applied, respectively.
- FIG. 5 is an embodiment to which the present invention is applied and provides an improved video coding method. A schematic flowchart for explanation is shown.
- FIG. 6 is an embodiment to which the present invention is applied and is a flowchart for explaining an improved video coding method for generating an optimal prediction signal.
- FIG. 7 is an embodiment to which the present invention is applied and is a flowchart illustrating a process of generating an optimal prediction signal.
- FIG. 8 is a flowchart illustrating a method of obtaining an optimal transform coefficient according to an embodiment to which the present invention is applied.
- 9 and 10 are embodiments to which the present invention is applied, and are conceptual views illustrating a method of applying a spatial iotemporal transform to a group of pictures (GOP).
- GOP group of pictures
- FIG. 11 is an embodiment to which the present invention is applied and shows blocks in a frame forming a temporal trajectory of the same object in an IPPP type temporal prediction structure.
- FIG. 12 and 13 illustrate embodiments to which the present invention is applied.
- FIG. 12 illustrates blocks in a frame for explaining prediction on a transform domain in a temporal prediction structure of an IPPP type
- FIG. 13 illustrates temporal prediction of an IPPP type. Represents a corresponding set of transform coefficients for which prediction on the transform domain is performed in the structure.
- FIGS. 14 and 15 illustrate block diagrams of encoders and decoders that perform IPPP type CNTs according to embodiments to which the present invention is applied.
- FIG. 16 is an embodiment to which the present invention is applied and shows a corresponding set of transform coefficients for which prediction on a transform domain is performed in an IBBBP type temporal prediction structure.
- 17 is an embodiment to which the present invention is applied and shows a flowchart of encoding a video signal based on inter-pixel correlation on a transform domain.
- FIG. 18 illustrates an embodiment to which the present invention is applied and shows a flowchart of decoding a video signal based on a conditionally nonlinear transform considering inter-pixel correlation on a transform domain.
- a method of encoding a video signal based on correlations between pixels on a transform domain comprising: obtaining a first transform coefficient by performing transform on a pixel value of a target block in a current frame ; Restoring a second transform coefficient for the corresponding block in the previous frame; And obtaining a predicted value of the first transform coefficient based on the reconstructed second transform coefficient and a correlation coefficient.
- the second transform coefficient may be restored based on all the transform coefficients previously restored and the first transform coefficient.
- the correlation coefficient is characterized in that it represents a correlation between the restored second transform coefficient and the first transform coefficient.
- the correlation coefficient is characterized in that it changes based on the frequency index of the transform coefficients.
- the present invention the optimal conversion by using the optimal funct ion And obtaining a coefficient, wherein the optimal function is based on the first transform coefficient and the second transform coefficient, wherein the optimal transform coefficient represents a minimum value of the product of the product.
- the Daeung block in the previous frame is characterized in that the block to the target block in the current frame.
- the present invention also provides a method of decoding a video signal, the method comprising: receiving a video signal including a first transform coefficient of a target block in a current frame; Acquiring a spatial transform coefficient by performing a temporal inverse transform on the first transform coefficient, wherein the temporal inverse transform represents an inverse transform of an applied transform based on a temporal traj ectory ; Restoring the spatial transform coefficients by using the second transform coefficients of the Daewoong bltok in the previous frame; And reconstructing the video signal by performing a spatial inverse transform on the spatial transform coefficients.
- the first transform coefficient may represent a space-time transform coefficient obtained based on an optimal function.
- the present invention provides a device for encoding a video signal based on the inter-pixel correlation on the transform domain, the first transform coefficient (f irst transform coef f icient) by performing a transform on the pixel value of the target block in the current frame A spatial transform unit for acquiring); And restoring a second transform coefficient for the corresponding block in the previous frame, and applying the restored second transform coefficient to the correlation coefficient (correlat ion coef fi cient).
- An optimization unit for obtaining a predicted value of a first transform coefficient (fi rst transform coef icient) is provided.
- the optimization unit by using an optimal function (opt imal funct ion) to obtain an optimal transform coefficient, the optimal function is based on the first transform coefficient and the second transform coefficient,
- the optimal transform coefficient is characterized in that it represents the minimum value of the optimal function.
- the present invention also provides an apparatus for decoding a video signal, comprising: an entropy decoding unit for receiving a video signal including a first transform coefficient of a target block in a current frame; And obtaining a spatial transform coefficient by performing a temporal inverse transform on the first transform coefficient, restoring the spatial transform coefficient by using a second transform coefficient of a corresponding block in a previous frame, and performing the spatial transform coefficient. And an inverse transform unit for restoring the video signal by performing a spatial inverse transform on the inverse transform, wherein the temporal inverse transform represents an inverse transform of a transform applied based on a temporal trajectory.
- an apparatus for decoding a video signal, comprising: an entropy decoding unit for receiving a video signal including a first transform coefficient of a target block in a current frame; And obtaining a spatial transform coefficient by performing a temporal inverse transform on the first transform coefficient, restoring the spatial transform coefficient by using a second transform coefficient of a corresponding block in a previous frame, and performing the
- . 1 and 2 show schematic block diagrams of an encoder and a decoder in which media coding is performed, respectively.
- the encoder 100 of FIG. 1 includes a transform unit 110, a quantization unit 120, an inverse quantization unit 130, an inverse transform unit 140, a delay unit 150, a prediction unit 160, and an entropy encoding unit (
- the decoder 200 of FIG. 2 includes an entropy decoding unit 210, an inverse quantization unit 220, an inverse transform unit 230, a delay unit 240, and a prediction unit 250.
- the encoder 100 receives an original video signal (or iginal video signal), and outputs a predictive ion signal output from the predictor 160 in the original video signal. Subtract to produce prediction error.
- the generated prediction error is transmitted to the transform unit 110, and the transform unit 110 generates a transform coefficient by applying a transform scheme to the prediction error.
- the transformation techniques there may be a block-based transformation method and an image-based transformation method.
- the block-based transform method include a discrete cosine transform, a karhuhen-loeve transform, and the like.
- the discrete cosine transform (DCT) refers to decomposition (conversion) of a signal on a spatial domain into a two-dimensional frequency component.
- DCT discrete cosine transform
- a pattern having a lower frequency component toward the top left and a higher frequency component toward the bottom right forms a pattern.
- only one in the upper left of the 64 two-dimensional frequency components is a component having a frequency of 0 as a direct current component (DC), and the other is a low component as an AC component (AC).
- DC direct current component
- AC AC component
- DCT discrete cosine transform
- the discrete cosine transform is simply a transform used to represent an original video signal component, and is completely restored from an original video signal from a frequency component during inverse transformation. That is, by changing only the expression method of the image, all information included in the original image is preserved including the duplicated information.
- the discrete cosine transform (DCT) of the original video signal unlike the amplitude distribution of the original video signal, the discrete cosine transform (DCT) coefficient is present at a value near zero, so that a high compression effect is used. You can get it.
- the quantization unit 120 quantizes a transform coefficient and transmits the transform coefficient to the entropy encoding unit 170.
- the entropy encoding unit 170 entropy codes and outputs the quantized signal.
- the quantized signal output from the quantization unit 120 may be used to generate a prediction signal.
- the quantized signal may be restored to a prediction error by applying inverse quantization and inverse transformation through an inverse quantization unit 130 and an inverse transformation unit 140 in a loop.
- a reconstructed signal may be generated by adding the reconstructed prediction error to a prediction signal output from the predictor 160.
- the delay unit 150 stores the reconstruction signal for future reference by the prediction unit 160, and the prediction unit 160 stores a previously reconstructed signal stored in the delay unit 150.
- the prediction signal is generated by using (previously reconstructed signal).
- the decoder 200 of FIG. 2 receives a signal output from the encoder 100 of FIG. 1, and the received signal is entropy decoded through the entropy decoding unit 210.
- the inverse quantization unit 220 obtains a transform coefficient from the entropy decoded signal using the quantization step size information, and the inverse transform unit 230 inverse transforms the transform coefficient to obtain a prediction error.
- a reconstructed signal is generated by adding the obtained prediction error to a predict ion signal output from the predictor 250.
- the delay unit 240 stores the reconstruction signal for future reference by the prediction unit 250, and the prediction unit 250 stores a previously restored signal stored in the delay unit 240.
- the predicted signal is generated using the signal (previous ly reconstructed signal).
- the encoder 100 of FIG. 1 and the decoder 200 of FIG. 2 may be subjected to predictive coding, transform coding, and hybrid coding. The combination of the advantages of predictive coding and transform coding is called hybrid coding.
- Predictive coding can be applied to individual samples each time, and in fact the most powerful method for prediction is to have a cyclic structure. This circular structure is based on the fact that it can be best predicted when using the nearest value. That is, the best prediction can be performed if the prediction value is coded and then immediately used to predict another value.
- prediction and transformation are separated in two orthogonal dimensions. For example, in video coding, prediction is applied in the time domain and transform is applied in the spatial domain.
- prediction is performed only from data in blocks that are already coded. This can eliminate error propagation, but in the prediction process some data samples in the block and smaller There is a disadvantage in that performance is reduced by forcing the use of statistically correlated data.
- the present invention seeks to solve this problem by removing the limitations on the data that can be used in the prediction process, and by enabling a new form of hybrid coding that incorporates the advantages of predictive coding and transform coding.
- the present invention is to improve the compression efficiency by providing a conditional non-linear transformation method that considers the correlation between pixels on the transform domain.
- 3 and 4 are embodiments to which the present invention is applied and show schematic blotting degrees of an encoder and a decoder to which an improved coding method is applied, respectively.
- N residual data obtained after subtracting N prediction data from N original data (or iginal data) at once.
- transform coding (1) for predictive error predict ion error.
- the prediction process and the transformation process are performed sequentially.
- the present invention proposes a method of obtaining a transform coefficient using a previously reconstructed signal and a context signal.
- the encoder 300 of FIG. 3 includes an optimizer 310, a quantizer 320, and an entropy encoder 330.
- the decoder 400 of FIG. 4 includes an entropy decoder 410 and an inverse quantizer. 420, an inverse transform unit 430, and a recovery unit 440.
- the optimizer 310 obtains an optimized transform coefficient.
- the optimizer 310 may apply the following embodiments to obtain an optimized transform coefficient.
- a reconstruction function for reconstructing a signal may be defined as follows.
- S represents a reconstruction signal
- c represents a decoded transform coefficient
- y represents a context signal
- R (c, y) represents a nonlinear incon reconstruct ion funct ion that uses c and y to generate a reconstruction signal.
- the prediction signal may be defined as a relationship between the values that are already reconstructed and the transform coefficients. That is, the encoder and the decoder to which the present invention is applied may generate an opt imized predict ion signal in consideration of all signals that have been reconstructed when performing the prediction process.
- a non-linear prediction function non-l inear predict ion funct ion
- Each decoded transform coefficient thus affects the overall reconstruction process and enables control of the prediction error contained in the prediction error vector.
- the prediction error signal may be defined as follows.
- e represents a prediction error signal
- c represents a decoded transform coefficient
- T represents a transform matrix
- the recovery signal may be defined as follows.
- n represents the nth reconstruction signal
- e n represents the nth prediction error signal
- y represents a context signal
- R n represents a nonlinear reconstruction function using and y to generate a reconstruction signal.
- the nonlinear recovery function R n may be defined as follows.
- ⁇ ⁇ denotes a non-linear function yejeuk (norrlinear predict ion function) consisting of the parameters to generate a prediction signal.
- the non-linear prediction function may be, for example, a median function or a combination of linear functions as well as a combination of a tank order filter or a nonlinear function.
- the non-linear prediction function P n 0 may be different nonlinear functions.
- the encoder 300 and the decoder 400 to which the present invention is applied may include a repository of candidate functions for selecting the non-linear prediction function.
- the optimizer 310 may select an optimal nonlinear prediction function to generate an optimized transform coefficient.
- the optimal nonlinear prediction function may be selected from candidate functions stored in the store. This will be described in more detail with reference to FIGS. 7 and 8. As described above, by selecting an optimal nonlinear prediction function, the optimizer 310 may generate an optimized transform coeff i cient.
- the output transform coefficient is transmitted to the quantization unit 320, and the quantization unit 320 quantizes the transform coefficient and transmits the transform coefficient to the entropy encoding unit 330.
- the entropy encoding unit 330 may entropy encode the quantized transform coefficients to output a compressed bitstream.
- the decoder 400 of FIG. 4 may receive the bitstream output from the encoder of FIG. 3, perform entropy decoding through the entropy decoding unit 410, and perform inverse quantization through the inverse quantization unit 420. have. In this case, the signal output through the inverse quantization unit 420 may mean an optimized transform coefficient.
- the inverse transform unit 430 receives the optimized transform coefficients to perform an inverse transform process, and generates a prediction error signal through the inverse transform process.
- the reconstruction unit 440 generates a reconstruction signal by adding the prediction error signal and the prediction signal.
- various embodiments described with reference to FIG. 3 may be applied.
- Degree. 5 is an embodiment to which the present invention is applied and shows an improved video coding method. A schematic flowchart for explanation is shown.
- the encoder may generate a reconstruction signal based on at least one of all previously reconstructed signals and context signals (S510).
- the context signal may include at least one of a previously reconstructed signal, a previously reconstructed intra coded signal, an already reconstructed portion of the current frame, or other information related to decoding of a signal to be reconstructed.
- the reconstruction signal may include a sum of a prediction signal and a prediction error signal, and each of the prediction signal and the prediction error signal may be generated based on at least one of a previously reconstructed signal and a context signal.
- the encoder may obtain an optimal transform coefficient that minimizes the optimization function (S520).
- the optimization function may include a distortion component, a rate component, and a Lagrange mult ipl ⁇ .
- the distortion component may consist of the difference between the original video signal and the reconstruction signal, and the rate component may comprise a previously obtained transform coefficient.
- ⁇ represents a real number that balances the distortion component and the rate component.
- the obtained transform coefficient is transmitted to the decoder through quantization and entropy encoding (S530).
- the decoder receives the transmitted transform coefficients and obtains a prediction error vector through entropy decoding, inverse quantization, and inverse transform.
- the prediction unit in the decoder generates a prediction signal using all available samples that have already been reconstructed, and can reconstruct the video signal based on the prediction signal and the reconstructed prediction error vector.
- the embodiments described in the encoder may be applied to the process of generating the prediction signal.
- FIG. 6 is a flowchart illustrating a video coding method using an already reconstructed signal and a context signal to generate an optimal transform coefficient according to an embodiment to which the present invention is applied.
- the present invention using the already reconstructed signal (previously reconstructed signal) (, x 2, ⁇ , ⁇ ⁇ - ⁇ ) signal and context (context signal) may generate a prediction signal (S610).
- the previously reconstructed signal may mean 3 ⁇ 4, 3 ⁇ 4 3 ⁇ 4- ⁇ as defined in Equation 3 above.
- a nonlinear prediction function may be applied to generate the prediction signal, and different nonlinear prediction functions may be applied to each prediction signal.
- the prediction signal is added to the received prediction error signal e (i) (S620) to generate a reconstruction signal (S630).
- step S620 may be performed through an adder (not shown).
- the generated recovery signal may be stored for future reference (S640). This stored signal can then be used to generate the next prediction signal.
- step S610 is a flowchart illustrating a process of generating a prediction signal used to generate an optimal transform coefficient.
- the present invention uses a predicted signal p (i by using a previously reconstructed signal (Xi, X2, ..., S n — ⁇ ) and a context signal. )) May be generated (S710).
- a predicted signal p i by using a previously reconstructed signal (Xi, X2, ..., S n — ⁇ ) and a context signal. )
- f (k) selection of an optimal prediction function f (k) may be necessary to generate the prediction signal.
- the generated reconstruction signal S n may be generated using the generated prediction signal (S720), and the generated reconstruction signal S may be stored for future reference (S730).
- all signals 5, X2,..., X n -i and context signals that have already been recovered may be used to select the optimal prediction function.
- the present invention may select an optimal prediction function by finding a candidate function that minimizes the sum of the distortion measurement and the rate measurement (S740).
- the distortion measurement indicates a value obtained by measuring a distortion between an original image signal and a reconstructed signal
- the rate measurement value indicates a value measured by a rate required for transmitting or storing a transform coefficient
- the present invention can obtain the optimal prediction function by selecting a candidate function to minimize the following equation (5).
- C * denotes a value of c that minimizes Equation 5, i.e., the decoded transform coefficient Indicates.
- D (x, x (c)) represents a distortion measurement value between the original video signal and its reconstruction signal
- R (c) represents a rate measurement value necessary for transmitting or storing the conversion coefficient c.
- R (c) is Huffman
- An entropy coder such as a Huffman coder or an arithmetic coder may be used to represent the number of bits used to store the transform coefficient c.
- R (c) is a Laplacian or Gaussian probability model
- ⁇ represents a Lagrange multiplier used in encoder optimization.
- ⁇ may represent a real number that balances the distortion measurement with the rate measurement.
- 8 is a flowchart illustrating a method of obtaining an optimal transform coefficient according to an embodiment to which the present invention is applied.
- the present invention can provide an improved coding method by obtaining an optimal transform coefficient that minimizes the sum of the distortion measure and the rate measure.
- the encoder may obtain an optimal transform coefficient that minimizes the sum of the distortion measurement value and the rate measurement value (S810).
- the equation of Equation 5 may be applied to the sum of the distortion measurement value and the rate measurement value.
- the input signal the original image signal (X), the already reconstructed signal (5, previously obtained conversion coefficients and Lagranges)
- At least one of a multiplier (Lagrange mul t ipl ier, ⁇ ) may be used.
- the already reconstructed signal may be obtained based on a previously obtained transform coefficient.
- the optimal transform coefficient (c) is inversely transformed through an inverse transform process (S820), and a prediction error signal is obtained (S830).
- the encoder generates a reconstruction signal X using the obtained error signal (S840).
- a context signal may be used to generate the reconstruction signal 50.
- the generated reconstructed signal may in turn be used to obtain an optimal transform coefficient that minimizes the sum of the distortion measurement and the rate measurement.
- the optimal transform coefficients are updated and can be used to obtain new optimized transform coefficients again through a reconstruction process.
- This process may be performed by the optimizer 310 of the encoder 300.
- the optimizer 310 outputs a newly obtained transform coefficient, and the output transform coefficient is compressed and transmitted through a quantization and entropy encoding process.
- a prediction signal is used to obtain an optimal transform coefficient, and the prediction signal may be defined as a relationship between already reconstructed signals and transform coefficients.
- the transform coefficients may be described by Equation 2, and as in Equation 2 and Equation 3, each transform coefficient may affect the entire reconstruction process, and a prediction error included in a prediction error vector. Can enable a wide range of control.
- the restoring process may be limited to a linear one. In such a case, the restoring signal may be defined as in Equation 6 below.
- X represents a reconstruction signal
- c represents a decoded transform coefficient
- y represents a context signal
- F, ⁇ , and I / represent n x n matrices.
- an n X n matrix S may be used to control the quantization error included in the transform coefficient.
- the reconstruction signal may be defined as follows.
- the matrix S for controlling the quantization error may be obtained by using the following minimization process of Equation 8.
- T represents a training signal
- the transform coefficient c is aligned with an n-dimensional vector.
- the transform coefficient components satisfy e.
- the conversion coefficient Each may have a different quantization step size.
- the n X n matrix F, S, H of Equation 7 may be co-optimized for the training signal.
- the joint optimization method may be performed by minimizing the following equation (9).
- GOP group of pictures
- the present invention can apply space-time transformation to a picture group G0P including V frames.
- the prediction error signal and the reconstruction signal may be defined as follows.
- T st represents a spatiotemporal transform matrix and c includes decoded transform coefficients for the entire picture group.
- the error vector e may include all error values for the entire picture group G0P having the V frames.
- FIG. 9 is a diagram for describing a transform method in a general spatial domain
- FIG. 10 is a diagram for explaining a method of applying a space-time transform to a picture group.
- transform codes in the spatial domain are generated independently of error values of I frames and P frames.
- the present invention provides a new method for improving the efficiency of the compression algorithm using CNT techniques that consider inter-pixel correlation on the transform domain.
- the CNT combines the transform and prediction steps together in an optimal way by taking into account the quantization effects of the samples.
- the present invention can combine a completely arbitrary prediction method with any transform, taking into account the propagation effect of quantization error.
- High compression efficiency can be obtained by applying different design parameter selections to different forms of traditional prediction-transformation scenarios such as intra and inter prediction.
- the selection of other design parameters may include the geometry of the neighbor pixel set considered in each CNT operation.
- the present invention describes a method of designing a CNT for inter frame coding of a video signal.
- the present invention can convert the original optimization problem for a spat io-temporal volume of pixels in a video signal into a one-dimensional temporal trajectory. This can significantly reduce the complexity while maintaining the efficiency of the CNT technology.
- the present invention provides a CNT technique for space-time volume of video.
- CNTs can be applied independently to each of the three dimensions of the space-time video volume.
- the present invention may first apply a spatial transform, such as DCT, to each coding unit (CU or PU) in a frame to obtain corresponding transform coefficients that are spatially uncorrelated.
- the CNTs can be designed using transform coefficients along the one-dimensional temporal motion trajectory found by the inter-frame motion estimation.
- the problem of designing CNTs for inter-frame coding that must process three-dimensional space-time pixel volumes can be reformulated as a one-dimensional CNT design problem.
- the present invention proposes a method of designing a CNT for inter frame coding without incurring high computational complexity, so that long-term temporal correlation can be effectively considered within the CNT framework.
- Another embodiment of the invention is directed to a method of generating a special form of CNT that is applied to a group of video frames (or G0P).
- temporal matching blocks are located within a given group of picture (G0P) to form a temporal trajectory.
- a spatial transform such as DCT (S-Transform) is applied to each CU in the frame, whereby the transform coefficients of the CU are de-correlated.
- Matching of the first stage may be performed in the pixel domain as in a general codec, or may be performed on the transform coefficient domain obtained after spatial transform.
- CNT parameters are designed for coding transform coefficients of the same frequency in a CU according to a temporal trajectory.
- the parameters may refer to the F and H matrices of Equation 6.
- Temporal prediction dependencies such as IPPP or IBBBP can be considered when deriving CNT parameters.
- the correlation coefficients between the transform coefficient values between the blocks corresponding in time to can vary, based on the frequency or time index of the correlation coefficient.
- the S-Transform represents a spatial transform for calculating a transform coefficient for each frame, as shown in Equation 15, and the Temporal Transform (T-transform) corresponds to a temporal trajectory. Represents the time conversion required for CNT operation.
- FIG. 11 is an embodiment to which the present invention is applied and shows blocks in a frame forming a temporal trajectory of the same object in an IPPP type temporal prediction structure.
- FIG. 11 shows a temporal prediction structure of a typical IPPP type.
- Four frames from Frame (i-1) to Frame (i + 2) are illustrated, and the four frames may be I, P, P, and P frames, respectively.
- the blots b (i — ⁇ ⁇ 1) ( ⁇ + 2) in the frame are connected by the motion vector to form a temporal trajectory in which temporal motion of the same object can be identified.
- the inner frame beultok b (i - ⁇ ⁇ 1) ( ⁇ + 2) is assumed to be a 2x2 block, the present invention is not limited thereto.
- the first-order Gauss Markov model predictor is expressed by the following equation (13).
- Equation 13 Represents a pixel value of the n-th position in the block in the i-th frame, and the correlation coefficient is assumed to be 1.
- Equation 14 Where ⁇ represents the pixel value of the spatially neighboring pixel, and ⁇ represents the correlation coefficient associated with.
- Equation 13 or Equation 14 may be used based on complexity or modeling accuracy.
- FIG. 12 and 13 illustrate embodiments to which the present invention is applied, FIG. 12 shows blocks in a frame for explaining prediction on a transform domain in a temporal prediction structure of type ⁇ , and FIG. 13 shows temporal prediction of ⁇ type. Represents a large set of transform coefficients in which a prediction on a transform domain is performed in a structure.
- a block in a frame may be divided into four subblocks, and f represents a transform coefficient of each subblock.
- the block bi in the frame i includes four subblocks, and represents a transform coefficient of each subblock f0, fl, f2, and f3.
- the sub blocks b (i ⁇ ir b (i + 2) are assumed to be 2 ⁇ 2 blocks, but the present invention is not limited thereto.
- Correlation between unspecific pixel values in a frame and transform coefficients may be represented by Equation 15 below, and the transform coefficients of each subblock may be defined as transform coefficient sets.
- F ' represents a set of transform coefficients of a block in a frame
- ' represents a set of pixel values of a block in a frame
- T represents a transform matrix
- F 'and X' can be represented by Equation 16 below.
- n denotes an index for the transform coefficient
- i denotes a frame number
- i denotes a correlation coefficient between the (i-1) th frame and the nth transform coefficients of matching blocks in the i-th frame.
- the transform domain The transform coefficient prediction may be performed on the image.
- the transform coefficient / '' in the current i-th frame can be predicted from the transform coefficient / '' -1 of the previous frame.
- One of the gist of the CNT algorithm to which the present invention is applied is that both prediction and transformation can be applied one-dimensionally for better decorrelat ion.
- TDP temporal direct ion prediction
- the present invention provides a method of performing CNTs on space-time pixel volumes without increasing computational complexity.
- an IPPP type CNT using a primary Gauss-Markov model may be performed as follows.
- the prediction dependence between the transform coefficients in the frame may be defined as in Equations 18 to 20.
- Equation 21 an equation for predicting a specific transform coefficient in a frame may be calculated as in Equation 21 below.
- Is a set of transform coefficient prediction values is a reconstructed sample set of transform coefficients, and is a transform coefficient of a previous frame. For example, it can be expressed as the following equations (22) to (24).
- a recovery function such as 25 can be obtained.
- Equation 24 has a form corresponding to H ⁇ (IF 0 ) 'l G.
- conversion ⁇ ⁇ represents the transformation to be applied based on the temporal trajectory that may include a DCT, DST and the like.
- Equation 26 the CNT optimization equation may be changed as in Equation 26 to find an optimal transform coefficient vector.
- DO represents a distortion component
- RO represents a rate component
- A represents a Lagrange multiplier.
- DO represents an inferable function such as / 2 norm
- R () represents a bit measure needed to transmit related side information such as a quantization index corresponding to C *.
- ⁇ represents a real number that balances the distortion component and the rate component.
- 14 and 15 illustrate block diagrams of an encoder and a decoder that performs ⁇ type CNTs according to embodiments to which the present invention is applied. A block diagram of an encoder and a decoder for performing an IPPP type CNT to which the present invention is applied is shown.
- the encoder 1400 to which the present invention is applied includes a spatial transform unit 1410, an optimizer 1420, a quantization unit 1430, an entropy decoding unit 1440, an inverse transform unit 1450, and a DPB 1460. It may include.
- the spatial transform unit 1410 may include a plurality of sub-space transform units applied to each frame.
- the spatial transform unit 1410 includes (i + 2) th spatial transform unit 1411, (i + 1) th spatial transform unit 1412, (i) th spatial transform unit 1413, ( and a plurality of spatial transform units, such as the i-1) th spatial transform unit 1414.
- the (i-1) th spatial transform unit 1414 may be separately performed as shown in FIG. 14. However, this is expressed for the sake of understanding and may be performed in one transform unit in the encoder.
- the spatial transform unit 1410 may receive a pixel value or a pixel value set in the pixel domain for each frame, and output a transform coefficient or a transform coefficient set by applying a spatial transform matrix to the frame. For example, the spatial transform unit 1410 may acquire a first transform coefficient f irst transform coefficient by performing transform on the pixel value of the target block in the current frame.
- the optimizer 1420 may calculate an optimal transform coefficient by using an optimization function.
- the optimization function includes a distortion component, a rate component and a Lagrangian game, for example, Equation 26 may be used.
- the optimal transform coefficients represent transform coefficients that minimize the optimization function.
- the optimal transform coefficient may be obtained based on at least one of a transform coefficient prediction value, a reconstructed sample of the transform coefficient, and a correlation coefficient between the transform coefficients.
- the optimizer 1420 restores a second transform coefficient for the Daewoong block in the previous frame, and restores the second transform coefficient and the correlation coefficient.
- a prediction value of the first transform coefficient may be obtained.
- the corresponding block in the previous frame refers to a block supported by the target block in the current frame.
- the correlation coefficient represents a correlation between the restored second transform coefficient and the third transform coefficient.
- the inverse transform unit 1450, the DPBC1460, and the (i_l) th spatial transform unit 1414 are shown as separate units, but are for convenience of description and are included in the optimizer 1420. May be
- the optimal transform coefficients output from the optimizer 1420 are quantized through the quantization unit 1430, entropy encoded through the entropy encoding unit 1440, and transmitted to the decoder.
- the decoder 1500 to which the present invention is applied includes an entropy decoding unit 1510, an inverse quantization unit 1520, a temporal inverse transform unit 1530, a spatial inverse transform unit (not shown), a DPB 1550, and a transform. It may include a portion 1560.
- the spatial inverse transform unit (not shown) may include an (i + 2) th spatial inverse transform unit 1540, an (i + 1) th spatial inverse transform unit 1541, and an (i) th spatial transform unit 1542.
- the time inverse transform unit 1530 and the space inverse transform unit are separately illustrated for convenience of description, they may be included in one inverse transform unit.
- Entropy decoding unit 1510 is the optimal conversion transmitted from the encoder 1400 Receive coefficients and perform entropy decoding.
- the inverse quantization unit 1520 dequantizes the entropy decoded transform coefficients, and the time inverse transform unit 1530 outputs a transform coefficient or a transform coefficient set for each frame. For example, a transform coefficient or a transform coefficient set in which a transform is performed on a pixel value of a target block may be output.
- the transform coefficients output from the temporal inverse transform unit 1530 may be transmitted to a spatial inverse transform unit (not shown) together with the transform coefficients of the Daeung block in the previous frame.
- a spatial inverse transform unit (not shown) together with the transform coefficients of the Daeung block in the previous frame.
- the transform coefficient of the target blotk in the (i + 2) th frame may be transmitted to the (i + 2) th spatial inverse transform unit 1540 together with the transform coefficient of the Daeung block in the previous frame.
- the spatial inverse transform unit may reconstruct the pixel value of the corresponding block by performing spatial inverse transform on the received transform coefficient.
- the (i + 2) th spatial inverse transform unit 1540 may target the (i + 2) th frame based on the transform coefficients output from the temporal inverse transform unit 1530 and the transform coefficients of the corresponding block in the previous frame.
- the pixel value X i + 2 of the block can be restored.
- the pixel value of the target block in the (i) th frame reconstructed by the (0th spatial inverse transform unit 1542) may be stored in the DPB 1550 and then used to reconstruct the pixel value of the block in the frame.
- 16 is a diagram illustrating a corresponding set of transform coefficients for which prediction on a transform domain is performed in a temporal prediction structure of an IBBBP type, according to an embodiment to which the present invention is applied.
- the prediction dependency of the liver may be defined as in Equations 27 to 30. [Equation 27] [Equation 28]
- Equation 31 an equation for predicting the transform coefficient of the in-frame bltok may be calculated as in Equation 31 below.
- Equation 32 to 34 represents a transform coefficient predicted value set
- X represents a reconstructed sample set of transform coefficients
- Y represents a transform coefficient of a previous frame.
- a reconstruction function such as 25 may be obtained, and F 0 and G in the IBBBP prediction structure may be newly defined by Equation 31.
- the CNT optimization equation for finding the optimal transform coefficient vector o may be changed based on Equation 26.
- FIG. 17 shows a flowchart of encoding a video signal based on inter-pixel correlation on a transform domain in an embodiment to which the present invention is applied.
- the present invention provides a method of encoding a video signal based on inter-pixel correlation on a transform domain.
- a first transform coefficient may be obtained (S1710).
- a second transform coefficient of the Daeung block in the previous frame may be restored.
- the Daeung block in the previous frame indicates a block corresponding to the target block in the current frame.
- a predicted value of the first transform coefficient may be obtained based on the reconstructed second transform coefficient and the correlation coefficient (S1730).
- the correlation coefficient represents a correlation between pixels between the restored second transform coefficient and the first transform coefficient.
- the correlation coefficient may change based on a frequency index of transform coefficients.
- the encoder may obtain an optimal transform coefficient by using an optimal function (S1740).
- the optimal function is based on the first transform coefficient and the second transform coefficient, and the optimal transform coefficient indicates a transform coefficient that minimizes the optimal function.
- Equation 26 may be used for the optimal function.
- FIG. 18 is an embodiment to which the present invention is applied, and illustrates a conditionally nonlinear transform considering correlation between pixels on a transform domain. A flowchart for decoding a video signal based on this is shown.
- the present invention provides a method for decoding a video signal based on a conditional ly nonl inear transform that takes into account inter-pixel correlations on the transform domain.
- the decoder may receive a video signal including a first transform coefficient of a target block in the current frame (S1810).
- the first transform coefficient indicates a space-time transform coefficient obtained based on an optimal function.
- the decoder the first can, to obtain a space conversion coefficient by performing a time reverse (inverse temporal transform) to the transform coefficients (S1820).
- the temporal inverse transform represents an inverse transform of the transform applied based on a temporal trajectory.
- the spatial transform coefficient may mean a transform coefficient or a transform coefficient set in which a transform is performed on a pixel value of a target block.
- the decoder may restore the spatial transform coefficients by using the second transform coefficients of the Daeung block in the previous frame (S1830).
- the decoder may reconstruct the video signal by performing spatial inverse transform on the spatial transform coefficients (S1840). For example, the pixel value of the target block in the current frame may be restored based on the spatial transform coefficient and the transform coefficient of the Daeung block in the previous frame.
- the pixel value of the reconstructed target block in the current frame may be stored in the DPB and then used to reconstruct the pixel value of the block in the frame.
- the embodiments described herein may be implemented and performed on a processor, microprocessor, controller, or chip.
- the functional units illustrated in FIGS. 1 to 4 and 14 to 15 may be implemented by a computer, a processor, a microprocessor, a controller, or a chip.
- the decoder and encoder to which the present invention is applied include a multimedia broadcasting transmitting and receiving device, a mobile communication terminal, a home cinema video device, a digital cinema video device, a surveillance camera, a video chat device, a real time communication device such as video communication, a mobile streaming device, Storage media, camcorders, video on demand (VoD) service providing devices, internet streaming service providing devices, three-dimensional (3D) video devices, video telephony video devices, and medical video devices, and the like, for processing video signals and data signals Can be used for
- the processing method to which the present invention is applied can be produced in the form of a program executed by a computer, and stored in a computer-readable recording medium.
- Multimedia data having a data structure according to the present invention can also be stored in a computer-readable recording medium.
- the computer readable recording medium includes all kinds of storage devices for storing computer readable data.
- the computer-readable recording medium may include, for example, a Blu-ray Disc (BD), a Universal Serial Bus (USB), a ROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device. Can be.
- the computer-readable recording medium may be in the form of a carrier wave (for example, transmission over the Internet). Contains the implemented media.
- the bit stream generated by the encoding method may be stored in a computer-readable recording medium or transmitted through a wired or wireless communication network.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Discrete Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020177012784A KR20170075754A (ko) | 2014-10-29 | 2015-10-29 | 비디오 신호의 인코딩, 디코딩 방법 및 그 장치 |
US15/523,424 US10051268B2 (en) | 2014-10-29 | 2015-10-29 | Method for encoding, decoding video signal and device therefor |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201462072417P | 2014-10-29 | 2014-10-29 | |
US62/072,417 | 2014-10-29 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016068630A1 true WO2016068630A1 (ko) | 2016-05-06 |
Family
ID=55857855
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2015/011518 WO2016068630A1 (ko) | 2014-10-29 | 2015-10-29 | 비디오 신호의 인코딩, 디코딩 방법 및 그 장치 |
Country Status (3)
Country | Link |
---|---|
US (1) | US10051268B2 (ko) |
KR (1) | KR20170075754A (ko) |
WO (1) | WO2016068630A1 (ko) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7680190B2 (en) * | 2000-06-21 | 2010-03-16 | Microsoft Corporation | Video coding system and method using 3-D discrete wavelet transform and entropy coding with motion information |
KR20100097286A (ko) * | 2009-02-26 | 2010-09-03 | 에스케이 텔레콤주식회사 | 영상 부호화/복호화 장치 및 방법 |
US20140044166A1 (en) * | 2012-08-10 | 2014-02-13 | Google Inc. | Transform-Domain Intra Prediction |
WO2014109826A1 (en) * | 2012-11-13 | 2014-07-17 | Intel Corporation | Video codec architecture for next generation video |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007116351A (ja) * | 2005-10-19 | 2007-05-10 | Ntt Docomo Inc | 画像予測符号化装置、画像予測復号装置、画像予測符号化方法、画像予測復号方法、画像予測符号化プログラム、及び画像予測復号プログラム |
WO2008120577A1 (ja) * | 2007-03-29 | 2008-10-09 | Kabushiki Kaisha Toshiba | 画像符号化及び復号化の方法及び装置 |
US20090154567A1 (en) * | 2007-12-13 | 2009-06-18 | Shaw-Min Lei | In-loop fidelity enhancement for video compression |
KR101302660B1 (ko) * | 2009-09-14 | 2013-09-03 | 에스케이텔레콤 주식회사 | 고해상도 동영상의 부호화/복호화 방법 및 장치 |
CN107071437B (zh) * | 2010-07-02 | 2019-10-25 | 数码士有限公司 | 用于帧内预测的解码图像的方法 |
US9445093B2 (en) * | 2011-06-29 | 2016-09-13 | Qualcomm Incorporated | Multiple zone scanning order for video coding |
KR101943049B1 (ko) * | 2011-06-30 | 2019-01-29 | 에스케이텔레콤 주식회사 | 영상 부호화/복호화 방법 및 장치 |
US9380298B1 (en) * | 2012-08-10 | 2016-06-28 | Google Inc. | Object-based intra-prediction |
CN104685884B (zh) * | 2012-10-05 | 2017-10-24 | 华为技术有限公司 | 用于视频编码的方法及设备、用于视频解码的方法 |
US20140098880A1 (en) * | 2012-10-05 | 2014-04-10 | Qualcomm Incorporated | Prediction mode information upsampling for scalable video coding |
-
2015
- 2015-10-29 KR KR1020177012784A patent/KR20170075754A/ko unknown
- 2015-10-29 WO PCT/KR2015/011518 patent/WO2016068630A1/ko active Application Filing
- 2015-10-29 US US15/523,424 patent/US10051268B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7680190B2 (en) * | 2000-06-21 | 2010-03-16 | Microsoft Corporation | Video coding system and method using 3-D discrete wavelet transform and entropy coding with motion information |
KR20100097286A (ko) * | 2009-02-26 | 2010-09-03 | 에스케이 텔레콤주식회사 | 영상 부호화/복호화 장치 및 방법 |
US20140044166A1 (en) * | 2012-08-10 | 2014-02-13 | Google Inc. | Transform-Domain Intra Prediction |
WO2014109826A1 (en) * | 2012-11-13 | 2014-07-17 | Intel Corporation | Video codec architecture for next generation video |
Non-Patent Citations (1)
Title |
---|
ANDREW NAFTEL ET AL.: "Motion trajectory learning in the DFT-coefficient feature space.", IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION SYSTEMS (ICVS, 2006, pages 1 - 8 * |
Also Published As
Publication number | Publication date |
---|---|
US10051268B2 (en) | 2018-08-14 |
US20170302921A1 (en) | 2017-10-19 |
KR20170075754A (ko) | 2017-07-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210160508A1 (en) | Method and apparatus for encoding and decoding video using skip mode | |
KR101974261B1 (ko) | Cnn 기반 인루프 필터를 포함하는 부호화 방법과 장치 및 복호화 방법과 장치 | |
WO2016167538A1 (ko) | 비디오 신호의 인코딩, 디코딩 방법 및 그 장치 | |
KR101901355B1 (ko) | 최적화 함수를 이용하여 그래프 기반 예측을 수행하는 방법 및 장치 | |
KR20210010633A (ko) | 히스토리 기반 움직임 벡터에 기반한 인터 예측 방법 및 그 장치 | |
CN105850124B (zh) | 使用量化误差的额外的控制编码、解码视频信号的方法和装置 | |
WO2016076659A1 (ko) | 일반화된 그래프 파라미터를 이용하여 그래프 기반 변환을 수행하는 방법 및 장치 | |
US20190268619A1 (en) | Motion vector selection and prediction in video coding systems and methods | |
JP2016536859A (ja) | メディア信号のエンコード及びデコード方法並びにそれを用いる装置 | |
US10652569B2 (en) | Motion vector selection and prediction in video coding systems and methods | |
WO2016068630A1 (ko) | 비디오 신호의 인코딩, 디코딩 방법 및 그 장치 | |
KR100978465B1 (ko) | 양-예측 부호화 방법 및 장치, 양-예측 복호화 방법 및 장치 및 기록매체 | |
KR20230067492A (ko) | Ai를 이용하는 영상 부호화 장치 및 영상 복호화 장치, 및 이들에 의한 영상의 부호화 및 복호화 방법 | |
JP2024510433A (ja) | ビデオ圧縮のための時間的構造ベースの条件付き畳み込みニューラルネットワーク |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15854245 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15523424 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 20177012784 Country of ref document: KR Kind code of ref document: A |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 15854245 Country of ref document: EP Kind code of ref document: A1 |