WO2010078759A1 - 基于码率控制的图像时间域和空间域分辨率处理方法 - Google Patents

基于码率控制的图像时间域和空间域分辨率处理方法 Download PDF

Info

Publication number
WO2010078759A1
WO2010078759A1 PCT/CN2009/073590 CN2009073590W WO2010078759A1 WO 2010078759 A1 WO2010078759 A1 WO 2010078759A1 CN 2009073590 W CN2009073590 W CN 2009073590W WO 2010078759 A1 WO2010078759 A1 WO 2010078759A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
resolution
code rate
spatial
cost function
Prior art date
Application number
PCT/CN2009/073590
Other languages
English (en)
French (fr)
Inventor
马国强
Original Assignee
深圳市融创天下科技发展有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市融创天下科技发展有限公司 filed Critical 深圳市融创天下科技发展有限公司
Priority to US12/746,175 priority Critical patent/US8311097B2/en
Priority to EP09833910.4A priority patent/EP2234401A4/en
Publication of WO2010078759A1 publication Critical patent/WO2010078759A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/19Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding using optimisation based on Lagrange multipliers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/31Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the temporal domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/87Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving scene cut or scene change detection in combination with video compression

Definitions

  • the present invention relates to the field of video image processing, and more particularly to an image processing method for an adaptive time domain and spatial domain resolution framework. Background technique
  • the quantization coefficient Qp is too high, so that the quantization step size is too large to reflect the detail change of the video image. As a result, the high-frequency part of the video image is excessively distorted, and the detail loss of the video image is serious.
  • the present invention proposes a new video image processing method (AstRF) (RDO based Adaptive spatial-temporal Resolution Frame), which is an image processing method based on adaptive time domain and spatial domain resolution framework based on rate distortion optimization.
  • the encoder when the encoder detects that the allocation code rate is less than the critical point, automatically finds the most suitable time resolution and spatial resolution under the specified target bit rate according to the principle of rate distortion optimization, after decoding by the decoder, The resolution of the input image is restored by a certain algorithm.
  • the image processing method provided by the present invention can significantly improve the apparent damage to the subjective effect of the video image caused by over-compression of the transmitted video image at a low code rate.
  • the technical problem to be solved by the present invention is to provide an adaptive time domain and spatial domain resolution framework based on rate distortion optimization for the problem of over-compression which seriously affects the subjective quality and objective quality of video images in the prior art.
  • Image processing method is to provide an adaptive time domain and spatial domain resolution framework based on rate distortion optimization for the problem of over-compression which seriously affects the subjective quality and objective quality of video images in the prior art.
  • the technical solution adopted by the present invention to solve the technical problem is to construct an image processing method based on rate distortion optimization adaptive time domain and spatial domain resolution framework, and the method comprises the following steps:
  • the encoder calculates and obtains a critical rate condition value according to the input video image and the change of the target allocation code rate;
  • the encoder compares the critical code rate condition value with a threshold, and if the critical code rate condition value is less than a threshold, that is, the target allocation code rate is less than the critical code rate, obtaining, by downsampling, is adapted to the The image is assigned an image resolution of the code rate, otherwise the conventional encoding process is performed;
  • the decoder recovers the original resolution of the received image by upsampling and smoothes it.
  • the formula for calculating and obtaining the critical code rate condition value in the step A is:
  • Risual xyk H x (C x , y , k _ P x , y , k ) x H T
  • is the spatial-frequency domain transformation matrix specified by any encoder
  • R isual ⁇ k represents the distribution of the residual obtained in the frequency domain after encoding
  • Score is the critical rate condition value. .
  • the range of the threshold value in the step B is from 0.1 to 0.95. Further preferably, the step B further comprises:
  • step B1 further includes:
  • the downsampling in the step B2 is to reduce the time resolution by means of frame drawing.
  • the step C further includes: CI. Use spatial resampling to recover the original spatial resolution of the received image;
  • the image is smoothed by time window smoothing and scene switching judgment.
  • the spatial resampling in the step C1 adopts a bicubic convolution interpolation algorithm
  • the bicubic convolution interpolation algorithm includes a one-dimensional cube convolution interpolation algorithm and a two-dimensional cube convolution interpolation algorithm, where
  • the one-dimensional cube convolution interpolation algorithm is:
  • the step of determining the scene switching in the step C2 includes:
  • IntraCostlnMb SATD + ⁇ ⁇ Rbit (Intra)
  • InterCostlnMb SAD + ⁇ ⁇ Rbit(Inter)
  • IntraCostlnMb is the value of the intraframe coding cost function
  • InterCostlnMb is the value of the interframe coding cost function, which is the Lagrangian factor
  • SATD is the sum of the absolute values of the prediction parameters of the 4 X 4 block after the Hardman transform
  • SAD is the absolute error.
  • Rbit is the number of coded output bits corresponding to the corresponding coding parameters
  • C22 Accumulating the intra-coded cost function value and the inter-frame coding cost function value of each macroblock in the current image, and analyzing the complete frame image to obtain the intra-coded total cost function value and the inter-frame coding total cost function value.
  • the scene switching is performed when the inter-coded total cost function value is greater than the product of the intra-coded total cost function value and the sensitivity coefficient. Otherwise, no scene switching occurs, wherein the sensitivity coefficient ranges from 0.1 to 0.9.
  • An image processing method for implementing an adaptive time domain and a spatial domain resolution frame based on rate distortion optimization provided by the present invention, when the encoder detects that the allocation code rate is less than the critical code rate, automatically according to the rate distortion
  • the principle of optimization is to find the most suitable time resolution and spatial resolution under the specified target bit rate.
  • the spatial resolution of the input image is restored by a certain algorithm.
  • the image processing method provided by the present invention can significantly improve the apparent damage to the subjective effect of the video image caused by over-compression of the video image transmitted at a low code rate and can reduce the computational complexity while still ensuring the image quality.
  • FIG. 1 is a flow chart of an image processing method for an adaptive time domain and spatial domain resolution framework in accordance with a preferred embodiment of the present invention
  • FIG. 2 is a schematic diagram of a time window for smoothing an image in one embodiment of the present invention
  • FIG. 3 is a flowchart of a scene switching determination method for smoothing an image by using a scene switching determination manner in an embodiment of the present invention
  • FIG. 4 is a schematic diagram of reference grid points required for one-dimensional interpolation and two-dimensional interpolation of spatial resampling in one embodiment of the present invention
  • Figure 5 is a diagram showing the relationship between spatial resampling cube convolution interpolation function and distance in one embodiment of the present invention
  • Figure 6 is a view showing a comparison of the original resolution encoding of the Akiyo sequence image and the method rate distortion performance of the present invention in one embodiment of the present invention
  • Figure 7 is a schematic diagram showing the comparison of the original resolution encoding of the Foreman sequence image and the rate distortion performance of the method of the present invention in one embodiment of the present invention
  • FIG. 8 is a schematic diagram showing a comparison between the original resolution encoding of the Mobile sequence image and the method rate distortion performance of the present invention in an embodiment of the present invention
  • Figure 9 is a diagram showing the comparison of the original resolution encoding of the Tempete sequence image with the rate distortion performance of the method of the present invention in one embodiment of the present invention. detailed description
  • FIG. 1 is a flow chart showing an image processing method of an adaptive time domain and spatial domain resolution framework according to a preferred embodiment of the present invention. The process is as follows:
  • step S100 the original input video image is received at the encoder end
  • step S105 the critical code rate condition value is calculated and obtained at the encoder side as follows: Assume an image representing k time, t represents a target code rate, and represents a resampled generated image:
  • C, y , k denotes the image obtained by S, y , k encoding and then decoding.
  • D x , y , k represent the distortion obtained by S x , y , k encoding:
  • D x , y , k S x ⁇ k -C x , y , k ;
  • N the number of pixels participating in the calculation area
  • 7 ⁇ the appropriate weight for the high-frequency component, which is as follows: 1 1 1 1
  • the score is a critical rate condition value; in step S110, the comparison determines whether the critical rate condition value Score calculated by the encoder is smaller than the threshold L.
  • the image resolution is 320 ⁇ 240, 10FPS, and the code rate is 50 kbps.
  • the channel is about 60 kbps, and L can take 0.85. If the scene changes, such as the code rate changes, etc., the threshold L needs to be re-adjusted according to the experiment, and the threshold L ranges from 0.1 to 0.95; if the critical rate condition value Score is smaller than the threshold L then step S115 is performed, otherwise step S135 is performed;
  • step S115 the spatial resolution adapted to the target allocation code rate is calculated at the encoder side using downsampling, as follows:
  • the specific template can be actually set according to the actual resolution of the actual size. After determining the set of modules, find the best values in these templates according to the principles of RD0.
  • the Lagrange linear approximation method can be used to implement RD0, according to the formula:
  • J R(p) + pD(p) Traverse the various coding modes in the template set to find Ji , which is the smallest in the above equation.
  • Min ⁇ The mode should be RD optimal, and the corresponding resolution is the optimal resolution;
  • step S120 the time resolution adapted to the target allocation code rate is calculated at the encoder side using downsampling, as follows:
  • Variable temporal resolution is achieved primarily by temporal stratification (or grading).
  • a bitstream that provides time grading the image can be segmented into a base layer and one or several enhancement layers with the following characteristics. These time layers are marked by the time level, the base layer is represented by time level 0, the other time layers are sequentially incremented, and the corresponding time resolution is also increased.
  • a low-level time resolution scaled by k can be obtained by removing the time (enhancement) layer of all time levels greater than k in the bitstream.
  • a necessary condition for time scalability in general is to allow implementation by limiting the motion compensated prediction reference frame of the currently predicted frame to a time layer lower than or equal to the current frame.
  • a time-gradable N-layer second-order time enhancement layer can be obtained by using a hierarchical B picture tool in the encoder.
  • Time base layer T Independent of all other frames, starting with an IDR access unit, where each frame is encoded intra-coded or the previous base layer image is used as a reference frame. The coding and display order of the base layer is the same. Class time is ⁇ ⁇ time enhancement layer picture is always located in two successive time between pictures below the level of X.
  • the temporal enhancement layer is encoded as a frame, and its reference frame lists Li st 0 and Li st 1 are limited to two images whose temporal time is less than X. Each time layer set can be decoded independently of all time levels ⁇ > ⁇ enhancement layers;
  • the Hierarchical Prediction Structure of the above time hierarchy may be combined with the multi-reference frame method of the encoder, the reference frame list may use more than one reference frame, and may include pictures of the same time level as the predicted frame,
  • the layer prediction structure may be non-second-order, and the prediction structure may be arbitrarily modified according to actual coding requirements;
  • the coding efficiency of the hierarchical prediction structure depends in large part on how different time-level quantization parameters are selected.
  • the base layer coding must have the highest fidelity because it directly or indirectly serves as a motion compensated prediction reference for all other images.
  • the quantization parameters of other time layers can be increased layer by layer, because Rebuilding image quality for it can only affect fewer images.
  • the selection of the base layer quantization parameters can be done by rate-distortion analysis by conventional methods. However, for the QP of the enhancement layer, in order to avoid complicated calculations, the following method can be used, and the experiment proves that the video signals with different characteristics are better robust.
  • the quantization parameter of the base layer is QP.
  • step S125 the original spatial resolution of the video image is restored by using the upsampling method at the decoder end.
  • the upsampling here is to use the spatial resampling method to restore the original spatial resolution of the video image by using the bicubic convolution interpolation algorithm.
  • the double cube convolution interpolation algorithm includes a one-dimensional interpolation algorithm and a two-dimensional interpolation algorithm;
  • step S130 the processed image is smoothed at the decoder end, including smoothing processing of the time window smoothing processing and the scene switching determining manner, and the smoothing process ends the method flow;
  • step S135 the critical code is When the rate condition value Score is not less than the threshold value L, the original input video image is subjected to conventional encoding processing at the encoder end;
  • step S140 the received encoded video image is subjected to conventional decoding processing at the decoder side corresponding to step S135.
  • Fig. 2 is a timing diagram showing a time window for smoothing an image in an embodiment of the present invention.
  • the spatial resolution is lowered in a smoothing manner, and the visual appearance is gradually excessive.
  • Figure 2 shows a continuous time window that counts the overall situation of a window and then recodes the entire window.
  • FIG. 3 is a flowchart of a method for determining a scene switching method for smoothing an image by using a scene switching determination manner in an embodiment of the present invention, including the following steps:
  • step 300 the current macroblock is first analyzed, and the optimal intra and inter coding modes are selected.
  • step 302 IntraCostlnMb and InterCostlnMb are calculated using the Lagrangian rate distortion model; IntraCostlnMb and InterCostlnMb respectively represent intraframe coding.
  • the cost function of the interframe coding the calculation method is as follows:
  • IntraCostlnMb SATD + ⁇ ⁇ Rbit (Intra)
  • InterCostlnMb SAD + ⁇ ⁇ Rbit(Inter)
  • SATD the sum of the absolute values of the prediction parameters of the 4 X 4 block subjected to the Hardman transform
  • SAD is the absolute error sum
  • Rbit is the number of coded output bits corresponding to the corresponding coding parameters
  • step 304 the intra-coded cost function value (IntraCostlnMb) and the inter-frame coding cost function value (InterCostlnMb) of each macroblock in the current image are cumulatively summed to obtain an intra-coded total cost function value (IntraTotalCost).
  • Interframe coding total cost function value (InterTotalCost) the formula is:
  • IntraTotalCost > IntraCostlnMb
  • step 306 it is determined whether the current frame is over, if not, proceed to step 300, if the current frame ends, step 308 is performed;
  • step 308 after analyzing the complete frame image, determining whether the current image is a scene switch according to the obtained intra-coded total cost function value (IntraTotalCost) and the inter-frame coding total cost function value (InterTotalCost), if InterTotalCost > IntraTotalCost X Fbias
  • the judgment result of this image is scene switching 310. Otherwise, the judgment result is that there is no scene switching 312.
  • Fbias is a sensitive coefficient for judging the scene switching, and the value range is 0.1-0.9. If it is too small, it will cause misjudgment.
  • the conference is slow to reflect the scene switching. For example, when transmitting a QVGA (320x240) or lOfps scene on a 50kbps channel, it is recommended to take 0.25. Other scenarios can be adjusted according to specific needs.
  • the spatial resampling of the present invention uses bicubic convolution interpolation.
  • the gray value of the interpolated point is obtained by weighted averaging of the gray values of the 16 nearest original image grid points.
  • the whole interpolation process includes one-dimensional interpolation in the horizontal direction and one-dimensional interpolation in the vertical direction, one at a time.
  • the cube convolution one-dimensional interpolation formula is: Wherein, the point to be interpolated, Pfe) is the reference grid point. If the reference grid point falls outside the image, it is replaced by the pixel on the edge of the nearest image; the two-dimensional interpolation is separable in two directions.
  • Pfe point to be interpolated
  • Figure 6 is a diagram showing the comparison of the original resolution encoding of the Akiyo sequence image and the method rate distortion performance of the present invention in one embodiment of the present invention.
  • Figure 7 is a diagram showing the comparison of the original resolution encoding of the Foreman sequence image and the rate distortion performance of the method of the present invention in one embodiment of the present invention.
  • Figure 8 is a diagram showing the comparison of the original resolution encoding of the Mobile sequence image with the method rate distortion performance of the present invention in one embodiment of the present invention.
  • Figure 9 is a diagram showing the comparison of the original resolution encoding of the Tempete sequence image with the rate distortion performance of the method of the present invention in one embodiment of the present invention.
  • Figure 6, Figure 7, Figure 8, and Figure 9 show the rate resolution performance of Akiyo, Foreman, Mobile, and Tempete, respectively, using the original resolution and the method of the present invention.
  • the PSNR is calculated at the original resolution. Is the difference between the decoded image and the original image; the method of the present invention calculates the PSNR using the difference between the decoded image and the resolution-adjusted image; as can be seen from these figures, the original resolution and the method of the present invention are two The bar begins to branch at a certain point, which is the critical code rate, and then the distortion between the curve and the resampled image using the method of the present invention is much smaller than that between the original resolution image and the original input image. distortion.
  • the above is only the preferred embodiment of the present invention, and is not intended to limit the present invention. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the protection of the present invention. Within the scope.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Description

基于码率控制的图像时间域和空间域
分辨率処理方法 技术领域
本发明涉及视频图象处理领域, 更具体地说,涉及一种自适应时间域和空 间域分辨率框架的图像处理方法。 背景技术
在窄带视频通信应用领域, 由于传输带宽的限制, 需要降低视频图像编 码码率, 在较低的编码码率下传输的图象有很大概率都处于过压缩状态, 继而 引起的方块效应、 量化噪声等对视频图像的主观效果会产生比较明显的损伤。 在编码器中,造成过压缩的原因主要是由过量化引起的,具体表现在两种情况:
(1) 量化系数 Qp过高, 使得量化步长偏大, 不能反映视频图像的细节变化, 结果导致视频图象的高频部分失真过大, 视频图象的细节损失比较严重。
(2) 宏块边界效应,由于一帧视频图像中两侧边缘宏块的编码模式可能不相同, 并且各宏块可能选择不同的量化参数, 因此会引起视频图像边界能量的断裂, 即经过压缩编码后的相邻图像块的边界不连续,造成明显的方块效应 (blocking effect) , 在目前主流的视频图像压缩编码技术如 H261/H263/H264及 MPEG4等 国际标准中,一般都是基于 DCT离散余弦变换的方法将空域图像信息变换到频 率域, 再将变换后少数的 DCT系数进行量化和编码。采用传统的编码压缩方法 在低码率压缩的视频图像通信中产生的方块效应是引起图像失真的一个重要 因素, 方块效应严重影响了视频图像通信的主观质量和客观质量 (PSNR)。
实验证明, 对于给定的输入图象, 随着编码过程中分配码率的降低, 存在 着一个临界点, 当分配码率小于该临界点时, 重建图象在原有的时间分辨率和 空间分辨率下不能保存足够的图像紋理信息。本发明提出了一种新的视频图像 处理方法 ( AstRF ) ( RDO based Adaptive spatial-temporal Resolution Frame ) ,即基于率失真优化的自适应时间域和空间域分辨率框架的图像处理方 法, 当编码器检测到分配码率小于临界点时, 自动根据率失真优化的原则, 找 出在分配的特定目标码率下最适合的时间分辨率和空间分辨率,在解码器解码 后, 再通过一定的算法恢复到输入图象的分辨率。通过本发明提供的图像处理 方法可以明显改善在低码率下传输视频图像因过压缩而引起的对视频图像的 主观效果的明显损伤。
发明内容
本发明要解决的技术问题在于,针对现有技术中的因过压缩而严重影响视 频图像主观质量和客观质量的问题,提供一种基于率失真优化的自适应时间域 和空间域分辨率框架的图像处理方法。
本发明解决其技术问题所采用的技术方案是:构造一种基于率失真优化的 自适应时间域和空间域分辨率框架的图像处理方法, 所述方法包括以下步骤:
A. 编码器根据输入的视频图像和目标分配码率的变化计算并获得临界码 率条件值;
B. 编码器将所述临界码率条件值与阈值进行比较, 若所述临界码率条件 值小于阈值即所述的目标分配码率小于所述临界码率时通过下采样获得适应 于所述目标分配码率的图像分辨率, 否则进行常规编码处理;
C. 解码器通过上采样恢复接收图像的原始分辨率并进行平滑处理。
优选地, 所述步骤 A中计算并获得临界码率条件值的公式为:
Risualx y k = H x (Cx,y,k _ Px,y,k ) x H T
Figure imgf000004_0001
其中, 表示 编码然后解码得到的重建图象, 表示 k时刻的图
ΡΧ,Λ表示, Η是任意编码器指定的空间 -频域变换矩阵; Risual^k表示编码 后得到的残差在频域的分布,
其中, Qp 是量化参数, N是参加计算区域的象素数目, 7 ^是为高频分 :适当增加权重, 取值如下:
Figure imgf000005_0001
1 1 1 1 1 2 1 1
2 1 -1 -2 1 1 -1 -2
H
1 -1 -1 1 1 -1 -1 2
1 -2 2 -1 1 - 2 1 -1
Score 是临界码率条件值。 。
优选地, 所述步骤 B中的所述阈值根据不同场景变化的范围是 0. 1至 0. 95。 进一步优选地, 所述步骤 B进一步包括:
B 1. 通过下采样获得适应于所述目标分配码率的图像空间分辨率;
B2. 通过下采样获得适应于所述目标分配码率的图像时间分辨率。
进一步优选地, 所述步骤 B1进一步包括:
B11.设置图像空间分辨率模板集合;
B12.根据率失真优化方法在所述模板集合中获得最佳模板做为适应于所 述目标分配码率的图像空间分辨率。 进一步优选地, 所述步骤 B12中所述率失真优化方法是采用拉格朗日线性 逼近方法实现 RDO, 公式为: J = R p) + pD p) 其中,遍历模板集合中各种编码模式 求取 Ji, 使所述公式中的 最小。 即 minG/)对应的模式, 即是 R-D最优, 其对应的图像分辨率即是所述图像空间 分辨率。
优选地, 所述步骤 B2中所述下采样是采用抽帧的方式降低时间分辨率。 优选地, 所述步骤 C进一步包括: CI. 采用空间重采样恢复接收图像的原始空间分辨率;
C2.采用时间窗平滑和场景切换判断方式对图像进行平滑处理。
优选地, 所述步骤 C1中所述空间重采样是采用了双立方体卷积插值算法, 所述双立方体卷积插值算法包括一维立方体卷积插值算法和二维立方体卷积 插值算法, 其中, 一维立方体卷积插值算法为:
Figure imgf000006_0001
其中, 中 c)是待插值点, /^( )是参考网格点, 1
其中, ¾ α = -0.5
Figure imgf000006_0002
其中 W )是函数 H ( c) = 的最近似拟合表达式; 二维立方体卷积插值算法为:
Figure imgf000006_0003
其中, P '(; 是待插值点, ^是参考网格点。
优选地, 所述步骤 C2中所述场景切换判断的步骤包括:
C21.分析当前宏块, 选取最佳帧内和帧间编码模式, 使用拉格朗日率失真 模型计算出帧内编码代价函数值和帧间编码代价函数值, 计算公式为:
IntraCostlnMb = SATD + λ χ Rbit(Intra)
InterCostlnMb = SAD + λ χ Rbit(Inter)
其中, IntraCostlnMb是帧内编码代价函数值, InterCostlnMb是帧间编码代价 函数值, 是拉格朗日因子, SATD是经过哈德曼变换的 4 X 4块的预测参差绝对 值总和, SAD是绝对误差和, Rbit是使用相应编码参数对应的编码输出比特数;
C22. 将当前图像中每个宏块的帧内编码代价函数值和帧间编码代价函数 值进行累加,分析完整帧图像后得到帧内编码总代价函数值和帧间编码总代价 函数值,若帧间编码总代价函数值大于帧内编码总代价函数值与敏感系数的乘 积则发生了场景切换, 否则没有发生场景切换, 其中敏感系数范围是 0.1至 0.9。
实施本发明提供的基于率失真优化的自适应时间域和空间域分辨率框架 的图像处理方法, 当编码器检测到分配码率小于临界码率时, 自动根据率失真 优化的原则, 找出在分配的特定目标码率下最适合的时间分辨率和空间分辨 率, 在解码器解码后, 再通过一定的算法恢复到输入图象的空间分辨率。通过 本发明提供的图像处理方法可以明显改善在低码率下传输视频图像因过压缩 而引起的对视频图像的主观效果的明显损伤并且在降低计算复杂度的同时仍 能保证图象质量。 附图说明
下面将结合附图及实施例对本发明作进一步说明, 附图中:
图 1是依据本发明一较佳实施例的一种自适应时间域和空间域分辨率框架 的图像处理方法的流程图;
图 2是本发明的一个实施例中用于平滑图像的时间窗示意图;
图 3是本发明在一个实施例中使用场景切换判断方式对图像进行平滑处理 的场景切换判断方法流程图;
图 4是本发明在一个实施例中空间重采样的一维插值和二维插值需要的参 考网格点示意图;
图 5是本发明的一个实施例中空间重采样的立方体卷积插值函数与距离之 间的关系示意图;
图 6是本发明的一个实施例中 Akiyo序列图象原始分辨率编码与本发明方 法率失真性能对比示意图;
图 7是本发明的一个实施例中 Foreman序列图象原始分辨率编码与本发明 方法率失真性能对比示意图;
图 8是本发明的一个实施例中 Mobile序列图象原始分辨率编码与本发明方 法率失真性能对比示意图;
图 9是本发明的一个实施例中 Tempete序列图象原始分辨率编码与本发明 方法率失真性能对比示意图。 具体实施方式
为了使本发明的目的、技术方案及优点更加清楚明白, 以下结合附图及实 施例, 对本发明进行进一步详细说明。应当理解, 此处所描述的具体实施例仅 仅用以解释本发明, 并不用于限定本发明。 图 1示出了依据本发明一较佳实施例的一种自适应时间域和空间域分辨率 框架的图像处理方法的流程图, 过程如下:
在步骤 S100中, 在编码器端接收原始输入的视频图像;
在步骤 S105中, 在编码器端计算并获得临界码率条件值, 方法如下: 假设 表示 k时刻的图象, t表示目标码率, 表示重新采样后的生 成图象:
Sr x^k' =R(Sx^k) 其中, R(表示下采样函数。设 表示 ^^编码然后解码得到的重建图象,
C ,y,k表示 S ,y,k编码然后解码得到的图象。 Dx,y,k表示由 Sx,y,k编码得到的失真: Dx,y,k = Sx^k -Cx,y,k; 设 表示由 S y,k编码, 然后恢复分辨率后得到的失
其中, 0表示上采样函数。
设 ¾W , 表示编码 后得到的残差在频域的分布。 'sW , 的取值见式
4, 其中 H可以是任意编码器指定的空间-频域变换矩阵。 Risualxyk =Hx(Cxyk -Pxyk)xHT
Figure imgf000008_0001
其中, Qp 是量化参数, N是参加计算区域的象素数目, 7^是为高频分 量适当增加权重, 取值如下: 1 1 1 1
1 1 2 8
1 2 16 8
1 8 8 16
1 1 1 1 1 2 1 1
2 1 -1 -2 1 1 -1 -2
H= HT =
1 -1 -1 1 1 -1 -1 2
1 -2 2 -1 1 -2 1 -1
Score 是临界码率条件值; 在步骤 S110中, 比较判断编码器计算的临界码率条件值 Score是否小于阈 值 L, 在本发明使用的场景中, 图象分辨率为 320X240, 10FPS, 码率 50kbps, 信 道是 60kbps左右, L可以取 0.85, 如果场景发生变化, 如码率变化了等等, 需 要重新根据实验调整阈值 L, 阈值 L的范围是 0.1— 0.95; 若临界码率条件值 Score小于阈值 L则执行步骤 S115, 否则执行步骤 S135;
在步骤 S115中,在编码器端采用下采样计算适应于目标分配码率的空间分 辨率, 方法如下:
首先对空间分辨率设置模板, 如下:
今 纵向缩小 1/2
今 纵向缩小 1/3
今 纵向缩小 1/4
今 横向缩小 1/2
今 横向缩小 1/3
今 横向缩小 1/4
具体的模板可以根据实际中原始分辨率的大小实际设定。 在确定好模块集 合后, 根据 RD0的原则在这些模板中寻找最佳值。可以使用拉格朗日线性逼近 方法来实现 RD0, 按照式:
J = R(p) + pD(p) 遍历模板集合中各种编码模式 求取 Ji, 使上式中的 最小。 即 min ^^对 应的模式, 即是 R-D最优, 其对应的分辨率即是最优分辨率;
在步骤 S120中,在编码器端采用下采样计算适应于目标分配码率的时间分 辨率, 方法如下:
主要通过时间上分层(或分级)来实现时间分辨率的可变。一个提供时间 可分级的比特流,图象可以分割成具有以下特性的一个基础层和一个或者几个 增强层。 这些时间层通过时间级来标志, 基础层以时间级 0表示, 其它的时间 层依次递增, 相应地时间分辨率也随之增加。 对于一个自然数 k, 可以通过移 除比特流中所有时间级大于 k的时间(增强)层, 得到一个以 k标定的低层次时 间分辨率。在混合视频编解码器中, 一般情况下时间可分级的一个必要条件是 允许通过将当前被预测帧的运动补偿预测参考帧限制在低于或者是等于当前 帧的时间层中来实现。 利用编码器中分层 B帧 (hierarchical B picture ) 的 工具可以得到时间可分级的 N层 2阶时间增强层。
时间基础层 T。, 独立于其它所有帧被编码, 开始于一个 IDR访问单元, 其 中每一帧编码方式为帧内编码或者利用之前的基础层的图象作为参考帧。基础 层的编码和显示次序是相同的。 时间级为 Τχ的时间增强层图片永远坐落于两 个连续的时间级低于 X的图片之间。时间增强层是作为 Β帧被编码的, 其参考 帧列表 Li st 0和 Li st 1被限制在前后时间级小于 X的两个图象。 每个时间层 集合 可以独立于所有时间级 Υ>Χ的增强层被解码;
上述时间分级的分层预测结构 (Hierarchical Prediction Structure ) 可 以与编码器的多参考帧方法相结合, 参考帧列表可以使用多于一个参考帧, 而 且可以包含与被预测帧相同时间级的图片, 分层预测结构可以是非二阶的, 可 以根据实际编码需要, 任意修改预测结构;
在上述时间分层的结构中, 很明显时间上的参考受到局限, 编码器率失真 性能将会不可避免地受到影响, 以下部分介绍如何在一定程度上改善这个问 题。
分层预测结构的编码效率很大一部分依赖于不同的时间层量化参数是怎样 选择的。基础层编码必须有最高的保真度, 因为其直接或者间接地作为所有其 它图象的运动补偿预测参考。而其它各个时间层的量化参数可以逐层增加, 因 为其重建图象质量只能影响更少的图片。基础层量化参数的选择可以通过传统 的方法由率失真分析来完成。 然而对增强层的 QP, 为了避免复杂的计算, 可 以使用以下的方法, 实验证明对不同特性的视频信号有较好的鲁棒性。
假设基础层的量化参数是 QP。,对于时间级为 k>0的增强层其量化参数可以 选择为 QPk= QP。+3+k。
虽然这种方法在一个 GOP内引起很大的 PSNR波动, 但是实验证明其重建结 果还是比较平滑的。
在步骤 S125中,在解码器端采用上采样的方法回复视频图像的原始空间分 辨率,这里的上采样就是使用空间重采样的方式即使用双立方体卷积插值算法 回复视频图像的原始空间分辨率,双立方体卷积插值算法包括一维插值算法和 二维插值算法;
在步骤 S130中,在解码器端对处理过的图像进行平滑处理,包括时间窗平 滑处理和场景切换判断方式的平滑处理, 平滑处理完毕即结束本方法流程; 在步骤 S135中, 是在临界码率条件值 Score不小于阈值 L时,在编码器端对 原始输入的视频图像进行常规编码处理;
在步骤 S140中,对应于步骤 S135在解码器端对接收的编码的视频图像进行 常规的解码处理。
图 2示出了本发明的一个实施例中用于平滑图像的时间窗示意图, 在时间 窗中, 空间分辨率的下降采用平滑过度的方式, 让人的视觉逐渐过度。 图 2表 示连续的时间窗,可以统计一个窗口的整体情况,然后回来重新编码整个窗口。
图 3示出了本发明在一个实施例中使用场景切换判断方式对图像进行平滑 处理的场景切换判断方法流程图, 包括以下步骤:
在步骤 300中, 首先分析当前宏块, 选取最佳帧内和帧间编码模式; 在步骤 302中, 使用拉格朗日率失真模型计算出 IntraCostlnMb和 InterCostlnMb; IntraCostlnMb和 InterCostlnMb分别代表在帧内编码及帧间 编码的代价函数, 计算方法如下:
IntraCostlnMb = SATD + λ χ Rbit(Intra)
InterCostlnMb = SAD + λ χ Rbit(Inter) 其中, 是拉格朗日因子, SATD是经过哈德曼变换的 4 X 4块的预测参差绝 对值总和, SAD是绝对误差和, Rbit是使用相应编码参数对应的编码输出比特 数;
在步骤 304中, 将当前图像中每个宏块的帧内编码代价函数值 (IntraCostlnMb )和帧间编码代价函数值(InterCostlnMb )进行累加求和, 得 到帧内编码总代价函数值 ( IntraTotalCost ), 帧间编码总代价函数值 (InterTotalCost) , 公式为:
IntraTotalCost = > IntraCostlnMb
InterTotalCost = InterCostlnMb
在步骤 306中, 判断当前帧是否结束, 若没有结束则继续执行步骤 300, 若 当前帧结束则执行步骤 308;
在步骤 308中, 当分析完整帧图像后, 根据求得的帧内编码总代价函数值 (IntraTotalCost) 和帧间编码总代价函数值 (InterTotalCost) 分析判断当前图 像是不是场景切换,如果 InterTotalCost > IntraTotalCost X Fbias 此幅图像的判断 结果为场景切换 310, 否则判断结果为没有场景切换 312; 其中, Fbias是判断 场景切换的敏感系数, 取值范围是 0.1— 0.9, 如果过小会造成误判, 过大会造 成对场景切换反映迟钝, 例如在 50kbps信道传输 QVGA ( 320x240 ) , lOfps的 场景时, 建议取 0.25, 其它场景可以根据具体需要作调节。
图 4示出了本发明在一个实施例中空间重采样的一维插值和二维插值需要 的参考网格点示意图, 本发明的空间重采样使用了双立方体卷积插值。被插值 点的灰度值由 16个最近的原图像网格点的灰度值加权平均而得,整个插值过程 包括了水平方向上的一维插值以及垂直方向上的一维插值,每次一维插值需要 4个参考网格点, 两边各两个; 而二维插值共需要 4x4=16个参考网格点; 一维 立方体卷积插值的核函数算子是:
Figure imgf000012_0001
其中, 是待插值点与参考网格点之间的距离, 而参数《 = -0.5。 这个核函数是 函数 H (X) = 的最近似拟合表达式, 函数 H W = 的函数图像如附 χ · π χ - π
图中的图 5所示。 立方体卷积一维插值公式为:
Figure imgf000013_0001
其中, 是待插值点, Pfe)是参考网格点, 若参考网格点落在了图像之外, 则用最近的图像边缘上的像素点取代;二维插值是可分离的两个方向上的一维 插值的组合, 算法为:
P'(x, y) =∑W(y - yj) - (fj W(x - χ, )Ρ{χ, , ))
;=0 i=0 图 5示出了本发明的一个实施例中空间重采样的立方体卷积插值函数与距 离之间的关系示意图, 反映该函数图像的函数为: Hw = 。
χ - π
图 6是本发明的一个实施例中 Akiyo序列图象原始分辨率编码与本发明方 法率失真性能对比示意图。
图 7是本发明的一个实施例中 Foreman序列图象原始分辨率编码与本发明 方法率失真性能对比示意图。
图 8是本发明的一个实施例中 Mobile序列图象原始分辨率编码与本发明方 法率失真性能对比示意图。
图 9是本发明的一个实施例中 Tempete序列图象原始分辨率编码与本发明 方法率失真性能对比示意图。
图 6、 图 7、 图 8和图 9分别显示了 Akiyo、 Foreman, Mobile和 Tempete分别 采用原始分辨率及本发明方法的率失真性能, 这里需要说明的是, 原始分辨率 下计算 PSNR时采用的是解码图象和原始图象间的差异;本发明方法计算 PSNR 采用的是解码图象和分辨率调整后图象间的差异; 从这些图中可以看到, 原始 分辨率和本发明方法两条曲线在某个点开始分叉, 这个点就是临界码率, 之后 采用本发明方法的曲线与重采样图象之间的失真远小于原始分辨率图象编码 后与原始输入图象之间的失真。 以上所述仅为本发明的较佳实施例而已, 并不用以限制本发明, 凡在本发 明的精神和原则之内所作的任何修改、等同替换和改进等, 均应包含在本发明 的保护范围之内。

Claims

权 利 要 求
1、一种自适应时间域和空间域分辨率框架的图像处理方法,其特征在于, 包括以下步骤:
A. 编码器根据输入的视频图像和目标分配码率的变化计算并获得临界码 率条件值;
B. 编码器将所述临界码率条件值与阈值进行比较, 若所述临界码率条件 值小于阈值即所述的目标分配码率小于所述临界码率时通过下采样获得适应 于所述目标分配码率的图像分辨率, 否则进行常规编码处理;
C. 解码器通过上采样恢复接收图像的原始分辨率并进行平滑处理。
2、根据权利要求 1所述的自适应时间域和空间域分辨率框架的图像处理方 法, 其特征在于: 所述步骤 A中计算并获得临界码率条件值的公式为:
Ri腿 ,y,k = H x (Cx,y,k—Px,y,k ) x H T
Figure imgf000015_0001
其中, 表示 编码然后解码得到的重建图象, 表示 k时刻的图 象, 表示预测图像, H是任意编码器指定的空间-频域变换矩阵, Risual^表示编码 , 后得到的残差在频域的分布,
其中, Qp 是量化参数, N是参加计算区域的像素数目, 7 ^是为高频分 量适当增加权重, 取值如下:
Figure imgf000015_0002
1 1 1 1 1 2 1 1
2 1 -1 - 2 1 1 -1
H H
1 -1 -1 1 1 -1 -1
1 - 2 2 -1 1 - 2 1
Score 是临界码率条件值。
3、根据权利要求 1所述的自适应时间域和空间域分辨率框架的图像处理方 法, 其特征在于: 所述步骤 B中的所述阈值根据不同场景变化的范围是 0. 1至 0. 95 ο
4、根据权利要求 1所述的自适应时间域和空间域分辨率框架的图像处理方 法, 其特征在于, 所述步骤 Β进一步包括:
B1. 通过下采样获得适应于所述目标分配码率的图像空间分辨率;
Β2. 通过下采样获得适应于所述目标分配码率的图像时间分辨率。
5、根据权利要求 4所述的自适应时间域和空间域分辨率框架的图像处理方 法, 其特征在于, 所述步骤 B1进一步包括:
B11.设置图像空间分辨率模板集合;
B12.根据率失真优化方法在所述模板集合中获得最佳模板做为适应于所 述目标分配码率的图像空间分辨率。
6、根据权利要求 5所述的自适应时间域和空间域分辨率框架的图像处理方 法, 其特征在于, 所述步骤 B12中所述率失真优化方法是采用拉格朗日线性逼 近方法实现 RDO, 公式为: J = R(p) + PD(p) 其中,遍历模板集合中各种编码模式 求取 J 使所述公式中的 最小。 即 minG/)对应的模式, 即是 R-D最优, 其对应的图像分辨率即是所述图像空间 分辨率。
7、根据权利要求 4所述的自适应时间域和空间域分辨率框架的图像处理方 法,其特征在于,所述步骤 B2中所述下采样是采用抽帧的方式降低时间分辨率。
8、根据权利要求 1所述的自适应时间域和空间域分辨率框架的图像处理方 法, 其特征在于, 所述步骤 C进一步包括:
C 1. 采用空间重采样恢复接收图像的原始空间分辨率;
C2.采用时间窗平滑和场景切换判断方式对图像进行平滑处理。
9、根据权利要求 8所述的自适应时间域和空间域分辨率框架的图像处理方 法, 其特征在于, 所述步骤 C1中所述空间重采样是采用了双立方体卷积插值算 法,所述双立方体卷积插值算法包括一维立方体卷积插值算法和二维立方体卷 积插值算法, 其中, 一维立方体卷积插值算法为:
Figure imgf000017_0001
其中, 中 是待插值点, 是参考网格点,
(a + : (a + : + 1 0 1 < 1
其中, - 5a\x\ + Sa\x\ - 4a 1≤ < 2 ¾a = -0.5
Figure imgf000017_0002
其中 W )是函数 H ( c) = 的最近似拟合表达式;
χ · π
二维立方体卷积插值算法为
P' , y) =∑W(y - yj ) - ∑W(x - Xi )P(Xi , yj )) 其中, P '( 是待插值点, P . , ^)是参考网格点。
10、 根据权利要求 8所述的自适应时间域和空间域分辨率框架的图像处理 方法, 其特征在于, 所述步骤 C2中所述场景切换判断的步骤包括:
C21.分析当前宏块, 选取最佳帧内和帧间编码模式, 使用拉格朗日率失真 模型计算出帧内编码代价函数值和帧间编码代价函数值, 计算公式为:
IntraCostlnMb = SATD + λ χ Rbit(Intra)
InterCostlnMb = SAD + λ χ Rbit(Inter)
其中, IntraCostlnMb是帧内编码代价函数值, InterCostlnMb是帧间编码代价 函数值, 是拉格朗日因子, SATD是经过哈德曼变换的 4 X 4块的预测参差绝对 值总和, SAD是绝对误差和, Rbit是使用相应编码参数对应的编码输出比特数; C22. 将当前图像中每个宏块的帧内编码代价函数值和帧间编码代价函数 值进行累加,分析完整帧图像后得到帧内编码总代价函数值和帧间编码总代价 函数值,若帧间编码总代价函数值大于帧内编码总代价函数值与敏感系数的乘 积则发生了场景切换, 否则没有发生场景切换, 其中敏感系数范围是 0.1至 0.9(
PCT/CN2009/073590 2009-01-09 2009-08-28 基于码率控制的图像时间域和空间域分辨率处理方法 WO2010078759A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US12/746,175 US8311097B2 (en) 2009-01-09 2009-08-28 Image processing method for adaptive spatial-temporal resolution frame
EP09833910.4A EP2234401A4 (en) 2009-01-09 2009-08-28 BASED ON CODER DATA CONTROL PROCESS FOR PROCESSING TEMPORARY AND SPATIAL IMMEDIATE RESOLUTIONS

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2009101048685A CN101778275B (zh) 2009-01-09 2009-01-09 一种自适应时间域和空间域分辨率框架的图像处理方法
CN200910104868.5 2009-01-09

Publications (1)

Publication Number Publication Date
WO2010078759A1 true WO2010078759A1 (zh) 2010-07-15

Family

ID=42316223

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2009/073590 WO2010078759A1 (zh) 2009-01-09 2009-08-28 基于码率控制的图像时间域和空间域分辨率处理方法

Country Status (4)

Country Link
US (1) US8311097B2 (zh)
EP (1) EP2234401A4 (zh)
CN (1) CN101778275B (zh)
WO (1) WO2010078759A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2410749A1 (fr) 2010-07-20 2012-01-25 Freebox Procédé d'encodage adaptatif d'un flux vidéo numérique, notamment pour diffusion sur ligne xDSL
CN111462765A (zh) * 2020-04-02 2020-07-28 宁波大学 一种基于一维卷积核的自适应音频复杂度表征方法
US11006122B2 (en) * 2018-03-11 2021-05-11 Google Llc Static video recognition

Families Citing this family (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101418104B1 (ko) * 2010-03-08 2014-07-16 에스케이 텔레콤주식회사 움직임 벡터 해상도 조합을 이용한 움직임 벡터 부호화/복호화 방법 및 장치와 그를 이용한 영상 부호화/복호화 방법 및 장치
US20120275511A1 (en) * 2011-04-29 2012-11-01 Google Inc. System and method for providing content aware video adaptation
US8767821B2 (en) 2011-05-09 2014-07-01 Google Inc. System and method for providing adaptive media optimization
US9438918B2 (en) * 2012-04-23 2016-09-06 Intel Corporation Frame level rate control using motion estimated distortions
CN103428523B (zh) 2012-05-22 2015-07-08 华为技术有限公司 评估视频质量的方法和装置
CN103096086A (zh) * 2013-02-06 2013-05-08 上海风格信息技术股份有限公司 一种在多画面显示中下采样前移达到系统优化的方法
KR102085270B1 (ko) * 2013-08-12 2020-03-05 삼성전자 주식회사 가장 작은 왜곡 값을 갖는 해상도를 선택하는 이미지 처리 방법과 상기 방법을 수행하는 장치들
US20150103909A1 (en) * 2013-10-14 2015-04-16 Qualcomm Incorporated Multi-threaded video encoder
US9363333B2 (en) * 2013-11-27 2016-06-07 At&T Intellectual Property I, Lp Server-side scheduling for media transmissions
GB2526773B (en) * 2014-04-04 2020-09-30 Adder Tech Ltd Video signal transmission
CN104980740A (zh) * 2014-04-08 2015-10-14 富士通株式会社 图像处理方法、装置和电子设备
CN104052998B (zh) * 2014-05-26 2019-10-25 深圳市梦网百科信息技术有限公司 一种基于运动强度的gop层率控调整方法和系统
CN104159119A (zh) * 2014-07-07 2014-11-19 大连民族学院 一种视频图像实时共享播放的超分辨率重构方法和系统
KR102332782B1 (ko) 2014-12-15 2021-11-30 삼성전자주식회사 시각 특성을 고려한 영상 데이터 압축
CN104935928B (zh) * 2015-06-01 2017-10-17 电子科技大学 一种基于空域下采样模式的高效图像压缩方法
CN104883579B (zh) * 2015-06-08 2017-11-14 电子科技大学 一种基于空‑频域的联合视频图像的上采样方法
US11450064B2 (en) 2016-09-06 2022-09-20 Carnegie Mellon University Gaussian mixture model based approximation of continuous belief distributions
CN106682419B (zh) * 2016-12-27 2019-05-07 深圳先进技术研究院 一种医学图像参数的拟合方法和装置
CN107155107B (zh) 2017-03-21 2018-08-03 腾讯科技(深圳)有限公司 视频编码方法和装置、视频解码方法和装置
CN109905717A (zh) * 2017-12-11 2019-06-18 四川大学 一种基于空时域下采样与重建的h.264/avc编码优化方法
CN108235018B (zh) * 2017-12-13 2019-12-27 北京大学 一种基于拉格朗日乘子模型的点云帧内编码优化方法及装置
CN108833916B (zh) 2018-06-20 2021-09-24 腾讯科技(深圳)有限公司 视频编码、解码方法、装置、存储介质和计算机设备
CN108833918B (zh) * 2018-06-20 2021-09-17 腾讯科技(深圳)有限公司 视频编码方法、解码方法、装置、计算机设备及存储介质
WO2020076838A1 (en) * 2018-10-08 2020-04-16 Beijing Dajia Internet Information Technology Co., Ltd. Motion vector storage for video coding
CN111193931B (zh) * 2018-11-14 2023-04-07 深圳市中兴微电子技术有限公司 一种视频数据的编码处理方法和计算机存储介质
CN109495741B (zh) * 2018-11-29 2023-03-31 四川大学 基于自适应下采样和深度学习的图像压缩方法
CN109635813B (zh) * 2018-12-13 2020-12-25 银河水滴科技(北京)有限公司 一种钢轨区域图像分割方法及装置
CN109640117A (zh) * 2018-12-26 2019-04-16 呈像科技(北京)有限公司 视频插帧处理方法
US10764603B2 (en) * 2018-12-31 2020-09-01 Alibaba Group Holding Limited Resolution-adaptive video coding
CN110264404B (zh) * 2019-06-17 2020-12-08 北京邮电大学 一种超分辨图像纹理优化的方法和装置
CN110582022B (zh) * 2019-09-27 2022-12-30 腾讯科技(深圳)有限公司 视频编解码方法和装置及存储介质
CN110896481B (zh) * 2019-12-31 2022-03-08 杭州当虹科技股份有限公司 一种适用于hevc的快速帧内模式编码方法
CN111314697B (zh) * 2020-02-25 2021-10-15 中南大学 一种针对光字符识别的码率设置方法、设备及存储介质
CN111986082B (zh) * 2020-07-22 2022-06-21 中国西安卫星测控中心 一种自适应图像处理分辨率评价方法
CN112153379B (zh) * 2020-09-28 2023-02-24 北京金山云网络技术有限公司 转码质量的评价方法、装置、电子设备及计算机存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01231583A (ja) * 1988-03-11 1989-09-14 Fujitsu Ltd 可変ビットレート画像符号化装置
US20030161401A1 (en) * 2002-02-27 2003-08-28 Bo Shen Reducing the resolution of media data
CN1578463A (zh) * 2003-06-30 2005-02-09 英特尔公司 按照运动图像专家组标准编码的流的代码转换方法
US20050238243A1 (en) * 1999-06-08 2005-10-27 Matsushita Electric Industrial Co., Ltd. Image coding apparatus
CN101159871A (zh) * 2007-11-12 2008-04-09 中兴通讯股份有限公司 宏块组级视频码率控制方法

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6891889B2 (en) * 2001-09-05 2005-05-10 Intel Corporation Signal to noise ratio optimization for video compression bit-rate control
KR100850705B1 (ko) * 2002-03-09 2008-08-06 삼성전자주식회사 시공간적 복잡도를 고려한 적응적 동영상 부호화 방법 및그 장치
US7797723B2 (en) * 2004-10-30 2010-09-14 Sharp Laboratories Of America, Inc. Packet scheduling for video transmission with sender queue control
JP4758946B2 (ja) * 2007-05-24 2011-08-31 大日本スクリーン製造株式会社 閾値マトリクス生成方法、画像データ生成方法、画像データ生成装置、画像記録装置および閾値マトリクス
US7974498B2 (en) * 2007-08-08 2011-07-05 Microsoft Corporation Super-resolution in periodic and aperiodic pixel imaging
US8218811B2 (en) * 2007-09-28 2012-07-10 Uti Limited Partnership Method and system for video interaction based on motion swarms
CN102905131A (zh) * 2008-06-20 2013-01-30 杜比实验室特许公司 在多个失真约束下的视频压缩
US20110013692A1 (en) * 2009-03-29 2011-01-20 Cohen Robert A Adaptive Video Transcoding

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01231583A (ja) * 1988-03-11 1989-09-14 Fujitsu Ltd 可変ビットレート画像符号化装置
US20050238243A1 (en) * 1999-06-08 2005-10-27 Matsushita Electric Industrial Co., Ltd. Image coding apparatus
US20030161401A1 (en) * 2002-02-27 2003-08-28 Bo Shen Reducing the resolution of media data
CN1578463A (zh) * 2003-06-30 2005-02-09 英特尔公司 按照运动图像专家组标准编码的流的代码转换方法
CN101159871A (zh) * 2007-11-12 2008-04-09 中兴通讯股份有限公司 宏块组级视频码率控制方法

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2410749A1 (fr) 2010-07-20 2012-01-25 Freebox Procédé d'encodage adaptatif d'un flux vidéo numérique, notamment pour diffusion sur ligne xDSL
FR2963189A1 (fr) * 2010-07-20 2012-01-27 Freebox Procede d'encodage adaptatif d'un flux video numerique, notamment pour diffusion sur ligne xdsl.
US11006122B2 (en) * 2018-03-11 2021-05-11 Google Llc Static video recognition
US11917158B2 (en) 2018-03-11 2024-02-27 Google Llc Static video recognition
CN111462765A (zh) * 2020-04-02 2020-07-28 宁波大学 一种基于一维卷积核的自适应音频复杂度表征方法
CN111462765B (zh) * 2020-04-02 2023-08-01 宁波大学 一种基于一维卷积核的自适应音频复杂度表征方法

Also Published As

Publication number Publication date
US8311097B2 (en) 2012-11-13
US20110058605A1 (en) 2011-03-10
CN101778275B (zh) 2012-05-02
CN101778275A (zh) 2010-07-14
EP2234401A1 (en) 2010-09-29
EP2234401A4 (en) 2015-02-25

Similar Documents

Publication Publication Date Title
WO2010078759A1 (zh) 基于码率控制的图像时间域和空间域分辨率处理方法
JP5054826B2 (ja) 時空間的複雑度を用いた符号化モード決定方法及び装置
JP4653220B2 (ja) イントラblモードを考慮したデブロックフィルタリング方法、および前記方法を利用する多階層ビデオエンコーダ/デコーダ
US9596485B2 (en) Motion picture encoding/decoding apparatus, adaptive deblocking filtering apparatus and filtering method for same, and recording medium
RU2367113C1 (ru) Способ управления устранением блочности, учитывающий режим внутреннего bl, кодировщик-декодер многослойного видео, его использующий
JP4391809B2 (ja) 画像のシーケンスを適応的に符号化するシステムおよび方法
JPWO2003003749A1 (ja) 画像符号化装置、画像復号装置、画像符号化方法及び画像復号方法
WO2018117896A1 (en) Low complexity mixed domain collaborative in-loop filter for lossy video coding
JP4391810B2 (ja) 画像のシーケンスを適応的に符号化するシステムおよび方法
WO2006133613A1 (en) Method for reducing image block effects
KR100905059B1 (ko) 동영상 부호화에 있어서 비트 발생 가능성 예측을 이용한블록 모드 결정 방법 및 장치
JP4824705B2 (ja) 画像符号化方法、画像復号方法、画像符号化装置、画像復号装置、画像符号化プログラム、画像復号プログラムおよびそれらのプログラムを記録したコンピュータ読み取り可能な記録媒体
KR100987921B1 (ko) 선택적 움직임 검색영역을 이용한 움직임 보상기법이 적용되는 동영상 압축부호화장치및 복호화 장치와 움직임 보상을 위한 선택적 움직임 검색영역 결정방법.
CN113079376A (zh) 对静止区域的视频编码方法和装置
KR20110099554A (ko) 영상의 복잡도를 이용한 h.264/avc 부호화기의 양자화 파라미터 결정 방법 및 이를 구현하는 h.264/avc 부호화기
CN103957413A (zh) 一种针对移动网络视频通信应用的实时误码掩盖方法和装置
JP5585271B2 (ja) 動画像符号化装置
JP4936557B2 (ja) 符号化装置
JP2007529144A (ja) マクロブロック符号化時にノイズを防止する方法
KR100687845B1 (ko) 이미지 화소 필터링 방법
KR100721022B1 (ko) 블록 경계 영역 필터링 방법
KR100801155B1 (ko) H.264에서의 저복잡도를 가지는 공간적 에러 은닉방법
Vyas et al. Error concealment techniques using intra-modes and weighted interpolation in H. 264 decoder
KR100748512B1 (ko) 블록 경계 영역 필터링 방법
KR100674027B1 (ko) 이미지 화소 필터링 방법

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 12746175

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 2009833910

Country of ref document: EP

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09833910

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE