CN101883284B - Video encoding/decoding method and system based on background modeling and optional differential mode - Google Patents
Video encoding/decoding method and system based on background modeling and optional differential mode Download PDFInfo
- Publication number
- CN101883284B CN101883284B CN 201010203823 CN201010203823A CN101883284B CN 101883284 B CN101883284 B CN 101883284B CN 201010203823 CN201010203823 CN 201010203823 CN 201010203823 A CN201010203823 A CN 201010203823A CN 101883284 B CN101883284 B CN 101883284B
- Authority
- CN
- China
- Prior art keywords
- data
- background image
- video
- decoding
- encoded
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 87
- 239000013598 vector Substances 0.000 claims abstract description 59
- 238000012549 training Methods 0.000 claims description 28
- 238000012545 processing Methods 0.000 abstract description 5
- 230000009286 beneficial effect Effects 0.000 abstract description 4
- 230000006835 compression Effects 0.000 description 16
- 238000007906 compression Methods 0.000 description 16
- 238000005516 engineering process Methods 0.000 description 10
- 238000004364 calculation method Methods 0.000 description 9
- 230000011218 segmentation Effects 0.000 description 7
- 238000001514 detection method Methods 0.000 description 4
- 101150039623 Clip1 gene Proteins 0.000 description 3
- 101150103904 Clip2 gene Proteins 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 101100060194 Caenorhabditis elegans clip-1 gene Proteins 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- NUHSROFQTUXZQQ-UHFFFAOYSA-N isopentenyl diphosphate Chemical group CC(=C)CCO[P@](O)(=O)OP(O)(O)=O NUHSROFQTUXZQQ-UHFFFAOYSA-N 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000011056 performance test Methods 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000010977 unit operation Methods 0.000 description 1
Images
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
技术领域 technical field
本发明涉及数字媒体处理技术领域中的视频压缩技术,更具体地涉及一种基于背景建模和可选差分模式的视频编/解码方法及系统。The present invention relates to video compression technology in the technical field of digital media processing, and more particularly to a video encoding/decoding method and system based on background modeling and optional differential mode.
背景技术 Background technique
视频压缩(也称为视频编码)是数字媒体存储与传输等应用中的关键技术之一,其目的是通过消除冗余信息来减少存储与传输中的数据量。当前所有的主流视频压缩标准都采用了基于块的预测变换混合编码框架,即通过预测、变换、熵编码等方法消除视频图像中的统计冗余(包括空间冗余、时间冗余和信息熵冗余),以达到减少数据量的目的。由于视频数据在一定的时间段内是具有场景不变特性,近年来,随着背景建模技术的发展和进步,背景建模技术被越来越多的应用于视频编码中,合理地利用建模生成的背景,可以进一步消除视频中的信息冗余,从而获得更好的压缩性能。Video compression (also known as video coding) is one of the key technologies in applications such as digital media storage and transmission. Its purpose is to reduce the amount of data in storage and transmission by eliminating redundant information. All current mainstream video compression standards adopt a block-based predictive transformation hybrid coding framework, which eliminates statistical redundancy (including spatial redundancy, temporal redundancy, and information entropy redundancy) in video images through methods such as prediction, transformation, and entropy coding. surplus), in order to achieve the purpose of reducing the amount of data. Because video data has scene-invariant characteristics in a certain period of time, in recent years, with the development and progress of background modeling technology, background modeling technology has been more and more used in video coding, and rational use of built-in The background generated by the model can further eliminate the information redundancy in the video, so as to obtain better compression performance.
对于建模背景的使用,目前主要的方法分为两类,一类是脱离基于块的预测变换混合编码框架,采用基于对象的视频压缩技术,通过使用建模算法生成背景,完成对象检测、对象跟踪、前景/背景分割等技术,将视频中的各个对象分离出来,通过对不同的对象采取不同的压缩方式,进一步挖掘视频中的信息冗余,从而提高压缩效率。因此,基于对象的视频压缩方法是解决视频压缩问题的一个重要研究方向,但也存在两个问题:其一,视频中的对象检测与分割仍然是计算机视觉和图像处理领域中一个未解决的问题,现有方法在检测、分割的正确率和准确率方面仍不够理想,成为向对象的视频压缩方法的一个瓶颈;其二,上述对象检测与分割方法的计算复杂性较高,不利于编码器的实现。For the use of modeling background, the current main methods are divided into two categories. One is to break away from the block-based predictive transform hybrid coding framework, and use object-based video compression technology to generate background by using modeling algorithms to complete object detection, object Tracking, foreground/background segmentation and other technologies separate each object in the video, and adopt different compression methods for different objects to further mine the information redundancy in the video, thereby improving the compression efficiency. Therefore, the object-based video compression method is an important research direction to solve the video compression problem, but there are two problems: first, object detection and segmentation in video is still an unsolved problem in the field of computer vision and image processing , the existing methods are still not ideal in terms of correctness and accuracy of detection and segmentation, which has become a bottleneck in object-oriented video compression methods; second, the computational complexity of the above-mentioned object detection and segmentation methods is high, which is not conducive to encoder realization.
于是,另一类仍然基于块的预测变换混合编码框架的编码方法,就成为另外一个备受关注的研究方向。Therefore, another type of coding method that is still based on the block-based predictive transform hybrid coding framework has become another research direction that has attracted much attention.
发明内容 Contents of the invention
本发明提供一种基于背景建模和可选差分模式的视频编/解码方法及系统。基于本发明能够更好的提高编码性能。The invention provides a video encoding/decoding method and system based on background modeling and optional differential mode. Based on the present invention, the coding performance can be better improved.
一方面,本发明公开了一种基于背景建模和可选差分模式的视频编码方法,该方法包括如下步骤:背景建模步骤,使用输入的视频序列建模生成背景图像,经过编码后得到重构背景图像;全局运动估计步骤,对每幅输入图像进行像素或亚像素精度的全局运动估计,得到全局运动矢量;模式选择步骤,基于所述重构的背景图像和所述全局运动矢量,选择性地使用原始模式、差分模式对每一个视频块进行编码。On the one hand, the present invention discloses a video encoding method based on background modeling and optional differential mode, the method includes the following steps: a background modeling step, using input video sequence modeling to generate a background image, and obtaining a reconstructed image after encoding construct a background image; the global motion estimation step is to perform global motion estimation of pixel or sub-pixel precision on each input image to obtain a global motion vector; the mode selection step is to select based on the reconstructed background image and the global motion vector Each video block is encoded using the original mode and the differential mode selectively.
上述视频编码方法,优选所述背景建模步骤中,所述使用输入的视频序列建模生成背景图像包括:对于每一个像素点,在训练集内找到该像素点的像素集合,然后开始遍历;对于每一个像素值,根据这个像素集合内当前像素值与下一个相邻像素值之间的差,利用当前时刻生成的动态阈值进行判断,如果差的绝对值大于阈值,则判定当前数据段结束,开始下一个数据段;进而,将整个当前像素位置的像素集合划分为若干数据段;为每一段分配一个权重,所述权重为该段数据集合的大小;基于所述权重,计算出的所述像素点的背景像素值。In the above video encoding method, preferably in the background modeling step, the step of using the input video sequence to model and generate the background image includes: for each pixel, find the pixel set of the pixel in the training set, and then start traversing; For each pixel value, according to the difference between the current pixel value and the next adjacent pixel value in this pixel set, the dynamic threshold value generated at the current moment is used for judgment. If the absolute value of the difference is greater than the threshold value, it is determined that the current data segment is over. , start the next data segment; furthermore, divide the entire pixel set at the current pixel position into several data segments; assign a weight to each segment, and the weight is the size of the segment data set; based on the weight, the calculated The background pixel value of the above pixel.
上述视频编码方法,优选所述背景建模步骤还包括定期重新选取训练集以更新所述背景图像的步骤。In the video coding method above, preferably, the background modeling step further includes a step of periodically reselecting a training set to update the background image.
上述视频编码方法,优选所述背景建模步骤中,对背景图像进行编码以获取重构的背景图像中,所述编码方法包括:使用有损或无损的、图像或视频的编码方法编码建模生成的背景图像;或把所有背景图像看作一个序列,使用视频编码方法进行编码,所述视频编码方法为:MPEG-1/2/4、H.263、H.264/AVC、VC1、AVS、JPEG、JPEG2000或MJPEG。In the above video encoding method, preferably in the background modeling step, the background image is encoded to obtain a reconstructed background image, the encoding method includes: using a lossy or lossless encoding method for image or video encoding modeling Generated background images; or regard all background images as a sequence, and use video encoding method for encoding, the video encoding method is: MPEG-1/2/4, H.263, H.264/AVC, VC1, AVS , JPEG, JPEG2000 or MJPEG.
上述视频编码方法,优选所述全局运动估计步骤中,所述全局运动估计包括:以数据块为基本单位,令当前图像以重构背景图像为参考图像进行全局的,整像素或分像素运动搜索,取运动矢量集合的中值、最大聚类或平均值为当前图像的全局运动矢量。In the above video encoding method, preferably in the global motion estimation step, the global motion estimation includes: using data blocks as the basic unit, making the current image use the reconstructed background image as a reference image to perform global, integer or sub-pixel motion search , take the median, maximum cluster or average value of the motion vector set as the global motion vector of the current image.
上述视频编码方法,优选所述模式选择步骤中,所述选择性地使用原始模式、差分模式对每一个视频块进行编码,选择的方法为:通过比较原始模式和差分模式的率失真结果。In the above video encoding method, preferably in the mode selection step, the selective use of the original mode and the differential mode to encode each video block is selected by comparing the rate-distortion results of the original mode and the differential mode.
上述视频编码方法,优选所述原始模式编码为:根据全局运动矢量,找到待编码数据的预测参考数据在背景图像中的对应数据,若预测参考数据已编码为差分模式,则以二者的解码叠加值为参考,否则则直接以预测参考数据的解码原始值为参考来直接编码待编码数据。In the above video encoding method, preferably, the original mode encoding is as follows: according to the global motion vector, find the corresponding data in the background image of the prediction reference data of the data to be encoded, and if the prediction reference data has been encoded as a differential mode, then use the decoding of the two The superimposed value is a reference; otherwise, the decoded original value of the prediction reference data is directly used as a reference to directly encode the data to be encoded.
上述视频编码方法,优选所述差分模式编码为:根据全局运动矢量,为待编码数据的预测参考数据在背景图像中的匹配对应的数据,若所预测参考数据已编码为原始模式,则以二者的解码差分值为参考,否则则直接以预测参考像素的解码原始值为参考来编码待编码数据与背景图像中对应的数据的差分数据。In the above video encoding method, preferably, the differential mode encoding is as follows: according to the global motion vector, it is the data corresponding to the matching of the prediction reference data of the data to be encoded in the background image, and if the prediction reference data has been encoded into the original mode, then the two Otherwise, the decoded original value of the prediction reference pixel is directly used as a reference to encode the difference data between the data to be encoded and the corresponding data in the background image.
上述视频编码方法,优选所述定期重新选取训练集以更新所述背景图像具体为:依据视频段进行背景图像的更新,所述视频段为使用同一个重构的背景图像进行编码的一段输入视频序列,整个输入视频序列可以看作是由若干首尾相接的视频段构成,编码时,从当前视频段中选取训练图像集进行背景建模并生成一幅背景图像,供下一个视频段编码使用,使当前视频段编码时,使用的是在前一个视频段编码的同时生成的背景图像。In the above video encoding method, it is preferred that the regular reselection of the training set to update the background image specifically includes: updating the background image according to a video segment, and the video segment is an input video encoded using the same reconstructed background image Sequence, the entire input video sequence can be regarded as composed of several end-to-end video segments. When encoding, the training image set is selected from the current video segment for background modeling and a background image is generated for the encoding of the next video segment. , so that when the current video segment is encoded, the background image generated while the previous video segment is encoded is used.
另一方面,本发明还公开了一种基于背景建模和可选差分模式的视频编码系统,包括:背景建模模块,用于使用输入的视频序列建模生成背景图像,经过编码后得到重构背景图像;全局运动估计模块,用于对每幅输入图像进行像素或亚像素精度的全局运动估计,得到全局运动矢量;模式选择模块,用于基于所述重构的背景图像和所述全局运动矢量,选择性地使用原始模式、差分模式对每一个视频块进行编码。On the other hand, the present invention also discloses a video encoding system based on background modeling and optional differential mode, including: a background modeling module, which is used to model and generate a background image using an input video sequence, and obtain a reconstructed image after encoding. Constructing a background image; a global motion estimation module for performing global motion estimation with pixel or sub-pixel precision on each input image to obtain a global motion vector; a mode selection module for based on the reconstructed background image and the global Motion vector, optionally using raw mode, differential mode to encode each video block.
上述视频编码系统,优选所述背景建模模块中,所述使用输入的视频序列建模生成背景图像包括:用于对于每一个像素点,在训练集内找到该像素点的像素集合,然后开始遍历;对于每一个像素值,根据这个像素集合内当前像素值与下一个相邻像素值之间的差,利用当前时刻生成的动态阈值进行判断,如果差的绝对值大于阈值,则判定当前数据段结束,开始下一个数据段;进而,将整个当前像素位置的像素集合划分为若干数据段;为每一段分配一个权重,所述权重为该段数据集合的大小;基于所述权重,计算出的所述像素值的背景像素的子模块。The above-mentioned video coding system, preferably in the background modeling module, the use of the input video sequence modeling to generate the background image includes: for each pixel, find the pixel set of the pixel in the training set, and then start Traverse; for each pixel value, according to the difference between the current pixel value and the next adjacent pixel value in this pixel set, use the dynamic threshold generated at the current moment to judge, if the absolute value of the difference is greater than the threshold, then judge the current data end of the segment, and start the next data segment; furthermore, the pixel set at the entire current pixel position is divided into several data segments; a weight is assigned to each segment, and the weight is the size of the segment data set; based on the weight, calculate The pixel value of the submodule of the background pixel.
上述视频编码系统,优选所述背景建模模块还包括定期重新选取训练集以更新所述背景图像的子模块。In the above video coding system, preferably, the background modeling module further includes a submodule for periodically reselecting a training set to update the background image.
上述视频编码系统,优选所述背景建模模块中,对背景图像进行编码以获取重构的背景图像中,所述编码方法包括:使用有损或无损的、图像或视频的编码方法编码建模生成的背景图像;或把所有背景图像看作一个序列,使用视频编码方法进行编码,所述视频编码方法为:MPEG-1/2/4、H.263、H.264/AVC、VC1、AVS、JPEG、JPEG2000或MJPEG。In the above video coding system, preferably in the background modeling module, the background image is coded to obtain a reconstructed background image, and the coding method includes: using a lossy or lossless, image or video coding method to code and model Generated background images; or regard all background images as a sequence, and use video encoding method for encoding, the video encoding method is: MPEG-1/2/4, H.263, H.264/AVC, VC1, AVS , JPEG, JPEG2000 or MJPEG.
上述视频编码系统,优选所述全局运动估计模块还包括:用于以数据块为基本单位,令当前图像以重构背景图像为参考图像进行全局的,整像素或分像素运动搜索,取运动矢量集合的中值、最大聚类或平均值为当前图像的全局运动矢量的子模块。In the above-mentioned video coding system, preferably, the global motion estimation module further includes: using the data block as the basic unit, making the current image use the reconstructed background image as the reference image to perform global, integer or sub-pixel motion search, and obtain the motion vector The median, largest cluster or mean of the set is a submodule of the global motion vector for the current image.
上述视频编码系统,优选所述模式选择模块中,所述选择性地使用原始模式、差分模式对每一个视频块进行编码,选择的方法为:通过比较原始模式和差分模式的率失真结果。In the above video coding system, preferably in the mode selection module, the selective use of the original mode and the differential mode to encode each video block is selected by comparing the rate-distortion results of the original mode and the differential mode.
上述视频编码系统,优选所述原始模式编码为:根据全局运动矢量,找到待编码数据的预测参考数据在背景图像中的对应数据,若预测参考数据已编码为差分模式,则以二者的解码叠加值为参考,否则则直接以预测参考数据的解码原始值为参考来直接编码待编码数据。In the above video coding system, it is preferable that the original mode coding is as follows: according to the global motion vector, find the corresponding data in the background image of the prediction reference data of the data to be coded, and if the prediction reference data has been coded as a differential mode, then use the decoding of the two The superimposed value is a reference; otherwise, the decoded original value of the prediction reference data is directly used as a reference to directly encode the data to be encoded.
上述视频编码系统,优选所述差分模式编码为:根据全局运动矢量,为待编码数据的预测参考数据在背景图像中的匹配对应的数据,若所预测参考数据已编码为原始模式,则以二者的解码差分值为参考,否则则直接以预测参考像素的解码原始值为参考来编码待编码数据与背景图像中对应的数据的差分数据。In the above video encoding system, preferably, the differential mode encoding is as follows: according to the global motion vector, it is the data corresponding to the matching of the prediction reference data of the data to be encoded in the background image, and if the prediction reference data has been encoded into the original mode, the two Otherwise, the decoded original value of the prediction reference pixel is directly used as a reference to encode the difference data between the data to be encoded and the corresponding data in the background image.
上述视频编码系统,优选所述定期重新选取训练集以更新所述背景图像具体为:依据视频段进行背景图像的更新,所述视频段为使用同一个重构的背景图像进行编码的一段输入视频序列,整个输入视频序列可以看作是由若干首尾相接的视频段构成,编码时,从当前视频段中选取训练图像集进行背景建模并生成一幅背景图像,供下一个视频段编码使用,使当前视频段编码时,使用的是在前一个视频段编码的同时生成的背景图像。In the above video coding system, it is preferred that the regular reselection of the training set to update the background image specifically includes: updating the background image according to a video segment, and the video segment is an input video encoded using the same reconstructed background image Sequence, the entire input video sequence can be regarded as composed of several end-to-end video segments. When encoding, the training image set is selected from the current video segment for background modeling and a background image is generated for the encoding of the next video segment. , so that when the current video segment is encoded, the background image generated while the previous video segment is encoded is used.
另一方面,本发明还公开了一种与上述的视频编码方法相对应的视频解码方法,包括:解码背景图像和全局运动矢量;对每一个视频块进行原始模式或者差分模式解码。On the other hand, the present invention also discloses a video decoding method corresponding to the above-mentioned video coding method, including: decoding background images and global motion vectors; performing original mode or differential mode decoding on each video block.
上述视频解码方法,优选所述原始模式解码包括:若待解码数据的编码为原始模式,根据全局运动矢量,得到预测参考数据在背景图像中的对应数据;若预测参考数据为已编码为差分模式,则以二者的解码叠加值为参考,否则则直接以预测参考数据的解码原始值为参考来直接解码的待解码数据。In the above video decoding method, it is preferred that the original mode decoding includes: if the encoding of the data to be decoded is in the original mode, according to the global motion vector, the corresponding data of the prediction reference data in the background image is obtained; if the prediction reference data is coded as a differential mode , the decoded superimposed value of the two is used as a reference, otherwise, the decoded data to be decoded is directly decoded by using the decoded original value of the prediction reference data as a reference.
上述视频解码方法,优选所述差分模式解码包括:若待解码数据的编码为差分模式,根据全局运动矢量,得到预测参考数据在背景图像中的对应数据;若所预测参考数据已编码为原始模式,则以二者的解码差分值为参考,否则则直接以预测参考数据的解码原始值为参考来解码当前待解码数据,解码出的数据再经过与背景图像中对应的数据的叠加运算。In the above video decoding method, preferably, the differential mode decoding includes: if the encoding of the data to be decoded is a differential mode, according to the global motion vector, the corresponding data of the prediction reference data in the background image is obtained; if the predicted reference data has been encoded as the original mode , then use the decoded difference value of the two as a reference, otherwise, directly use the decoded original value of the prediction reference data as a reference to decode the current data to be decoded, and then the decoded data is superimposed with the corresponding data in the background image.
另一方面,本发明还公开了一种与上述视频编码系统相对应的视频解码系统,包括:用于解码背景图像和全局运动矢量的模块;用于对每一个视频块进行原始模式或者差分模式解码的模块。On the other hand, the present invention also discloses a video decoding system corresponding to the above-mentioned video coding system, including: a module for decoding background images and global motion vectors; decoded module.
上述视频解码系统,优选所述原始模式解码包括:若待解码数据的编码为原始模式,根据全局运动矢量,得到预测参考数据在背景图像中的对应数据;若预测参考数据为已编码为差分模式,则以二者的解码叠加值为参考,否则则直接以预测参考数据的解码原始值为参考来直接解码的待解码数据。In the above video decoding system, preferably, the original mode decoding includes: if the encoding of the data to be decoded is the original mode, according to the global motion vector, the corresponding data of the prediction reference data in the background image is obtained; , the decoded superimposed value of the two is used as a reference, otherwise, the decoded data to be decoded is directly decoded by using the decoded original value of the prediction reference data as a reference.
上述视频解码系统,优选所述差分模式解码包括:若待解码数据的编码为差分模式,根据全局运动矢量,得到预测参考数据在背景图像中的对应数据;若所预测参考数据已编码为原始模式,则以二者的解码差分值为参考,否则则直接以预测参考数据的解码原始值为参考来解码当前待解码数据,解码出的数据再经过与背景图像中对应的数据的叠加运算得到最终的解码数据。In the above video decoding system, preferably, the differential mode decoding includes: if the encoding of the data to be decoded is a differential mode, according to the global motion vector, the corresponding data of the prediction reference data in the background image is obtained; if the predicted reference data has been encoded as the original mode , then use the decoded difference value of the two as a reference, otherwise, directly use the decoded original value of the prediction reference data as a reference to decode the current data to be decoded, and then superimpose the decoded data with the corresponding data in the background image to obtain the final the decoded data.
相比于现有技术,本发明具有如下特点:第一、不进行对象或前景/背景的分割;第二,以块或宏块为单位进行编码;第三,增加了全局运动补偿;第四,采用模式选择的方式,从两类编码模式中选择最优,以保证编码效率。本发明能够提高编码性能,并且具有不增加编码延迟,以及码流本身包含了背景图像,有利于进一步处理的优点。Compared with the prior art, the present invention has the following characteristics: first, no object or foreground/background segmentation; second, encoding is performed in units of blocks or macroblocks; third, global motion compensation is added; fourth , using the mode selection method to select the best from two types of coding modes to ensure the coding efficiency. The invention can improve the encoding performance, and has the advantages of not increasing the encoding delay, and the code stream itself contains the background image, which is beneficial to further processing.
附图说明 Description of drawings
图1是本发明基于背景建模和可选差分模式的视频编码方法实施例的步骤流程图;Fig. 1 is a flow chart of steps of an embodiment of a video encoding method based on background modeling and optional differential mode in the present invention;
图2是用于实施本发明的视频编码方法框图;Fig. 2 is a block diagram for implementing the video coding method of the present invention;
图3是在全局运动估计下当前图像中数据与背景图像的对应关系;Fig. 3 is the corresponding relationship between the data in the current image and the background image under global motion estimation;
图4是背景建模过程示意图;Fig. 4 is a schematic diagram of the background modeling process;
图5是训练集的选取示意图;Fig. 5 is the selection diagram of training set;
图6是当前待编码数据将编码为差分模式、预测参考已编码为原始模式解码举例;Fig. 6 is an example of decoding the current data to be encoded as a differential mode, and the prediction reference has been encoded as an original mode;
图7是当前待编码数据将编码为差分模式、预测参考已编码为差分模式的编码举例;Fig. 7 is an example of encoding where the current data to be encoded will be encoded as a differential mode, and the prediction reference has been encoded as a differential mode;
图8是当前待编码数据将编码为原始模式、预测参考已编码为差分模式解码举例;Fig. 8 is an example of encoding the current data to be encoded into the original mode, and the prediction reference has been encoded into the differential mode;
图9是当前待解码数据已编码为差分模式、预测参考已编码为原始模式解码举例;Figure 9 is an example of decoding where the current data to be decoded has been encoded as a differential mode and the prediction reference has been encoded as an original mode;
图10是当前待解码数据已编码位差分模式、预测参考已编码为差分模式下的编码举例;FIG. 10 is an example of coding in which the current data to be decoded has been coded in differential mode and the prediction reference has been coded in differential mode;
图11是当前待解码数据已编码为原始模式、预测参考已编码为差分模式解码举例。Fig. 11 is an example of decoding where the current data to be decoded has been encoded as the original mode and the prediction reference has been encoded as the differential mode.
具体实施方式 Detailed ways
为使本发明的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本发明作进一步详细的说明。In order to make the above objects, features and advantages of the present invention more comprehensible, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.
本发明中,利用视频序列中存在的一定时间内场景固定的特点,对视频中相对固定的场景,采用了一种背景建模的方法,进而生成背景图像加以描述。在编码过程中,通过建立和更新描述了视频中相对固定的场景的背景图像,选择性的使用一种新引入的差分模式编码方法编码每一个数据块,从而更大程度上消除视频序列中的冗余,获得更好的压缩性能。与之相应,编码方法的基本思想如下:首先建立和更新背景图像来描述视频图像的固定场景信息;随后使用可选择地进行全局运动估计,得到每一幅图像的全局运动矢量;然后在直接编码原始数据的原始视频编码模式和编码原始数据与对应的背景数据的差分结果的差分视频编码模式之间,选择最优的编码模式。在上述的编码算法思想中,背景图像是利用原始输入视频图像经过背景建模得到的,生成的背景图像需要编入码流;如果使用了全局运动估计,则也需要将全局运动矢量写入码流;解码端的设计要求与编码端相匹配,即码流在解码时,首先直接解码背景图像和可选择编码的全局运动矢量。In the present invention, a background modeling method is used for the relatively fixed scene in the video sequence to generate a background image for description by utilizing the feature that the scene in the video sequence is fixed within a certain period of time. In the encoding process, by establishing and updating the background image that describes a relatively fixed scene in the video, a newly introduced differential mode encoding method is selectively used to encode each data block, thereby eliminating the video sequence to a greater extent. Redundancy for better compression performance. Correspondingly, the basic idea of the encoding method is as follows: firstly establish and update the background image to describe the fixed scene information of the video image; then use the optional global motion estimation to obtain the global motion vector of each image; then directly encode An optimal encoding mode is selected between the original video encoding mode of the original data and the differential video encoding mode of encoding the differential result of the original data and the corresponding background data. In the above coding algorithm idea, the background image is obtained by using the original input video image through background modeling, and the generated background image needs to be encoded into the code stream; if global motion estimation is used, the global motion vector also needs to be written into the code stream; the design requirements of the decoding end match the encoding end, that is, when the code stream is decoded, the background image and the optional encoded global motion vector are first directly decoded.
参照图1,图1是本发明基于背景建模和可选差分模式的视频编码方法实施例的步骤流程图,包括如下步骤:Referring to FIG. 1, FIG. 1 is a flow chart of the steps of an embodiment of a video coding method based on background modeling and optional differential mode in the present invention, including the following steps:
背景建模步骤S1,使用输入的视频序列建模生成背景图像,背景图像经过编解码后得到重构背景图像;全局运动估计步骤S2,对每幅输入图像进行像素或亚像素精度的全局运动估计,得到全局运动矢量;模式选择步骤S3,基于所述重构的背景图像和所述全局运动矢量,选择性地使用原始模式、差分模式对每一个视频块进行编码。The background modeling step S1 is to use the input video sequence to model and generate a background image, and the background image is encoded and decoded to obtain a reconstructed background image; the global motion estimation step S2 is to perform global motion estimation with pixel or sub-pixel precision on each input image , to obtain the global motion vector; the mode selection step S3, based on the reconstructed background image and the global motion vector, selectively use the original mode and the differential mode to encode each video block.
上述实施例具有如下特点:第一、不进行对象或前景/背景的分割;第二,以块或宏块为单位进行编码;第三,增加了全局运动补偿;第四,采用模式选择的方式,从两类编码模式中选择最优,以保证编码效率。本发明能够提高编码性能,并且具有不增加编码延迟,以及码流本身包含了背景图像,有利于进一步处理的优点。并且,基于背景建模的视频压缩技术尤其适用于视频监控、视频会议、智能房间等等,这些应用场景的视频具有场景维持时间长、镜头切换频率极低的特点,有利于使用背景建模提高压缩效率。The above embodiments have the following features: first, no object or foreground/background segmentation; second, encoding is performed in units of blocks or macroblocks; third, global motion compensation is added; fourth, mode selection is adopted , choose the best from the two coding modes to ensure the coding efficiency. The invention can improve the encoding performance, and has the advantages of not increasing the encoding delay, and the code stream itself contains the background image, which is beneficial to further processing. Moreover, the video compression technology based on background modeling is especially suitable for video surveillance, video conferencing, smart rooms, etc. The videos of these application scenarios have the characteristics of long scene maintenance time and extremely low camera switching frequency, which is beneficial to improve Compression efficiency.
参照图2,本发明可以在如图2所示的编解码框架下完成,该框架在编码端包括背景建模、可选择的全局运动估计、全局运动补偿、背景图像编码、背景图像解码、差分模式编码、原始模式编码七个功能单元,分别完成背景建模算法,可选择的全局运动估计算法,与背景图像之间的数据补偿操作,背景图像编码算法,背景图像解码算法,差分模式下的数据编码方法,原始模式下的数据编码算法操作;相应的解码端由差分模式解码、原始模式解码、背景图像解码、差分与背景叠加单元构成,分别实现差分模式下编码数据的解码、原始模式下编码数据的解码、差分解码结果与背景图像中对应数据的叠加操作。Referring to Fig. 2, the present invention can be completed under the codec framework as shown in Fig. 2, which includes background modeling, optional global motion estimation, global motion compensation, background image encoding, background image decoding, differential There are seven functional units of mode encoding and original mode encoding, which respectively complete the background modeling algorithm, the optional global motion estimation algorithm, the data compensation operation with the background image, the background image encoding algorithm, the background image decoding algorithm, and the differential mode. Data encoding method, data encoding algorithm operation in the original mode; the corresponding decoding end is composed of differential mode decoding, original mode decoding, background image decoding, difference and background superposition unit, which respectively realize the decoding of the encoded data in the differential mode, and the decoding in the original mode. Decoding of encoded data, superimposition operation of differential decoding result and corresponding data in the background image.
如图2所示,本实施例在编码端包括接受输入视频序列的背景图像建模操作、接受所述背景图像建模操作输出的背景图像编码操作、连接所述背景图像编码操作输出的背景图像解码操作、接受输入视频序列和背景图像解码模块输出的可选全局运动估计操作,接受可选的全局运动估计矢量的全局运动估计矢量、输入序列和背景图像解码输出的全局运动补偿操作,接受可输入视频序列、全局运动估计模块输出的全局运动估计矢量和全局运动补偿操作输出的差分数据原始模式编码操作,接受输入视频序列和可选的全局运动估计模块输出的全局运动估计矢量差分模式编码操作,最后通过模式选择选择接受差分模式编码操作输出的编码信息和编码码流或者原始模式编码操作输出的编码信息和编码码流。在解码端,包括连接接受编码端输入的编码码流的背景图像解码操作;根据模式选择标志位选择性的接受输入码流信息的差分模式解码操作;根据模式选择标志位选择性的接受输入码流信息的原始模式解码操作;接受差分块解码的输出和背景图像解码输出的差分与背景叠加操作。下面本发明对图2所述的基于背景模型的视频编解方法及所涉及的各操作的功能作用和实施方法进行详细的描述:As shown in FIG. 2 , the coding end of this embodiment includes a background image modeling operation for accepting an input video sequence, a background image coding operation for accepting the output of the background image modeling operation, and a background image for connecting the output of the background image coding operation. A decoding operation, an optional global motion estimation operation that accepts an input video sequence and a background image decoding module output, that accepts an optional global motion estimation vector, a global motion compensation operation that accepts an optional global motion estimation vector, an input sequence and a background image decoding output The input video sequence, the global motion estimation vector output by the global motion estimation module and the differential data output by the global motion compensation operation raw mode encoding operation, accepts the input video sequence and the optional global motion estimation vector output by the global motion estimation module differential mode encoding operation , and finally select to accept the encoded information and the encoded code stream output by the encoding operation in the differential mode or the encoded information and the encoded stream output by the encoding operation in the original mode through the mode selection. At the decoding end, it includes the background image decoding operation that connects and accepts the encoded code stream input by the encoding end; the differential mode decoding operation that selectively accepts the input code stream information according to the mode selection flag bit; selectively accepts the input code according to the mode selection flag bit Raw mode decoding operation for stream information; differential and background overlay operation accepting differential block decoded output and background image decoded output. The following present invention describes in detail the video encoding and decoding method based on the background model described in FIG. 2 and the functions and implementation methods of the various operations involved:
一、背景图像建模1. Background Image Modeling
1、对输入的视频序列,选取一个训练图像集,建模生成一幅背景图像传递给视频编码模块3进行压缩。在第一个背景图像生成之前,该模块的输出为第一帧。以亮度分量为例,所用建模方法包含但不局限于如图3所示的背景建模算法,包括如下步骤:1. For the input video sequence, select a training image set, model and generate a background image and send it to the video encoding module 3 for compression. The output of this module is the first frame before the first background image is generated. Taking the brightness component as an example, the modeling method used includes but is not limited to the background modeling algorithm shown in Figure 3, including the following steps:
步骤S11,初始化当前像素位置平均值,阈值;Step S11, initialize the average value of the current pixel position, the threshold;
步骤S12,建立一个数据段,初始化其权重,平均值;Step S12, establishing a data segment, initializing its weight and average value;
步骤S13,在训练集内读取一个新像素;Step S13, read a new pixel in the training set;
步骤S14,判断:时间上相邻像素间的差平方是否大于阈值?若是,执行步骤S15a,若否执行步骤S15b;Step S14, judging: Is the square of the difference between temporally adjacent pixels greater than a threshold? If yes, execute step S15a, if not execute step S15b;
步骤S15a,更新当前数据段平均值;Step S15a, updating the average value of the current data segment;
步骤S16a,在当前权重下更新阈值;Step S16a, update the threshold under the current weight;
步骤S17a,当前数据段权重加1;Step S17a, adding 1 to the weight of the current data segment;
步骤S15b,在当前数据段权重下更新总平均值、阈值;Step S15b, updating the total average value and the threshold value under the weight of the current data segment;
步骤S16b,建立一个新数据段,初始化权重,平均值;Step S16b, create a new data segment, initialize the weight and average value;
步骤S18,在执行完步骤S17a或步骤S16b后,判断训练集是否结束,若是,则执行步骤S19;若否,则执行步骤S13;Step S18, after executing step S17a or step S16b, judge whether the training set is over, if so, execute step S19; if not, execute step S13;
步骤S19,在当前数据段权重下更新总平均值,并将总平均值作为背景像素的平均值。Step S19, updating the total average value under the weight of the current data segment, and using the total average value as the average value of the background pixels.
图3中,总平均值、每个新建数据段的平均值和权重均初始化为0。阈值初始化为训练集内前两帧数据的差平方的像素级平均值。总平均值和阈值更新都是基于权重的。设数据段i的权重为Wi,像素和为Sumi,数据段i内所有相邻像素的差平方和为Ti,则总平均值AVG,阈值Th计算如公式(1)、(2):In Figure 3, the total average value, the average value and weight of each new data segment are initialized to 0. The threshold is initialized as the pixel-level average of the squared difference of the previous two frames of data in the training set. Both overall average and threshold updates are based on weights. Let the weight of data segment i be W i , the sum of pixels be Sum i , and the sum of the squares of differences between all adjacent pixels in data segment i be T i , then the total average value AVG and the threshold Th are calculated as formulas (1) and (2) :
与前文所述背景图像更新所需的序列结构相对应,每当输入一个训练图像集,便重新进行背景建模,生成一幅背景图像,完成背景图像更新。色度分量的建模过程是相同的。Corresponding to the sequence structure required for the background image update mentioned above, whenever a training image set is input, the background modeling is performed again to generate a background image to complete the background image update. The modeling process for the chroma components is the same.
二、背景图像编码操作2. Background image encoding operation
对背景建模模块生成的背景图像进行编码压缩,把编码结果写入码流并传递给背景图像重构模块。编码所用的编码器包含但不局限于MPEG-1/2/4、H.263、H.264/AVC、VC1、AVS、JPEG、JPEG2000、MJPEG。编码器的配置包含但不局限于独立帧内编码每幅背景图像、将所有背景图像视作背景图像序列使用IPPP结构编码以及无损压缩每幅背景图像等方式。在具体的实施方式中,我们采用了扩展至9比特输入位宽的AVS编码技术对背景图像进行QP=0的编码。The background image generated by the background modeling module is encoded and compressed, and the encoding result is written into the code stream and passed to the background image reconstruction module. Encoders used for encoding include but are not limited to MPEG-1/2/4, H.263, H.264/AVC, VC1, AVS, JPEG, JPEG2000, MJPEG. The configuration of the encoder includes, but is not limited to, encoding each background image in an independent frame, treating all background images as a sequence of background images using IPPP structure encoding, and losslessly compressing each background image. In a specific implementation, we use the AVS coding technology extended to 9-bit input bit width to code the background image with QP=0.
三、背景图像解码操作3. Background image decoding operation
对背景图像编码模块输出的背景图像码流进行解码重构,为了使得编解码匹配,传递重构的背景图像给重构背景补偿操作。在具体的实施方式中,采用了扩展至9比特码流的AVS解码技术对背景图像进行解码,解码的输出也为9比特。Decode and reconstruct the background image code stream output by the background image coding module, and transfer the reconstructed background image to the reconstructed background compensation operation in order to make the codec match. In a specific implementation manner, the background image is decoded by using the AVS decoding technology extended to a 9-bit code stream, and the decoded output is also 9 bits.
四、可选的全局运动估计操作4. Optional global motion estimation operation
对背景图像解码操作输出的重构背景图像和输入的当前视频图像使用全局运动估计,得到全局运动矢量。包括但不限于以数据块为基本单位,令当前图像以重构背景图像为参考图像进行全局整像素或分像素运动搜索取运动矢量的中值、最大聚类或平均值为当前图像的全局运动矢量。得到的全局运动矢量需要写入码流。如果全局运动矢量为零,也可以使用相应的标志位替代全局运动矢量写入码流。Using global motion estimation on the reconstructed background image output by the background image decoding operation and the input current video image, a global motion vector is obtained. Including but not limited to taking the data block as the basic unit, making the current image use the reconstructed background image as the reference image to perform global integer or sub-pixel motion search to obtain the median, maximum clustering or average value of the motion vector as the global motion of the current image vector. The obtained global motion vector needs to be written into the code stream. If the global motion vector is zero, you can also use the corresponding flag bit instead of the global motion vector to write into the code stream.
五、全局运动补偿操作5. Global Motion Compensation Operation
使用全局运动估计生成的全局运动矢量,在背景图像中匹配与对当前要编码的原始数据对应的背景图像数据,如图4所示。对匹配的当前要编码的原始数据和背景图像数据进行差分运算,然后将差分结果输出给差分模式编码操作进行编码。The global motion vector generated by the global motion estimation is used to match the background image data corresponding to the current original data to be encoded in the background image, as shown in FIG. 4 . A differential operation is performed on the matched original data currently to be encoded and the background image data, and then the differential result is output to a differential mode encoding operation for encoding.
六、差分模式编码操作6. Differential mode encoding operation
对重构背景补偿操作输出地差分数据块进行编码压缩。选择的编码算法要与差分数据块格式相匹配。例如,当差分图像的输出为9比特时,应该选择可以配置成9比特的块压缩算法,如重构背景补偿操作中所述。在该模式下,我们将编码该数据块所用到的预测参考数据和当前数据块数据均与重构背景图像中对应的数据进行差分操作,使得该数据块参考的数据都是差分的。以全局运动矢量为0下的亮度差分计算为例,设s(x,y)是待差分数据在位置(x,y)的一个的亮度像素值,b(x,y)最终重构背景图像在位置(x,y)的亮度像素值,差分数据的计算方法包含但不局限于如下公式所述两种算法:Coding and compressing the differential data blocks output by the reconstructed background compensation operation. The selected encoding algorithm should match the format of the differential data block. For example, when the output of the differential image is 9 bits, a block compression algorithm that can be configured to 9 bits should be selected, as described in the reconstruction background compensation operation. In this mode, we perform differential operations on the prediction reference data used to encode the data block and the current data block data with the corresponding data in the reconstructed background image, so that the reference data of the data block are all differential. Take the brightness difference calculation when the global motion vector is 0 as an example, let s(x, y) be the brightness pixel value of the data to be differenced at position (x, y), and b(x, y) finally reconstruct the background image For the brightness pixel value at position (x, y), the calculation method of differential data includes but not limited to the two algorithms described in the following formula:
r(x,y)=Clip1(s(x,y)-b(x,y)+256), (3)r(x, y)=Clip1(s(x, y)-b(x, y)+256), (3)
r(x,y)=Clip2((s(x,y)-b(x,y))>>1+128),(4)r(x,y)=Clip2((s(x,y)-b(x,y))>>1+128), (4)
其中公式中的clip1表示将计算结果限定在[0,511]之内,如果越界,则取最近的边界上的像素值;clip2表示将计算结果限定在[0,255]之内,如果越界,则取最近的边界上的像素值。结合全局运动补偿操作,当当前数据块的预测参考数据为原始模式编码时,该单元的操作如图5所示。当当前数据块的预测参考数据为差分模式编码时,该单元的操作如图6所示。在帧内预测时,图5、6中的参考图像可以等同于当前图像。The clip1 in the formula means to limit the calculation result within [0, 511], if it is out of bounds, take the pixel value on the nearest boundary; clip2 means to limit the calculation result within [0, 255], if it is out of bounds, Then take the pixel value on the nearest boundary. Combined with the global motion compensation operation, when the prediction reference data of the current data block is encoded in the original mode, the operation of this unit is shown in FIG. 5 . When the prediction reference data of the current data block is coded in differential mode, the operation of this unit is shown in FIG. 6 . During intra-frame prediction, the reference image in Figs. 5 and 6 may be equivalent to the current image.
七、原始模式编码操作7. Raw mode encoding operation
对输入的数据块进行编码压缩。选择的编码算法要与输入数据块相匹配。例如,当输入数据块为9比特时,应该选择可以配置成9比特的块压缩算法,如差分图像计算模块中所述。在该模式下,我们将编码该数据块所用到的预测参考数据与重构背景图像中对应的数据进行叠加操作,使得该数据块参考的数据都是非差分的。以全局运动矢量为0下的亮度差分计算为例,设s(x,y)是在位置(x,y)的差分编码数据经过解码后的亮度像素值,b(x,y)为解码的背景图像像素在位置(x,y)的亮度像素值,参考像素的最终取值r(x,y)计算方法包含但不局限于如下公式所述两种算法:Encode and compress the input data block. The encoding algorithm is selected to match the input data block. For example, when the input data block is 9 bits, a block compression algorithm that can be configured to 9 bits should be selected, as described in the differential image calculation module. In this mode, we superimpose the prediction reference data used to encode the data block with the corresponding data in the reconstructed background image, so that the data referenced by the data block are all non-differential. Taking the calculation of brightness difference when the global motion vector is 0 as an example, let s(x, y) be the decoded brightness pixel value of the differentially encoded data at position (x, y), and b(x, y) be the decoded The brightness pixel value of the background image pixel at position (x, y), the calculation method of the final value r(x, y) of the reference pixel includes but is not limited to the two algorithms described in the following formula:
r(x,y)=Clip1(s(x,y)-256+b(x,y)), (5)r(x,y)=Clip1(s(x,y)-256+b(x,y)), (5)
r(x,y)=Clip2((s(x,y)<<1-128)+b(x,y))),(6)r(x,y)=Clip2((s(x,y)<<1-128)+b(x,y))), (6)
当当前数据块的预测参考数据为原始模式编码时,该单元直接解码码流。当当前数据块的预测参考数据为差分模式编码时,该单元的操作如图7所示。在帧内预测时,图7中的参考图像可以等同于当前图像。When the prediction reference data of the current data block is encoded in the original mode, this unit directly decodes the code stream. When the prediction reference data of the current data block is coded in differential mode, the operation of this unit is shown in FIG. 7 . During intra-frame prediction, the reference image in FIG. 7 may be equivalent to the current image.
上述编码方法在实现中,所述的背景图像定期更新方法可以通过一种新的视频序列结构——视频段进行说明和实现。一个视频段是一段较长的输入视频序列(几百帧或更长),整个输入视频序列可以看作是由一个个首尾相接的视频段构成。每个视频段使用同一幅重构的背景图像计算差分图像。在编码时,背景建模模块从当前视频段中选取训练图像集,进行背景建模并生成一幅背景图像,供下一个视频段编码使用。从另一个角度看,当前视频段编码时,使用的是在前一个视频段编码的同时生成的背景图像,因此,整个编码方法不会因为背景图像的生成而带来额外的延迟。对于第一个视频段,其中前若干图像可以采用传统的视频编码技术(包括但不限于MPEG-1/2/4、H.263、H.264/AVC、VC1、AVS、JPEG、JPEG2000、MJPEG)进行编码。在编码这些图像的同时,背景建模模块从这些图像中如图8所示选取训练图像集,生成第一个背景图像并传输到解码端。对于第一个视频段中接下来的图像,可以利用上述第一个背景图像的重构图像,生成差分图像进行编码。上述方法可以保证整个序列开始编码时,也不会因背景建模而产生额外的延迟。In the implementation of the above encoding method, the method for regularly updating the background image can be described and realized through a new video sequence structure——video segment. A video segment is a relatively long input video sequence (hundreds of frames or longer), and the entire input video sequence can be regarded as composed of end-to-end video segments. The same reconstructed background image is used to compute the difference image for each video segment. During encoding, the background modeling module selects a training image set from the current video segment, performs background modeling and generates a background image for the encoding of the next video segment. From another perspective, when the current video segment is encoded, the background image generated while encoding the previous video segment is used. Therefore, the entire encoding method will not bring additional delay due to the generation of the background image. For the first video segment, the first few images can adopt traditional video coding technology (including but not limited to MPEG-1/2/4, H.263, H.264/AVC, VC1, AVS, JPEG, JPEG2000, MJPEG ) to encode. While encoding these images, the background modeling module selects a training image set from these images as shown in Figure 8, generates the first background image and transmits it to the decoding end. For the next image in the first video segment, the reconstructed image of the above-mentioned first background image can be used to generate a difference image for encoding. The above method can ensure that when the entire sequence starts to encode, there will be no additional delay due to background modeling.
与上述序列结构相对应,首先被编入码流的是直接编码的第一个训练图像集。随后是第一幅背景图像的编码码流。接下来,则是第一个视频段的第一个训练图像集之外其他部分所对应的差分图像编码码流。以后则交替得将每个视频段编码对应的背景图像和差分图像编入最终码流。Corresponding to the above sequence structure, the first training image set that is directly coded is encoded into the code stream first. This is followed by the encoded bitstream of the first background image. Next, it is the differential image coding code stream corresponding to other parts other than the first training image set of the first video segment. Afterwards, the background image and differential image corresponding to each video segment code are alternately encoded into the final code stream.
上述编码方法和系统产生的码流可以用如图2所示的解码端的四个操作进行解码:The code stream generated by the above encoding method and system can be decoded by four operations at the decoding end as shown in Figure 2:
一、背景图像解码操作1. Background image decoding operation
对背景图像码流进行解码,传递解码出的背景图像给差分图像补偿模块。The code stream of the background image is decoded, and the decoded background image is transmitted to the differential image compensation module.
二、差分与背景叠加操作。2. Difference and background overlay operation.
对差分模式块解码操作输出的差分数据与背景图像中在全局运动矢量下对应的背景数据进行叠加,将叠加结果输出。The differential data output by the differential mode block decoding operation is superimposed on the background data corresponding to the global motion vector in the background image, and the superposition result is output.
三、差分模式解码操作3. Differential mode decoding operation
对编码端写入的差分模式图像码流进行解码,解码时如所参考的像素点为原始模式编码,则要首先按照公式(3)、(4)得到用于当前块解码的参考像素点。解码出当前块数据后,还需要重建最终解码像素点,此时,以全局运动矢量为0下的亮度分量为例,设b′(x,y)和r′(x,y)分别是解码出的背景像素和参考像素在位置(x,y)处的像素值,则输出图像在该位置的像素值d’(x,y)可以按照如下公式:Decode the differential mode image code stream written by the encoder. If the referenced pixel is encoded in the original mode during decoding, the reference pixel for decoding the current block must first be obtained according to formulas (3) and (4). After decoding the current block data, it is necessary to reconstruct the final decoded pixel. At this time, taking the luminance component with the global motion vector as 0 as an example, let b'(x, y) and r'(x, y) be the decoding The output background pixel and the pixel value of the reference pixel at the position (x, y), then the pixel value d'(x, y) of the output image at the position can be according to the following formula:
d′(x,y)=Clip1(b′(x,y)+r′(x,y)-256). (7)d'(x, y)=Clip1(b'(x, y)+r'(x, y)-256). (7)
d′(x,y)=Clip2(b′(x,y)+((r′(x,y)-128)<<1)) (8)d'(x,y)=Clip2(b'(x,y)+((r'(x,y)-128)<<1)) (8)
公式(7)、(8)分别与编码端的公式(3)、(4)相匹配。结合差分与背景叠加操作,当待解码数据块的预测参考数据为原始模式编码时,该单元操作如图9所示。当当前数据块的预测参考数据为差分模式时,该单元的操作如图10所示。在帧内预测时,图9、10中的参考图像可以等同于当前图像。Formulas (7) and (8) match with formulas (3) and (4) at the encoding end respectively. Combining the operation of difference and background superposition, when the prediction reference data of the data block to be decoded is encoded in the original mode, the unit operation is shown in Figure 9 . When the prediction reference data of the current data block is in differential mode, the operation of this unit is shown in FIG. 10 . During intra-frame prediction, the reference image in Figs. 9 and 10 may be equivalent to the current image.
四、原始模式块解码操作。4. Original mode block decoding operation.
对编码端写入的原始模式图像码流进行解码,在解码过程中,如所参考的像素点为差分模式编码,则要首先按照公式(5)、(6)得到用于当前块解码的参考像素点,解码出的当前块数据即为最终重建的解码图像。待解码数据的预测参考数据为原始模式编码时,待解码数据的预测参考数据为差分模式时,该单元的操作如图11所示。在帧内预测时,图11中的参考图像可以等同于当前图像。Decode the original mode image code stream written by the encoder. During the decoding process, if the referenced pixel is encoded in the differential mode, the reference for decoding the current block must first be obtained according to formulas (5) and (6). pixels, and the decoded current block data is the final reconstructed decoded image. When the prediction reference data of the data to be decoded is encoded in the original mode, and when the prediction reference data of the data to be decoded is in the differential mode, the operation of this unit is shown in FIG. 11 . During intra-frame prediction, the reference picture in FIG. 11 may be equivalent to the current picture.
下面举一个实例来说明本发明所述方法的一种可能的实现方法。设定输入视频为YUV4:2:0格式,设定视频段长度为990帧图像。输入数据像素值需要通过加256扩展至9比特。背景建模操作采用如图11以及公式(1)(2)所示基于分段权重的建模方法。An example is given below to illustrate a possible implementation method of the method of the present invention. Set the input video to YUV4:2:0 format, and set the video segment length to 990 frames. The input data pixel values need to be extended to 9 bits by adding 256. The background modeling operation adopts the modeling method based on segmentation weight as shown in Fig. 11 and formulas (1) (2).
具体的,训练集的可以选择每个视频段中平均分布于该视频段的118幅图像作为训练图像集,分别对亮度和各个色度分量进行背景建模,为下一个视频段生成背景图像。另外,我们引入视频段0作为初始视频段,第一幅图像为视频段0的背景图像。这些生成的背景图像的像素值通过加256扩展至9比特,然后采用扩展至9比特的AVS-S编码器RM0903,在QP=0情况下将其直接进行编码为I帧。背景图像解码操作采用扩展至9比特的AVS-S编码器RM0903解码器,实现背景图像的解码。不使用全局运动估计,不编码全局运动矢量。差分模式编码方法中采用前述公式(3)所述的方法进行差分计算,采用扩展了9比特的AVS-S编码方法对差分后的当前块进行编码。原始模式编码方法中采用前述公式(5)所述的方法进行叠加计算,采用扩展了9比特的AVS-S编码方法对当前待编码块进行编码。背景图像和差分图像的码流的内容按顺序为:直接编码的前118幅图像、第一个背景图像、第一个视频段、第二个背景图像、第二个视频段。Specifically, in the training set, 118 images that are evenly distributed in each video segment can be selected as the training image set, and background modeling is performed on the brightness and each chrominance component respectively to generate a background image for the next video segment. In addition, we introduce video segment 0 as the initial video segment, and the first image is the background image of video segment 0. The pixel values of these generated background images are extended to 9 bits by adding 256, and then the AVS-S encoder RM0903 extended to 9 bits is used to directly encode them into I frames under the condition of QP=0. The background image decoding operation adopts the AVS-S encoder RM0903 decoder extended to 9 bits to realize the decoding of the background image. Global motion estimation is not used, and global motion vectors are not encoded. In the differential mode encoding method, the method described in the foregoing formula (3) is used for differential calculation, and the 9-bit extended AVS-S encoding method is used to encode the current block after the differential. In the original mode encoding method, the method described in the aforementioned formula (5) is used for superposition calculation, and the current block to be encoded is encoded by using the 9-bit extended AVS-S encoding method. The contents of the code stream of the background image and the differential image are in sequence: the first 118 images directly encoded, the first background image, the first video segment, the second background image, and the second video segment.
针对上述实现,进行了如下性能测试。选取8个3088帧的室内/室外场景的静止摄像机序列进行测试,并与通用配置的AVS参考编码器RM0903的Shenzhan Profile的比较。本发明的方法的实施实例可以在1Mbps~4Mbps的码率范围内,在SD序列上实现0.92~1.53dB的性能增益,对应于40.1%~74.76%的码率节省,在在128kbps~768kbps的码率范围内,在CIF序列上实现1.27~1.87dB的性能增益,对应于36.61%~85.77的码率节省。For the above implementation, the following performance tests were carried out. Eight static camera sequences of indoor/outdoor scenes with 3088 frames were selected for testing, and compared with the Shenzhan Profile of AVS reference encoder RM0903 with general configuration. The implementation example of the method of the present invention can realize the performance gain of 0.92~1.53dB on the SD sequence in the code rate range of 1Mbps~4Mbps, corresponding to the code rate saving of 40.1%~74.76%, in the code rate of 128kbps~768kbps In the rate range, a performance gain of 1.27-1.87dB is realized on the CIF sequence, corresponding to a code rate saving of 36.61%-85.77.
以上对本发明所提供的一种基于背景建模和可选差分模式的视频编/解码方法及系统进行详细介绍,本文中应用了具体实施例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的系统及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处。综上所述,本说明书内容不应理解为对本发明的限制。A video encoding/decoding method and system based on background modeling and optional differential mode provided by the present invention has been introduced above in detail. In this paper, the principle and implementation of the present invention have been explained by using specific embodiments. The above embodiments The description is only used to help understand the system and its core idea of the present invention; at the same time, for those of ordinary skill in the art, according to the idea of the present invention, there will be changes in the specific implementation and application scope. In summary, the contents of this specification should not be construed as limiting the present invention.
Claims (24)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201010203823 CN101883284B (en) | 2010-06-21 | 2010-06-21 | Video encoding/decoding method and system based on background modeling and optional differential mode |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 201010203823 CN101883284B (en) | 2010-06-21 | 2010-06-21 | Video encoding/decoding method and system based on background modeling and optional differential mode |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101883284A CN101883284A (en) | 2010-11-10 |
CN101883284B true CN101883284B (en) | 2013-06-26 |
Family
ID=43055156
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 201010203823 Active CN101883284B (en) | 2010-06-21 | 2010-06-21 | Video encoding/decoding method and system based on background modeling and optional differential mode |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101883284B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102333221B (en) * | 2011-10-21 | 2013-09-04 | 北京大学 | Panoramic background prediction video coding and decoding method |
CN102665077A (en) * | 2012-05-03 | 2012-09-12 | 北京大学 | Rapid and efficient encoding-transcoding method based on macro block classification |
CN102868891B (en) * | 2012-09-18 | 2015-02-18 | 哈尔滨商业大学 | Multi-angle view video chromatic aberration correction method based on support vector regression |
CN105847793B (en) * | 2015-01-16 | 2019-10-22 | 杭州海康威视数字技术股份有限公司 | Video coding-decoding method and its device |
CN104702956B (en) * | 2015-03-24 | 2017-07-11 | 武汉大学 | A kind of background modeling method towards Video coding |
CN106331700B (en) | 2015-07-03 | 2019-07-19 | 华为技术有限公司 | Reference image encoding and decoding method, encoding device and decoding device |
CN107396138A (en) | 2016-05-17 | 2017-11-24 | 华为技术有限公司 | A kind of video coding-decoding method and equipment |
CN110062235B (en) * | 2019-04-08 | 2023-02-17 | 上海大学 | Background frame generation and update method, system, device and medium |
CN112702602B (en) * | 2020-12-04 | 2024-08-02 | 浙江智慧视频安防创新中心有限公司 | A video encoding and decoding method and storage medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101742319A (en) * | 2010-01-15 | 2010-06-16 | 北京大学 | Method and system for static camera video compression based on background modeling |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4697500B2 (en) * | 1999-08-09 | 2011-06-08 | ソニー株式会社 | TRANSMISSION DEVICE, TRANSMISSION METHOD, RECEPTION DEVICE, RECEPTION METHOD, AND RECORDING MEDIUM |
-
2010
- 2010-06-21 CN CN 201010203823 patent/CN101883284B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101742319A (en) * | 2010-01-15 | 2010-06-16 | 北京大学 | Method and system for static camera video compression based on background modeling |
Also Published As
Publication number | Publication date |
---|---|
CN101883284A (en) | 2010-11-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101883284B (en) | Video encoding/decoding method and system based on background modeling and optional differential mode | |
CN113411577B (en) | Coding method and device | |
US10142654B2 (en) | Method for encoding/decoding video by oblong intra prediction | |
CN101204094B (en) | Method for scalably encoding and decoding video signal | |
CN101742319B (en) | Background modeling-based static camera video compression method and background modeling-based static camera video compression system | |
CN112465698B (en) | Image processing method and device | |
CN103248893B (en) | From H.264/AVC standard to code-transferring method and transcoder thereof the fast frame of HEVC standard | |
CN110891180B (en) | Video decoding method and video decoder | |
CN101729891B (en) | Method for encoding multi-view depth video | |
CN103442228B (en) | Code-transferring method and transcoder thereof in from standard H.264/AVC to the fast frame of HEVC standard | |
CN101984665A (en) | Method and system for evaluating video transmission quality | |
CN110855998B (en) | Fusion candidate list construction method and device, and fusion candidate list editing/decoding method and device | |
CN101404766A (en) | Multi-view point video signal encoding method | |
CN101291436B (en) | Video coding/decoding method and device thereof | |
WO2022078339A1 (en) | Reference pixel candidate list constructing method and apparatus, device and storage medium | |
US20070133689A1 (en) | Low-cost motion estimation apparatus and method thereof | |
CN111866502B (en) | Image prediction method, apparatus, and computer-readable storage medium | |
US20160050431A1 (en) | Method and system for organizing pixel information in memory | |
CN102196272A (en) | A P-frame encoding method and device | |
CN103546754A (en) | Spatially scalable transcoding method and system from H.264/AVC to SVC | |
CN102595132A (en) | Distributed video encoding and decoding method applied to wireless sensor network | |
CN101854554A (en) | Video Codec System Based on Image Restoration Prediction | |
WO2020063687A1 (en) | Video decoding method and video decoder | |
CN100586185C (en) | A Mode Selection Method for H.264 Video Reduced Resolution Transcoding | |
CN114079782A (en) | Video image reconstruction method, device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |