CN101883284B - Video encoding/decoding method and system based on background modeling and optional differential mode - Google Patents

Video encoding/decoding method and system based on background modeling and optional differential mode Download PDF

Info

Publication number
CN101883284B
CN101883284B CN 201010203823 CN201010203823A CN101883284B CN 101883284 B CN101883284 B CN 101883284B CN 201010203823 CN201010203823 CN 201010203823 CN 201010203823 A CN201010203823 A CN 201010203823A CN 101883284 B CN101883284 B CN 101883284B
Authority
CN
China
Prior art keywords
data
background image
video
decoding
encoded
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 201010203823
Other languages
Chinese (zh)
Other versions
CN101883284A (en
Inventor
高文
张贤国
梁路宏
黄铁军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN 201010203823 priority Critical patent/CN101883284B/en
Publication of CN101883284A publication Critical patent/CN101883284A/en
Application granted granted Critical
Publication of CN101883284B publication Critical patent/CN101883284B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses video encoding/decoding method and system based on background modeling and an optional differential mode. The encoding method comprises the following steps of: modeling and generating a background image by using an input video sequence, and encoding to obtain a reconstructed background image; carrying out pixel or sub-pixel accuracy global motion estimation on each input image to obtain a global motion vector; and carrying out encoding on each video block by selectively using an original mode and a differential mode on the basis of the reconstructed background image and the global motion vector. The invention can improve the encoding performance and has the advantages of not increasing the encoding delay and being beneficial to further processing as the code stream contains the background image per se.

Description

Video encoding/decoding and system based on background modeling and optional differential mode
Technical field
The present invention relates to the video compression technology in the digital media processing technical field, relate more specifically to a kind of video encoding/decoding based on background modeling and optional differential mode and system.
Background technology
Video compression (also referred to as Video coding) is one of key technology during digital media storage and transmission etc. are used, and its objective is the data volume in reducing storage and transmit by the elimination redundant information.Current all main flow video compression standards have all adopted block-based predictive transformation hybrid encoding frame, namely by the statistical redundancy (comprising spatial redundancy, time redundancy and comentropy redundancy) in the methods such as prediction, conversion, entropy coding elimination video image, to reach the purpose that reduces data volume.To have the scene invariant feature in the regular hour section due to video data, in recent years, development and progress along with the background modeling technology, the background modeling technology more and more is applied in Video coding, the background of reasonably utilizing modeling to generate, can further eliminate the information redundancy in video, thereby obtain better compression performance.
Use for the modeling background, main method is divided into two classes at present, one class is to break away from block-based predictive transformation hybrid encoding frame, adopt object-based video compression technology, by using the modeling algorithm generation background, complete object detection, to technology such as image tracing, foreground/background segmentation, each object in video is separated, by different objects is taked different compress modes, further excavate the information redundancy in video, thereby improve compression efficiency.Therefore, object-based video-frequency compression method is an important research direction that solves the video compression problem, but also there are two problems: one, object detection in video remains an open question in computer vision and image processing field with cutting apart, existing method is still not ideal enough aspect the accuracy that detects, cuts apart and accuracy rate, becomes to a bottleneck of the video-frequency compression method of object; Its two, the computational complexity of above-mentioned object detection and dividing method is higher, is unfavorable for the realization of encoder.
So the another kind of still coding method of block-based predictive transformation hybrid encoding frame just becomes the research direction that another one receives much concern.
Summary of the invention
The invention provides a kind of video encoding/decoding based on background modeling and optional differential mode and system.Can better improve coding efficiency based on the present invention.
On the one hand, the invention discloses a kind of method for video coding based on background modeling and optional differential mode, the method comprises the steps: the background modeling step, uses the video sequence modeling generation background image of input, through obtaining the reconstructed background image after coding; The overall motion estimation step is carried out the overall motion estimation of pixel or sub-pixel precision to every width input picture, obtain global motion vector; The model selection step based on background image and the described global motion vector of described reconstruct, optionally uses raw mode, difference modes that each video block is encoded.
Above-mentioned method for video coding, in preferred described background modeling step, the video sequence modeling generation background image of described use input comprises: for each pixel, find the pixel set of this pixel in training set, then begin traversal; For each pixel value, according to poor between current pixel value and next adjacent pixel values in this pixel set, the dynamic threshold that utilizes current time to generate judges, if poor absolute value is greater than threshold value, judge that the current data section finishes, and begins next data segment; And then, the pixel set of whole current pixel position is divided into some data segments; Distribute a weight for each section, described weight is the size of this segment data set; Based on described weight, the background pixel value of the described pixel that calculates.
Above-mentioned method for video coding, preferred described background modeling step also comprise regularly chooses training set again to upgrade the step of described background image.
Above-mentioned method for video coding, in preferred described background modeling step, background image is encoded to obtain in the background image of reconstruct, and described coding method comprises: use to diminish or background image that the coding method coding modeling of harmless, image or video generates; Or all background images are regarded as a sequence, use method for video coding to encode, described method for video coding is: MPEG-1/2/4, H.263, H.264/AVC, VC1, AVS, JPEG, JPEG2000 or MJPEG.
Above-mentioned method for video coding, in preferred described overall motion estimation step, described overall motion estimation comprises: take data block as base unit, make present image carry out take the reconstructed background image as reference picture the overall situation, whole pixel or minute pixel motion search, getting the intermediate value of motion vector set, maximum cluster or mean value is the global motion vector of present image.
Above-mentioned method for video coding, in preferred described model selection step, described raw mode, the difference modes optionally used encoded to each video block, and the method for selection is: by comparing the rate distortion result of raw mode and difference modes.
Above-mentioned method for video coding, preferred described raw mode is encoded to: according to global motion vector, find the corresponding data of predictive reference data in background image of data to be encoded, if predictive reference data has been encoded to difference modes, take both decoding superposition value as reference, otherwise directly take the decoding original value of predictive reference data as with reference to coming the direct coding data to be encoded.
Above-mentioned method for video coding, preferred described difference modes is encoded to: according to global motion vector, data for the coupling correspondence of predictive reference data in background image of data to be encoded, if institute's predictive reference data has been encoded to raw mode, take both decoding difference value as reference, otherwise directly take the decoding original value of prediction reference pixel as the differential data with reference to data corresponding in encode data to be encoded and background image.
above-mentioned method for video coding, the preferred described training set of regularly again choosing is specially to upgrade described background image: carry out the renewal of background image according to video-frequency band, one section input video sequence that described video-frequency band is encoded for the background image that uses same reconstruct, whole input video sequence can be regarded as by some end to end video-frequency bands and consists of, during coding, choosing the training plan image set from the current video section carries out background modeling and generates a width background image, for next video-frequency band coding, when making current video section coding, what use is the background image that generates in previous video section coding.
On the other hand, the invention also discloses a kind of video coding system based on background modeling and optional differential mode, comprising: the background modeling module is used for using the video sequence modeling generation background image of inputting, through obtaining the reconstructed background image after coding; The overall motion estimation module for every width input picture being carried out the overall motion estimation of pixel or sub-pixel precision, obtains global motion vector; Mode selection module is used for background image and described global motion vector based on described reconstruct, optionally uses raw mode, difference modes that each video block is encoded.
Above-mentioned video coding system, in preferred described background modeling module, the video sequence modeling generation background image of described use input comprises: be used for for each pixel, find the pixel set of this pixel in training set, then begin traversal; For each pixel value, according to poor between current pixel value and next adjacent pixel values in this pixel set, the dynamic threshold that utilizes current time to generate judges, if poor absolute value is greater than threshold value, judge that the current data section finishes, and begins next data segment; And then, the pixel set of whole current pixel position is divided into some data segments; Distribute a weight for each section, described weight is the size of this segment data set; Based on described weight, the submodule of the background pixel of the described pixel value that calculates.
Above-mentioned video coding system, preferred described background modeling module also comprise regularly chooses training set again to upgrade the submodule of described background image.
Above-mentioned video coding system, in preferred described background modeling module, background image is encoded to obtain in the background image of reconstruct, and described coding method comprises: use to diminish or background image that the coding method coding modeling of harmless, image or video generates; Or all background images are regarded as a sequence, use method for video coding to encode, described method for video coding is: MPEG-1/2/4, H.263, H.264/AVC, VC1, AVS, JPEG, JPEG2000 or MJPEG.
Above-mentioned video coding system, preferred described overall motion estimation module also comprises: be used for take data block as base unit, make present image carry out take the reconstructed background image as reference picture the overall situation, whole pixel or minute pixel motion search, getting the intermediate value of motion vector set, maximum cluster or mean value is the submodule of the global motion vector of present image.
Above-mentioned video coding system, in preferred described mode selection module, described raw mode, the difference modes optionally used encoded to each video block, and the method for selection is: by comparing the rate distortion result of raw mode and difference modes.
Above-mentioned video coding system, preferred described raw mode is encoded to: according to global motion vector, find the corresponding data of predictive reference data in background image of data to be encoded, if predictive reference data has been encoded to difference modes, take both decoding superposition value as reference, otherwise directly take the decoding original value of predictive reference data as with reference to coming the direct coding data to be encoded.
Above-mentioned video coding system, preferred described difference modes is encoded to: according to global motion vector, data for the coupling correspondence of predictive reference data in background image of data to be encoded, if institute's predictive reference data has been encoded to raw mode, take both decoding difference value as reference, otherwise directly take the decoding original value of prediction reference pixel as the differential data with reference to data corresponding in encode data to be encoded and background image.
above-mentioned video coding system, the preferred described training set of regularly again choosing is specially to upgrade described background image: carry out the renewal of background image according to video-frequency band, one section input video sequence that described video-frequency band is encoded for the background image that uses same reconstruct, whole input video sequence can be regarded as by some end to end video-frequency bands and consists of, during coding, choosing the training plan image set from the current video section carries out background modeling and generates a width background image, for next video-frequency band coding, when making current video section coding, what use is the background image that generates in previous video section coding.
On the other hand, the invention also discloses the corresponding video encoding/decoding method of a kind of with above-mentioned method for video coding, comprising: decoding background image and global motion vector; Each video block is carried out raw mode or difference modes decoding.
Above-mentioned video encoding/decoding method, preferred described raw mode decoding comprises: if data to be decoded be encoded to raw mode, according to global motion vector, obtain the corresponding data of predictive reference data in background image; If predictive reference data is for being encoded to difference modes, take both decoding superposition value as reference, otherwise directly take the decoding original value of predictive reference data as with reference to the data to be decoded that directly decode.
Above-mentioned video encoding/decoding method, preferred described difference modes decoding comprises: if data to be decoded be encoded to difference modes, according to global motion vector, obtain the corresponding data of predictive reference data in background image; If institute's predictive reference data has been encoded to raw mode, take both decoding difference value as reference, otherwise directly take the decoding original value of predictive reference data as with reference to decoding current data to be decoded, the data that decode pass through again with background image in the superposition of corresponding data.
On the other hand, the invention also discloses the corresponding video decoding system of a kind of and above-mentioned video coding system, comprising: the module that is used for decoding background image and global motion vector; Be used for each video block is carried out the module of raw mode or difference modes decoding.
Above-mentioned video decoding system, preferred described raw mode decoding comprises: if data to be decoded be encoded to raw mode, according to global motion vector, obtain the corresponding data of predictive reference data in background image; If predictive reference data is for being encoded to difference modes, take both decoding superposition value as reference, otherwise directly take the decoding original value of predictive reference data as with reference to the data to be decoded that directly decode.
Above-mentioned video decoding system, preferred described difference modes decoding comprises: if data to be decoded be encoded to difference modes, according to global motion vector, obtain the corresponding data of predictive reference data in background image; If institute's predictive reference data has been encoded to raw mode, take both decoding difference value as reference, otherwise directly take the decoding original value of predictive reference data as with reference to decoding current data to be decoded, the data that decode again through with background image in the superposition of corresponding data obtain final decoded data.
Than prior art, the present invention has following features: the first, do not carry out cutting apart of object or foreground/background; The second, encode take piece or macro block as unit; The 3rd, increased global motion compensation; The 4th, the mode of employing model selection is selected optimum from two class coding modes, to guarantee code efficiency.The present invention can improve coding efficiency, and has and do not increase coding delay, and code stream itself comprised background image, is conducive to the advantage of further processing.
Description of drawings
Fig. 1 is the flow chart of steps that the present invention is based on the method for video coding embodiment of background modeling and optional differential mode;
Fig. 2 is be used to implementing method for video coding block diagram of the present invention;
Fig. 3 is the corresponding relation of data and background image in present image under overall motion estimation;
Fig. 4 is background modeling process schematic diagram;
Fig. 5 is the schematic diagram of choosing of training set;
Fig. 6 is that current data to be encoded will be encoded to difference modes, prediction reference and be encoded to raw mode decoding for example;
Fig. 7 is that current data to be encoded will be encoded to coding that difference modes, prediction reference be encoded to difference modes for example;
Fig. 8 is that current data to be encoded will be encoded to raw mode, prediction reference and be encoded to difference modes decoding for example;
Fig. 9 is that current data to be decoded have been encoded to difference modes, prediction reference and have been encoded to raw mode decoding for example;
Figure 10 be current data to be decoded bits of coded difference modes, prediction reference be encoded under difference modes coding for example;
Figure 11 is that current data to be decoded have been encoded to raw mode, prediction reference and have been encoded to difference modes decoding for example.
Embodiment
For above-mentioned purpose of the present invention, feature and advantage can be become apparent more, the present invention is further detailed explanation below in conjunction with the drawings and specific embodiments.
In the present invention, utilize the fixing characteristics of scene in the certain hour that exists in video sequence, to relatively-stationary scene in video, adopted a kind of method of background modeling, and then the generation background image is described.In cataloged procedure, by setting up and upgrading and described the background image of relatively-stationary scene in the video, optionally use a kind of difference modes coding method of new introducing each data block of encoding, thereby eliminate to a greater extent the redundancy in video sequence, obtain better compression performance.Correspondingly, the basic thought of coding method is as follows: model and background image updating are described the fixed scene information of video image; Use subsequently and selectively carry out overall motion estimation, obtain the global motion vector of each width image; Then between the difference Video Encoding Mode of the original video coding mode of direct coding initial data and coding initial data and the difference result of corresponding background data, the coding mode of selection optimum.In above-mentioned encryption algorithm thought, background image utilizes original inputted video image to obtain through background modeling, and the background image of generation need to enroll code stream; If used overall motion estimation, also need global motion vector is write code stream; Designing requirement and the coding side of decoding end are complementary, and namely code stream when decoding, at first directly decodes background image and the global motion vector that can select to encode.
With reference to Fig. 1, Fig. 1 is the flow chart of steps that the present invention is based on the method for video coding embodiment of background modeling and optional differential mode, comprises the steps:
Background modeling step S1 uses the video sequence modeling generation background image of inputting, and background image obtains the reconstructed background image through after encoding and decoding; Overall motion estimation step S2 carries out the overall motion estimation of pixel or sub-pixel precision to every width input picture, obtain global motion vector; Model selection step S3 based on background image and the described global motion vector of described reconstruct, optionally uses raw mode, difference modes that each video block is encoded.
Above-described embodiment has following features: the first, do not carry out cutting apart of object or foreground/background; The second, encode take piece or macro block as unit; The 3rd, increased global motion compensation; The 4th, the mode of employing model selection is selected optimum from two class coding modes, to guarantee code efficiency.The present invention can improve coding efficiency, and has and do not increase coding delay, and code stream itself comprised background image, is conducive to the advantage of further processing.And, video compression technology based on background modeling is particularly useful for video monitoring, video conference, intelligent room etc., the video of these application scenarioss has the scene characteristics long, that the camera lens switching frequency is extremely low of holding time, and is conducive to use background modeling to improve compression efficiency.
with reference to Fig. 2, the present invention can complete under encoding and decoding framework as shown in Figure 2, this framework comprises background modeling at coding side, selectable overall motion estimation, global motion compensation, the background image coding, the background image decoding, the difference modes coding, seven functional units of raw mode coding, complete respectively the background modeling algorithm, selectable overall motion estimation algorithm, and the operation of the compensation data between background image, the background image encryption algorithm, the background image decoding algorithm, data-encoding scheme under difference modes, data encoding algorithm operating under raw mode, corresponding decoding end is made of difference modes decoding, raw mode decoding, background image decoding, difference and background superpositing unit, realizes respectively the overlap-add operation of corresponding data in decoding, differential decoding result and the background image of coded data under the decoding, raw mode of coded data under difference modes.
as shown in Figure 2, the present embodiment comprises the background image modelling operability of accepting input video sequence at coding side, accept the background image encoding operation of described background image modelling operability output, the background image decode operation that connects described background image encoding operation output, accept the optional overall motion estimation operation of input video sequence and background image decoder module output, accept the overall motion estimation vector of optional overall motion estimation vector, the global motion compensation operation of list entries and background image decoding output, but accept input video sequence, the differential data raw mode encoding operation of the overall motion estimation vector global motion compensation operation output of overall motion estimation module output, accept the overall motion estimation phasor difference merotype encoding operation of input video sequence and optional overall motion estimation module output, select to accept the coded message of difference modes encoding operation output and coded message and the encoding code stream of encoding code stream or the output of raw mode encoding operation by model selection at last.In decoding end, comprise the background image decode operation of the encoding code stream of accepting the coding side input; Optionally accept the difference modes decode operation of input code flow information according to the model selection flag bit; Optionally accept the raw mode decode operation of input code flow information according to the model selection flag bit; Accept difference and the background overlap-add operation of output and the background image decoding output of difference block decoding.The described video based on background model of Fig. 2 is compiled the solution method in following the present invention and each related function and implementation method that operates is described in detail:
One, background image modeling
1, to the video sequence of input, choose a training plan image set, modeling generation one width background image passes to video encoding module 3 and compresses.Before first background image generated, this module was output as the first frame.Take luminance component as example, modeling method used comprises the steps: including but not limited to as shown in Figure 3 background modeling algorithm
Step S11, initialization current pixel position mean, threshold value;
Step S12 sets up a data segment, its weight of initialization, mean value;
Step S13 reads a new pixel in training set;
Step S14, judgement: is the difference on the time between neighbor square greater than threshold value? if, execution in step S15a, execution in step S15b if not;
Step S15a upgrades current data section mean value;
Step S16a upgrades threshold value under current weight;
Step S17a, current data section weight adds 1;
Step S15b upgrades overall average, threshold value under current data section weight;
Step S16b sets up a new data section, initializes weights, mean value;
Step S18, after execution of step S17a or step S16b, whether the training of judgement collection finishes, if, execution in step S19; If not, execution in step S13;
Step S19 upgrades overall average under current data section weight, and with the overall average mean value of pixel as a setting.
In Fig. 3, mean value and the weight of overall average, each newdata section all are initialized as 0.Threshold value is initialized as the Pixel-level mean value of the difference square of front cross frame data in training set.Overall average and threshold value more new capital are based on weight.If the weight of data segment i is W i, pixel and be Sum i, in data segment i, the poor quadratic sum of all neighbors is T i, overall average AVG, threshold value Th calculate as formula (1), (2):
AVG = Σ j = 1 i ( Sum j × W j ) / Σ j = 1 i ( W j 2 ) , - - - ( 1 )
Th = Σ j = 1 i ( T j × W j ) / Σ j = 1 i ( W j 2 ) . - - - ( 2 )
Corresponding with the sequential structure that the described background image renewal of preamble is required, whenever training plan image set of input, just re-start background modeling, generate a width background image, complete background image and upgrade.The modeling process of chromatic component is identical.
Two, background image encoding operation
The background image that the background modeling module is generated carries out compression coding, and coding result is write code stream and passes to the background image reconstructed module.The encoder used of encoding comprise but be not limited to MPEG-1/2/4, H.263, H.264/AVC, VC1, AVS, JPEG, JPEG2000, MJPEG.The modes such as background image sequence use IPPP structured coding and the every width background image of Lossless Compression are regarded in the configuration of encoder as including but not limited to the every width background image of independent intraframe coding, with all background images.In concrete execution mode, we have adopted the AVS coding techniques that extends to 9 bit input bit wides background image to be carried out the coding of QP=0.
Three, background image decode operation
Background image code stream to the output of background image coding module is carried out decoding and reconstituting, in order to make the encoding and decoding coupling, transmits the background image of reconstruct to the reconstructed background compensating operation.In concrete execution mode, adopted the AVS decoding technique that extends to 9 bit stream that background image is decoded, the output of decoding is also 9 bits.
Four, optional overall motion estimation operation
The reconstructed background image of background image decode operation output and the current video image of input are used overall motion estimation, obtain global motion vector.Include but not limited to take data block as base unit, make present image carry out overall whole pixel take the reconstructed background image as reference picture or divide the pixel motion search to get the intermediate value of motion vector, maximum cluster or mean value being the global motion vector of present image.The global motion vector that obtains need to write code stream.If global motion vector is zero, also can uses corresponding flag bit to substitute global motion vector and write code stream.
Five, global motion compensation operation
The global motion vector that uses overall motion estimation to generate mates in background image and the background image data corresponding to the current initial data that will encode, as shown in Figure 4.The current initial data that will encode and background image data to coupling carry out calculus of differences, then difference result are exported to the difference modes encoding operation and are encoded.
Six, difference modes encoding operation
Reconstruct background compensation operation output ground differential data piece is carried out compression coding.The encryption algorithm of selecting will be complementary with the differential data block format.For example, when difference image is output as 9 bit, should select to be configured to the piece compression algorithm of 9 bits, described in the reconstructed background compensating operation.Under this pattern, we will encode this data block predictive reference data used and current data blocks of data all with the reconstructed background image in corresponding data carry out difference operation, the data that make this data block reference are all difference.Be calculated as example take global motion vector as the brightness difference under 0, if s is (x, y) be to treat that differential data is at position (x, y) the luminance pixel values of, b (x, y) final reconstructed background image is in the luminance pixel values of position (x, y), and the computational methods of differential data are including but not limited to the described two kinds of algorithms of following formula:
r(x,y)=Clip1(s(x,y)-b(x,y)+256), (3)
r(x,y)=Clip2((s(x,y)-b(x,y))>>1+128),(4)
Within wherein the clip1 in formula represents result of calculation is limited to [0,511], if cross the border, get nearest borderline pixel value; Within clip2 represents result of calculation is limited to [0,255], if cross the border, get nearest borderline pixel value.In conjunction with the global motion compensation operation, when the predictive reference data of current data block was the raw mode coding, the operation of this unit as shown in Figure 5.When the predictive reference data of current data block was the difference modes coding, the operation of this unit as shown in Figure 6.When infra-frame prediction, the reference picture in Fig. 5,6 can be equal to present image.
Seven, raw mode encoding operation
Data block to input is carried out compression coding.The encryption algorithm of selecting will be complementary with input block.For example, when input block is 9 bit, should select to be configured to the piece compression algorithm of 9 bits, described in the difference image computing module.Under this pattern, we carry out overlap-add operation at the corresponding data of encoding in this data block predictive reference data used and reconstructed background image, and the data that make this data block reference are all non-difference.Be calculated as example take global motion vector as the brightness difference under 0, if s is (x, y) be at position (x, y) differential coding data are through decoded luminance pixel values, b (x, y) is that the background image pixel of decoding is in the luminance pixel values of position (x, y), final value r (x, the y) computational methods of reference pixel are including but not limited to the described two kinds of algorithms of following formula:
r(x,y)=Clip1(s(x,y)-256+b(x,y)), (5)
r(x,y)=Clip2((s(x,y)<<1-128)+b(x,y))),(6)
When the predictive reference data of current data block was the raw mode coding, this unit directly decoded code stream.When the predictive reference data of current data block was the difference modes coding, the operation of this unit as shown in Figure 7.When infra-frame prediction, the reference picture in Fig. 7 can be equal to present image.
Above-mentioned coding method is in realization, and described background image regular update method can be by a kind of new video sequence structure---video-frequency band describes and realizes.A video-frequency band is one section long input video sequence (hundreds of frame or longer), and whole input video sequence can be regarded as by end to end one by one video-frequency band and consists of.Each video-frequency band uses the background image of same width reconstruct to calculate difference image.When coding, the background modeling module is chosen the training plan image set from the current video section, carries out background modeling and generates a width background image, for next video-frequency band coding.From another perspective, current video section when coding, use be the background image that generates in previous video section coding, therefore, whole coding method can not bring because of the generation of background image extra delay.For first video-frequency band, wherein before some images can adopt traditional video coding technique (include but not limited to MPEG-1/2/4, H.263, H.264/AVC, VC1, AVS, JPEG, JPEG2000, MJPEG) encode.In these images of coding, the background modeling module is chosen the training plan image set as shown in Figure 8 from these images, generate first background image and be transferred to decoding end.For ensuing image in first video-frequency band, can utilize the reconstructed image of above-mentioned first background image, generate difference image and encode.When said method can guarantee that whole sequence begins to encode, can not produce extra delay because of background modeling yet.
Corresponding with above-mentioned sequential structure, what at first be incorporated into code stream is first training plan image set of direct coding.The encoding code stream of the first width background image subsequently.Next, be the corresponding difference image encoding code stream of other parts outside first training plan image set of first video-frequency band.Later replace to such an extent that corresponding background image and the difference image of each video-frequency band coding enrolled final code stream.
The code stream that above-mentioned coding method and system produce can be decoded with four operations of as shown in Figure 2 decoding end:
One, background image decode operation
Background image code stream is decoded, and the background image that transmission decodes is to the difference image compensating module.
Two, difference and background overlap-add operation.
To in the differential data of difference modes piece decode operation output and background image under global motion vector corresponding background data superpose, stack result is exported.
Three, difference modes decode operation
The difference modes image code stream that coding side writes is decoded, during decoding such as the pixel of institute's reference be raw mode coding, will be at first obtain reference image vegetarian refreshments for the current block decoding according to formula (3), (4).After decoding the current block data, also need to rebuild final decoded pixel point, at this moment, luminance component take global motion vector under 0 is established b ' (x, y) and r ' (x as example, y) be respectively that the background pixel that decodes and reference pixel are at position (x, y) pixel value of locating, output image can be according to following formula at the pixel value d ' (x, y) of this position:
d′(x,y)=Clip1(b′(x,y)+r′(x,y)-256). (7)
d′(x,y)=Clip2(b′(x,y)+((r′(x,y)-128)<<1))?(8)
Formula (7), (8) are complementary with formula (3), (4) of coding side respectively.In conjunction with difference and background overlap-add operation, when the predictive reference data of data block to be decoded was the raw mode coding, this unit operations as shown in Figure 9.When the predictive reference data of current data block was difference modes, the operation of this unit as shown in figure 10.When infra-frame prediction, the reference picture in Fig. 9,10 can be equal to present image.
Four, raw mode piece decode operation.
The raw mode image code stream that coding side writes is decoded, in decode procedure, pixel as institute's reference is the difference modes coding, will be at first obtain reference image vegetarian refreshments for the current block decoding according to formula (5), (6), the current block data that decode are the decoded picture of final reconstruction.When the predictive reference data of data to be decoded was the raw mode coding, when the predictive reference data of data to be decoded was difference modes, the operation of this unit as shown in figure 11.When infra-frame prediction, the reference picture in Figure 11 can be equal to present image.
The below cites an actual example to illustrate a kind of possible implementation method of the method for the invention.Setting input video is the YUV4:2:0 form, and setting video-frequency band length is 990 two field pictures.Input data pixels value need to extend to 9 bits by adding 256.Background modeling operates employing modeling method based on the segmentation weight as shown in Figure 11 and formula (1) (2).
Concrete, the 118 width images that can select to be evenly distributed in this video-frequency band in each video-frequency band of training set carry out background modeling to brightness and each chromatic component respectively as the training plan image set, are next video-frequency band generation background image.In addition, we introduce video-frequency band 0 as the initial video section, and the first width image is the background image of video-frequency band 0.The pixel value of the background image of these generations extends to 9 bits by adding 256, then adopts the AVS-S encoder RM0903 that extends to 9 bits, in the QP=0 situation, it directly is encoded to the I frame.The background image decode operation adopts the AVS-S encoder RM0903 decoder that extends to 9 bits, realizes the decoding of background image.Do not use overall motion estimation, the global motion vector of not encoding.Adopt the described method of aforementioned formula (3) to carry out Difference Calculation in the difference modes coding method, adopt the AVS-S coding method of having expanded 9 bits that differentiated current block is encoded.Adopt the described method of aforementioned formula (5) to carry out superposition calculation in the raw mode coding method, adopt the AVS-S coding method of having expanded 9 bits that current block to be encoded is encoded.The content of the code stream of background image and difference image is in order: the front 118 width images of direct coding, first background image, first video-frequency band, second background image, second video-frequency band.
For above-mentioned realization, carried out following performance test.Choose the static camera sequence of the indoor/outdoor scene of 8 3088 frames and test, and with the comparison of the Shenzhan Profile of the AVS reference encoder device RM0903 of generic configuration.The embodiment of method of the present invention can be in the range of code rates of 1Mbps~4Mbps, realize the performance gain of 0.92~1.53dB on the SD sequence, code check corresponding to 40.1%~74.76% is saved, in the range of code rates at 128kbps~768kbps, realize the performance gain of 1.27~1.87dB on the CIF sequence, the code check corresponding to 36.61%~85.77 is saved.
Above a kind of video encoding/decoding and system based on background modeling and optional differential mode provided by the present invention described in detail, used specific embodiment herein principle of the present invention and execution mode are set forth, the explanation of above embodiment just is used for helping to understand system of the present invention and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to thought of the present invention, all will change in specific embodiments and applications.In sum, this description should not be construed as limitation of the present invention.

Claims (24)

1. the method for video coding based on background modeling and optional differential mode, is characterized in that, the method comprises the steps:
The background modeling step is used the video sequence modeling generation background image of inputting, and background image is through reconstructed background image after encoding and decoding;
The overall motion estimation step is carried out the overall motion estimation of pixel or sub-pixel precision to every width input picture, obtain global motion vector;
The model selection step based on background image and the described global motion vector of reconstruct, optionally uses raw mode, difference modes that each video block is encoded.
2. method for video coding according to claim 1, is characterized in that, in described background modeling step, the video sequence modeling generation background image of described use input comprises:
For each pixel, find the pixel set of this pixel in training set, then begin traversal;
For each pixel value, according to poor between current pixel value and next adjacent pixel values in this pixel set, the dynamic threshold that utilizes current time to generate judges, if poor absolute value is greater than threshold value, judge that the current data section finishes, and begins next data segment; And then, the pixel set of whole current pixel position is divided into some data segments;
Distribute a weight for each section, described weight is the size of this segment data set; Based on described weight, the background pixel value of the described pixel that calculates.
3. method for video coding according to claim 2, is characterized in that, described background modeling step also comprises regularly chooses training set again to upgrade the step of described background image.
4. method for video coding according to claim 1, is characterized in that, in described background modeling step, background image encoded to obtain the method for rebuilding background image comprise:
The background image that use diminishes or the coding method coding modeling of harmless, image or video generates; Or all background images are regarded as a sequence, use MPEG-1/2/4, H.263, H.264/AVC, VC1, AVS, JPEG, JPEG2000 or MJPEG encode to this sequence.
5. method for video coding according to claim 1, is characterized in that, in described overall motion estimation step, described overall motion estimation comprises:
Take data block as base unit, make present image carry out take the background image of reconstruct as reference picture the overall situation, whole pixel or minute pixel motion search, getting the intermediate value of motion vector set, maximum cluster or mean value is the global motion vector of present image.
6. method for video coding according to claim 1, it is characterized in that, in described model selection step, described raw mode, the difference modes optionally used encoded to each video block, and the method for selection is: by comparing the rate distortion result of raw mode and difference modes.
7. method for video coding according to claim 6, is characterized in that, described raw mode is encoded to:
According to global motion vector, find the corresponding data of predictive reference data in background image of data to be encoded, if predictive reference data has been encoded to difference modes, take the decoding superposition value of this predictive reference data and its corresponding data in background image as reference, otherwise, directly take the decoding original value of predictive reference data as with reference to coming the direct coding data to be encoded.
8. method for video coding according to claim 7, is characterized in that, described difference modes is encoded to:
According to global motion vector, find the data of the correspondence of predictive reference data in background image of data to be encoded, if institute's predictive reference data has been encoded to raw mode, the decoding difference value of the corresponding data in this predictive reference data and its background image is as reference, otherwise, directly take the decoding original value of predictive reference data as the differential data with reference to data corresponding in encode data to be encoded and background image.
9. method for video coding according to claim 3, is characterized in that, the described training set of regularly again choosing is specially to upgrade described background image:
Carry out the renewal of background image according to video-frequency band, one section input video sequence that described video-frequency band is encoded for the background image that uses same reconstruct, whole input video sequence can be regarded as by some end to end video-frequency bands and consists of, during coding, choosing the training plan image set from the current video section carries out background modeling and generates a width background image, for next video-frequency band coding, when making current video section coding, use be background image in the generation of previous video section coding.
10. the video coding system based on background modeling and optional differential mode, is characterized in that, comprising:
The background modeling module is used for using the video sequence modeling generation background image of inputting, and background image is through reconstructed background image after encoding and decoding;
The overall motion estimation module for every width input picture being carried out the overall motion estimation of pixel or sub-pixel precision, obtains global motion vector;
Mode selection module is used for background image and described global motion vector based on reconstruct, optionally uses raw mode, difference modes that each video block is encoded.
11. video coding system according to claim 10 is characterized in that, in described background modeling module, the video sequence modeling generation background image of described use input comprises:
Be used for for each pixel, find the pixel set of this pixel in training set, then begin traversal; For each pixel value, according to poor between current pixel value and next adjacent pixel values in this pixel set, the dynamic threshold that utilizes current time to generate judges, if poor absolute value is greater than threshold value, judge that the current data section finishes, and begins next data segment; And then, the pixel set of whole current pixel position is divided into some data segments; Distribute a weight for each section, described weight is the size of this segment data set; Based on described weight, the submodule of the background pixel value of the described pixel that calculates.
12. video coding system according to claim 11 is characterized in that, described background modeling module also comprises regularly chooses training set again to upgrade the submodule of described background image.
13. video coding system according to claim 10 is characterized in that, in described background modeling module, background image is encoded to obtain in the submodule of background image of reconstruct, described coding method comprises:
The background image that use diminishes or the coding method coding modeling of harmless, image or video generates; Or
All background images are regarded as a sequence, use MPEG-1/2/4, H.263, H.264/AVC, VC1, AVS, JPEG, JPEG2000 or MJPEG encode to this sequence.
14. video coding system according to claim 10 is characterized in that, described overall motion estimation module also comprises:
Be used for take data block as base unit, make present image carry out the overall situation take the background image of reconstruct as reference picture, whole pixel or minute pixel motion search, getting the intermediate value of motion vector set, maximum cluster or mean value is the submodule of the global motion vector of present image.
15. video coding system according to claim 10, it is characterized in that, in described mode selection module, described raw mode, the difference modes optionally used encoded to each video block, and the method for selection is: by comparing the rate distortion result of raw mode and difference modes.
16. video coding system according to claim 15 is characterized in that, described raw mode is encoded to:
According to global motion vector, find the corresponding data of predictive reference data in background image of data to be encoded, if predictive reference data has been encoded to difference modes, take the decoding superposition value of this predictive reference data and its corresponding data in background image as reference, otherwise, directly take the decoding original value of predictive reference data as with reference to coming the direct coding data to be encoded.
17. video coding system according to claim 16 is characterized in that, described difference modes is encoded to:
According to global motion vector, find the corresponding data of predictive reference data in background image of data to be encoded, if institute's predictive reference data has been encoded to raw mode, take the decoding difference value of this predictive reference data and its corresponding data in background image as reference, otherwise, directly take the decoding original value of predictive reference data as the differential data with reference to data corresponding in encode data to be encoded and background image.
18. video coding system according to claim 12 is characterized in that, the described training set of regularly again choosing is specially to upgrade described background image:
Carry out the renewal of background image according to video-frequency band, one section input video sequence that described video-frequency band is encoded for the background image that uses same reconstruct, whole input video sequence can be regarded as by some end to end video-frequency bands and consists of, during coding, choosing the training plan image set from the current video section carries out background modeling and generates a width background image, for next video-frequency band coding, make current video section when coding, use be the background image that generates in previous video section coding.
19. one kind generates the video encoding/decoding method of code stream with method for video coding claimed in claim 1, it is characterized in that, comprising:
Decoding and rebuilding background image and global motion vector;
Each video block is carried out raw mode or difference modes decoding.
20. video encoding/decoding method according to claim 19 is characterized in that, described raw mode decoding comprises:
If data to be decoded be encoded to raw mode, according to global motion vector, obtain the corresponding data of predictive reference data in background image; If predictive reference data is for being encoded to difference modes, take the decoding superposition value of this predictive reference data and its corresponding data in background image as reference, otherwise, directly take the decoding original value of predictive reference data as with reference to the data to be decoded that directly decode.
21. video encoding/decoding method according to claim 19 is characterized in that, described difference modes decoding comprises:
If data to be decoded be encoded to difference modes, according to global motion vector, obtain the corresponding data of predictive reference data in background image; If institute's predictive reference data has been encoded to raw mode, take the decoding difference value of this predictive reference data and its corresponding data in background image as reference, otherwise, directly take the decoding original value of predictive reference data as with reference to decoding current data to be decoded, the data that decode pass through again with background image in the superposition of corresponding data.
22. one kind generates the video decoding system of code stream with video coding system claimed in claim 10, it is characterized in that, comprising:
The module that is used for decoding background image and global motion vector;
Be used for each video block is carried out the module of raw mode or difference modes decoding.
23. video decoding system according to claim 22 is characterized in that, described raw mode decoding comprises:
If data to be decoded be encoded to raw mode, according to global motion vector, obtain the corresponding data of predictive reference data in background image; If predictive reference data is for being encoded to difference modes, take the decoding superposition value of this predictive reference data and its corresponding data in background image as reference, otherwise, directly take the decoding original value of predictive reference data as with reference to directly decoding, to obtain final decoded data.
24. video encoding/decoding method according to claim 22 is characterized in that, described difference modes decoding comprises:
If data to be decoded be encoded to difference modes, according to global motion vector, obtain the corresponding data of predictive reference data in background image; If institute's predictive reference data has been encoded to raw mode, take the decoding difference value of this predictive reference data and its corresponding data in background image as reference, otherwise, directly take the decoding original value of predictive reference data as with reference to decoding current data to be decoded, the data that decode again through with background image in the superposition of corresponding data obtain final decoded data.
CN 201010203823 2010-06-21 2010-06-21 Video encoding/decoding method and system based on background modeling and optional differential mode Active CN101883284B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201010203823 CN101883284B (en) 2010-06-21 2010-06-21 Video encoding/decoding method and system based on background modeling and optional differential mode

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201010203823 CN101883284B (en) 2010-06-21 2010-06-21 Video encoding/decoding method and system based on background modeling and optional differential mode

Publications (2)

Publication Number Publication Date
CN101883284A CN101883284A (en) 2010-11-10
CN101883284B true CN101883284B (en) 2013-06-26

Family

ID=43055156

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201010203823 Active CN101883284B (en) 2010-06-21 2010-06-21 Video encoding/decoding method and system based on background modeling and optional differential mode

Country Status (1)

Country Link
CN (1) CN101883284B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102333221B (en) * 2011-10-21 2013-09-04 北京大学 Panoramic background prediction video coding and decoding method
CN102665077A (en) * 2012-05-03 2012-09-12 北京大学 Rapid and efficient encoding-transcoding method based on macro block classification
CN102868891B (en) * 2012-09-18 2015-02-18 哈尔滨商业大学 Multi-angle view video chromatic aberration correction method based on support vector regression
CN105847793B (en) 2015-01-16 2019-10-22 杭州海康威视数字技术股份有限公司 Video coding-decoding method and its device
CN104702956B (en) * 2015-03-24 2017-07-11 武汉大学 A kind of background modeling method towards Video coding
CN106331700B (en) 2015-07-03 2019-07-19 华为技术有限公司 Method, encoding device and the decoding device of reference picture coding and decoding
CN107396138A (en) * 2016-05-17 2017-11-24 华为技术有限公司 A kind of video coding-decoding method and equipment
CN110062235B (en) * 2019-04-08 2023-02-17 上海大学 Background frame generation and update method, system, device and medium
CN112702602A (en) * 2020-12-04 2021-04-23 浙江智慧视频安防创新中心有限公司 Video coding and decoding method and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101742319A (en) * 2010-01-15 2010-06-16 北京大学 Background modeling-based static camera video compression method and background modeling-based static camera video compression system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4697500B2 (en) * 1999-08-09 2011-06-08 ソニー株式会社 TRANSMISSION DEVICE, TRANSMISSION METHOD, RECEPTION DEVICE, RECEPTION METHOD, AND RECORDING MEDIUM

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101742319A (en) * 2010-01-15 2010-06-16 北京大学 Background modeling-based static camera video compression method and background modeling-based static camera video compression system

Also Published As

Publication number Publication date
CN101883284A (en) 2010-11-10

Similar Documents

Publication Publication Date Title
CN101883284B (en) Video encoding/decoding method and system based on background modeling and optional differential mode
CN111405283B (en) End-to-end video compression method, system and storage medium based on deep learning
CN101742319B (en) Background modeling-based static camera video compression method and background modeling-based static camera video compression system
CN101204094B (en) Method for scalably encoding and decoding video signal
CN1316433C (en) Video-information encoding method and video-information decoding method
CN103141092B (en) The method and apparatus carrying out encoded video signal for the super-resolution based on example of video compress use motion compensation
CN104737540A (en) Video codec architecture for next generation video
CN102026000A (en) Distributed video coding system with combined pixel domain-transform domain
CN101272489B (en) Encoding and decoding device and method for video image quality enhancement
CN110691250B (en) Image compression apparatus combining block matching and string matching
CN102137263A (en) Distributed video coding and decoding methods based on classification of key frames of correlation noise model (CNM)
CN106170093A (en) A kind of infra-frame prediction performance boost coded method
CN103002283A (en) Multi-view distributed video compression side information generation method
CN105100814A (en) Methods and devices for image encoding and decoding
WO2018120019A1 (en) Compression/decompression apparatus and system for use with neural network data
CN111726614A (en) HEVC (high efficiency video coding) optimization method based on spatial domain downsampling and deep learning reconstruction
CN102316323B (en) Rapid binocular stereo-video fractal compressing and uncompressing method
CN113068041B (en) Intelligent affine motion compensation coding method
CN101998117B (en) Video transcoding method and device
CN109379590B (en) Pulse sequence compression method and system
CN109474825B (en) Pulse sequence compression method and system
CN112001854A (en) Method for repairing coded image and related system and device
Jilani et al. JPEG image compression using FPGA with Artificial Neural Networks
CN102333220B (en) Video coding and decoding method capable of selectively finishing predictive coding in transform domain
CN105359508A (en) Multi-level spatial-temporal resolution increase of video

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant