CN1717056A

CN1717056A - Frame internal prediction of highpass time filter frame for wavelet video frequency encoding

Info

Publication number: CN1717056A
Application number: CN 200510081875
Authority: CN
Inventors: L·切普林斯基; J·卡巴尔; S·甘巴里
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2004-07-02
Filing date: 2005-07-04
Publication date: 2006-01-04

Abstract

A method of encoding frame sequence using 3-D decomposition including time filter and frame internal prediction/interpolation is used for frame internal prediction of highpass time filter frame for wavelet video frequency encoding. The method contains: (a) intra prediction/interpolation in the first stage in which any adjacent block is able to be used; (b) intra prediction/interpolation aiming at each block estimation step (a) to identify the block used for frame internal prediction; (c) intra prediction/interpolation in the second stage in which the block identified in step (b) will not be used for other blocks.

Description

The infra-frame prediction that is used for the high-pass time filtering frame of small wave video coding

Technical field

The present invention relates to the video sequence that utilizes 3-D (t+2D and 2D+t) wavelet coding is carried out Code And Decode.More specifically, proposed a kind of a plurality of parts (a plurality of) and carried out improving one's methods of infra-frame prediction at the high pass frames that generates during decomposing in the time.

Background technology

Following paper is a reference background data of describing the 3-D sub-band coding, be Jens-Rainer Ohm " Three-Dimensional Subband Coding with Motion Compensation ", and Choi and Woods " Motion-Compensation 3-D Subband Coding ofVideo ".In brief, with in the video sequence such as the image sequence of set of pictures (GOP) after the wavelet based space distortion, be decomposed into the space-time subband by motion compensation (MC) time domain analysis.In the method for alternative, time series analysis and spatial analysis steps can be exchanged.The further coding of the sub-band coefficients of gained is used for transmission.

When since to the mobile estimation of the specific region/piece of frame fall flat and quality dissatisfied and can not the time of implementation during filtering, well-known problem will appear in the motion compensation small wave video coding.In the prior art, by not application time filtering in the generation of low pass frames, solve this problem and still carry out motion compensated prediction for the generation of high pass frames.The latter's problem is, the piece that is produced in high pass frames trends towards having high relatively energy (high value coefficient), and this has negative effect to further compression step.In the former patent application (EP Appl.No.03255624.3 incorporates its content into by reference at this), we have introduced the idea of the generation of the problem piece that utilizes infra-frame prediction to improve high pass frames.In this invention, not according to the frame adjacent in time but predict these pieces according to spatially adjacent frame with present frame.Can use different predictive modes, in above-mentioned patent application, describe several patterns wherein.

Great majority utilize the video coding system (for example MPEG-4 part 10/H.264) of infra-frame prediction that this prediction is restricted to: by the scanning sequency of piece, utilize the piece formerly handled to carry out this prediction (being cause and effect).This restriction is not necessarily essential under the situation of wavelet coding.This discusses this in above-mentioned application, and done further research in the paper below: (ICASSP 2004 by " Directional Spatial I-blocks for MC-EZBC VideoCoder " that Woods and Wu showed, in May, 2004, propose to MPEG in December, 2003 earlier).The part of the novelty in this paper is: use interpolation method and prediction to form high-pass frame blocks.Fig. 1 shows an example of this interpolation, wherein carries out interpolation between the left side block of current block and right side block.

For the prediction except level and vertical direction/interpolation direction, it is more complicated that situation becomes, and the quantity of the piece that need use may be very big.Figure 2 illustrates this point, and show in this case, because the unavailability of some pieces on the right side of candidate blocks makes and predicts the part (light gray) of this piece rather than it is carried out interpolation.

As in this paper, discussing, in prediction and interpolation, the instructions for use of non-causal (non-casual) direction is thought better of the availability of piece, avoiding predicted according to the other side each other situation of two pieces, and guarantee the consistency between the Code And Decode.In the scanning direction of considering image when (usually from left to right and from the top down), the use of cause and effect direction is meant that use is as the known information of scanning result.The solution that is proposed in paper is to use rescan (two-sweep) process:

1. in scanning for the first time, only DEFAULT pattern non-causal piece (that is, being considered to the successful piece of its motion compensation of carrying out) is used as fallout predictor.The MSE that will produce according to infra-frame prediction and the MSE of motion compensation compare, and cause the piece of low MSE to be labeled as through intra-prediction infra-frame prediction.

2. in scanning for the second time, all non-causal pieces that will not be marked as in first step through intra-prediction are used for fallout predictor.This expression can be used for more adjacent block the prediction/interpolation through the piece of intra-prediction, and this will reduce the MSE of high pass piece.

The technology of foregoing description has a lot of problems.One of them is when the propagation that utilizes the quantization error when the piece of intra-prediction is carried out infra-frame prediction repeatedly.Another problem is the suboptimum of the rescan process that adopted by Woods and Wu.In scanning first time of this algorithm,, also to stop all non-DEFAULT pieces as fallout predictor even can intra-prediction wherein during a part.

Summary of the invention

Many aspects of the present invention have been described in claims.

First above-mentioned problem solves by using " piece restriction ": we do not allow to be used for once more through the piece of infra-frame prediction prediction.In Woods and Wu, in first scanning, the candidate of I piece is disabled for interpolation/prediction, and these pieces comprise P-BLOCK and REVERSE piece.They only apply this restriction to the non-causal piece, and this can not stop error propagation to a certain degree.

We also designed depend on " piece restriction " improved three times by the model selection algorithm.Utilize this restriction in position, in passing through the first time of model selection, can allow more piece, for the second time by in only partly limit their quantity.Pass through to use for the third time by identical mode then, to guarantee the consistency of encoder and decoder with the above-mentioned second time.

Description of drawings

With reference to the following drawings embodiments of the invention are described:

Fig. 1 represents the sketch of frame interpolate value in the horizontal direction;

Fig. 2 is illustrated in the sketch of the frame interpolate value on the diagonal;

Fig. 3 is illustrated in the sketch of the third level in the embodiment method;

Fig. 4 is illustrated in the modified interpolation sketch on the diagonal;

Fig. 5 represents the sketch of monoblock prediction;

Fig. 6 represents the sketch of monoblock interpolation;

The block diagram of Fig. 7 presentation code system.

Embodiment

Technology of the present invention is incorporated it at this by reference based on such as in the prior art described in the above-mentioned existing document.

In method (" piece restriction ") according to first embodiment of the invention, when handling current block, only attempt infra-frame prediction/interpolative mode, these patterns do not comprise with through the piece of intra-prediction as fallout predictor.This restriction can be used for only including the prediction (not needing many logical processing) of cause and effect direction, and can be used for when the non-causal direction is in the use.

Method according to second embodiment of the invention also is to use " piece restriction " three times by the model selection algorithm.

This algorithm can be summarized as follows:

1. in first model selection is passed through, close " piece restriction ", we to predictor block itself whether under the situation that intra-prediction does not impose any restrictions, identification might benefit from the piece of intra-prediction.This meaning is that some pieces of herein discerning might not be correctly decoded (i.e. " two pieces are used to mutual prediction ").We should the group problem be called " prediction mutually " below.

2. in passing through for the second time, open " piece restriction ", the candidate of being discerned in the passing through is in front reappraised.The set through the piece of intra-prediction of the use of " piece restriction " having been guaranteed gained is available (the similar problem of mentioning in above-mentioned the 1st promptly can not exist).This is similar in appearance to utilizing thick Woods that distinguishes and the scanning first time among the Wu: only apply described restriction to being identified as the potential piece through infra-frame prediction in step 1, therefore allow to use the piece of greater number.

3. in passing through for the third time, recomputate and the corresponding high pass frames part of the piece of infra-frame prediction the current final block mode that uses generation from pass through for the second time.Should be by to guaranteeing that consistent between the encoder is essential.

In step 1, for example utilize such as the technology in the prior art (for example above-mentioned prior art) and discern the candidate blocks that is used for intra-prediction.Can utilize all adjacent pieces that these candidate blocks are carried out intra-prediction/interpolation then.For example utilize the piece of MSE or MAD (mean square error or absolute mean deviation) estimation then, to determine that whether this error is less than the situation of using motion compensation through infra-frame prediction.If infra-frame prediction is better than motion compensation, then in step 1, this piece is identified as the piece that may benefit from infra-frame prediction then.

For the third time by being preferred, because, although decoder has the full detail of relevant following inside, promptly definitely which piece be through infra-frame prediction and be disabled as fallout predictor thus, it is unavailable that but sometimes must be assumed to a piece by encoder the second time, even this piece is available afterwards.In Fig. 3, provided an example.

In this example, Zhong Jian piece uses frame interpolate value/prediction in the horizontal direction.In passing through the second time that encoder modes is selected, the piece on the right side does not also have processed, and according to from the MSE that passes through for the first time relatively, the piece on this right side is labeled as potential in intra-prediction, thereby can not be used for the prediction/interpolation of current block.Although may find at last owing to limit rather than through intra-prediction at the piece by middle use for the second time.Therefore, the process that forms the high-pass coefficient of this piece in decoder is different from the process of using in encoder, and this will cause the deviation of reconstructed frame.

Describe below further specific implementation of the foregoing description and variation.Suitably will make up these variations and specific embodiment.

May not wish to use interpolation in some cases.Reason is additional computer and the memory spending in preceding discussion.The content that also might be special frames or frame group has the prediction of helping.In order to address this problem, we propose to switch between interpolation and predictive mode based on every frame or each sequence.This can realize which kind of changes just in use with the notice decoder by introducing signaling mechanism with suitable rank (for example frame, set of pictures, sequence).

Also possible situation is, for special frames or even piece, compare interpolation with prediction and may not improve performance, especially when when the added limitations that licensor makes progress is considered.In order to solve back one problem, we propose to switch based on each piece, and do not use clear and definite signaling.In first solution, if interpolation can be used for whole, then we only use interpolation (seeing Fig. 2, is an infeasible example), otherwise we use the prediction at whole.Another solution is the improved mode decision process, with except that absolute average error or mean square error that the typical case uses, utilizes the extra measurement to predicated error uniformity (such as the absolute difference of maximum).Because avoided like this in piece, introducing limbus, especially help visual quality.

A solution again, it comprises implicit signaling, this scheme is that each direction is introduced three kinds of independent block modes: a kind of interpolation that is used for, two kinds are used for prediction.Based on the minimum value of error measure, in these three kinds of patterns, select in the mode identical with inside/middle model judgement.

For the gentle direction outside vertical that dewaters, even can carry out interpolation to the remainder of piece, but we also almost must the diagonal applied forecasting.In order to address this problem, we omit the combination that the locations of pixels place utilizes available pixel at suggestion on diagonal.This is shown in Figure 4, has wherein used the mean value of pixel a and b in the position of pixel x.

Can will be applied to the situation of non-interpolation by similar idea to above-mentioned second time, wherein this idea can form the basis of single pass operation.In this case, do not have the problem of mutual prediction, but we observe and will be previous can cause excessive error propagation through the piece of intra-prediction as the basis of further intra-prediction, thereby can cause serious decreased performance.More accurately, can in the situation of cause and effect direction prediction, use " piece restriction ", to stop the error propagation in the frame.

Only carrying out in the situation of causal forecasting, as shown in Figure 5, whole pixel as fallout predictor, is being produced than only utilizing single line to want more performance usually.

A kind of possible explanation to this phenomenon is that when same pixel was not used in a plurality of pixel of prediction, the influence that quantization error is propagated was just not too remarkable.Also can whole prediction and interpolation method is combined, wherein two adjacent pieces can be with acting on the candidate of predicting internal block.One example of this prediction has been shown among Fig. 6, and wherein the first half predictions are to carry out according to the latter half of upper mass, and the second half predictions are to carry out according to the first half of bottom piece.

Another conclusion that can draw according to the superperformance of whole observed prediction is that whole should be tended to infra-frame prediction is restricted to structure zone more uniformly as fallout predictor.Therefore, our the model selection standard modification that proposes with the prediction/interpolation of " based on line " be comprise measure the piece that has been performed infra-frame prediction around the evenness of relevant range.This can realize that this piece will be used to the monoblock prediction by the variance of the pixel value in the zone of calculating the piece through predicting.

Described model selection standard can be suitable for consideration time decomposition rank, decomposes on rank in this time to form the high pass frames of being studied.We have done some experiments, and wherein we are to having introduced deviation (bias) in the comparison of carrying out based on other interframe error prediction of time stage and infra-frame prediction.The result who is obtained shows when when the thin more place of decomposition rank infra-frame prediction is favourable more at aspect of performance improvement is arranged slightly.

Discussion is by adjusting the entropy coding that block mode is judged for the other method of time other compliance of decomposition level.Have been found that, occur more frequently in lower decomposition level other places intra-prediction mode, therefore, should improve code efficiency, promptly decompose the inside portion of rank predictive mode and distribute shorter sign indicating number with the thinner time by suitable variation in design length variable when sign indicating number.For example, if utilize the bigger block mode of sum, then this method may be effectively.

In the piece of intra-prediction,, then can increase the influence of quantization error if use single pixel to predict several.Therefore, for will preferably considering quantity as the selection of the pixel of fallout predictor according to the pixel of single fallout predictor pixel prediction.

The present invention can utilize that the system similar to prior art system realizes by suitable modification.For example, be used for carrying out the processing of the foregoing description except revising MCTF (motion compensated temporal filter) module, the basic element of character of coded system as shown in Figure 7.

In this manual, use term " frame " to describe elementary area, comprise after the filtering, but this term also is used for other similar term, such as subelement or zone, the frame etc. of image, field, picture or image.In due course, can or organize pixel more and use interchangeably a plurality of pixels of term and a plurality of.In this manual, except that situation about can obviously find out from context, the meaning of image one speech is the zone of entire image or image.Equally, entire image also can be represented in the zone of image.Image comprises frame or field, and relates to rest image, or such as the image in the image series in film or the video, or the image in relevant image sets.

Image can be gray scale or coloured image, or the multiline image of another type (for example IR, UV or other electromagnetic image), or acoustic picture etc.

Can be clear and definite except based on context or as one of ordinary skill in the art understand, infra-frame prediction can be represented interpolation, vice versa, and the prediction/interpolation meaning is prediction or interpolation or both, so embodiments of the invention can only comprise prediction or only comprise interpolation, or comprise the prediction and the combination (being used for in-line coding) of interpolation, and comprise motion compensation/intraframe coding, and piece can be represented the one or more pixels from piece.

For example can utilize suitable software and/or hardware modifications, in computer system, realize the present invention.For example, the present invention can utilize computer or analog to realize, this computer or analog have: such as the control or the processing unit of processor or control device; The data storage device that comprises image memory device is such as memory, magnetic memory, CD, DVD etc.; Data output device such as display or monitor or printer; Data input device such as keyboard; And such as the image-input device of scanner; Perhaps any combination of these parts and additional components.Can be with the form of software and/or hardware, or in isolated plant or for example provide many aspects of the present invention in the special module of chip.For example can provide according to the system unit the device of the embodiment of the invention from other parts are long-range by the internet.Encoder as shown in Figure 7 and corresponding decoder for example have the corresponding component that is used to carry out reverse decode operation.

Can use the 3-D of other type to decompose and distortion.For example, can apply the present invention to wherein at first to carry out space filtering then in the decomposing scheme of time of implementation filtering.

Claims

1, a kind of utilization comprises the 3-D decomposition of time filtering and utilizes infra-frame prediction/interpolation that frame sequence is carried out Methods for Coding, and this method comprises:

(a) intra-prediction/interpolation of phase I can be used any adjacent piece in this stage;

(b), be used for the piece of infra-frame prediction with identification to the intra-prediction/interpolation of each piece estimating step (a);

(c) intra-prediction/interpolation of second stage, it is not used for the piece that identifies in the step (b) intra-prediction/interpolation of other piece.

2, method according to claim 1, wherein in step (c), except that the piece that identifies in step (b), any adjacent piece may be used to intra-prediction/interpolation.

3, according to claim 1 or the described method of claim 2, also comprise:

(d) the described intra-prediction/interpolation in the estimating step (c) is used for the piece of intra-prediction/interpolation with identification; And

(e) piece of being discerned is carried out intra-prediction/interpolation of phase III in step (d).

4, a kind of utilization comprises the 3-D decomposition of time filtering and utilizes infra-frame prediction/interpolation that frame sequence is carried out Methods for Coding, this method comprises: identification is used for the piece of infra-frame prediction/interpolation, and the piece that wherein is used for infra-frame prediction/interpolation is not used in the intra-prediction/interpolation of other piece.

5, method according to claim 4 wherein only utilizes the piece of the front in the scanning sequency to carry out intra-prediction/interpolation, and is used for being not used in according to the piece of described infra-frame prediction/interpolation at preceding intra-prediction/the interpolation of other piece.

6, a kind of utilization comprises the 3-D decomposition of time filtering and utilizes infra-frame prediction and interpolation that frame sequence is carried out Methods for Coding, and this method comprises according to preassigned to be switched between a plurality of intra-prediction/interpolative mode.

7, method according to claim 6 comprises based on for example piece, frame, set of pictures and sequence and switching.

8, method according to claim 7 comprises: have only when the frame interpolate value is used for whole, just this piece is used the frame interpolate value.

9, method according to claim 6, wherein said a plurality of patterns comprise based on the prediction/interpolation of line and block-based prediction/interpolation.

10, method according to claim 9 is wherein switched based on the measurement of evenness.

11, method according to claim 7 comprises based on the error measure minimum value being that switch on the basis with the piece between the corresponding prediction with two kinds of interpolation.

12, a kind of utilization comprises the 3-D decomposition of time filtering and utilizes infra-frame prediction/interpolation that frame sequence is carried out Methods for Coding, this method comprises: switch between inter prediction/interpolation and infra-frame prediction/interpolation, wherein said switching is comply with in the time and is decomposed rank.

13, method according to claim 11 is included in based on time decomposition rank and carries out the relatively middle deviation of using of predicated error.

14, a kind of utilization comprises that the 3-D of time filtering decomposes and utilizes infra-frame prediction/interpolation that frame sequence is carried out Methods for Coding, and described method comprises: according to be used to predict/piece of interpolation uses two or more row.

15, method according to claim 13 comprises that using whole predicts.

16, method according to claim 14 comprises and uses half block to predict/interpolation.

17, a kind of utilization comprises that the 3-D of time filtering decomposes and utilizes infra-frame prediction/interpolation that frame sequence is carried out Methods for Coding, and described method comprises by replacing based on the value of one or more neighbors and is not useable for predicting/pixel of interpolation.

18, method according to claim 16 comprises the combination of using two or more neighbors.

19, method according to claim 17 comprises: utilize vertical with the pixel at the line end place, diagonal angle that is positioned at piece adjacent adjacent with level and replace described pixel with the mean value of described adjacent pixels.

20, a kind of utilization comprises the 3-D decomposition of time filtering and utilizes infra-frame prediction/interpolation that frame sequence is carried out Methods for Coding, described method comprises: by using twice or more times measurement to predicated error, determining whether utilizing motion compensation (interframe) or infra-frame prediction/interpolation, or determine whether to utilize infra-frame prediction or frame interpolate value.

21, a kind of utilization comprises the 3-D decomposition of time filtering and utilizes infra-frame prediction/interpolation that frame sequence is carried out Methods for Coding, and wherein the type of employed entropy coding is comply with in the time and decomposed rank.

22, a kind of utilization comprises that the 3-D of time filtering decomposes and utilizes infra-frame prediction/interpolation that frame sequence is carried out Methods for Coding, and is wherein said to having considered the quantity of the pixel predicted according to the fallout predictor pixel as the selection of the pixel of fallout predictor.

23, a kind of to the method for utilizing aforementioned each claim and the method that the frame of encoding is decoded.

24, a kind of to utilizing and the use of coded data according to each the described method among the claim 1-23, comprise for example comprising and sending and receive.

25, a kind of coding and/or decoding device that is used for carrying out according to each described method of claim 1-23.

26, a kind of computer program, system or computer-readable recording medium that is used to carry out according to each described method of claim 1-23.