CN101102503A

CN101102503A - Prediction method for motion vector between video coding layers

Info

Publication number: CN101102503A
Application number: CN 200610101083
Authority: CN
Inventors: 谢清鹏; 熊联欢; 周建同; 曾鹏鑫
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2006-07-07
Filing date: 2006-07-07
Publication date: 2008-01-09

Abstract

The invention is used for improving the precision of inter-layer motion vector predication so as to boost the system compression efficiency. In the invention, with the condition of that basic layer is an interlaced mode and the enhanced layer is a non-interlaced mode, when the motion vector is generated for the virtual basic layer, the correlated motion vector in the basic layer is adjusted in order to be consistent with the motion vector in the virtual basic layer on time span so as to boost the precision of inter-layer motion vector predication. There are two approaches used for adjusting the correlated motion vector in basic layer; the one is to use the time-zooming to adjust the motion vector of the correlated encoded block in the basic layer to be consistent with the one in the virtual basic layer; the other one is to select the motion vector with time span that is consistent with the correlated motion vector in the virtual basic layer from the neighboring frame.

Description

The Forecasting Methodology of motion vector between video coding layers

Technical field

The present invention relates to field of video encoding, particularly the middle interlayer motion-vector prediction technology of vedio layering coding (Scalablevideo coding is called for short " SVC ") under the interlace mode.

Background technology

Along with the develop rapidly of computer internet (Internet) and mobile communications network, the application of the multimedia compression and the communication technology more and more widely, broadcasting, film are played to remote teaching and online news website etc. and have all used stream media technology from network.Current online transmission video, audio frequency mainly contain to be downloaded and streaming transmission dual mode.It is to transmit the video/audio signal continuously that streaming transmits, when Streaming Media remainder when client computer is play continues on the backstage to download.Streaming transmits has progressive streaming to transmit and real-time streaming transmission dual mode.It is real-time transmission that real-time streaming transmits, and is particularly suitable for live event, and real-time streaming transmits and must coupling connect bandwidth, this means that picture quality can reduce variation because of network speed, to reduce the demand to transmission bandwidth.

Especially along with 3-G (Generation Three mobile communication system) (3rd Generation, be called for short " 3G ") appearance and generally based on Internet protocol (Internet Protocol, abbreviation " IP ") network develops rapidly, and video communication just progressively becomes one of main business of communication.And both sides or multi-party video communication service, as video telephone, video conference, mobile multimedia terminal service etc., more transmission and the service quality to multimedia data stream proposes harsh requirement.It is better not only to require network to transmit real-time, and also requires video data compaction coding efficient higher equivalently.

SVC is the up-to-date encoding and decoding technique that MPEG formulates, and can change pixel and frame rate according to purposes.SVC has been located in the extension standards H.264 that the commercialization process is developing rapidly, is characterized in according to terminal kind and wireless status, changes resolution and frame rate.Another advantage of SVC is as long as prepare a video source, just can send to a plurality of terminals and service.Comprise a basic layer (Base Layer is called for short " BL ") and more than one enhancement layer (Enhanced Layer is called for short " EL ") in the code stream of SVC.In the SVC expansion of the advanced video compression standard of being united formulation by ITU and MPEG (Advanced Video Coding is called for short " AVC ") standard, the basic layer of regulation can be with H.264/AVC compatible.

Fig. 1 shows its rudimentary algorithm block diagram.When video stream data enters encoder, obtain the lower image of resolution as basic layer and each rudimentary enhancement layer in spatial sampling through two dimension, obviously its time of image of bottom or spatial resolution or other indexs are low more more.On each aspect, carry out independently coded system such as temporal estimation separately, image after bottom is finished coding returns to the resolution levels identical with the image of last layer through interpolation, passing to last layer, make the last layer core encoder can utilize down the image of one deck to predict, improve code efficiency.The principle of the inter-layer prediction coding of Here it is SVC.Multimedia sources sends on channel each layer is multiplexing after coding is finished, and the recipient also can according to quality of service requirement or bandwidth condition be subscribed or interim selective reception media data at all levels certainly, efficiently realize the UMA purpose.

And in SVC encoded at present, the prediction of interlayer information mainly partly was made up of texture prediction and movable information prediction two.Inter-layer texture prediction mainly is the prediction mode that the texture information of the basic layer of utilization or preceding one deck corresponding blocks is used as the intraframe prediction information of current block.As shown in Figure 2, in order to obtain the basic layer information of forecasting of macro block correspondence in the high level, also need the piece on the basic layer correspondence position is carried out deblocking effect and interpolation.The size decision of the ratio of interpolation spatial resolution according to basic layer with between strengthening, this pattern is also referred to as basic layer (intra_BL) pattern in the frame, and promptly according in the synchronization, the texture information of following one deck is predicted.

Here the several basic conceptions that needs to mention is: in the video coding of common non-layered, often there are two kinds of predictive modes---(intra) and interframe (inter) in the frame, the association of frame before and after wherein infra-frame prediction does not relate on the time shaft, and interframe is promptly predicted according to the frame of preceding or back.Here just newly introduced the notion of a level among the SVC, but in one deck, still according to traditional Video Encoding Mode, therefore in order to distinguish (intra) pattern in the frame, one deck is to the predictive mode of last layer about the employing interior basic layer of frame (intra_BL) indication the same time point of interlayer here.

Different with texture information is, the relation of frame before and after movable information itself just relates on the time shaft, but since on the time point corresponding frame between at all levels have corresponding relation, so movable information also has corresponding relation.So the prediction of interlayer movable information also is to utilize the movable information of basic layer or preceding one deck corresponding blocks to be used as the mode of prediction of the movable information of current block.But when the movable information of enhancement layer is encoded, 2 kinds of new macro block modes under original coding mode H.264, have been increased: basic layer model and 1/4 pixel accuracy pattern.

When using basic layer model, current macro needn't be transmitted more movable information again, directly replaces need not prediction.To use motion/prediction information and macro block to divide block message under this pattern from the corresponding macro block of preceding one deck.The spatial resolution of current one deck hour, motion vector will be exaggerated.If the corresponding macro block of preceding one deck is an intra macro block, the current macro pattern just is set to the intra_BL pattern so.For each piecemeal of current macro, the reference frame index of use also is consistent with corresponding macroblock partition in preceding one deck.Between enhancement layer and its basic layer, corresponding motion vector will multiply by a factor, and this factor is relevant with the spatial resolution ratio.

And 1/4 pixel accuracy pattern generally is hour use of spatial resolution at preceding one deck.This pattern is similar with basic layer model, and the acquisition of macroblock partition, reference frame index and motion vector is also the same with basic layer model.Yet,, all can transmit extraly on one 1/4 sample value and the motion vector that is added to each motion vector.

Describe the ins and outs of texture information and two kinds of predictive modes of movable information in the SVC inter-layer prediction above in detail.But it should be noted that: the Forecasting Methodology of above interlayer all is to realize under the pattern with the frame coding at enhancement layer and basic layer.Yet when SVC interweaved introducing (interlace) pattern in the new standard, corresponding inter-layer prediction must carry out some changes and could be suitable for or do not reduce code efficiency.

Here need to introduce the principle of the interlace mode that SVC will introduce.Interlace mode is meant at original front and back two two field pictures considers its similitude, it is merged into a frame encode improving code efficiency, and the effect that this interlace mode produces in static or the video flowing slowly of moving is very good.Obtain the half range image that vertical resolution reduces by half such as two width of cloth images before and after video flowing is on time shaft through over-sampling, two width of cloth image interlacing intersect then, image after obtaining interweaving, this process promptly is called and interweaves, image after wherein interweaving becomes frame (frame), and the half range image before interweaving is called (field).

In addition, also need to introduce the notion of macro block, macro block is exactly certain small images in the frame, and the operating unit of encoding process process generally is 16 * 16 often.And under interlace mode, frame and owing on vertical resolution, differ a times, so also there is one times convergent-divergent in the corresponding relation of macro block.

Introduce two kinds of coded systems under the interlace mode below, a kind of other frame of macro-block level, coding mode self adaptation (MBAFF) of being based on, another kind is based on frame, a coding mode self adaptation (PAFF) of image level.Note, here to need to introduce two notions, that is: a frame encoding mode and coding mode, be meant Unified coding and absolute coding respectively, wherein frame encoding mode (Unified coding) is meant that the corresponding content of two fields encodes together, this mode adaptive is in rest image stream or slow moving image, and a coding mode (absolute coding) is meant the corresponding content absolute coding of two fields, otherwise this pattern is adapted to the violent image that moves.Here a frame encoding mode and a coding mode are just used the notion of frame and field, do not obscure with frame, field.

As seen, MBAFF is the different of a code level in fact with PAFF, and basic principle is just the same, and it is unit that PAFF also can regard as with the macro block, and only all macroblock encoding modes are selected necessary unanimity, and MBAFF then can independently select.

Interlace mode not only can be carried out at basic layer, also can carry out at enhancement layer.With respect to interlace mode (interlace, be called for short " i pattern "), the common pattern (progressive is called for short " p pattern ") line by line that is called.In addition, owing to introduce interlace mode, the speed or the frame per second of basic layer and enhancement layer may be different, can reduce by half such as the basic layer speed behind the introducing interlace mode.

In sum, after introducing interlace mode, can effectively improve Media Stream code efficiency and compression ratio, but, because the i pattern has changed the content structure of this layer (basic layer or enhancement layer), make itself and other layer may produce inconsistent corresponding content or speed, then become impracticable for texture information in original inter-layer prediction or movable information prediction.Therefore, need to propose a kind of under interlace mode the method and apparatus of interlayer motion and texture prediction, can improve the efficient of the inter-layer prediction under the interlace mode, thereby can guarantee even improve the compression efficiency of SVC.

At enhancement layer and different interlacing interlace pattern and the combination of progressive pattern (frame pattern) line by line of basic layer, provided the overall plan of interlayer prediction under the i pattern in the present disclosed document, by making up virtual basic layer (Virtual Base Layer, abbreviation " VBL ") comes the conversion of between the various patterns and corresponding between complete layer, virtual basic layer has kept the texture and the movable information of basic layer, frame field coding structure pattern with enhancement layer is the same again simultaneously, helps at the inter-layer prediction of finishing on the basis that does not change original system framework under the i pattern.

The combination of basic layer and enhancement layer can be to be i-in the open source literature at present〉p, p-〉i, or i-i.I-for example〉p represents that basic layer is with the interlace pattern-coding, comprises PAFF pattern and MBAFF pattern, the enhancement layer list entries is lined by line scan, and encodes with frame pattern.The rest may be inferred for the meaning of other patterns.At different patterns, at present provided different inter-layer prediction solutions in the open source literature, wherein p-〉i, i-i, p-p, i-under the p situation forming process of virtual basic layer respectively as Fig. 3, Fig. 4 and shown in Figure 5.

Can find that by above process virtual basic layer is a bridge of inter-layer prediction.Virtual basic layer has the same frame, a coding mode with enhancement layer, thereby when enhancement layer coding, directly just can utilize the information of virtual basic layer to carry out inter-layer prediction.Simultaneously virtual basic layer forms by the corresponding relation integration on the basis of basic layer, has kept the corresponding informance of basic layer as much as possible, has higher forecasting accuracy.The formation that this shows virtual basic layer is under the inerlaced pattern, the key technology of inter-layer prediction.

The prediction of interlayer motion vector is a critical function of virtual basic layer, virtual basic layer will be under the basic layer frame field mode different with enhancement layer, the motion vector of basic layer is mapped as motion vector under the corresponding enhancement layer pattern, keeps certain precision of prediction simultaneously.

The prediction of motion vector also is four kinds of patterns according to bilevel corresponding relation: a coding-frame coding, frame coding-field coding, a coding-field coding, frame coding-frame coding.Bilevel frame field coding mode is the same because encode in coding-field and frame coding-frame is encoded, motion vector can copy fully without any need for conversion.So crucial motion-vector prediction technology just is a coding-frame coding, the motion vector transformational relation under two kinds of patterns of frame coding-field coding.Be that example illustrates existing method of motion vector prediction with MBAFF coding mode situation under the interlaced pattern below:

The mapping relations of frame are arrived shown in Fig. 6 and 7 in the field.Basic layer macro block is to being to encode according to the field, and enhancement layer macro block is to being framing code.The reference frame of virtual basic each fritter of layer macro block centering comes by normalization according to the reference frame of the direction of arrow from the corresponding fritter of basic layer among Fig. 6, because the twice when reference frame index value is the frame coding during coding of field, the reference frame index of field turned towards as 1 and 2 turn to 1,3 and 4 are normalized to 2, so analogize.Provided the corresponding referring-to relation of motion vector among Fig. 7, the motion vector of virtual basic each piece of layer is tried to achieve motion vector from basic layer corresponding blocks by conversion according to the number and the reference frame corresponding relation of B4 (4 * 4 fritter) piece of correspondence, the conversion here mainly is that motion vector need amplify 2 times in vertical direction, because the height of field is corresponding vertical frame dimension degree half, the motion vector that the motion vector of field is converted to frame need multiply by 2.

The mapping relations that frame is shown up such as Fig. 8 and shown in Figure 9.Basic layer macro block be to being according to the frame coding, and enhancement layer macro block is to being to encode by the field.The reference frame of virtual basic each fritter of layer field coded macroblock centering obtains by anti-normalization according to the reference frame of the direction of arrow from the corresponding fritter of basic layer among Fig. 8, because the twice when reference frame index value is the frame coding during coding of field, with the reference frame index of field if 1 instead be normalized to 1,2 instead are normalized to 3,3 instead are normalized to 5, so analogize.Provided the corresponding referring-to relation of motion vector among Fig. 9, the motion vector of virtual basic each piece of layer is according to the number and the corresponding fill pattern (oblique line) of B4 (4 * 4 fritter) piece of correspondence, the reference frame corresponding relation is tried to achieve motion vector from basic layer corresponding blocks by conversion, the conversion here mainly is that motion vector need dwindle 2 times in vertical direction, because height be half of corresponding vertical frame dimension degree, the motion vector that the motion vector of frame is converted to need be divided by 2.

The prediction case of general interlayer as shown in figure 10, basic layer has 3 frames, the enhancement layer correspondence 3 frames are also arranged, basic each frame of layer comprises field at the bottom of a top and, in the correspondence of basic layer and enhancement layer, find, two of basic layer and not all with enhancement layer correspondence frame correspondence, have only one with the enhancement layer correspondence, different according to field, original series top with end quarry sampling time point, one and the frame correspondence of enhancement layer correspondence of time point correspondence.Here hypothesis is a top correspondence, so when inter-layer prediction, if use the whole frame of basic layer or the information of field, the end to predict that effect can be not fine, preferably uses the top field information.

Specific to the prediction of motion vector, equally also there is not corresponding phenomenon.As shown in figure 11, what represent is that basic layer is the same with the enhancement layer frame per second, basic layer is the interlace coding, enhancement layer is the progressive coding, basic layer picture (the Group of Pictures of group, be called for short " GOP ")=2, corresponding enhancement layer GOP=4, among the figure top three layers corresponding respectively be basic layer, virtual basic layer and enhancement layer.What arrow was indicated is motion vector, shows the size of reference frame and motion vector.According to the field coding, its motion vector can be with reference to the field, top and the field, the end of former frame in basic layer for an interjacent two field picture, and the while also can be with reference to the field, top and the field, the end of back one frame.Because enhancement layer and virtual basic layer are the frame codings, so can only be with reference to former frame and back one frame.In the virtual base layer motion vector forming process of a middle two field picture, be that the piece of having got field, top and field, the end respectively forms virtual basic layer motion vector at present in the algorithm of open source literature.By the analysis of front as can be known, only should be more accurate with current corresponding one information.Here corresponding just should be the field, top of a middle frame.Even so also still there is the true situation of forecasting inaccuracy, among Figure 11 the 4th layer be exactly will basic layer each is placed apart according to each average time span, when field, the top reference of a middle frame is the Di Changchang of former frame, and the back one frame field, the end time (shown in the dotted arrow), at this moment Qian Hou motion vector reference frame all was 2 (referring to field, the end), algorithm in the open source literature is a normalization algorithm at present, the reference frame that is the afterwards virtual basic layer of normalization was 1 (referring to the field, top), motion vector also is only to have amplified twice in vertical direction, just directly conversion for the motion vector of virtual basic layer, correspondence be exactly motion vector in the second layer among Figure 11.Obviously, the conversion here is coarse.Have only when a middle frame reference be before and after during the field, top of two frames, this conversion is only comparatively rational, because the time span of its reality is consistent with the time span of virtual basic layer, and when reference be before and after during the field, the end of two frames, the time span of actual time span and virtual basic layer is inconsistent, so direct conversion will cause the inter-layer prediction motion vector of virtual basic layer accurate inadequately, thereby reduces code efficiency.

Summary of the invention

In view of this, main purpose of the present invention is to provide a kind of Forecasting Methodology of motion vector between video coding layers, makes the precision of interlayer motion-vector prediction be improved, thereby improves the compression efficiency of system.

For achieving the above object, the invention provides a kind of Forecasting Methodology of motion vector between video coding layers, comprise following steps:

Basic layer is an interlace mode, enhancement layer is during for pattern line by line, if the reference frame of the motion vector of a basic layer encoding block does not fit like a glove with the reference frame of the encoding block of the corresponding frame of enhancement layer on time shaft, then according to basic layer encoding block and/or its consecutive frame or in the motion vector of same position piece, and the standard time span between the corresponding frame of described enhancement layer and its reference frame, and the time span between affiliated of basic layer encoding block and its reference frame, be created on and be the standard motion vector of described standard time span on the time span, generate the inter-layer prediction or the actual motion vector of enhancement layer corresponding blocks according to this standard motion vector.

Wherein, generate described standard motion vector by following formula:

The time span of the motion vector of the motion vector * standard time span of standard motion vector=basic layer encoding block/basic layer encoding block.

In this external described method, generate described standard motion vector in the following manner:

With the consecutive frame of described basic layer encoding block or in have described standard time span in the motion vector of same position piece motion vector copy as described standard motion vector.

In this external described method, if the time span of the motion vector of described basic layer encoding block then generates described standard motion vector by following formula greater than described standard time span:

The time span of the motion vector of the motion vector * standard time span of standard motion vector=basic layer encoding block/basic layer encoding block;

If the time span of the motion vector of described basic layer encoding block is less than described standard time span, whether the time span of then further judging the motion vector of same position piece in described consecutive frame or the field is described standard time span, if then this motion vector is copied as described standard motion vector, otherwise generate described standard motion vector by following formula:

In this external described method, the described step that generates the motion vector of enhancement layer corresponding blocks according to standard motion vector further comprises following substep:

Generate the virtual motion vector of virtual basic layer corresponding blocks earlier according to described standard motion vector, generate the prediction or the actual motion vector of the corresponding macro block of described enhancement layer again according to this virtual motion vector.

In this external described method, if the reference frame of the motion vector of described basic layer encoding block fits like a glove with described enhancement layer reference frame, then generate the prediction or the actual motion vector of the corresponding macro block of described enhancement layer according to the motion vector that is somebody's turn to do basic layer encoding block on time shaft.

In this external described method, the reference frame of the motion vector of described basic layer encoding block does not fit like a glove with the enhancement layer reference frame on time shaft under following situation:

The reference frame of the motion vector of described basic layer encoding block is the field, the end of another basic frame, and described enhancement layer reference frame on time shaft corresponding to the field, top of this basic frame.

By relatively finding, the main distinction of technical scheme of the present invention and prior art is, basic layer is that interlace mode and enhancement layer are under the condition of pattern line by line, when generating motion vector for the enhancement layer coding piece, associated motion vector is adjusted consistently on time span with the enhancement layer associated motion vector in the layer substantially.Because the corresponding sports vector is consistent on time span in basic layer and the enhancement layer, thus the precision of interlayer motion-vector prediction can be improved, thus the compression efficiency of raising system.

Virtual basic layer is for generating the enhancement layer service, the frame of virtual basic layer is corresponding on time shaft with the frame of enhancement layer, so when generating the motion vector of virtual basic layer, also will basic layer in associated motion vector adjust with virtual basic layer in associated motion vector consistent on time span.In the prior art, when generating the motion vector of virtual basic layer, used normalized method, directly used the motion vector of basic layer Zhong Dingchang and field, the end, and the motion vector in these basic layers may be also inconsistent on time span with corresponding motion vector in the virtual basic layer, thereby reduced the precision of prediction of interlayer motion vector.And the present invention is by adjusting the time span of associated motion vector in the basic layer, makes in basic layer and the virtual basic layer corresponding sports vector consistent on time span, so has the precision of prediction of higher interlayer motion vector.

When adjusting in the basic layer associated motion vector, two kinds of basic modes are arranged:

First kind of mode be by convergent-divergent will basic layer in the time span of motion vector of corresponding encoded piece be adjusted into consistent with corresponding sports vector in the virtual basic layer.The motion vector of this mode corresponding encoded piece in basic layer is comparatively accurate when corresponding sports vector time span is bigger in the virtual basic layer, because at this moment be a kind of inner interpolation, is not extension.

The second way be from consecutive frame or the motion vector of same position encoding block choose with virtual basic layer in corresponding sports vector consistent motion vector on time span.The motion vector of this mode corresponding encoded piece in basic layer is more hour comparatively more accurate than corresponding sports vector time span in the virtual basic layer, because the consistency of motion is consistent on the very big probability of motion at two frame same position places, front and back.

Above-mentioned two kinds of basic modes can also be combined, the motion vector of corresponding encoded piece than corresponding sports vector time span in the virtual basic layer more hour in basic layer, earlier use the second way,, re-use first kind of mode if can not find motion vector with suitable time span.The result that the scheme of this combination obtains knows from experience than first, second kind of use mode is better separately total.

In the forming process of virtual base layer motion vector, can use diverse ways to forward motion vector and backward motion vector, so that at forward direction and back to effect is preferably all arranged.

Description of drawings

Fig. 1 is a SVC algorithm structure schematic diagram;

Fig. 2 is a SVC inter-layer texture prediction principle schematic under the non-interlace mode;

Fig. 3 is that basic layer is that the p enhancement layer is the forming process schematic diagram of virtual basic layer under the i situation in the existing techniques in realizing;

Fig. 4 is that basic layer is that the i enhancement layer is the forming process schematic diagram of virtual basic layer under the i situation in the prior art;

Fig. 5 is that basic layer is i or p in the prior art, and enhancement layer is the forming process schematic diagram of virtual basic layer under the p situation;

Fig. 6 reference frame corresponding relation schematic diagram that is the prior art midfield in the mapping of frame;

Fig. 7 motion vector corresponding relation schematic diagram that is the prior art midfield in the mapping of frame;

Fig. 8 is a reference frame corresponding relation schematic diagram in the mapping that frame is shown up in the prior art;

Fig. 9 is a motion vector corresponding relation schematic diagram in the mapping that frame is shown up in the prior art;

Figure 10 is the prediction case schematic diagram of interlayer general in the prior art;

Figure 11 is the Forecasting Methodology schematic diagram of interlayer motion vector in the prior art;

Figure 12 is the Forecasting Methodology flow chart according to the SVC interlayer motion vector of first embodiment of the invention;

Figure 13 is the Forecasting Methodology schematic diagram according to the interlayer motion vector of first embodiment of the invention;

Figure 14 is Soccer sequential test effect contrast figure;

Figure 15 is Parkrun sequential test effect contrast figure;

Figure 16 is Crew sequential test effect contrast figure.

Embodiment

For making the purpose, technical solutions and advantages of the present invention clearer, the present invention is described in further detail below in conjunction with accompanying drawing.

Main points of the present invention are, and are on the scene in the motion-vector prediction of frame, no longer adopt normalized method to determine the motion vector of corresponding blocks, but determine the size of virtual base layer motion vector according to time span.

The flow process of first embodiment of the invention as shown in figure 12.In this execution mode, basic layer is (interlace) pattern that interweaves, and enhancement layer is (progressive) pattern line by line.

In step 1201, whether the reference frame of motion vector that a basic layer encoding block is judged by system fits like a glove with the reference frame of the encoding block of the corresponding frame of enhancement layer on time shaft, if then enter step 1204, otherwise enters step 1202.

In step 1202, motion vector according to same position piece in basic layer encoding block and/or its consecutive frame or the field, and the standard time span between the virtual basic corresponding frame of layer and its reference frame, and under the basic layer encoding block and the time span between its reference frame, be created on and be the standard motion vector of standard time span on the time span.Because the frame of virtual basic layer should fit like a glove with the respective frame of enhancement layer on time shaft, so the time span between the virtual basic corresponding frame of layer and its reference frame is exactly the time span between enhancement layer frame and its reference frame.

Be example with the situation among Figure 13 below, the generation of standard motion vector is described.MV1 and MV2 are the motion vectors with reference to field at the bottom of two frames of front and back.Because the frame of virtual basic layer and enhancement layer is only corresponding with basic layer top on time shaft among Figure 13, directly using reference frame in this case is that the MV1 and the MV2 of basic layer field, the end is inaccurate.MV1 and MV2 need be converted to the respective top field is the standard motion vector of reference frame, the time span of this standard motion vector is that the time span between the field, top of two frames about layer substantially, just above alleged standard time span are arrived in the field, top of the middle frame of basic layer among Figure 13.Sagittal standard time span may be different, is without loss of generality, and the standard time span of establishing forward direction is w, and afterwards to standard time span is m.Wherein w and m are to be the time domain span of unit with field sequential sampling rate, and the field, top and the time span between the field, the end of same basic frame are 1.The conversion method of using in the present embodiment is that MV1 and MV2 are carried out convergent-divergent.

Carry out earlier the back to the motion vector convergent-divergent.In the back to, MV2 at time span overgauge time span, corresponding standard motion vector MV2 ' can obtain according to following formula:

MV2′＝MV2×(m)/(m+1)

Because MV2 belongs to inner interpolation at time span overgauge time span, MV2 ' is comparatively accurate.

Carry out the motion vector convergent-divergent of forward direction again.At forward direction, less than the standard time span, corresponding standard motion vector MV1 ' can obtain according to following formula MV1 on time span:

MV1′＝MV1×(w)/(w-1)

Because less than the standard time span, so this is a kind of extension valuation, but the motion between two of forward reference frame is uncertain to MV1 on time span, when not moving substantially between these two, the efficient of this method just can not be very high.In the execution mode of back, will adopt other method to generate the forward direction standard motion vector.

After this enter step 1203, according to the standard motion vector MV1 ' of basic layer encoding block and the virtual motion vector of the virtual basic layer corresponding blocks of MV2 ' generation.After this enter step 1205.

In step 1204, generate the virtual motion vector of virtual basic layer corresponding blocks according to the motion vector of basic layer encoding block.After this enter step 1205.

In step 1205, generate the prediction or the actual motion vector of the corresponding macro block of enhancement layer according to virtual motion vector.The realization of this step can be referring to the explanation of pertinent literature in the prior art.

In the first embodiment, handle, make that the corresponding sports vector is consistent on time span in basic layer and the enhancement layer by convergent-divergent, thus the precision of interlayer motion-vector prediction can be improved, thus the compression efficiency of raising system.

Second execution mode of the present invention and first execution mode are roughly the same, and difference is to generate the method difference of forward direction standard motion vector.In the present embodiment, with the consecutive frame of basic layer encoding block or in have the standard time span in the motion vector of same position piece motion vector copy as the forward direction standard motion vector.

With Figure 13 is example, and MV3 is the motion vector of the piece of same position in next piece corresponding with current block, promptly current block the motion vector of macro block centering sole piece on the scene.The time span of MV3 is identical with the standard time span.The present invention copies as forward direction standard motion vector MV1 ' with MV3, again with the forward direction virtual motion vector of the virtual basic layer corresponding blocks of MV1 ' generation.Because the consistency of motion, the motion at two frame same position places, front and back is consistent on very big probability, so adopt this method also to have greater efficiency.

Proposed MV3 is copied as the method for MV1 ' in second execution mode, this reference frame at MV3 is to be feasible during the field at the bottom of the left side frame shown in Figure 13, when but if the reference frame of MV3 is not field, the end shown in Figure 12, but other reference frames, because the difference of time span, it is just inaccurate at this moment to use MV3 to predict.In the 3rd execution mode for further addressing this problem the scheme of having proposed.

The 3rd execution mode of the present invention and second execution mode are roughly the same, and difference is to generate the method difference of forward direction standard motion vector.In the 3rd execution mode, the scheme that generates the forward direction standard motion vector in first execution mode and second execution mode is carried out combination.Promptly, whether the time span of judging earlier the motion vector of same position piece in consecutive frame or the field is the standard time span, if then this motion vector is copied as standard motion vector, otherwise generate described standard motion vector by convergent-divergent to the motion vector of basic layer encoding block.

Specifically in the example of Figure 13, judge earlier MV3 whether and MV1 ' have same time span, if then MV3 is copied as MV1 ', otherwise by formula MV1 '=MV1 * (w)/(w-1) calculate MV1 '.

This scheme combines the advantage of first and second execution modes, has fully effectively utilized the motion vector information of basic layer, has improved the forecasting efficiency of interlayer motion vector.

Result with three kinds of video sequence tests has been shown in Figure 14, Figure 15 and Figure 16.Wherein, what Figure 14 used is the Soccer sequence, and what Figure 15 used is the Parkrun sequence, and what Figure 16 used is the Crew sequence.In these tests, basic layer and enhancement layer all are the 4CIF image sequences, basic layer is the interlaced coded sequence, and enhancement layer is the progressive coded sequence, and two-layer up and down frame per second is the same, basic layer is with R1, R2, three code check points of R3 are encoded, corresponding basic each code check point of layer, also respectively at phase code rate point coding, each sequence can obtain 3 sets of curves to enhancement layer.Among Figure 14, Figure 15 and Figure 16, each figure respectively has 6 curves, the result that on behalf of primal algorithm, " R1_ is former " encode with R1 code check point, " R1_ is new " representative improves the result that the back algorithm is encoded with R1 code check point, the result that on behalf of primal algorithm, " R2_ is former " encode with R2 code check point, " R2_ is new " representative improves the result that the back algorithm is encoded with R2 code check point, the result that on behalf of primal algorithm, " R3_ is former " encode with R3 code check point, " R3_ is new " representative improves the result that the back algorithm is encoded with R3 code check point.Abscissa among the figure is a code check, and ordinate is Y-PSNR (Peak Signal-to-Noise Ratio is called for short " PSNR ").Under the same conditions, for same PSNR, the low more expression compression efficiency of code check is high more, and in other words, under identical condition, the curve on the left of relatively is better.As can be seen, in Figure 14, Figure 15 and Figure 16, under the same terms, the result who uses the algorithm after the present invention improves to obtain all is better than primal algorithm.From Crew sequence, Parkrun sequence, to the Soccer sequence, the variation program of image is more and more violent, and on effect, the algorithm after the improvement is more and more obvious with respect to the advantage to primal algorithm, as can be seen, the solution of the present invention is applicable to that more picture changes scene faster.

Though pass through with reference to some of the preferred embodiment of the invention, the present invention is illustrated and describes, but those of ordinary skill in the art should be understood that and can do various changes to it in the form and details, and without departing from the spirit and scope of the present invention.

Claims

1. the Forecasting Methodology of a motion vector between video coding layers is characterized in that, comprises following steps:

2. the Forecasting Methodology of motion vector between video coding layers according to claim 1 is characterized in that, generates described standard motion vector by following formula:

3. the Forecasting Methodology of motion vector between video coding layers according to claim 1 is characterized in that, generates described standard motion vector in the following manner:

4. the Forecasting Methodology of motion vector between video coding layers according to claim 1, it is characterized in that, if the time span of the motion vector of described basic layer encoding block then generates described standard motion vector by following formula greater than described standard time span:

5. according to the Forecasting Methodology of each described motion vector between video coding layers in the claim 1 to 4, it is characterized in that the described step that generates the motion vector of enhancement layer corresponding blocks according to standard motion vector further comprises following substep:

6. according to the Forecasting Methodology of each described motion vector between video coding layers in the claim 1 to 4, it is characterized in that, if the reference frame of the motion vector of described basic layer encoding block fits like a glove with described enhancement layer reference frame, then generate the prediction or the actual motion vector of the corresponding macro block of described enhancement layer according to the motion vector that is somebody's turn to do basic layer encoding block on time shaft.

7. according to the Forecasting Methodology of each described motion vector between video coding layers in the claim 1 to 4, it is characterized in that the reference frame of the motion vector of described basic layer encoding block does not fit like a glove with the enhancement layer reference frame under following situation on time shaft: