CN103096078A

CN103096078A - Inter-layer prediction method for video signal

Info

Publication number: CN103096078A
Application number: CN2012105858823A
Authority: CN
Inventors: 朴胜煜; 全柄文; 朴志皓
Original assignee: LG Electronics Inc
Current assignee: LG Electronics Inc
Priority date: 2006-01-09
Filing date: 2007-01-09
Publication date: 2013-05-08
Anticipated expiration: 2027-01-09
Also published as: BRPI0706378A2; CN103096078B

Abstract

The present invention relates to a method for conducting interlayer texture prediction in encoding or decoding of video signal . The present method constructs a pair of frame macro blocks from a single field macro block or vertically-adjacent two field macro blocks of a base layer, and using texture information of the constructed pair of frame macro blocks in interlayer texture prediction of a pair of frame macro blocks of a current layer.

Description

The inter-layer prediction method that is used for vision signal

The application is to be that January 9, application number in 2007 are 200780005672.X(international application no PCT/KR2007/000147 the applying date), denomination of invention divides an application for the Chinese patent application of the inter-layer prediction method of vision signal " be used for ".

1. technical field

The present invention relates to for the method for carrying out inter-layer prediction when the encoding/decoding video signal.

2. background technology

Liftable level Video Codec (SVC) encodes video into the picture sequence with high image quality, guarantee that simultaneously the part (specifically, from whole frame sequence discontinuous the partial frame sequence selected) of encoded picture sequence can be decoded and be used for showing this video with low image quality.

Although can show the low image quality video by the part that receives and process according to the picture sequence of liftable level scheme coding, if but still the existence bit rate reduce problem that picture quality significantly descends.A solution of this problem is to provide the auxiliary picture sequence of low bit rate---picture sequence that for example has small screen size and/or low frame per second---as the one deck at least in hierarchical structure.

When hypothesis provides two sequences, auxiliary (under) picture sequence is called as basic unit, and lead (on) picture sequence is called as enhancing or reinforced layer.The vision signal of basic unit and enhancement layer has redundancy, and is two-layer because identical video signal source is encoded into.In order to improve the encoding-decoding efficiency of enhancement layer, the vision signal of enhancement layer uses the information through encoding and decoding (movable information or texture information) of basic unit to come encoding and decoding.

Although can as shown in Figure 1a single video source 1 be encoded into a plurality of layer with different transmissibilitys, a plurality of video source 2b that also can will comprise as shown in Fig. 1 b under the different scanning pattern of identical content 2a are encoded into corresponding each layer.Equally, in this case, the encoder on the upper strata of encoding can improve by the encoded information and executing inter-layer prediction that utilizes lower floor the encoding and decoding gain, because two source 2b provide identical content 2a.

Therefore, need to provide a kind of inter-layer prediction method of the scan pattern of vision signal being included in consideration when different source codes are become corresponding each layer.When coding interlaced video, it can be encoded into even and strange, and also can be encoded into the strange and even macro block pair in a frame.Correspondingly, also must consider to be used for the picture type of encoding and decoding interlaced video signal for inter-layer prediction.

Generally speaking, enhancement layer provides the picture of resolution higher than basic unit's resolution.Correspondingly, if the picture of all layer has different resolution when different source codes is become corresponding each layer, also need to carry out interpolation and improve screen resolution (that is, picture size).Because the image of the basic unit's picture that uses in inter-layer prediction for encoding and decoding for prediction is more close to the image of enhancement layer picture, the encoding and decoding rate is just higher, so need to provide a kind of scan pattern of the vision signal with all layer to include the interpolating method of consideration in.

3. summary of the invention

The purpose of this invention is to provide a kind of method that has inter-layer prediction under the situation that one deck has the interlaced video signal component in two-layer at least.

Another object of the present invention is to provide a kind ofly carries out method to the inter-layer motion prediction of all layer of picture with different spatial resolutions (liftable level) according to picture type.

Another purpose of the present invention is to provide a kind of execution to the method for the inter-layer texture prediction of all layer of picture with different spaces and/or temporal resolution (liftable level).

A kind of inter-layer motion prediction method according to the present invention comprises: the motion related information of internal schema macro block is arranged to the motion related information of inter mode macro block, this internal schema is two vertical macro blocks that adjoin of basic unit with the inter mode macro block; Then obtain vertically to adjoin the right movable information of macro block based on these two macro blocks that vertically adjoin and be used for inter-layer motion prediction.

Another kind of inter-layer motion prediction method according to the present invention comprises: the inter mode piece that will be arranged to have 0 motion related information as the internal schema macro block of one of two internal schemas of vertically adjoining of basic unit and inter mode macro block; Then obtain vertically to adjoin the right movable information of macro block based on these two macro blocks that vertically adjoin and be used for inter-layer motion prediction.

Another kind of inter-layer motion prediction method according to the present invention comprises: from the movable information that vertically adjoins the right single macro block of movable information derivation of frame macro block of basic unit; And with the movable information the derived information of forecasting as the movable information of the field macro block in current layer or the right movable information separately of the field macro block in current layer.

Another kind of inter-layer motion prediction method according to the present invention comprises from the movable information of single macro block of basic unit or the movable information that vertically adjoins single right macro block of a macro block that is selected from basic unit two macro blocks movable information separately of deriving; And with the movable information separately derived as the frame macro block of the current layer information of forecasting to separately movable information.

A kind of inter-layer motion prediction method of all layer that is used for having the picture of different resolution according to the present invention comprises: use the Forecasting Methodology of conversion framing macro block the changing picture of lower floor to be become the frame picture of equal resolution by the type selecting ground according to the type of picture and picture macro block; Rise this frame picture of sampling so that it has the resolution identical with the resolution on upper strata; Then use and be applicable to the type of the frame macro block in this frame picture through rising sampling and the inter-layer prediction method of the macro block (mb) type in the picture of upper strata.

The another kind of inter-layer motion prediction method of all layer that is used for having the picture of different resolution according to the present invention comprises: identify lower floor and upper strata picture type and/or be included in the type of the macro block of these pictures; According to the result that identifies, lower layer pictures is used from single the right method of macroblock prediction frame macro block and had the virtual screen of the aspect ratio identical with the aspect ratio of upper strata picture with structure; Rise this virtual screen of sampling; Then utilize this virtual screen through rising sampling to the upper layer application inter-layer motion prediction.

The another kind of inter-layer motion prediction method of all layer that is used for having the picture of different resolution according to the present invention comprises: identify lower floor and upper strata picture type and/or be included in the type of the macro block of these pictures; According to the result that identifies, lower layer pictures is used from single the right method of macroblock prediction frame macro block and had the virtual screen of the aspect ratio identical with the aspect ratio of upper strata picture with structure; And utilize the virtual screen that constructs to use inter-layer motion prediction to the picture on upper strata.

Another kind comprises for the inter-layer motion prediction method of all layer of the picture with different resolution according to the present invention: the type that identifies lower floor and upper strata picture; If the type of lower layer pictures is that the type of field and upper strata picture is line by line, copy the movable information of the piece in lower layer pictures with the constructing virtual picture; Rise this virtual screen of sampling; And at this application of frame macro block-macroblock motion prediction method between virtual screen through rising sampling and upper strata picture.

Another kind comprises for the inter-layer motion prediction method of all layer of the picture with different resolution according to the present invention: the type that identifies lower floor and upper strata picture; If the type of lower layer pictures is and the type of upper strata picture is line by line, copy the movable information of piece of lower floor with the constructing virtual picture; And come the upper strata picture is used inter-layer motion prediction with this virtual screen.

In an embodiment of the present invention, sequentially predict partition mode, reference key and motion vector in inter-layer motion prediction.

In another embodiment of the present invention, sequentially prediction reference index, motion vector and partition mode.

The right movable information of field macro block that in another embodiment of the present invention, be used for the virtual basic unit of inter-layer motion prediction is to derive from the right movable information of the frame macro block of basic unit.

The movable information of field macro block of an occasionally strange picture that in another embodiment of the present invention, be used for the virtual basic unit of inter-layer motion prediction is to derive from the right movable information of the frame macro block of basic unit.

In another embodiment of the present invention, select macro block from the field macro block centering of basic unit, and the right movable information of frame macro block that will be used for the virtual basic unit of inter-layer motion prediction is to derive from the movable information of selected macro block.

The right movable information of frame macro block that in another embodiment of the present invention, be used for the virtual basic unit of inter-layer motion prediction is to derive from the movable information of the field macro block of an occasionally strange picture of basic unit.

In another embodiment of the present invention, the information of the field macro block in an occasionally strange picture of basic unit is copied with other constructing virtual field macro block, and the right movable information of frame macro block that will be used for the virtual basic unit of inter-layer motion prediction is to derive from the right movable information of field macro block that constructs in this way.

A kind of inter-layer texture prediction method according to the present invention comprises: vertically adjoin the frame macro block to structure macro block pair by basic unit; And with the field macro block that constructs to separately texture information as the field macro block of current layer to separately texture prediction information.

Another kind of inter-layer texture prediction method according to the present invention comprises: vertically adjoin the frame macro block to constructing single macro block by basic unit; And with the texture information of single the macro block that the constructs texture prediction information as the field macro block of current layer.

Another kind of inter-layer texture prediction method according to the present invention comprises: by single macro block of basic unit or vertically adjoin a macro block to structure frame macro block pair; And with the frame macro block that constructs to separately texture information as the frame macro block of current layer to separately texture prediction information.

Another kind of inter-layer texture prediction method according to the present invention comprise by basic unit vertically adjoin a macro block to structure N to the frame macro block, wherein N is the integer greater than 1; And with the N that constructs to frame macro block texture information separately as the N that is positioned at the different time position in current layer to frame macro block texture prediction information separately.

Another kind of inter-layer texture prediction method according to the present invention comprises: each frame of lower floor is divided into a plurality of pictures to allow lower floor to have the temporal resolution identical with the upper strata; Rise each isolated picture of institute of sampling in the vertical direction to expand in vertical direction each isolated picture of institute; Then each field picture through rising sampling is used for the inter-layer texture prediction of each frame on upper strata.

Another kind of inter-layer texture prediction method according to the present invention comprises: each picture of the sampling lower floor of rising in the vertical direction is to expand in vertical direction each picture; And the inter-layer texture prediction that each field picture through rising sampling is used for each frame on upper strata.

Another kind of inter-layer texture prediction method according to the present invention comprises: each frame on upper strata is divided into a plurality of pictures; The picture of down-sampled lower floor is to dwindle in vertical direction the picture of lower floor; Then will be used for through down-sampled picture the inter-layer texture prediction of isolated the picture on upper strata.

A kind of method of inter-layer prediction encoded video signal of utilizing according to the present invention comprises: determine in inter-layer texture prediction it is the 2N piece texture information separately of using the row by the 2N piece in the arbitrariness picture of alternately selecting basic unit then to construct with the selected row of order layout of selecting, still use by interpolation and be selected from the 2N piece texture information separately that a piece of the 2N piece of basic unit is constructed; And will indicate this information of determining to bring in the information of coding.

A kind of inter-layer prediction that utilizes according to the present invention comes the method for decoded video signal to comprise: check whether specific indication information is included in the signal that receives; And determine in inter-layer texture prediction it is the 2N piece texture information separately of using the row by the 2N piece in the arbitrariness picture of alternately selecting basic unit then to construct by the selected row of order layout of selecting based on the result that is checked, still use by interpolation and be selected from the 2N piece texture information separately that a piece of the 2N piece of basic unit is constructed.

In an embodiment of the present invention, each frame of upper strata or lower floor is divided into two field pictures.

In an embodiment of the present invention, if specific indication information is not included in received signal, this situation is considered as comprising the signal that is set as 0 indication information and determined it that texture information will be identical for the situation of the piece of inter-layer prediction separately with receiving.

A kind of method that according to the present invention, the vision signal of basic unit is used for inter-layer texture prediction comprises: the interlaced video signal of basic unit is divided into the strange field component of even summation; On vertical and/or horizontal direction, the strange field component of even summation is amplified separately; Then will be used for inter-layer texture prediction through the strange field component group combination of the even summation that amplifies.

The another kind of method that according to the present invention, the vision signal of basic unit is used for inter-layer texture prediction comprises: the progressive video signal of basic unit is divided into even row group and strange row group; The group of on vertical and/or horizontal direction, even summation very being gone is amplified separately; Even summation through amplifying is very gone be combined for inter-layer texture prediction.

The another kind of method that according to the present invention, the vision signal of basic unit is used for inter-layer texture prediction comprises: amplify the interlaced video signal of basic unit so that it has the resolution identical with the progressive video signal on upper strata on vertical and/or horizontal direction; And the inter-layer texture prediction of carrying out the vision signal on upper strata based on the vision signal through amplifying.

The another kind of method that according to the present invention, the vision signal of basic unit is used for inter-layer texture prediction comprises: amplify the progressive video signal of basic unit so that it has the resolution identical with the interlaced video signal on upper strata on vertical and/or horizontal direction; And the inter-layer texture prediction of carrying out the vision signal on upper strata based on the vision signal through amplifying.

In one embodiment of the invention, vision signal is separated and amplification is to carry out in macro block rank (or i.e. basis at macro block on).

In another embodiment of the present invention, vision signal is separated and is amplified and carry out on the picture rank.

In another embodiment of the present invention, if different to its picture format of two layers of using inter-layer texture prediction, if namely one deck comprises line by line picture and another layer comprises interlaced picture, carry out vision signal and separate and amplify.

In another embodiment of the present invention, if be all interlacing to its two-layer picture of using inter-layer texture prediction, carry out vision signal and separate and amplify.

4. description of drawings

Fig. 1 a illustrates with 1b and single video frequency source coding is become the method for a plurality of layers;

Fig. 2 a and 2b schematically illustrate the configuration of using according to the video signal coding apparatus of inter-layer prediction method of the present invention;

Fig. 2 c and 2d illustrate the type for the picture sequence of coding interlaced vision signal;

Fig. 3 a and 3b schematically show the process of wherein constructing basic unit's picture for inter-layer texture prediction according to an embodiment of the invention and carrying out de-blocking filter;

Fig. 4 a to 4f schematically shows according to an embodiment of the invention the process that the movable information of field macro block of the virtual basic unit of the inter-layer motion prediction that wherein will be used for MBAFF frame midfield macro block utilizes the movable information of frame macro block to derive;

Fig. 4 g schematically shows according to an embodiment of the invention the program that the right texture information of macro block wherein is used to the right texture prediction of field macro block in the MBAFF frame;

Fig. 4 h illustrates according to embodiments of the invention the frame macro block being transformed into the right method of a macro block;

Fig. 5 a and 5b illustrate reference key and movable information derivation program according to another embodiment of the invention;

Fig. 6 a to 6c schematically shows the derive program of movable information of the field macro block in virtual basic unit of the movable information that wherein utilizes according to an embodiment of the invention the frame macro block;

Fig. 6 d schematically shows according to an embodiment of the invention the program that the right texture information of frame macro block wherein is used to the texture prediction of the field macro block in a picture;

Fig. 7 a and 7b illustrate reference key and movable information derivation program according to another embodiment of the invention;

The movable information that Fig. 8 a to 8c schematically shows according to an embodiment of the invention field macroblock frame macro block of the virtual basic unit that wherein will be used for inter-layer motion prediction is the program of utilizing the movable information of a macro block of MBAFF frame to derive;

Fig. 8 d schematically shows according to an embodiment of the invention the program that the right texture information of field macro block in MBAFF frame wherein is used to the right texture prediction of frame macro block;

Fig. 8 e illustrates according to embodiments of the invention the field macro block the right method of conversion framing macro block;

When Fig. 8 f and 8g schematically show according to an embodiment of the invention macro block centering then and there only a macro block is inter mode, the right texture information of field macro block in the MBAFF frame is used for the program of the right inter-layer prediction of frame macro block;

Fig. 8 h schematically shows according to an embodiment of the invention the program that the right texture information of field macro block in MBAFF frame wherein is used to many texture predictions to the frame macro block;

Fig. 9 a and 9b illustrate reference key and movable information derivation program according to another embodiment of the invention;

The movable information that Figure 10 a to 10c schematically shows according to an embodiment of the invention the frame macro block of the virtual basic unit that will be used for therein inter-layer motion prediction is the program of utilizing the movable information of the field macro block of a picture to derive;

The texture information that Figure 10 d schematically shows the field macro block in its midfield picture according to an embodiment of the invention is used to the program of the right texture prediction of frame macro block;

Figure 11 illustrates reference key and movable information derivation program according to another embodiment of the invention;

The movable information that Figure 12 a and 12b schematically show the frame macro block of the virtual basic unit that wherein will be used for inter-layer motion prediction according to another embodiment of the invention is the program of utilizing the movable information of the field macro block of a picture to derive;

Figure 13 a to 13d schematically shows according to the type of picture the program that the movable information of movable information utilization field macro block of the field macro block of the virtual basic unit that will be used for according to an embodiment of the invention inter-layer motion prediction is derived respectively;

Figure 14 a to 14k respectively according to the type of picture illustrate according to various embodiments of the present invention in the spatial resolution of all layer method of motion prediction between execution level simultaneously not;

Figure 15 a and 15b schematically show according to an embodiment of the invention enhancement layer be line by line and the picture that will have the basic unit of different spatial resolutions when basic unit is interlacing is used for the program of inter-layer texture prediction;

Figure 16 a and 16b schematically show the program that wherein for the picture with basic unit is used for inter-layer texture prediction, the macro block of picture is exaggerated being divided into macro block and isolated macro block according to an embodiment of the invention;

Figure 17 a and 17b schematically show according to an embodiment of the invention enhancement layer be interlacing and the program of basic unit's to be the picture that will have the basic unit of different spatial resolutions in line by line be used for inter-layer texture prediction;

Figure 18 schematically shows according to an embodiment of the invention at enhancement layer and the picture that will have the basic unit of different spatial resolutions when basic unit all is interlacing is used for the program of inter-layer prediction;

It is the program that progressive frame sequence and two-layer picture type and temporal resolution are not predicted between application layer simultaneously that Figure 19 a illustrates according to an embodiment of the invention at enhancement layer;

It is progressive frame sequence and the two-layer program of using inter-layer prediction when having different picture types and identical resolution that Figure 19 b illustrates according to an embodiment of the invention at enhancement layer;

It is the program that progressive frame sequence and two-layer picture type and temporal resolution are not predicted between application layer simultaneously that Figure 20 illustrates according to an embodiment of the invention in basic unit; And

It is progressive frame sequence and the two-layer program of using inter-layer prediction when having different picture types and identical resolution that Figure 21 illustrates according to an embodiment of the invention in basic unit.

5. embodiment

With reference now to accompanying drawing, describe embodiments of the invention in detail.

Fig. 2 a schematically shows the building block of using according to the video signal coding apparatus of inter-layer prediction method of the present invention.Although the device of Fig. 2 a is realized as and incoming video signal is encoded into two-layer, the principle of the following description of the present invention also is applicable to be encoded into three layers or interlayer process when even more multi-layered in vision signal.

Enhancement layer (EL) the encoder 20 places execution of inter-layer prediction method according to the present invention in the device of Fig. 2 a.Encoded information (movable information and texture information) is in the 21 places reception of basic unit (EL) encoder.Based on the information and executing inter-layer texture prediction or the motion prediction that receive.If needed, the information that receives of decoding and based on the information and executing prediction that decodes.Certainly, in the present invention, as shown in Fig. 2 b, incoming video signal can be to come encoding and decoding with the video source 3 of the basic unit that has been encoded.As described below inter-layer prediction method is applicable equally in this case.

In the situation of Fig. 2 a, can have the coding interlaced vision signal of BS encoder 21 therein or therein the encoded video source 3 of Fig. 2 b by two kinds of methods of encoding and decoding.Particularly, in one of these two kinds of methods, as shown in Fig. 3 a, interlaced video signal is encoded into a sequence simply on the basis by the field, and in another approach, as shown in Fig. 3 b, by the macro block with two (even summation is strange) field, each frame that comes tectonic sequence is encoded into frame sequence with frame.The upper macro block of macro block centering in the frame of coding is called as " top macro block " in this way, and lower macro block is called as " end macro block ".If the top macro block is made of even (or strange) field picture component, end macro block is made of strange (or even) field picture component.The frame of structure is called as macro block adaptive frame/field (MBAFF) frame in this way.The MBAFF frame not only can comprise the macro block pair of each self-contained strange and even macro block, also can comprise the macro block pair of each self-contained two frame macro blocks.

Correspondingly, when the macro block in picture had the interlaced picture component, it may be the macro block in the field, and may be also the macro block in frame.Each macro block with interlaced picture component is called as a macro block, and each macro block that has line by line (scanning) picture content is called the frame macro block.

Therefore, need to by determine will the macro block of EL encoder 20 places' codings and will the inter-layer prediction at macro block in the base layer macro block type separately used be that frame macro block (mb) type or a macro block (mb) type are determined inter-layer prediction method.If macro block is a macro block, need by determine it be in the field or the MBAFF frame in a field macro block determine inter-layer prediction method.

To for each situation, the method be described respectively.Before describing, suppose that the resolution of current layer equals the resolution of basic unit.That is, suppose that SpatialScalabilityType () is 0.The description of the resolution of current layer during higher than basic unit's resolution will provide after a while.In following description and accompanying drawing, term " top " and " idol " (or strange) are used interchangeably, and term " end " and " very " (or idol) are used interchangeably.

In order to utilize basic unit to come inter-layer prediction with coding or decoding enhancement layer, at first need the basic unit of decoding.Therefore, at first base layer decoder described as follows.

In decoding during basic unit, the basic unit's movable information such as partition mode, reference key and motion vector of not only decoding, the texture of the basic unit of also decoding.

Decoded during for inter-layer texture prediction when the texture of basic unit, be not that all image pattern data of basic unit are all decoded, this is in order to reduce the load of decoder.The image pattern data of internal schema macro block are out decoded, and the inter mode macro block be only residual error data---to be the error information between the image pattern data---out decoded and need not adjoin picture and carry out motion compensation.

In addition, the basic unit's texture decoder that is used for inter-layer texture prediction is not by on the basis of macro block but carrying out by on the basis of picture, with the upper basic unit picture consistent with the enhancement layer picture of structure time.Basic unit's picture is to construct by the image pattern data that reconstruct from the internal schema macro block with from the residual error data that the inter mode macro block decodes as mentioned above.

Such as DCT and the internal schema quantizing or inter mode motion compensation and conversion carry out on the image block basis, for example carrying out on the 16x16 macroblock basis or on 4x4 sub-block basis.This causes minute blocking artefacts at block boundary place to make picture distortion.Use de-blocking filter and reduce these minutes blocking artefacts.Deblocking filter makes the edge of image block smoothly to improve the quality of frame of video.

Whether using de-blocking filter reduces piecemeal distortion and depends on that image block is in the intensity of boundary and the gradient of border surrounding pixel.The dynamics of deblocking filter or the degree pixel value before by the image block partition mode of quantization parameter, internal schema, inter mode, indicator collet size etc., motion vector, de-blocking filter etc. is determined.

Deblocking filter in inter-layer prediction is the internal schema macro block that is applied in basic unit's picture on basis of texture prediction of basic internal schema (intraBL or the interlayer internal schema) macro block as enhancement layer.

In the time will entirely being encoded into picture sequence as shown in Fig. 2 c according to inter-layer prediction method coding two-layer, this is two-layer is counted as frame format entirely, can easily derive thereby make from the encoding-decoding process for frame format the coding/decoding process that comprises de-blocking filter.

now will be for the picture format of the basic unit situation different from the picture format of enhancement layer---be that basic unit is that frame (or namely line by line) form and basic unit are the situation of (or being interlacing) form, basic unit is the situation of frame format for field form basic unit, although or the enhancement layer as shown in Fig. 2 c and 2d and basic unit are both for the field form but one of enhancement layer and basic unit are encoded into a picture sequence that another is encoded into the situation of MBAFF frame---the method for carrying out de-blocking filter according to embodiments of the invention is described.

Fig. 3 a and 3b schematically show and construct therein according to an embodiment of the invention basic unit's picture to carry out the process of the de-blocking filter that is used for inter-layer texture prediction.

It is that frame format and basic unit are the embodiment of a form that Fig. 3 a illustrates enhancement layer wherein, and basic unit is the embodiment of frame format for field form basic unit and Fig. 3 b illustrates wherein.

In these embodiments, for inter-layer texture prediction, the inter mode macro block of basic unit and the texture of internal schema macro block are decoded, the basic unit's picture that comprises image pattern data and residual error data with structure, and at the picture that deblocking filter is applied to construct recently to rise the picture of sampling and being constructed according to the resolution of the resolution (or being screen size) of basic unit and enhancement layer after reducing minute blocking artefacts.

The first method in Fig. 3 a and 3b (method 1) is that wherein basic unit is divided into two field pictures to carry out the method for de-blocking filter.In the method, when utilizing basic unit with different pictures form coding to create enhancement layer, basic unit's picture is divided into an even row picture and a strange row picture, and these two field pictures are deblocked (that is, carrying out be used to the filtering of deblocking) and rise sampling.Then these two picture splicings are become single picture, and carry out inter-layer texture prediction based on this single picture.

This first method comprises following three steps.

In separating step (step 1), field, the end (or the even) picture that basic unit's picture is divided into field, top (or the strange) picture that comprises even row and comprises strange row.Basic unit's picture is to comprise the residual error data (inter mode data) that reconstructs from the data flow of basic unit by motion compensation and the video pictures of image pattern data (internal schema data).

In the step (step 2) of deblocking, separated picture deblocked by deblocking filter in separating step.Here, can use conventional deblocking filter as this deblocking filter.

When the resolution of the resolution of enhancement layer and basic unit not simultaneously, the field picture through deblocking recently rises sampling according to the resolution of the resolution of enhancement layer and basic unit.

The splicing step (step 3), field, the top picture through rising sampling and field, the end picture through rising sampling in an alternating manner by interlacing scan to be spliced into single picture.Afterwards, carry out the texture prediction of enhancement layer based on this single picture.

In the second method in Fig. 3 a and 3b (method 2), when utilizing the basic unit of encoding with the different pictures form to create enhancement layer, basic unit's picture is not divided into two field pictures but it is directly deblocked to it and rise sampling, and carry out inter-layer texture prediction based on the picture of gained as a result.

In this second method, and to not be divided into top and bottom field picture by the corresponding basic unit's picture of the enhancement layer picture that inter-layer texture prediction is encoded but deblocked immediately, then rise sampling.Afterwards, carry out the texture prediction of enhancement layer based on this picture through rising sampling.

The deblocking filter that is applied to basic unit's picture of constructing for inter-layer motion prediction only is applied to comprising the zone of the image pattern data that decode from the internal schema macro block, and is not applied to comprising the zone of residual error data.

In the situation that the basic unit in Fig. 3 a is encoded into a form---be that basic unit is encoded into a picture sequence or is encoded into the MBAFF frame as shown in Fig. 2 d as shown in Fig. 2 c, in order to use the second method, the row that need to carry out interlacing scan top and bottom field picture alternately with it is combined into single picture (in the situation that Fig. 2 c) or alternately the row of the right top and bottom macro block of interlaced field macro block it is combined into the process of single picture (in the situation that Fig. 2 d).This process is described in detail with reference to Fig. 8 d and 8e.To be to comprise the residual error data (inter mode data) that reconstructs by motion compensation and field picture or the macro block of image pattern data (internal schema data) by interleaved top and bottom field picture or top and bottom macro block.

in addition, in the situation that (basic unit) in the MBAFF frame as shown in Fig. 2 d right top and bottom macro block of macro block is different patterns and selects the internal schema piece to be used for the right inter-layer texture prediction (in the situation that Fig. 8 g that describes after a while) of macro block of enhancement layer from these macro blocks, as shown in Fig. 2 d be encoded into the MBAFF frame in the field right basic unit of macro block in any frame (picture) in time with the inconsistent situation of enhancement layer picture under (in the situation that Fig. 8 h that describes after a while), or in the situation that to have the texture of the right enhancement layer of macro block be base layer prediction (in the situation that after a while describe Figure 10 d) from the field macro block with picture as shown in Fig. 2 c, one that chooses in the macro block of field is sampled into interim macro block to (" 841 " in Fig. 8 g and " 851 " in Fig. 8 h and " 852 ") or two interim macro blocks (" 1021 " in Figure 10 d) by liter, and deblocking filter is applied to internal schema macro block in these macro blocks.

The inter-layer texture prediction of describing in following various embodiment is based on carrying out through the basic unit's picture that deblocks of describing in the embodiment of Fig. 3 a and 3b.

Now will be for describing respectively inter-layer prediction method according to the macro block (mb) type in the current layer of wanting encoding and decoding and each situation that the macro block (mb) type of the basic unit of the inter-layer prediction of the macro block that will be used for current layer is classified.In this description, suppose that as described above the spatial resolution of current layer equals the spatial resolution of basic unit.

I. frame MB-〉situation of field MB in the MBAFF frame

In this case, the macro block in current layer (EL) is encoded into the field macro block in the MBAFF frame, and the macro block of basic unit of inter-layer prediction that will be used for the macro block of the current layer framing macro block that is encoded.In upper macro block in basic unit and lower macro block in included vision signal composition and current layer in the macro block of a pair of coordination included vision signal composition be identical.Upper and lower (top and bottom) macro block will be called as macro block pair, and term " to " in the following description will be used to describing a pair of piece that vertically adjoins.At first, the description inter-layer motion prediction is as follows.

EL encoder 20 uses by becoming macro-block partition mode that single macro block (by being compressed in vertical direction half size) obtains as the partition mode of current macro to 410 merger the macro block of basic unit.Fig. 4 a illustrates the specific example of this process.As shown, at first, the respective macroblock of basic unit is become single macro block (S41) to 410 merger, and the partition mode of the macro block that obtains by merger is copied into another macro block to construct macro block to 411 (S42).Afterwards, this is applied to the macro block of virtual basic unit to macro block 411 partition mode separately to 412 (S43).

Yet, when corresponding macro block is merged into single macro block to 410, may be created on unallowed zoning in partition mode.In order to prevent this situation, EL encoder 20 is determined partition mode according to following rule.

1) two 8x8 pieces of top and bottom (" B8_0 " in Fig. 4 a and " B8_2 ") of the macro block centering of basic unit are merged into single 8x8 piece.But if any in corresponding 8x8 piece do not segmented, they are merged into two 8x4 pieces, and if have any to be segmented in corresponding 8x8 piece, they are merged into four 4x4 pieces (" 401 " in Fig. 4 a).

2) the 8x16 piece of basic unit dwindles into the 8x8 piece, and the 16x8 piece dwindles into two 8x4 pieces that adjoin, and the 16x16 piece dwindles into the 16x8 piece.

If respective macroblock is centering to a rare macro block with the internal schema coding, at first EL encoder 20 carried out following process before the merger process.

If only have one to be internal schema in these two macro blocks, macro block be copied into interior macro block as shown in Fig. 4 b such as the movable information macro-block partition mode, reference key and motion vector, perhaps in macro block be considered to have macro block between the 16x16 of 0 motion vector and 0 reference key as shown in Fig. 4 c.Perhaps, as shown in Fig. 4 d, the reference key of interior macro block copies interior macro block to by the reference key of macro block between the general and arranges, and 0 motion vector is distributed to interior macro block.Then, carry out above mentioned merger process, then execution reference key as described below and motion vector derivation program.

EL encoder 20 carry out following processes with from respective macroblock to 410 reference key derivation current macro to 412 reference key.

If each piece corresponding to the 8x8 of the basic unit piece centering of current 8x8 piece has been subdivided into a similar number part, the reference key of this 8x8 piece centering one (jacking block or sole piece) is confirmed as the reference key of current 8x8 piece.Otherwise that the reference key that this 8x8 piece centering has been subdivided into less number part is confirmed as the reference key of current 8x8 piece.

In another embodiment of the present invention, for corresponding to the 8x8 of the basic unit piece of current 8x8 piece to a less reference key that is confirmed as current 8x8 piece in the reference key that arranges.This definite method in the example of Fig. 4 e can be expressed as follows:

Reference key=min of current B8_0 (reference key of the B8_0 of basic top frame MB, the reference key of the B8_2 of basic top frame MB)

Reference key=min of current B8_1 (reference key of the B8_1 of basic top frame MB, the reference key of the B8_3 of basic top frame MB)

Reference key=min of current B8_2 (reference key of the B8_0 of substrate frame MB, the reference key of the B8_2 of substrate frame MB), and

Reference key=min of current B8_3 (reference key of the B8_1 of substrate frame MB, the reference key of the B8_3 of substrate frame MB).

Above reference key derivation program applicable to top and bottom field macro block both.The reference key of each definite in this way 8x8 piece be multiply by 2, and the reference key after multiplying each other is defined as its final reference index.The reason of making this multiplication is that the number of picture is the twice of the number in frame sequence when decoding, is divided into even and strange because belong to the field macro block of picture.Depend on decoding algorithm, the final reference index of field, end macro block can add 1 and determines by its reference key being multiply by 2 reference keys after then multiplying each other.

It is below the derive program of the right motion vector of the macro block of virtual basic unit of EL encoder 20.

Motion vector is to determine on the basis of 4x4 piece, so the corresponding 4x8 piece of basic unit is out identified, as shown in Fig. 4 f.If this corresponding 4x8 piece is segmented, the motion vector of its top or end 4x4 piece is confirmed as the motion vector of current 4x4 piece.Otherwise, the motion vector of the 4x8 piece of correspondence is defined as the motion vector of current 4x4 piece.Determined motion vector is used as the final motion vector of current 4x4 piece after its vertical component is divided by 2.The reason of making this division is to be included in two iconic elements in the frame macro block corresponding to the iconic element of a field macro block thereby to make the size of field picture reduce in vertical direction half.

In case the field macro block of virtual basic unit determines in this way to 412 movable information, the target field macro block that this movable information just is used to enhancement layer is to 413 inter-layer motion prediction.Equally, in the following description, in case the right movable information of the macro block of virtual basic unit or macro block is determined, this movable information just is used to respective macroblock or the right inter-layer motion prediction of respective macroblock of current layer.In the following description, suppose that respective macroblock or right this process of inter-layer motion prediction of respective macroblock of being used to current layer even without the macro block of mentioning virtual basic unit or the right movable information of macro block also are employed.

The field macro block that Fig. 5 schematically shows the virtual basic unit that will be used to inter-layer prediction according to another embodiment of the invention to 500 movable information how from deriving corresponding to the right movable information of the right basic frame macro block of current macro.In the present embodiment, as shown in the figure, the reference key of the top of the top macro block that the frame macro block of basic unit is right or end 8x8 piece is used as the field macro block of virtual basic unit to the reference key of the top 8x8 piece of each macro block in 500, and the reference key of the top of the end macro block of basic unit or end 8x8 piece is used as this macro block to the reference key of the end 8x8 piece of each macro block in 500.on the other hand, as shown in the figure, the motion vector of the 4x4 piece of the top of the top macro block that the frame macro block of basic unit is right is common to the field macro block of virtual basic unit to each macro block 4x4 piece topmost in 500, the motion vector of the 3rd 4x4 piece of the top macro block that the frame macro block of basic unit is right is common to this macro block to second 4x4 piece of each macro block in 500, the motion vector of the end macro block 4x4 piece topmost that the frame macro block of basic unit is right is common to this macro block to the 3rd 4x4 piece of each macro block in 500, and the motion vector of the 3rd 4x4 piece of the end macro block that the frame macro block of basic unit is right is common to this macro block to the 4th 4x4 piece of each macro block in 500.

As shown in Fig. 5 a, for being used for field macro block that inter-layer prediction constructs, the top 4x4 piece 501 in the 8x8 piece in 500 8x8 pieces and end 4x4 piece 502 are used the motion vector of the 4x4 piece in the

different 8x8 pieces

511 and 512 of basic unit.These motion vectors may be the motion vectors that uses different reference pictures.That is,

different 8x8 piece

511 and 512 may have different reference keys.Correspondingly, in this case, the motion vector of the corresponding 4x4 piece 503 that will select for top 4x4 piece 501 500, EL encoder 20 for the macro block of constructing virtual basic unit shares the motion vector of second 4x4 piece 502 doing virtual basic unit, as shown in Fig. 5 b (521).

In the embodiment that describes with reference to figure 4a to 4f, with the right movable information of prediction current macro, EL encoder 20 is based on the right movable information of the respective macroblock of basic unit sequentially derive partition mode, reference key and motion vector for the movable information of constructing virtual basic unit.Yet, in with reference to figure 5a and the described embodiment of 5b, at first EL encoder 20 based on derive right reference key and the motion vector of macro block of virtual basic unit of the right movable information of the respective macroblock of basic unit, then finally determines the right partition mode of macro block of virtual basic unit based on the value of deriving.When partition mode is determined, 4x4 module unit with the identical motion vector of deriving and reference key is combined, if and the block mode after combination is the partition mode that allows, partition mode is arranged to the pattern after this combination, otherwise the pattern before partition mode is arranged to make up.

In the above-described embodiment, if the respective macroblock of basic unit is all internal schema to two macro blocks in 410, current macro is carried out prediction in base to 413.In this case, do not carry out motion prediction.Certainly, in the situation that the texture prediction macro block pair of constructing virtual basic unit not.If the respective macroblock of basic unit is internal schema to only having a macro block in 410, between inciting somebody to action as shown in Fig. 4 b, the movable information of macro block is copied to interior macro block, as shown in Fig. 4 c, motion vector and the reference key of interior macro block are arranged to 0, perhaps copy interior macro block to by the reference key of macro block between the general as shown in Fig. 4 d and the reference key of interior macro block is set and the motion vector of interior macro block is arranged to 0.Then, the right movable information of the macro block of virtual basic unit is derived as described above.

Be as mentioned above inter-layer motion prediction constructing virtual basic unit macro block to after, the right movable information of macro block that EL encoder 20 use construct is predicted and is encoded when the front court macro block to 413 movable information.

Now inter-layer texture prediction will be described.Fig. 4 g is illustrated in texture Forecasting Methodology between exemplary layer in the situation of " frame MB-〉in the MBAFF frame field MB ".The respective frame macro block that EL encoder 20 identifies basic unit is to 410 block mode.If the respective frame macro block is to two macro blocks in 410 or be all internal schema or be all inter mode, EL encoder 20 becomes interim field macro block to 421 to 410 conversions (conversion) respective macroblock of basic unit, in order to or carry out when the front court macro block and predict (when two frame macro blocks 410 all are internal schema) or carry out in the manner described below its residual prediction (when two frame macro blocks 410 all are inter mode) in to 413 base.When two macro blocks when corresponding macro block in to 410 all are internal schema, this interim field macro block to 421 comprise foregoing in the situation that internal schema is deblocked after completing decoding the data of (that is, carrying out be used to the filtering of deblocking).In following description to various embodiment, for the interim macro block of deriving from the macro block of the basic unit that is used for texture prediction to so same.

Yet, do not carry out inter-layer texture prediction when only having one to be inter mode in these two macro blocks.The macro block that is used for the basic unit of inter-layer texture prediction has the raw image data (or the view data through decoding) of un-encoded to 410 in the situation that macro block is internal schema, and has encoded residual error data (or the residual error data through decoding) in the situation that macro block is inter mode.Following to the description of texture prediction in for the macro block of basic unit to so same.

Fig. 4 h illustrate for the frame macro block to converting the right method of field macro block that will be used for inter-layer texture prediction to.As shown in the figure, sequentially select the even row of a pair of frame macro block A and B with a structure top macro block A', and sequentially select this to the strange row of frame macro block A and B with structure field, end macro block B'.When filling a field macro block with row, at first it fill with idol (or strange) row (A_ occasionally A_ is strange) of jacking block A, then fills with strange (or even) row (B_ occasionally B_ is strange) of sole piece B.

II. frame MB-〉situation of field MB in a picture

In this case, the macro block in current layer is the field macro block that is encoded in a picture, and the macro block of basic unit of inter-layer prediction that will be used for the macro block of current layer is the framing macro block that is encoded.In even or strange in the included vision signal composition of macro block centering in basic unit and current layer in the macro block of coordination included vision signal composition identical.At first, inter-layer motion prediction is described below.

EL encoder 20 uses by the macro block with basic unit and merger is become the macro-block partition mode of single macro block (by being compressed in vertical direction half size) acquisition as the partition mode of the occasionally strange macro block of virtual basic unit.Fig. 6 a illustrates the detailed example of this process.As shown in the figure, at first the respective macroblock with basic unit becomes single macro block 611 (S61) to 610 merger, and will be applied to by the partition mode that this merger obtains to be used for the macro block (S62) of virtual basic unit of the inter-layer motion prediction of current macro 613.Identical in merger rule and previous situation I.Identical in processing method when corresponding macro block has at least a macro block to encode with internal schema in to 610 and previous situation I.

The program that is used for derivation reference key and motion vector is also carried out with the mode same way as of describing with top situation I formerly.In situation I, identical derivation program is applied to the top and bottom macro block, because the strange macro block of even summation is to being carried in a frame.Yet this situation II and situation I difference are the derivation program only is applied to a field macro block, as shown in Fig. 6 b and 6c because want encoding and decoding only exist in the picture of front court one corresponding to base layer macro block to 610 macro block.

In above embodiment, for the movable information of the macro block of predicting virtual basic unit, EL encoder 20 is based on the right movable information of the respective macroblock of basic unit sequentially derive partition mode, reference key and the motion vector of this macro block.

In another embodiment of the present invention, EL encoder 20 is at first based on derive reference key and the motion vector of macro block of virtual basic unit of the right movable information of the respective macroblock of basic unit, then, finally determine the block mode of the macro block of virtual basic unit based on the value of deriving.Fig. 7 a and 7b schematically show the field reference key of macro block of virtual basic unit and the derivation of motion vector.The class of operation that is used in this case deriving is similar to the operation with reference to the situation I of figure 5a and 5b description, and difference is to push up or the movable information of end macro block is to utilize the right movable information of macro block of basic unit to derive.

When partition mode is finalized, 4x4 module unit with the identical motion vector of deriving and reference key is combined, if and the block mode after combination is the partition mode that allows, partition mode is arranged to the pattern after this combination, otherwise the pattern before partition mode is arranged to make up.

In the above-described embodiment, if two macro blocks of the respective macroblock centering of basic unit are all internal schemas, do not carry out motion prediction, the also right movable information of macro block of constructing virtual basic unit not, if and only have one to be internal schema in these two macro blocks, carry out in this case motion prediction as described previously.

Now inter-layer texture prediction will be described.Fig. 6 d is illustrated in texture Forecasting Methodology between exemplary layer in the situation of " frame MB-〉in picture field MB ".The respective macroblock that EL encoder 20 identifies basic unit is to 610 block mode.If two macro blocks of this macro block centering or be all internal schema or be all inter mode, EL encoder 20 by single to the frame macro block 610 interim macro block 621 of structure.If current macro 613 belongs to an even picture, EL encoder 20 is constructed an interim macro block 621 by respective macroblock to 610 even row.If current macro 613 belongs to strange picture, EL encoder 20 is constructed an interim macro block 621 by respective macroblock to 610 strange row.Building method is similar to the method for constructing single macro block A' or B' in Fig. 4 h.

In case an interim macro block 621 is configured out, EL encoder 20 is just carried out when predicting (when two macro blocks when corresponding macro block in to 610 all are internal schema) in the base of front court macro block 613 based on the texture information in field macro block 621, or carries out its residual prediction (when two macro blocks when corresponding macro block in to 610 all are inter mode).

If corresponding macro block is inter mode to only having a macro block in 610, EL encoder 20 is not carried out inter-layer texture prediction.

MB-in the III.MBAFF frame〉situation of frame MB

In this case, the macro block in current layer is the framing macro block that is encoded, and the macro block of basic unit of inter-layer prediction that will be used for the frame macro block of current layer is the field macro block that is encoded in the MBAFF frame.In the macro block of a pair of coordination in included vision signal composition and current layer in macro block of the field in basic unit, included vision signal composition is identical.At first, inter-layer motion prediction is described below.

EL encoder 20 uses macro-block partition mode by the right top of expansion base layer macro block or end macro block (expanding in vertical direction twice) acquisition as the right partition mode of the macro block in virtual basic unit.Fig. 8 a illustrates the specific example of this process.Although be that top macro block is selected in following description and accompanying drawing, described same applicable below when field, end macro block is selected.

As shown in Fig. 8 a, the respective macroblock of basic unit is expanded to twice constructing two macro blocks 811 (S81) to 810 field, top macro block, and the macro block that will be applied to by the partition mode that expansion obtains virtual basic unit is to 812 (S82).

Yet, when the respective fields macro block is extended to twice in vertical direction, may be created on unallowed partition mode (or pattern) in macro-block partition mode.In order to prevent this situation, EL encoder 20 by following rule according to determining partition mode through the partition mode of expansion.

1) 4x4 of basic unit, 8x4 and 16x8 piece are confirmed as after expansion by it being amplified in vertical direction 4x8,8x8 and the 16x16 piece of twice acquisition.

2) 4x8 of basic unit, 8x8 and 16x16 piece are confirmed as two of the top and bottoms of formed objects separately after expansion.As shown in Fig. 8 a, the 8x8 piece B8_0 of basic unit is confirmed as two 8x8 pieces (801).8x8 piece B8_0 is that on its left side or right side what adjoin may not be the 8x16 divided block through extension blocks in the reason that is not configured to the 8x16 piece after expansion, and does not have in this case which kind of macro-block partition mode supported.

If respective macroblock has a macro block to encode with internal schema in 810, EL encoder 20 be not select internal schema but select top or field, the end macro block of inter mode, and it is carried out above expansion process to determine that macro block in virtual basic unit is to 812 partition mode.

If corresponding macro block is all internal schema to two macro blocks in 810,20 of EL encoders are carried out inter-layer texture prediction, do not determine and reference key described below and motion vector derivation and do not carry out the partition mode that is undertaken by above expansion process.

The right reference key of macro block for the virtual basic unit that derives from the reference key of respective fields macro block, EL encoder 20 is defined as each reference key in two 8x8 pieces of this top and bottom with the reference key of the corresponding 8x8 piece B8_0 of basic unit, as shown in Fig. 8 b, and with the reference key of determined each 8x8 piece divided by 2 to obtain its final reference key.The reason of making this division is in order to be applied to frame sequence, frame numbers need to be reduced half, arranging because the number of reference pictures of a macro block is based on the picture that is divided into strange of even summation.

When the frame macro block of the virtual basic unit that derives during to 812 motion vector, the motion vector of the macro block that EL encoder 20 is defined as the motion vector of the corresponding 4x4 piece of basic unit virtual basic unit to the 4x8 piece in 812, as shown in Fig. 8 c, and determined motion vector after multiply by 2, its vertical component is used as final motion vector.The reason of making this multiplication be iconic element included in a field macro block corresponding to the iconic element of two frame macro blocks, thereby make the size of two field picture be increased in vertical direction twice.

In the above-described embodiment, for the right movable information of the macro block of predicting virtual basic unit, EL encoder 20 is based on the movable information of the respective fields macro block of basic unit sequentially derive partition mode, reference key and the motion vector of this macro block.

In another embodiment of the present invention, during the right movable information of the macro block of the virtual basic unit that will be used for the right inter-layer prediction of current macro when deriving, at first EL encoder 20 obtains right reference key and the motion vector of macro block of virtual basic unit based on the movable information of the respective fields macro block of basic unit, then finally determine the block mode of each macro block of the macro block centering of virtual basic unit based on the value that obtains, as shown in Fig. 9 a.When partition mode is finalized, 4x4 module unit with the identical motion vector of deriving and reference key is combined, if and the block mode after combination is the partition mode that allows, partition mode is arranged to the pattern after this combination, otherwise the pattern before partition mode is arranged to make up.

It is below the more detailed description of the embodiment of Fig. 9 a.As shown in the figure, the inter mode field macro block of basic unit is selected, and derive with the motion vector of selected macro block and reference key will be for right reference key and the motion vector of frame macro block of the virtual basic unit of the right motion prediction of current macro.If these two macro blocks are all inter modes, one of arbitrariness selected (901 or 902) in the top and bottom macro block, and use motion vector and the reference index information of selected macro block.As shown in the figure, for the reference key of deriving, the analog value of the top 8x8 piece of selected macro block is copied into the reference key of top and bottom 8x8 piece of the top macro block of virtual basic unit, and the analog value of the end 8x8 piece of selected macro block is copied into the reference key of top and bottom 8x8 piece of the end macro block of virtual basic unit.As shown in the figure, for the motion vector of deriving, the analog value of each 4x4 piece of selected macro block is shared the motion vector of the corresponding a pair of 4x4 piece that vertically adjoins of macro block centering of doing virtual basic unit.In another embodiment of the present invention, the right movable information of the respective macroblock of basic unit can be mixed and right motion vector and the reference key of frame macro block of the virtual basic unit that is used for deriving, and these are different from the embodiment shown in Fig. 9 a.Fig. 9 b illustrates the program that is used for derivation motion vector and reference key according to this embodiment.The reference key of the sub-block of the macro block centering of virtual basic unit (8x8 piece and 4x4 piece) and the related detailed description of copy of motion vector are here omitted because its can be from the illustration of the description of above-mentioned movable information derivation program and Fig. 9 b intuitivism apprehension.

Yet, because the movable information of two macro blocks of the field macro block centering of basic unit all is used in the embodiment of Fig. 9 b, if be internal schema so the field macro block centering of basic unit has a macro block, utilize the movable information as the movable information derivation internal schema macro block of another macro block of inter mode macro block.Particularly, can be copied to by the corresponding information with the inter mode macro block as shown in Fig. 4 b after the internal schema macro block constructs the motion vector and reference key of internal schema macro block, or after the internal schema macro block being considered as having the inter mode macro block of 0 motion vector and 0 reference key as shown in Fig. 4 c, or be set to after 0 motion vector and reference index information that the macro block of the virtual basic unit that derives is right being copied to by the reference key with the inter mode macro block reference key and its motion vector that the internal schema macro block arranges the internal schema macro block as shown in Fig. 4 d as shown in Fig. 9 b.In case derive the right motion vector of the macro block of virtual basic unit and reference index information, just determine based on the information of deriving the block mode that macro block is right as discussed previouslyly.

On the other hand, if two macro blocks of the respective fields macro block centering of basic unit are all internal schemas, do not carry out motion prediction.

Now inter-layer texture prediction will be described.Fig. 8 d be illustrated in " in the MBAFF frame the field MB-frame MB " situation under exemplary layer between the texture Forecasting Methodology.The respective fields macro block that EL encoder 20 identifies basic unit is to 810 block mode.If corresponding frame macro block is to two macro blocks in 810 or be all internal schema or be all inter mode, EL encoder 20 converts the respective fields macro block of basic unit to interim frame macro block to 821 to 810, in order to or carry out the present frame macro block to prediction (when these two frame macro blocks 810 all are internal schema) in 813 base or carry out in the manner described below its residual prediction (when these two frame macro blocks 810 all are inter mode).When two macro blocks when respective macroblock in to 810 all were internal schema, macro block comprised decoded data to 810, and as discussed previously deblocking filter was applied to the frame macro block to 821.Fig. 8 e illustrate for the field macro block to conversion framing macro block right method.As shown in the figure, the row of a pair of macro block A and B begin from the top of each macro block sequentially by alternate selection (A-〉B-〉A-B-A-...), then begin from the top to arrange to construct a pair of frame macro block A' and B' by selected sequence ground.Because be the right row of marshalling yard (MY) macro block again in this way, thus top frame macro block A' be by this to the row structure of the first half of field macro block A and B, and end frame macro block B' is by the row structure of the latter half.

On the other hand, if the respective fields macro block of basic unit is inter mode to only having a macro block in 810, according to current frame macro block, 813 block mode is selected a piece from the macro block of basic unit to 810, and selected block is used for inter-layer texture prediction.Perhaps, before determining that current frame macro block is to 813 block mode, can first use each method described below and come inter-layer prediction, can determine that then macro block is to 813 block mode.

Fig. 8 f and 8g illustrate and wherein select a piece with the example of inter-layer prediction.With inter mode coding (perhaps in the situation that carry out model prediction therebetween) at current frame macro block to 813, as shown in Fig. 8 f, select inter mode piece 810a from the field macro block of basic unit to 810, and selected is risen sampling in vertical direction to create two respective macroblock 831.Then this two macro blocks 831 are used for current frame macro block to 813 residual prediction.Not with inter mode coding (perhaps in the situation that carry out its intra mode prediction) at current frame macro block to 813, as shown in Fig. 8 g, select internal schema piece 810b from the field macro block of basic unit to 810, and selected is risen sampling in vertical direction to create two respective macroblock 841.After deblocking filter being applied to these two macro blocks 841, these two macro blocks 841 are used for the present frame macro block to prediction in 813 base.

One of them piece shown in Fig. 8 f and 8g is selected and rises sampling and also can be suitable for when each layer has different picture rates to create the right method of macro block that will be used for inter-layer texture prediction.When the picture rate of enhancement layer during higher than the picture rate of basic unit, corresponding picture on some picture in the picture sequence of enhancement layer may have no time in basic unit.Not free in basic unit in the enhancement layer picture of corresponding picture the included right inter-layer texture prediction of frame macro block can utilize in basic unit on the time macro block in the field macro block of coordination on a pair of space in picture the preceding to carry out.

Fig. 8 h is that the picture rate of enhancement layer is the example of the method in the situation of twice of basic unit's picture rate.

As shown in the figure, the picture rate of enhancement layer is the twice of the picture rate of basic unit.Therefore, having one in every two pictures of enhancement layer---such as the picture of picture order count (POC) for " n2 "---does not have the identical picture of picture order count (POC) in basic unit.Here, the identical consistency of POC on instruction time.

When in basic unit during not free consistent picture (for example, when current POC is n2), previous picture (namely, POC is than the picture of current POC low 1) in a pair of space on the field of coordination in macro block included field, end macro block 802 risen sampling to create interim macro block to 852 (S82) by vertical, then carry out current macro to 815 inter-layer texture prediction with this interim macro block to 852.When in basic unit during free consistent picture (for example, when current POC is n1), on a pair of space in upper consistent picture of this time in the field macro block of coordination included field, top macro block 801 risen sampling to create interim macro block to 851 (S82) by vertical, then carry out current macro to 814 inter-layer texture prediction with this interim macro block to 851.When by liter interim macro block that sampling creates to 851 or 852 comprise from the macro block of internal schema macro block decoding to the time, to this macro block to after using deblocking filter with this macro block to for inter-layer texture prediction.

In another embodiment of the present invention, when going up consistent picture if having time in basic unit (when the current POC in the example of Fig. 8 h is n1), the frame macro block to be not with the method shown in Fig. 8 h but can be according to the embodiment shown in Fig. 8 d by the field macro block to creating, then can use it for inter-layer texture prediction.In addition, during consistent picture (when the current POC in the example of Fig. 8 h is n2), inter-layer texture prediction can be carried out as Fig. 8 h ground, perhaps can be to the execution of the macro block in current picture inter-layer texture prediction current picture is not free in basic unit.

Correspondingly, embodiments of the invention assignment flag ' field_base_flag(field disjunction mark will) ' is to carry out or carry out according to the method shown in Fig. 8 h according to the method shown in Fig. 8 d with the indication inter-layer texture prediction, and this sign is incorporated in coded message.For example, texture prediction be according to as the method for Fig. 8 d when carrying out this sign be set to ' 0 ', and when texture prediction be that when basis is as the method execution of Fig. 8 h, this sign is set to ' 1 '.This sign is defined within macroblock layer in section head, macroblock layer or the expansion of liftable level in parameter sets in sequence parameter set in the enhancement layer that will transmit to decoder, the sequential parameter in the expansion of liftable level, parameter sets, the expansion of liftable level, section head, the expansion of liftable level.

IV. the field MB-in picture〉situation of frame MB

In this case, the macro block in current layer (EL) is the framing macro block that is encoded, and the macro block of basic unit (BL) of inter-layer prediction that will be used for the frame macro block of current layer is the field macro block that is encoded in a picture.Field in basic unit in macro block included vision signal composition be included in current layer in vision signal composition in the macro block of a pair of coordination be identical.At first, inter-layer motion prediction is described below.

EL encoder 20 uses the partition mode of macro block (the expanding in vertical direction twice) acquisition in by expansion basic unit occasionally strange as the partition mode of the macro block in virtual basic unit.Figure 10 a illustrates the specific example of this process.The difference of the top in the program shown in Figure 10 a and MBAFF frame wherein or the program of the selecteed situation III of field, end macro block is very naturally to use the field macro block 1010 of coordination on space in occasionally strange, and its similar part with the program of situation III is that a macro block 1010 of coordination is expanded and the partition mode of two macro blocks obtaining by expansion is applied to the macro block of virtual basic unit to 1012.When respective fields macro block 1010 is extended to twice in vertical direction, may be created on unallowed partition mode (or pattern) in macro-block partition mode.In order to prevent this situation, EL encoder 20 by with the rule 1 of advising in situation III) and 2) identical rule is according to determining partition mode through the partition mode of expansion.

If corresponding macro block is to encode by internal schema, 20 of EL encoders are carried out inter-layer texture prediction, determine and reference key described below and motion vector derivation and do not carry out the partition mode that is undertaken by above expansion process.That is, EL encoder 20 is not carried out inter-layer motion prediction.

Reference key and motion vector derivation program be similar with described in front situation III also.Yet this situation IV is different from situation III in the following areas.In situation III, because corresponding base layer macro block is carried at the strange macro block centering of even summation in frame, so one of top and bottom macro block is selected and is applied to the derivation program.In this situation IV, because only have one in basic unit corresponding to the macro block of the current macro of wanting encoding and decoding, so the macro block of virtual basic unit is derived from the movable information of respective fields macro block to 1012 movable information, and the macro block option program as shown in Figure 10 b and 10c not, and the movable information of deriving is used to current macro to 1013 inter-layer motion prediction.

Figure 11 schematically shows the right reference key of the macro block of virtual basic unit according to another embodiment of the invention and the derivation of motion vector.In this case, the right movable information of the macro block of virtual basic unit is to derive from the movable information of an occasionally strange macro block of basic unit, and this is from above different with reference to the described situation of figure 9a.The derivation operation identical with the situation of Fig. 9 a is applicable to this situation.Yet the mixing in the situation shown in Fig. 9 b also uses the process of the right movable information of macro block inapplicable in this situation IV, because there is no the top and bottom macro block pairing in respective fields in basic unit.

In the embodiment that describes with reference to figure 10a to 10c, for the right movable information of the macro block of predicting virtual basic unit, EL encoder 20 is based on the movable information of the respective fields macro block of basic unit sequentially derive partition mode, reference key and motion vector.Yet, in another embodiment of Figure 11, at first EL encoder 20 based on derive right reference key and the motion vector of macro block of virtual basic unit of the right movable information of the respective macroblock of basic unit, then finally determines the right partition mode of macro block of virtual basic unit based on the value of deriving.When partition mode is determined, 4x4 module unit with the identical motion vector of deriving and reference key is combined, if and the block mode after combination is the partition mode that allows, partition mode is arranged to the pattern after this combination, otherwise the pattern before partition mode is arranged to make up.

When carrying out texture prediction in the above-described embodiment, if the respective fields macro block of basic unit is internal schema, current macro is carried out prediction encoding and decoding in base.If the respective fields macro block is inter mode, and if current macro with inter mode coding, carry out the inter-layer residue prediction encoding and decoding.Here, certainly, the field macro block that uses in prediction is risen in vertical direction sampling at it and is used for afterwards texture prediction.

In another embodiment of the present invention, create virtual macro block with structure macro block pair, then from the macro block that the constructs movable information right to the macro block of the virtual basic unit that derives by being included in field macro block in strange or even.Figure 12 a and Figure 12 b illustrate the example of this embodiment.

In this embodiment, reference key and the motion vector of the corresponding idol of basic unit (or strange) macro block be copied (1201 and 1202) construct macro block to 1211 to create virtual strange (or even) macro block, and the macro block that constructs to the mixed macro block with the virtual basic unit that derives of 1211 movable information to 1212 movable information (1203 and 1204).In mixing and using the exemplary method of movable information, as shown in Figure 12 a and 12b, the top 8x8 piece of the macro block that the reference key of the top 8x8 piece of respective top macro block is applied to virtual basic unit to 1212 top macro block, the reference key of end 8x8 piece is applied to the top 8x8 piece of end macro block, the end 8x8 piece of the macro block that the reference key of the top 8x8 piece of macro block of the corresponding end is applied to virtual basic unit to 1212 top macro block, and the reference key of end 8x8 piece is applied to the end 8x8 piece (1203) of end macro block.Use motion vector (1204) according to reference key.Here omitted the description of this process, because it can be understood intuitively from Figure 12 a and 12b.

In the embodiment shown in Figure 12 a and 12b, the macro block of virtual basic unit is to use with identical as mentioned above method to determine based on the reference key of deriving and motion vector to 1212 partition mode.

Now inter-layer texture prediction will be described.Figure 10 b illustrates for texture Forecasting Methodology between the exemplary layer of the situation of " the field MB-in picture〉frame MB ".At first EL encoder 20 rises the respective fields macro block 1010 of the basic unit that samples to create two interim macro blocks 1021.If respective fields macro block 1010 is internal schemas, EL encoder 20 these two interim macro blocks 1021 that deblocking filter is applied to create, then carry out the present frame macro blocks to prediction in 1013 base based on these two interim macro blocks 1021.If respective fields macro block 1010 is inter modes, EL encoder 20 is carried out the present frame macro blocks to 1013 residual prediction based on these two that create interim macro blocks 1021.

V. MB-〉situation of a MB

This situation is subdivided into following four kinds of situations, because a macro block is divided into the field macro block that comprises in picture on the scene and the field macro block that is included in the MBAFF frame.

I) basic unit and enhancement layer are the situations of MBAFF frame

This situation is shown in Figure 13 a.As shown in the figure, the movable information that the respective macroblock of basic unit is right (partition mode, reference key and motion vector) is by macro block to the macro block that the be used as virtual basic unit right movable information of the movable information direct copying that respective macroblock is right to virtual basic unit.Here, movable information is to be copied having between the macro block of same parity.Particularly, the movable information of even macro block is copied into an even macro block, and the movable information of a strange macro block is copied into strange macro block, with the macro block of the virtual level of the motion prediction of the macro block that is configured to current layer.

Use the method for the inter-layer texture prediction between known frame macro block when carrying out texture prediction.

Ii) basic unit comprises a picture and enhancement layer comprises the situation of MBAFF frame

This situation is shown in Figure 13 b.As shown in the figure, the movable information of the respective fields macro block of basic unit (partition mode, reference key and motion vector) is the movable information of each macro block of the macro block centering by the movable information direct copying of respective fields macro block to each macro block of the macro block centering of virtual basic unit being used as virtual basic unit.Here, same parity copy rule is inapplicable because the movable information of single macro block be used to top and bottom field macro block both.

When carrying out texture prediction, predict (when the corresponding blocks of basic unit is internal schema) in as fired basis or use residual prediction (when the relevant block of basic unit is inter mode) between the enhancement layer with identical (occasionally strange) attribute and base layer macro block.

Iii) basic unit comprises the MBAFF frame and enhancement layer comprises the situation of a picture

This situation is shown in Figure 13 c.As shown in the figure, from corresponding to the field macro block of selecting to have same parity when the base layer macro block centering of front court macro block, and by with the movable information direct copying of institute's selected scenes macro block to the field macro block of virtual basic unit with the movable information (partition mode, reference key and motion vector) of the institute's selected scenes macro block movable information as the field macro block of virtual basic unit.

Iv) basic unit and enhancement layer are the situations of a picture

This situation is shown in Figure 13 d.As shown in the figure, by with the movable information direct copying of the respective fields macro block of basic unit to the field macro block of virtual basic unit with the movable information (partition mode, reference key and motion vector) of the respective fields macro block of the basic unit movable information as the field macro block of virtual basic unit.Equally in this case, movable information is to be copied having between the macro block of same parity.

The description of above inter-layer prediction is to provide for the situation that basic unit and enhancement layer have an equal resolution.When resolution at enhancement layer is put up with in following description higher than basic unit's resolution, how (, as SpatialScalabilityType () greater than 0 the time) identifies the type of macro block in the picture type (progressive frame, MBAFF frame or interlaced field) of every one deck and/or picture and provides according to the type application inter-layer prediction method that identifies.At first inter-layer motion prediction is described.

M_A). basic unit's (progressive frame)-〉 enhancement layer (MBAFF frame)

Figure 14 a illustrates the processing method for this situation.As shown in the figure, at first, the movable information of all macro blocks of the respective frame in basic unit is copied to create virtual frames.Then carry out and rise sampling.Rise in sampling at this, utilize the texture information of basic unit's picture to carry out interpolation with the interpolation rate that the resolution (or being picture size) that allows this picture equates with the resolution of current layer.The movable information of each macro block of the picture that is exaggerated by interpolation in addition, is based on that the movable information of each macro block of this virtual frames constructs.Can be with in multiple known method a kind of for this structure.The picture of the provisional basic unit that constructs in this way has the resolution identical with the picture of current (enhancing) layer.Correspondingly, can use above-mentioned inter-layer motion prediction in this case.

(Figure 14 a), the macro block in the picture in basic unit and current layer is the field macro block in frame macro block and MBAFF frame, and current layer comprises the MBAFF frame because basic unit comprises frame in this case.Correspondingly, the method for using above-mentioned situation I is carried out inter-layer motion prediction.Yet, macro block pair not only as mentioned above, the frame macro block is to also being included in same MBAFF frame.Correspondingly, when going out for frame macro block (mb) type rather than macro block (mb) type corresponding to the right type of the right current layer macro block of the macro block in the picture of provisional basic unit is identified, use the method (frame-frame Forecasting Methodology) of the motion prediction of the known simple copy that comprises movable information between the frame macro block.

M_B). basic unit's (progressive frame)-〉 enhancement layer (interlaced field)

Figure 14 b illustrates the processing method for this situation.As shown in the figure, at first, the movable information of all macro blocks of the respective frame in basic unit is copied to create virtual frames.Then carry out and rise sampling.Rise in sampling at this, utilize the texture information of basic unit's picture, carry out interpolation with the interpolation rate that the resolution that allows picture equates with the resolution of current layer.The movable information of each macro block of the picture that is exaggerated by interpolation in addition, is based on that the movable information of each macro block of the virtual frames that creates constructs.

Use the method for above-mentioned situation II and carry out inter-layer motion prediction, because each macro block of the picture of the provisional basic unit of structure is all frame macro blocks in this way, and each macro block of current layer is all the field macro blocks in a picture.

M_C). basic unit's (MBAFF frame)-〉 enhancement layer (progressive frame)

Figure 14 c illustrates the processing method for this situation.As shown in the figure, at first, the corresponding MBAFF frame transform of basic unit is become progressive frame.The method of above-mentioned situation III is applicable to field macro block with the MBAFF frame to being transformed into progressive frame, and known frame-frame Forecasting Methodology is applicable to the right conversion of frame macro block of MBAFF frame.Certainly, when the method with situation III is applied in this situation, be to utilize the movable information that the data that dope and the difference of the data of the actual layer of wanting encoding and decoding is carried out each macro block of data creation virtual frames that the inter-layer prediction of the operation of encoding and decoding obtains and this frame by not carrying out.

In case the acquisition virtual frames is just carried out this virtual frames and is risen sampling.Rise in sampling at this, carry out interpolation with the interpolation rate that the resolution that allows basic unit equates with the resolution of current layer.In addition, utilize the movable information of each macro block of movable information structure through amplifying picture of a kind of each macro block based on virtual frames in multiple known method.Here, carry out known frame macro block-macro block inter-layer motion prediction method, because each macro block of the picture of the provisional basic unit that constructs in this way is all frame macro blocks, and each macro block of current layer is all frame macro blocks.

M_D). basic unit's (interlaced field)-〉 enhancement layer (progressive frame)

Figure 14 d illustrates a kind of processing method for this situation.In this case, the type of picture is identical with the type of the macro block of this picture.As shown in the figure, at first, the respective fields of basic unit is transformed into progressive frame.The frame that conversion goes out has vertical/horizontal (in length and breadth) ratio identical with the picture of current layer.The method that rises sampling process and above-mentioned situation IV is applicable to interlaced field is transformed into progressive frame.Certainly, when the method with situation IV is applied in this situation, be to utilize by not carrying out the data that dope and the difference of the data of the actual layer of wanting encoding and decoding are carried out the movable information that data that the inter-layer prediction of the operation of encoding and decoding obtains create each macro block of the data texturing of virtual frames and this frame.

In case the acquisition virtual frames is just carried out this virtual frames and is risen sampling.Rise in sampling at this, carry out interpolation and equal the resolution of current layer with the resolution that allows virtual frames.In addition, use the movable information of each macro block of the picture that the movable information structure interpolation of a kind of each macro block based on virtual frames in multiple known method goes out.Here be to carry out known frame macro block-macro block inter-layer motion prediction method, because each macro block of the picture of the provisional basic unit of structure is all frame macro blocks in this way, and each macro block of current layer is the frame macro block.

Figure 14 e illustrate according to another embodiment of the invention for above situation M_D) processing method.As shown in the figure, this embodiment is transformed into progressive frame with strange or even respective fields.For interlaced field is transformed into progressive frame, use the method that rises sampling and above-mentioned situation IV as shown in Figure 14 d.In case acquisition virtual frames, motion prediction between---its in the multiple known method a kind of---picture that carries out current layer of just virtual frames being used the method for the motion prediction between the picture with same aspect ratio and provisional layer is with the prediction encoding and decoding of the movable information of each macro block of the picture line by line of execution current layer.

The difference of the method for the method shown in Figure 14 e and Figure 14 d is not generate interim prediction signal.

Figure 14 f illustrate according to another embodiment of the invention for above situation M_D) processing method.As shown in the figure, this embodiment copies the movable information of all macro blocks of respective fields of basic unit to create virtual screen.Then carry out and rise sampling.Rise in sampling at this, use the texture information of the picture of basic unit, and different interpolation rates is used for vertical and horizontal interpolation so that have the size identical with the picture of current layer (or being resolution) through the picture of amplification.In addition, a kind of (for example, the expansion special liftable level (ESS)) in multiple known Forecasting Methodology can be applied to virtual screen with structure through amplifying various syntactic informations and the movable information of picture.The motion vector that constructs in this process is expanded according to magnification ratio.In case provisional basic unit is configured out through rising sampled picture, just this picture be used for is carried out the inter-layer motion prediction of each macro block of the picture of current layer, with the movable information of each macro block of the picture of encoding and decoding current layer.Here, use known frame macro block=macro block inter-layer motion prediction method.

Figure 14 g illustrate according to another embodiment of the invention for above situation M_D) processing method.As shown in the figure, at first this embodiment copies the movable information of all macro blocks of respective fields of basic unit to create virtual screen.Afterwards, use the texture information of picture of basic unit with for the vertical ratio different with horizontal interpolation execution interpolation.The texture information that creates by this operation is used to inter-layer texture prediction.In addition, the movable information in virtual screen is used to carry out the inter-layer motion prediction of each macro block in the picture of current layer.Here, a kind of (for example, the special liftable level of expansion (ESS) of definition in associating liftable level Video Model (JSVM)) in the multiple known method of application carried out the motion prediction encoding and decoding of the picture of current layer.

The difference of the method for the method shown in Figure 14 g and Figure 14 f is not generate interim prediction signal.

M_E). basic unit's (MBAFF frame)-〉 enhancement layer (MBAFF frame)

Figure 14 h illustrates the processing method for this situation.As shown in the figure, at first, the corresponding MBAFF frame transform of basic unit is become progressive frame.For the MBAFF frame transform is become progressive frame, the method for above-mentioned situation III is applicable to the right conversion of field macro block of MBAFF frame, and frame-frame Forecasting Methodology is applicable to the right conversion of frame macro block of MBAFF frame.Certainly, when the method with situation III is applied in this situation, be to utilize the data that obtain by the inter-layer prediction that need not carry out data that encoding and decoding dope and the operation of the difference of the data of the actual layer of wanting encoding and decoding to create the movable information of each macro block of virtual frames and this frame.

In case the acquisition virtual frames is just carried out this virtual frames and is risen sampling.Rise in sampling at this, carry out interpolation with the interpolation rate that the resolution that allows basic unit equates with the resolution of current layer.In addition, utilize the movable information of each macro block of the picture of movable information structure through amplifying of a kind of each macro block based on virtual frames in multiple known method.Use the method for above-mentioned situation I and carry out inter-layer motion prediction, because each macro block of the picture of the provisional basic unit of structure is all frame macro blocks in this way, and each macro block of current layer is all the field macro blocks in the MBAFF frame.Yet, macro block pair not only as mentioned above, the frame macro block is to also being included in same MBAFF frame.Correspondingly, corresponding to the right current layer macro block of the macro block in the picture of provisional basic unit when being frame macro block rather than macro block, use the method (frame-frame Forecasting Methodology) of the motion prediction of the known copy that comprises movable information between the frame macro block.

M_F). basic unit's (MBFF frame)-〉 enhancement layer (interlaced field)

Figure 14 i illustrates the processing method of this situation.As shown in the figure, at first, the corresponding MBAFF frame transform of basic unit is become progressive frame.For the MBAFF frame transform is become progressive frame, the method for above-mentioned situation III is applicable to the right conversion of field macro block of MBAFF frame, and frame-frame Forecasting Methodology is applicable to the right conversion of frame macro block of MBAFF frame.Certainly, equally, when the method with situation III is applied in this situation, be to utilize the data that obtain by the inter-layer prediction that need not carry out data that encoding and decoding dope and the operation of the difference of the data of the actual layer of wanting encoding and decoding to create the movable information of each macro block of virtual frames and this frame.

In case obtain virtual frames, just with the interpolation rate that allows resolution to equal the resolution of current layer, this virtual frames carried out interpolation.In addition, use the movable information of each macro block of the picture of movable information structure through amplifying of a kind of each macro block based on virtual frames in multiple known method.Use the method for above-mentioned situation II and carry out inter-layer motion prediction, because each macro block of the picture of the provisional basic unit of structure is all frame macro blocks in this way, and each macro block of current layer is all the field macro blocks in occasionally strange.

M_G). basic unit's (interlaced field)-〉 enhancement layer (MBAFF frame)

Figure 14 j illustrates the processing method for this situation.As shown in the figure, at first, the interlaced field of basic unit is transformed into progressive frame.Use the method that rises sampling and above-mentioned situation IV interlaced field is transformed into progressive frame.Certainly, equally, when the method with situation IV is applied in this situation, be to utilize by not carrying out the data that dope and the difference of the data of the actual layer of wanting encoding and decoding are carried out the movable information that data that the inter-layer prediction of the operation of encoding and decoding obtains create each macro block of virtual frames and this frame.

In case the acquisition virtual frames is just carried out this virtual frames and is risen sampling to allow resolution to equal the resolution of current layer.In addition, utilize the movable information of each macro block of the picture of a kind of structure through amplifying in multiple known method.Use the method for above-mentioned situation I and carry out inter-layer motion prediction, because each macro block of the picture of the provisional basic unit of structure is all frame macro blocks in this way, and each macro block of current layer is all the field macro blocks in the MBAFF frame.Yet, macro block pair not only as mentioned above, the frame macro block is to also being included in same MBAFF frame.Therefore, corresponding to the right current layer macro block of the macro block in the picture of provisional basic unit when comprising frame macro block rather than macro block, use the method (frame-frame Forecasting Methodology) of the known motion prediction between the frame macro block rather than the Forecasting Methodology of above-mentioned situation I.

M_H). basic unit's (interlaced field)-〉 enhancement layer (interlaced field)

Figure 14 k illustrates the processing method for this situation.As shown in the figure, at first, the movable information of all macro blocks of the respective fields in basic unit is copied to create virtual field, then this virtual field is carried out rising sampling.This rises sampling and carries out with the sample rate that rises that the resolution that allows basic unit equates with the resolution of current layer.In addition, the movable information of each macro block of the picture of movable information structure through amplifying of each macro block of a kind of virtual frames based on creating in the multiple known method of use.Use the situation iv in above-mentioned situation V) method carry out inter-layer motion prediction, because each macro block of the picture of the provisional basic unit of structure is all the field macro blocks in a picture in this way, and each macro block of current layer is all also the field macro blocks in a picture.

Although be that texture information with the picture of the texture information of the virtual field of provisional layer or frame rather than basic unit carries out rising sampling in the description of the embodiment of Figure 14 a to 14k, the texture information of basic unit's picture also can be used for rising sampling.In addition, if not necessity, in the time of will being used for the movable information of picture of provisional layer of the inter-layer motion prediction carried out in following stages when deriving, can omit the interpolation process that utilizes texture information above-mentioned in rising sampling process.

On the other hand, although the description of texture prediction is to provide for the situation that basic unit and enhancement layer have a same spatial resolution, these two layers may have different spatial resolutions as mentioned above.In the situation that the resolution of enhancement layer is higher than the resolution of basic unit, at first, execution makes the resolution of the picture of basic unit equal the operation of resolution of the picture of enhancement layer, the basic unit's picture that has the resolution identical with the resolution of enhancement layer with establishment, and select the texture prediction method corresponding with each situation in above-mentioned situation I-V to carry out the prediction encoding and decoding based on each macro block in this picture.Describe now the program that the resolution that makes basic unit's picture equals the resolution of enhancement layer picture in detail.

When consider being used for inter-layer prediction two-layer, the combined number that is used for the picture format (line by line and interlaced format) of encoding and decoding between two-layer is 4, because two kinds of vision signal scan methods are arranged, a kind of is to line by line scan and another kind is interlacing scan.Therefore, will increase the resolution of basic unit's picture to carry out the method for inter-layer texture prediction for each description in these four kinds of situations respectively.

T_A). enhancement layer be line by line and basic unit is the situation of interlacing

Figure 15 a illustrates and for this situation, basic unit's picture is used for the embodiment of the method for inter-layer texture prediction.As shown in the figure, be included in strange of the even summation of different time output on the time corresponding to basic unit's picture 1501 of picture 1500 of current (enhancing) layer.Therefore, at first, EL encoder 20 is divided into even summation strange (S151) with the picture of basic unit.The internal schema macro block of basic unit's picture 1501 has the raw image data that is not encoded (or decoded view data) for intra mode prediction, and between its pattern, macro block has encoded residual error data for residual prediction (or through decoding residual error data).When describing texture prediction hereinafter, like this equally for base layer macro block or picture.

After respective picture 1501 was divided into field component, EL encoder 20 was carried out the interpolation of isolated 1501a and 1501b on vertical and/or horizontal direction, to create the strange picture 1502a of even summation and the 1502b (S152) through amplifying.This interpolation is used a kind of in multiple known method, such as 6 tap filtering and binary system linear filtering.Be used for increasing the vertical and level of resolution (that is, size) of picture than the vertical and level ratio of the size that equals enhancement layer picture 1500 with the size of basic unit picture 1501 by interpolation.Vertical and level ratio can be equal to each other.For example, if the resolution between enhancement layer and basic unit is 2, the strange 1501a of isolated even summation and 1501b are carried out interpolation, to create again a pixel on vertical and horizontal direction between each pixel in each.

In case interpolation is completed, make up the strange 1502a of even summation through amplifying and 1502b with structure picture 1503 (S153).In this combination, alternately select the strange 1502a of even summation through amplifying and 1502b row (1502a-〉1502b-〉1502a-1502b-..) then with it by the picture 1503 of order layout to construct combination of selecting.The block mode of each macro block in the picture 1503 of here, determining to make up.For example, the block mode of the macro block of the picture 1503 of combination is confirmed as equating with the block mode of macro block in the basic unit's picture 1501 that comprises the zone with identical image composition.This definite method can be applicable in any situation of the picture through amplifying described below.Because the combined picture 1503 of structure has the spatial resolution identical with the current picture 1500 of enhancement layer in this way, so the texture prediction (for example, between frame-frame macro block texture prediction) of carrying out the macro block in current picture line by line 1500 based on the respective macroblock of combined picture 1503 (S154).

Figure 15 b illustrates the method for using basic unit's picture in inter-layer texture prediction according to another embodiment of the invention.As shown in the figure, this embodiment is absent from the scene and separates basic unit's picture on the basis of attribute (parity), but directly carry out the interpolation (S155) of basic unit's picture of strange of the even summation that is included in different time output on vertical and/or horizontal direction, with the resolution (that is, size) of structure resolution and enhancement layer picture identical through the amplification picture.Structure is used to carry out the inter-layer texture prediction (S156) of the current picture line by line of enhancement layer through amplifying picture in this way.

Figure 15 a separates the picture with strange of even summation it is carried out the program of interpolation on illustrating by the basis of attribute on the scene on the picture rank.Yet EL encoder 20 can be reached and result identical shown in Figure 15 a by the program shown in execution graph 15a on the macro block rank.More specifically, the vision signal of---itself and be subject at present macro block in the enhancement layer picture of texture prediction encoding and decoding to coordination---the strange field component of even summation that can comprise as in Figure 16 a or 16b of when the basic unit with strange of even summation is when encode by MBAFF, vertically adjoining macro block in picture 1501 right.Figure 16 a illustrates the frame MB that interweaves in the strange field component of even summation each macro block in a pair of macro block A and B wherein to pattern, and each macro block that Figure 16 b illustrates in wherein a pair of macro block A and B comprises that the field MB of the video line with homologous field attribute is to pattern.

In the situation that Figure 16 a, for the method shown in application drawing 15a, select this idol row to each macro block in macro block A and B to construct even piece A', and select its strange row to construct strange piece B', thereby with the macro block that is intertwined with the strange field component of even summation in each macro block to being divided into two piece A' and the B' that has respectively the strange field component of even summation.To each in isolated two macro block A' and B ' in this way carry out interpolation with structure through amplifying piece.Utilization in amplifying piece with the current enhancement layer picture that will be subject to the texture prediction encoding and decoding in intra_BL(basic unit) or the residual_prediction(residual prediction) data in the macro block of pattern corresponding zone carry out texture prediction.Although not shown in Figure 16 a, but but partly on attribute on the scene basis in all piece structural map 15a that amplifies of combination individually through amplifying the strange picture 1502a of even summation and 1502b, so can construct by the operation above every pair of macro block is repeated through amplifying the strange picture 1502a of even summation and 1502b in Figure 15 a.

In the situation that cut apart macro block to constructing each macro block based on the field attribute as Figure 16 b, above-mentioned separable programming is to copying simply each macro block with the process of the macro block of constructing two separation from this macro block.Follow-up program is similar to the described program with reference to figure 16a.

T_B). enhancement layer be interlacing and basic unit is situation line by line

Figure 17 a illustrates and for this situation, basic unit's picture is used for the embodiment of the method for inter-layer texture prediction.As shown in the figure, at first, EL encoder 20 is current layer picture 1700 structure two pictures (S171).In the exemplary method of two pictures of application construction, select the idol row of respective picture 1701 to construct a picture 1701a, and select its strange row to construct another picture 1701b.Then EL encoder 20 carries out the interpolation of two picture 1701a so constructing and 1701b to create two picture 1702a and 1702b (S172) through amplifying on vertical and/or horizontal direction.This interpolation is used a kind of in multiple known method, such as situation T_A) in 6 tap filtering and binary system linear filtering.For increasing the ratio of resolution also with situation T_A) in describe those are identical.

In case interpolation is completed, just make up these two through amplifying field 1702a and 1702b with structure picture 1703 (S173).In this combination, alternately select these two through amplifying field 1702a and the row of 1702b (1702a-〉1702b-〉1702a-1702b-...) then it is pressed the order layout of selection with the picture 1703 of tectonic association.Because the combined picture that constructs in this way 1703 has the spatial resolution identical with the current picture 1700 of enhancement layer, so the texture prediction (for example, between frame-frame macro block texture prediction or with reference to the texture prediction of figure 4g description) of carrying out the macro block in current interlaced picture 1700 based on the respective macroblock of picture 1703 of combination (S174).

Figure 17 b illustrates the method for using basic unit's picture in inter-layer texture prediction according to another embodiment of the invention.As shown in the figure, this embodiment is not divided into basic unit's picture two pictures, but directly carry out the interpolation (S175) of basic unit's picture on vertical and/or horizontal direction, with structure resolution identical with enhancement layer screen resolution (that is, size) through the amplification picture.Structure is used to carry out the inter-layer texture prediction (S176) of the current interlaced picture of enhancement layer through amplifying picture in this way.

Although the description of Figure 17 a also provides on the picture rank, EL encoder 20 can be as above situation T_A) described in carry out the picture separation process on the macro block rank.When single picture 1701 is considered as vertically adjoining macro block to the time, the method for Figure 17 b is similar to the separation shown in Figure 17 a and interpolator.Here it has omitted the detailed description of this program, because can be understood intuitively from Figure 17 a.

T_C). enhancement layer and basic unit be the situation of interlacing both

Figure 18 illustrates and for this situation, basic unit's picture is used for the embodiment of the method for inter-layer texture prediction.In this case, as shown in the figure, EL encoder 20 with situation T_A) in identical mode upper basic unit's picture 1801 corresponding to current layer picture 1800 of time is divided into even summation strange (S181).Then EL encoder 20 carries out the interpolation of isolated 1801a and 1801b to create the strange picture 1802a of even summation and the 1802b (S182) through amplifying on vertical and/or horizontal direction.Then EL encoder 20 makes up the strange 1802a of even summation through amplifying and 1802b with structure picture 1803 (S182).The inter-layer texture prediction (for example, between frame-frame macro block texture prediction or with reference to the texture prediction of figure 4g description) that then EL encoder 20 carries out macro block in current interlaced picture 1800 (the frame macro block of MBAFF coding to) based on the respective macroblock of combined picture 1803 (S184).

although two layers have identical picture format, but separate picture 1801 (S181) of basic unit on the basis of EL encoder 20 attributes on the scene and amplify isolated (S182) individually and then make up picture (S183) through amplifying, this be because if the picture 1801 of strange of combination even summation when the characteristic that its vision signal with strange of even summation alters a great deal by Direct interpolation, the interlaced picture 1800 that the even summation that interweaves with having of enhancement layer is strange is compared, the image that picture through amplifying may have a distortion (for example, has the image that stretches the border).Correspondingly, even two layers are all interlacing, according to the present invention, EL encoder 20 obtains two fields with it after also separating on the basis of basic unit's picture attribute on the scene, and individual amplifies this two fields, then makes up the field through amplifying.

Certainly, can not be always to use method shown in Figure 180 when the picture of two layers all is interlacing, but the video properties that replaces according to picture selectively use the method.

Figure 18 is illustrating according to the present invention the program of separating on attribute on the scene basis and amplifying the picture with strange of even summation on the picture rank.yet, as above T_A) described in, EL encoder 20 can be reached result same as shown in Figure 18 by carry out program shown in Figure 180 on the macro block rank, it comprise with reference to figure 16a and 16b describe based on macro block separate and interpolation process (specifically with the frame macro block to the piece that is divided into the strange row of even summation and amplify isolated) and combination and inter-layer texture prediction process (specifically alternately select the row of the piece through amplifying to construct a pair of piece through amplifying individually, and utilize the piece through amplifying that constructs to carrying out the right texture prediction of frame macro block of current layer).

T_D). enhancement layer and basic unit be line by line situation both

In this case, basic unit's picture is amplified to the size identical with the enhancement layer picture, and with the inter-layer texture prediction of the picture through amplifying for the current enhancement layer picture with same frame form.

Although below described the embodiment of the texture prediction when basic unit and enhancement layer have identical temporal resolution, two layers may have different temporal resolutions, that is, and and different picture rates.If even the picture of all layer is also different picture scan type when all layers have same time resolution, these pictures may comprise the vision signal with different output times, even they are pictures (that is the picture that corresponds to each other on, the time) of identical POC.Inter-layer texture prediction method in this case will be described now.In the following description, suppose that two layers have identical spatial resolution at first.If two-layer have different spatial resolutions, as described above at each picture of liter sampling basic unit so that spatial resolution equal to use again method described below after the resolution of enhancement layer.

A) enhancement layer comprises that progressive frame, basic unit comprise that the temporal resolution of MBAFF frame and enhancement layer reaches the high situation of twice

Figure 19 a illustrates the inter-layer texture prediction method for this situation.As shown in the figure, each MBAFF frame of basic unit comprises strange of the even summation with different output times, so EL encoder 20 is divided into even summation strange (S191) with each MABFF frame.EL encoder 20 is divided into respectively even and strange with even field component (for example, even row) and the strange field component (for example, strange row) of each MBAFF frame.After in this way the MBAFF frame being divided into two fields, EL encoder 20 in vertical direction interpolation each so that it has the high resolution (S192) that reaches twice.This interpolation is used a kind of in multiple known method, such as 6 tap filtering, binary system linear filtering and the capable zero padding of sample.In case interpolation is completed, consistent picture each frame of enhancement layer is just free in basic unit, so the macro block of every frame of 20 pairs of enhancement layers of EL encoder is carried out known inter-layer texture prediction (for example, predicting between frame-frame macro block) (S193).

Also above program can be applied to inter-layer motion prediction.Here, when the MBAFF frame was divided into two fields, the movable information of each macro block of the field macro block centering in EL encoder 20 copy MBAFF frames was as the movable information with macro block of homologous field attribute (parity), to use it for inter-layer motion prediction.Even not free in basic unit during consistent picture (at t1, t3 ... situation under), use the method also can create out upper consistent picture of time to carry out inter-layer motion prediction according to said method.

When one of two-layer resolution is twice high of resolution of another layer as in the example of Figure 19 a and even that N doubly all can directly use above-mentioned method during (three times or more than) high at it.For example, when resolution is three times high, one of these two isolated can be copied in addition construct and to use three fields, and when resolution is four times high, each in these two isolated fields can be copied again to construct and to use four fields.Obviously, in the situation that differences in resolution is any time arranged, those skilled in the art need not any creative thinking just can come inter-layer prediction by using principle of the present invention simply.Therefore, being used for of not describing in this specification falls within the scope of the present invention naturally in any method that prediction between the layer of different time resolution is being arranged.Other situations described below are like this equally.

If base layer encoder has been become frame self-adaptive field and frame (PAFF) rather than MBAFF frame, two-layer may have identical temporal resolution as in Figure 19 b.Therefore, in this case, undertaken frame being divided into the process of two and directly frame being carried out carry out again inter-layer texture prediction after interpolation is constructed the picture with temporal resolution identical with current layer by need not.

B) enhancement layer comprises that MBAFF frame, basic unit comprise that the temporal resolution of progressive frame and enhancement layer is half situation of basic unit

Figure 20 illustrates the inter-layer texture prediction method for this situation.As shown in the figure, each MBAFF frame of enhancement layer comprises strange of the even summation with different output times, so EL encoder 20 is divided into even summation strange (S201) with each MABFF frame.EL encoder 20 is divided into respectively even and strange with even field component (for example, even row) and the strange field component (for example, strange row) of each MBAFF frame.EL encoder 20 is carried out the sub sampling of each frame of basic unit in vertical direction with the picture (S202) that reduces by half of structure resolution.This sub sampling can use a kind of in capable sub sampling or various other known down-sampled method, in the example of Figure 20, picture (picture t0, t2, t4 that 20 selections of EL encoder have even picture index ...) even row obtaining half-sized picture, and select picture (picture t1, t3 with strange picture index ...) strange row to obtain half-sized picture.Also can carry out in reverse order frame and separate (S201) and sub sampling (S202).

In case complete this two process S201 and S202, from the frame of enhancement layer isolated 2001, picture with field 2001 consistent on time and that have the spatial resolution identical with field 2001 has just been arranged basic unit, EL encoder 20 is carried out known inter-layer texture prediction (for example, predicting between frame-frame macro block) (S203) to the macro block in each thus.

Also above program can be applied to inter-layer motion prediction.Here, when obtaining from each frame of basic unit the picture (S202) that size reduces by sub sampling, EL encoder 20 can be according to the method that is fit to (for example, adopt the method for the movable information of the piece do not divided fully) obtain the movable information of respective macroblock from the movable information of each macro block of vertically adjoining macro block centering, then the movable information that obtains can be used for inter-layer motion prediction.

In this case, the picture of enhancement layer is encoded in order to transmit by PAFF, because inter-layer prediction is to carrying out from isolated each picture 2001 of MBAFF frame.

C) enhancement layer comprises that MBAFF frame, basic unit comprise progressive frame and two-layer situation with identical temporal resolution

Figure 21 illustrates the inter-layer texture prediction method for this situation.As shown in the figure, each MBAFF frame of enhancement layer comprises strange of the even summation with different output times, so EL encoder 20 is divided into even summation strange (S211) with each MABFF frame.EL encoder 20 is divided into respectively even and strange with even field component (for example, even row) and the strange field component (for example, strange row) of each MBAFF frame.EL encoder 20 is carried out the sub sampling of each frame of basic unit in vertical direction with the picture (S212) that reduces by half of structure resolution.This sub sampling can use a kind of in capable sub sampling or various other known down-sampled method.Also can carry out in reverse order frame and separate (S211) and sub sampling (S212).

EL encoder 20 also can be constructed (for example, an even picture) by the MBAFF frame, rather than the MBAFF frame is divided into two fields.This is to have identical temporal resolution because of two-layer, so only has one (rather than both all) to have the respective frame that can be used for inter-layer prediction from a frame in isolated two field pictures in basic unit.

In case complete this two process S211 and S212, EL encoder 20 is just carried out known inter-layer texture prediction (for example, predicting between frame-frame macro block) (S213) based on corresponding picture through sub sampling in basic unit to only even (very) field in isolated field from the frame of enhancement layer.

Equally in this case, can by with situation b) described identical mode carries out inter-layer motion prediction to isolated of the enhancement layer of carrying out inter-layer texture prediction for it.

Provide although above description is the inter-layer prediction operation of just being carried out by the EL encoder 20 of Fig. 2 a or 2b, all descriptions of inter-layer prediction operation can jointly be applicable to receive through the information of decoding and the EL decoder of decoding enhancement layer stream from basic unit.In the Code And Decode program, the operation of above-mentioned inter-layer prediction (comprise for separating of, amplify and combined picture or macro block in the operation of vision signal) carry out in an identical manner, but the operation after inter-layer prediction is carried out in a different manner.The example of this difference is: after carrying out motion and texture prediction, difference information between the information that encoder encodes dopes or the information that dopes and actual information, in order to send it to decoder, and decoder will be by directly applying to current macro or obtain actual motion information and texture information by the actual macro block coding/decoding information that receives of other use by the information of carrying out with identical interlayer motion and texture prediction in encoder place execution obtains.The above details of describing from the coding angle of the present invention and principle are directly applied for the decoder of the two-layer data flow that decoding receives.

Yet, when the EL encoder transmitted the enhancement layer with MBAFF frame in the PAFF mode after as described in reference Figure 20 and 21, enhancement layer being separated into a sequence and inter-layer prediction, decoder was not carried out the above-mentioned program that the MBAFF frame is divided into a picture to the current layer that receives.

In addition, decoder then from receive signal decoding go out to identify EL encoder 20 be as shown in Fig. 8 d or carried out as shown in Fig. 8 h the inter-layer texture prediction between macro block sign ' field_base_flag'.Based on the value of statistical indicant that decodes, decoder determines that the prediction between macro block carries out or carry out as shown in Fig. 8 h as shown in Fig. 8 d, and determine to obtain texture prediction information according to this.If do not receive sign ' field_base_flag', EL decoder hypothesis has received the sign with " 0 " value.That is, the texture prediction between EL decoder hypothesis macro block is to carry out according to the method as shown in Fig. 8 d, and obtains the right information of forecasting of current macro to rebuild current macro or macro block pair.

Have at least in above-mentioned limited embodiment of the present invention one even can be when using the video signal source of different-format (or pattern) inter-layer prediction.Therefore, when a plurality of layers of encoding and decoding, can improve the data encoding rate, and be not limited to the picture type of vision signal, such as interlace signal, progressive signal, MBAFF frame picture and a picture.In addition, when being the interlaced video signal source, the image of the picture that uses in prediction can be configured to more be similar to the original image for the prediction encoding and decoding, thereby improves the data encoding rate when one of two-layer.

Although described the present invention with reference to preferred embodiment, it should be apparent to those skilled in the art that and to carry out in the present invention various improvement, modification, replacement and increase and can not depart from the scope of the present invention and spirit.Therefore, the present invention is intended to contain improvement of the present invention, modification, replacement and increases, as long as they drop in the scope of claims and equivalents thereof.

Claims

1. method that is used for decoded video signal said method comprising the steps of:

Derive the positional information of the virtual macro block corresponding with the frame macro block of current layer;

Positional information based on described virtual macro block, derive the positional information of the respective macroblock of basic unit, the respective macroblock of described basic unit is single macro block in macro block adaptive frame/field frame (MBAFF frame), and described macro block adaptive frame/field frame (MBAFF frame) be comprise the macro block that comprises strange and an even macro block to and the right frame of macro block that comprises two frame macro blocks;

Based on the positional information of described respective macroblock, obtain the movable information of described respective macroblock;

Utilize the movable information of described respective macroblock, predict the movable information of described frame macro block; And

Utilize the movable information of described prediction, the frame macro block of the described current layer of decoding.

2. the method for claim 1, is characterized in that, described movable information comprises motion vector and reference key.

3. method as claimed in claim 2, is characterized in that, described prediction comprises that reference key with described respective macroblock is divided by 2.

4. the method for claim 1, is characterized in that, described respective macroblock forms according to coded in inter mode.

5. the method for claim 1, is characterized in that, the aspect ratio of the image of described basic unit and the image of enhancement layer is different.

6. device that is used for decoded video signal, described device comprises:

Decoding unit, it is used for deriving the positional information of the virtual macro block corresponding with the frame macro block of current layer; Based on the positional information of described virtual macro block, derive the positional information of the respective macroblock of basic unit; Based on the positional information of described respective macroblock, obtain the movable information of described respective macroblock; Utilize the movable information of described respective macroblock, predict the movable information of described frame macro block; And the movable information that utilizes described prediction, the frame macro block of the described current layer of decoding,

Wherein, the respective macroblock of described basic unit is single macro block in macro block adaptive frame/field frame (MBAFF frame), and

Described macro block adaptive frame/field frame (MBAFF frame) be comprise the macro block that comprises strange and an even macro block to and the right frame of macro block that comprises two frame macro blocks.

7. device as claimed in claim 6, is characterized in that, described movable information comprises motion vector and reference key.

8. device as claimed in claim 7, is characterized in that, described prediction comprises that reference key with described respective macroblock is divided by 2.

9. device as claimed in claim 6, is characterized in that, described respective macroblock forms according to coded in inter mode.

10. device as claimed in claim 6, is characterized in that, the aspect ratio of the image of described basic unit and the image of enhancement layer is different.