CN102075766B

CN102075766B - Video coding and decoding methods and devices, and video coding and decoding system

Info

Publication number: CN102075766B
Application number: CN 200910221967
Authority: CN
Inventors: 李斌; 李厚强; 邸佩云; 谢清鹏; 胡昌启; 姚峻
Original assignee: University of Science and Technology of China USTC; Huawei Technologies Co Ltd
Current assignee: University of Science and Technology of China USTC; Huawei Technologies Co Ltd
Priority date: 2009-11-23
Filing date: 2009-11-23
Publication date: 2013-01-09
Anticipated expiration: 2029-11-23
Also published as: CN102075766A

Abstract

The embodiment of the invention relates to a video coding method, a video coding device, a video decoding method, a video decoding device and a video coding and decoding system. The video coding method comprises the following steps of: acquiring an original sequence of video image data; performing forecasting, transformation, quantification and coding on a first image with a first size in the original sequence to obtain a code stream of the first image; performing forecasting, transformation and quantification on a second image with a second size in the original sequence according to a forecasting mode of the first image to obtain the transform domain data of the second image; and if the second size is different from the first size, performing forecasting on the transform domain data of the second image in a transform domain by taking the transform domain data of the first image as a reference, and coding the interlayer forecast transform domain data of the second image to obtain the code stream of the second image. Inverse transformation is not required to be performed again on the first image in the original sequence in the transform domain, so the video coding method, the video coding device, the video decoding method, the video decoding device and the video coding and decoding system provided by the embodiment of the invention reduce the coding complexity.

Description

Video coding, coding/decoding method, device and video coding and decoding system

Technical field

The embodiment of the invention relates to communication technical field, especially a kind of Video coding, coding/decoding method, device and video coding and decoding system.

Background technology

Along with the develop rapidly of Internet technology, video application on the internet is more and more extensive.In order to adapt to preferably the demand of different clients, joint video team (Joint Video Team, be called for short: JVT) with scalable video coding (Scalable Video Coding, be called for short: SVC) brought H.264/ advanced video coding (Advanced Video Coding, abbreviation: in expansion AVC) of comparatively advanced video encoding standard into.

It is a kind of technology that the SVC code stream is converted to the AVC code stream that the SVC code stream rewrites, there is the code stream rewriting technique of three kinds of SVC scalabilities in prior art, that is: local scalability code stream rewriting of playing, the scalability code stream rewriting of clean culture, the scalability code stream of broadcast/group broadcast rewrite, because classification B frame has well solved the scalability of temporal resolution, so the code stream rewriting technique does not need the scalability of the time of considering; Because JVT has solved preferably the scalability of quality and can use in practice, so the code stream rewriting technique does not need the scalability of considering quality yet.

Prior art will rewrite a part of classifying the SVC standard as based on the code stream of quality scalable, and the code stream rewriting technique of quality scalable carries out inter-layer prediction by coding is changed at transform domain; Rewrite, need to carry out inter-layer prediction in pixel domain in the process of decoding based on the coding of spatial scalable, code stream, if need to produce the code stream of spatial scalable, then need to carry out inter-layer prediction in pixel domain, therefore in code stream rewriting, decode procedure, need the decoded transform data of entropy is carried out inverse transformation, rebuild reference value, then reconstructed image in pixel domain after the inverse transformation; Perhaps more again prediction, conversion, quantification and coding after rebuilding reference value obtain new code stream.

The inventor finds in implementing process of the present invention, owing to needing decoded transform data is carried out inverse transformation in the process that rewrites at code stream and decode, has therefore increased the complexity that code stream rewrites and decodes.

Summary of the invention

The purpose of the embodiment of the invention is to provide a kind of Video coding, coding/decoding method, device and video coding and decoding system, by to the original series of vedio data in the transform domain processing of encoding or decode, reduce the complexity that code stream rewrites and decoding realizes.

The embodiment of the invention provides a kind of method for video coding, comprising:

Obtain the original series of vedio data;

To the code stream that the first image that has first size in the described original series is predicted, conversion, quantification, coding obtain described the first image;

According to the predictive mode of described the first image, to the second image that has the second size in the original series predict, conversion, quantification, obtain the transform domain data of described the second image;

If described the second size is not identical with described first size, take the transform domain data of described the first image as with reference to the transform domain data of described the second image are predicted at transform domain, transform domain inter-layer prediction data to described the second image after the prediction are encoded, and obtain the code stream of described the second image; Wherein, the code stream of the code stream of described the first image and described the second image forms the spatial scalable code stream.

The embodiment of the invention also provides a kind of video code flow rewrite method, comprising:

Obtain the spatial scalable code stream that obtains by transform domain inter-layer prediction coding;

Described spatial scalable code stream is being carried out in the rewrite process, if the second size that the second image in the described spatial scalable code stream has is not identical with the first size of the first image, then at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image, the transform domain data of the second encoding texture of described the second image are carried out code stream rewrite.

The embodiment of the invention also provides a kind of video encoding/decoding method, comprising:

Described spatial scalable code stream is carried out in the decode procedure, if the second size that the second image in the described spatial scalable code stream has is not identical with the first size of the first image, then at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image;

Transform domain data to the second encoding texture are carried out inverse transformation, and the second encoding texture of acquisition utilizes the second encoding texture to rebuild the second image.

The embodiment of the invention also provides a kind of video coding apparatus, comprising:

Acquisition module is for the original series that obtains vedio data;

The first coding module, the code stream that the first image that is used for described original series is had first size is predicted, conversion, quantification, coding obtain described the first image;

The first processing module is used for the predictive mode according to described the first image, to the second image that has the second size in the original series predict, conversion, quantification, obtain the transform domain data of described the second image;

The second coding module, if it is not identical with described first size to be used for described the second size, take the transform domain data of described the first image as with reference to the transform domain data of described the second image are predicted at transform domain, transform domain inter-layer prediction data to described the second image after the prediction are encoded, and obtain the code stream of described the second image; Wherein, the code stream of the code stream of described the first image and described the second image forms the spatial scalable code stream.

The embodiment of the invention also provides a kind of video code flow rewriting device, comprising:

Acquisition module is used for obtaining the spatial scalable code stream that obtains by transform domain inter-layer prediction coding;

Code stream rewrites module, be used for described spatial scalable code stream is being carried out rewrite process, if the second size that the second image in the described spatial scalable code stream has is not identical with the first size of the first image, then at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image, the transform domain data of the second encoding texture of described the second image are carried out code stream rewrite.

The embodiment of the invention also provides a kind of video decoder, comprising:

First rebuilds module, is used for obtaining the spatial scalable code stream that obtains by transform domain inter-layer prediction coding;

Second rebuilds module, be used for described spatial scalable code stream is carried out decode procedure, if the second size that the second image in the described spatial scalable code stream has is not identical with the first size of the first image, then at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image;

Triple modeling pieces are used for the transform domain data of the second encoding texture are carried out inverse transformation, and the second encoding texture of acquisition utilizes the second encoding texture to rebuild the second image.

The embodiment of the invention also provides a kind of video coding and decoding system, comprising: video coding apparatus, at least one video decoder, at least one code stream rewriting device,

Described video coding apparatus is for the original series that obtains vedio data; To the code stream that the first image that has first size in the described original series is predicted, conversion, quantification, coding obtain described the first image; According to the predictive mode of described the first image, to the second image that has the second size in the original series predict, conversion, quantification, obtain the transform domain data of described the second image; If described the second size is not identical with described first size, take the transform domain data of described the first image as with reference to the transform domain data of described the second image are predicted at transform domain, transform domain inter-layer prediction data to described the second image after the prediction are encoded, and obtain the code stream of described the second image; Wherein, the code stream of the code stream of described the first image and described the second image forms the spatial scalable code stream;

Described code stream rewriting device is used for obtaining the spatial scalable code stream that obtains by transform domain inter-layer prediction coding; Described spatial scalable code stream is being carried out in the rewrite process, if the second size that the second image in the described spatial scalable code stream has is not identical with the first size of the first image, then at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image, the transform domain data of the second encoding texture of described the second image are carried out code stream rewrite;

Described video decoder is used for obtaining the spatial scalable code stream that obtains by transform domain inter-layer prediction coding; Described spatial scalable code stream is carried out in the decode procedure, if the second size that the second image in the described spatial scalable code stream has is not identical with the first size of the first image, then at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image; Transform domain data to the second encoding texture are carried out inverse transformation, and the second encoding texture of acquisition utilizes the second encoding texture to rebuild the second image.

The Video coding that the embodiment of the invention provides, coding/decoding method, device and video coding and decoding system, by to the second image of having the second size in the original series on transform domain take the transform domain data of the first image as with reference to the second image is predicted at transform domain, data to the second image after the prediction are encoded, obtain the code stream of the second image, do not process owing to not needing that or not first image in the original series is carried out inverse transformation again on transform domain, therefore reduced the complexity of coding and decoding.

Description of drawings

In order to be illustrated more clearly in the embodiment of the invention or technical scheme of the prior art, the below will do to introduce simply to the accompanying drawing of required use in embodiment or the description of the Prior Art, apparently, accompanying drawing in the following describes only is some embodiments of the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain according to these accompanying drawings other accompanying drawing.

Fig. 1 is the schematic flow sheet of an embodiment of method for video coding of the present invention;

Fig. 2 is a schematic diagram of Video coding embodiment illustrated in fig. 1;

Fig. 3 is another schematic diagram of Video coding embodiment illustrated in fig. 1;

Fig. 4 is the schematic flow sheet of an embodiment of video code flow rewrite method of the present invention;

Fig. 5 is the schematic diagram that middle code stream embodiment illustrated in fig. 4 rewrites;

Fig. 6 is another schematic diagram that middle code stream embodiment illustrated in fig. 4 rewrites;

Fig. 7 is the schematic flow sheet of an embodiment of video encoding/decoding method of the present invention;

Fig. 8 is a schematic diagram of middle video decode embodiment illustrated in fig. 7;

Fig. 9 is another schematic diagram of middle video decode embodiment illustrated in fig. 7;

Figure 10 is the structural representation of an embodiment of video coding apparatus of the present invention;

Figure 11 is the structural representation of an embodiment of video code flow rewriting device of the present invention;

Figure 12 is the structural representation of an embodiment of video decoder of the present invention;

Figure 13 is the structural representation of an embodiment of video coding and decoding system of the present invention;

Figure 14 is the structural representation of the code stream rewriting system that the embodiment of the invention was suitable for.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the invention, the technical scheme in the embodiment of the invention is clearly and completely described, obviously, described embodiment only is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment among the present invention, those of ordinary skills belong to the scope of protection of the invention not making the every other embodiment that obtains under the creative work prerequisite.

In order to be more readily understood the embodiment of the invention, the embodiment of the invention adopts following basic symbol to describe: establish f _{L, n} ⁱBe the l layer, n frame (also can be n or n image), the original pixel value of i encoding block (l, n, i are positive integer);

Be the l layer, n frame, the coding side decoding and rebuilding value of i encoding block.f _{L, n} ^{I, pred}Be the l layer, the n frame, the predicted value of i encoding block, predictive mode comprises infra-frame prediction and inter prediction.The pattern of l-1 layer bit stream is MODE (f _{L-1, n} ⁱ), encoding texture is

The operation of U () expression up-sampling, the size (perhaps resolution) that is about to the l-1 layer is transformed into the l-1 layer with the same size in upper strata, for example: l, l+1 ..., l+n (n is natural number), the embodiment of the invention is transformed into the l layer take the l-1 layer and describes as example; In addition, U () not only can represent the up-sampling of pixel, can also represent the up-sampling of pattern.DCT () expression is carried out the discrete cosine transform conversion to encoding block, and (Discrete Cosine Transform is called for short: DCT).

In order to understand better the technical scheme of the embodiment of the invention, the inter-layer prediction technology that the multilayer code stream is encoded is elaborated.When the l layer is carried out inter-layer prediction, need to carry out complete decoding to reference layer l-1 layer rebuilds, to obtain the more coded message of reference layer l-1 layer, consider the factors such as complexity of decoding and rebuilding, rebuilding with reference to layer l-1 layer complete decoding is not to be the best mode of inter-layer prediction, therefore need to select between inter-layer prediction and time prediction, for example: motion sequence correlation in time slow and that details is abundant can be stronger, then adopts the mode of time prediction comparatively simple this moment.The inter-layer prediction mode of multilayer code stream coding has three kinds: inter-layer motion prediction, inter-layer residue prediction, interlayer infra-frame prediction; Wherein, in the inter-layer motion prediction mode, when the corresponding encoding block of reference layer l-1 is interlayer coding (inter-coded), the encoding block of enhancement layer l also adopts interlayer coding (inter-coded), in the case, minute block message of enhancement layer l and the information such as reference frame and motion vector are all inherited from the encoding block of low layer l-1 correspondence position; In the inter-layer residue prediction mode, all interlayer encoding blocks can adopt the mode of inter-layer residue prediction, and the residual error of hypothetical reference layer l-1 layer is The present encoding piece is f _{L, n} ⁱ, the encoding block by motion search in former frame (perhaps some reference frames) is

The Integer DCT Transform operation of encoding block is denoted as DCT ().

In addition, since the embodiment of the invention take predictive mode as infra-frame prediction or inter prediction as example describes, therefore need to relate to the code stream of the first image and the code stream of the second image, easy for describing, the code stream of the first image is called the l-1 layer bit stream, and the code stream of the second image is called the l layer bit stream; If the second size that the second image has is not identical with the first size that the first image has, also be, the resolution (size dimension) of l-1 layer bit stream and l layer bit stream is not identical, then spatially the coding mode of the encoding block of correspondence position is identical for the coding mode of l layer bit stream and l-1 layer bit stream, for example: when the l layer bit stream is CLV Common Intermediate Format (Common Intermediate Format, be called for short: CIF) size, and the l-1 layer bit stream is 1/4th CLV Common Intermediate Formats (Quarter Common Intermediate Format, be called for short: QCIF) during size, corresponding four the l layer encoding blocks of l-1 layer encoding block, then the coding mode of these four l layer encoding blocks is determined by the coding mode of same l-1 layer encoding block.

Fig. 1 is the schematic flow sheet of an embodiment of method for video coding of the present invention, and Fig. 2 is a schematic diagram of Video coding embodiment illustrated in fig. 1, and Fig. 3 is another schematic diagram of Video coding embodiment illustrated in fig. 1; As shown in Figure 1, the embodiment of the invention comprises the steps:

Step 101, obtain the original series of vedio data;

Step 102, to the first image that has first size in the original series is predicted, conversion, quantification, coding obtain the first image code stream;

Step 103, according to the predictive mode of the first image, to the second image that has the second size in the original series predict, conversion, quantification, obtain the transform domain data of the second image;

If described the second size of step 104 is not identical with described first size, take the transform domain data of described the first image as with reference to the transform domain data of described the second image are predicted at transform domain, transform domain inter-layer prediction data to described the second image after the prediction are encoded, and obtain the code stream of described the second image; Wherein, the code stream of the code stream of described the first image and described the second image forms the spatial scalable code stream.

Can decode from the spatial scalable code stream in the embodiment of the invention obtains the different a plurality of images of size, and larger-size image is encoded as a reference with the less image of size in the process of this code stream of coding acquisition.

The method for video coding that the embodiment of the invention provides, take the transform domain data of the first image as with reference to the second image is predicted at transform domain, data to the second image after the prediction are encoded, obtain the code stream of the second image, do not process owing to not needing that or not first image in the original series is carried out inverse transformation again on transform domain, therefore reduced the complexity of coding.

Further, in above-mentioned steps 101, the original series of vedio data can be to receive from the vedio data of medium data acquisition equipment or local storage or by the device address, the first image and the second image in the original series can obtain respectively, the first image also can obtain by the reprocessing of the second image, such as obtaining by the second image down sampling or convergent-divergent; So long as vedio data gets final product, the source of vedio data does not consist of the restriction to the embodiment of the invention.

Further, in above-mentioned steps 102, in order to carry out compatibility with existing coding standard, the encoding block of the first image of having first size in the original series is carried out the coding method of single layer coding and can adopt existing single layer coding technology, for example: the AVC coding techniques; Existing single layer coding technology can not consist of the restriction to the embodiment of the invention, every can adopt single layer coding and satisfy the implementation that the original series data to image of client demand encode all be considered as the described method that can form the code stream of the first image of the embodiment of the invention; The present embodiment for convenience of description, the code stream of the first image is called the l-1 layer bit stream, the code stream of following the second image that forms based on l-1 layer bit stream coding is called the l layer bit stream, apparently, also can be with the code stream of l layer bit stream as the first image, with the code stream of l+1 layer bit stream as the second image, so code stream and the code stream of the second image of the first image only is relative conceptual description in the embodiment of the invention; In addition, the first alleged image of the embodiment of the invention refers to can be as the reference picture of image not identical with the size of the first image in the original series, and therefore the first image is not limited in the fixing image of resolution.

Further, carry out exemplary illustration in conjunction with Fig. 2, if the predictive mode of the first image is intra prediction mode, then above-mentioned steps 103 is specifically as follows:

Obtain the first encoding texture of the first image, the first encoding texture is carried out up-sampling, the data behind the up-sampling are carried out change quantization, obtain the transform domain data of the first encoding texture; For example: if the first encoding texture of the l-1 layer that gets access to (the first image) is

Wherein,

Be the l-1 layer, n frame, the coding side decoding and rebuilding value of i encoding block, f _{L-1, n} ^{I, pred}Be the l-1 layer, n frame, the predicted value of i encoding block; The first encoding texture to the l-1 layer

Carrying out the up-sampling processing obtains Data behind the up-sampling are carried out dct transform, then do quantification and convergent-divergent processing at transform domain, the transform domain data that obtain the first encoding texture are

Obtain the second encoding texture of the second image, described the second encoding texture is carried out change quantization, obtain the transform domain data of the second encoding texture; For example: if the second encoding texture of the second image that obtains (perhaps the second residual error) is

Wherein,

Be the l layer, n frame, the coding side decoding and rebuilding value of j encoding block, f _{L, n} ^{I, pred}Be the l layer, the n frame, the predicted value of i encoding block is then to the second encoding texture

After carrying out the dct transform quantification, obtain the transform domain data of the second encoding texture

Take the transform domain data of the first encoding texture as with reference to the transform domain data of described the second encoding texture are carried out inter-layer prediction, obtain the first transform domain inter-layer prediction data, described the first transform domain inter-layer prediction data are carried out the entropy coding, form the code stream of the second image, for example: with the transform domain data of the first encoding texture

Be reference, to the transform domain data of described the second encoding texture Carry out inter-layer prediction, obtain the first transform domain inter-layer prediction data

DCT ({\hat{f}}_{l, n}^{j} - f_{l, n}^{i, pred}) - DCT (U ({\hat{f}}_{l - 1, n}^{i} - f_{l - 1, n}^{i, pred}));

The first transform domain inter-layer prediction data are carried out the entropy coding, form the code stream of the second image.

Replacedly, carry out exemplary illustration in conjunction with Fig. 3, if the predictive mode of the first image is inter-frame forecast mode, then in above-mentioned steps 103, the first encoding texture that obtains l-1 layer (the first image) is

(residual error) is to the first encoding texture of l-1 layer

Carrying out dct transform and quantification obtains

The second encoding texture that obtains the second image is

To the second encoding texture

Obtain the transform domain data of the second encoding texture after carrying out dct transform and quantizing

Transform domain data with the first encoding texture

For with reference to the second encoding texture

The transform domain data

Carry out inter-layer prediction, obtain the first transform domain inter-layer prediction data

DCT ({\hat{f}}_{l, n}^{i} - {\hat{f}}_{l, n - 1}^{j}) - DCT (U ({\hat{f}}_{l - 1, n}^{i} - {\hat{f}}_{l - 1, n - 1}^{j}));

Described the first transform domain inter-layer prediction data are carried out the entropy coding, form the code stream of the second image.

Further, the predictive mode according to the first image can also have following two kinds of processing modes in the above-mentioned steps 103:

One, the predictive mode of the first image are the processing mode of intra prediction mode:

Obtain the 3rd encoding texture of the first image, the 3rd encoding texture is carried out change quantization, the data after the conversion are carried out up-sampling, obtain the transform domain data of the 3rd encoding texture; For example: if the 3rd encoding texture of the l-1 layer that gets access to (the first image) is

Wherein,

Be the l-1 layer, n frame, the coding side decoding and rebuilding value of i encoding block, f _{L-1, n} ^{I, pred}Be the l-1 layer, n frame, the predicted value of i encoding block; The 3rd encoding texture to the l-1 layer

Carrying out dct transform obtains

Data behind the dct transform are carried out up-sampling, obtain the 3rd encoding texture

The transform domain data

Obtain the 4th encoding texture of the second image, described the 4th encoding texture is carried out change quantization, obtain the transform domain data of the 4th encoding texture; For example: if the 4th encoding texture of the second image that obtains (perhaps the second residual error) is

Wherein,

Be the l layer, n frame, the coding side decoding and rebuilding value of j encoding block, f _{L, n} ^{I, pred}Be the l layer, the n frame, the predicted value of i encoding block is then to the 4th encoding texture

After carrying out the dct transform quantification, obtain the transform domain data of the 4th encoding texture

Take the transform domain data of the 3rd encoding texture as with reference to the transform domain data of described the 4th encoding texture are carried out inter-layer prediction, obtain the second transform domain inter-layer prediction data, the second transform domain inter-layer prediction data are carried out the entropy coding, form the code stream of the second image, for example: with the transform domain data of the 3rd encoding texture

Be reference, to the transform domain data of the 4th encoding texture

Carry out inter-layer prediction, obtain the second transform domain inter-layer prediction data

DCT ({\hat{f}}_{l, n}^{j} - f_{l, n}^{i, pred}) - U (DCT ({\hat{f}}_{l - 1, n}^{i} - f_{l - 1, n}^{i, pred}));

To the second transform domain inter-layer prediction data

DCT ({\hat{f}}_{l, n}^{j} - f_{l, n}^{i, pred}) - U (DCT ({\hat{f}}_{l - 1, n}^{i} - f_{l - 1, n}^{i, pred}));

Carry out the entropy coding, form the code stream of the second image.

The predictive mode of its two, the first image is the processing mode of inter-frame forecast mode:

The 3rd encoding texture that obtains l-1 layer (the first image) is

(residual error) is to the 3rd encoding texture of l-1 layer Carrying out dct transform and quantification obtains

To the data behind the dct transform

Carry out up-sampling

Obtain the transform domain data of the 3rd encoding texture

Obtain the 4th encoding texture of the second image To the 4th encoding texture

Obtain the transform domain data of the 4th encoding texture after carrying out dct transform and quantizing

Transform domain data with the 3rd encoding texture

For with reference to the 4th encoding texture

The transform domain data

Carry out inter-layer prediction, obtain the second transform domain inter-layer prediction data The second transform domain inter-layer prediction data are carried out the entropy coding, form the code stream of the second image.

Further, if have the image of at least three kinds of sizes in the described original series, then with described the first image as a reference, perhaps with described the second image as a reference and with described the second image as the first image, with one in the residual image in the image of described at least three kinds of sizes as the second image, if it is not identical with described first size to carry out the second size that the second image has in the described original series, then take described the first image as reference described the second image is carried out predictive coding at transform domain, data to described the second image after the predictive coding are encoded, and obtain the code stream of described the second image.

On the basis of above-mentioned Fig. 1～embodiment illustrated in fig. 3, can also comprise: the identification information that it is the transform domain inter-layer prediction that transmission is used for the described spatial scalable code stream of mark, so that receiving equipment carries out reconstruction process to the spatial scalable code stream that described coding forms afterwards at transform domain according to described identification information;

Wherein, receiving equipment can be the network node that the multilayer code stream can be rewritten in the network, also can be the multilayer code stream decoding device that to decode to the multilayer code stream, but above-mentioned two kinds of equipment do not consist of the restriction to the embodiment of the invention, as long as the equipment that the multilayer code stream that receives can be rewritten or decode is the described receiving equipment of the embodiment of the invention.

Perhaps, the identification information that should be used for mark multilayer code stream and be the transform domain inter-layer prediction also can be added on the multilayer code stream, so that receiving equipment carries out the transform domain reconstruction process according to identification information to the multilayer code stream of the rear formation of encoding.

By this multilayer code stream of mark in this label information identification information that is the transform domain inter-layer prediction, particularly, this identification information specifically can have space layer code stream overwrite function by this multilayer code stream of mark, the inter-layer prediction coding that is based on transform domain that perhaps this multilayer code stream of mark adopts in a plurality of space layer codings; By this label information is transferred in decoding device or the network node, perhaps the mode with applicable acquiescence is transferred in decoding device or the network node, so that decoding device or network node know that by this label information this multilayer code stream is to have space layer code stream overwrite function, perhaps in a plurality of space layer codings, adopted the inter-layer prediction coding of transform domain.If the mode with transmission sends to decoding device or network node with this label information, then this label information can be carried by code stream, (for example: Session Description Protocol (Session Description Protocol also can pass through related protocol, be called for short: SDP)) or message packet (for example: in real time the transfer control agreement (Real-time Transport Control Protocol, be called for short: RTCP) message) or the mode of packet be transferred to network node or decoding device.

Fig. 4 is the schematic flow sheet of an embodiment of video code flow rewrite method of the present invention, and Fig. 5 is the schematic diagram that middle code stream embodiment illustrated in fig. 4 rewrites, and Fig. 6 is another schematic diagram that middle code stream embodiment illustrated in fig. 4 rewrites; As shown in Figure 4, the embodiment of the invention comprises the steps:

Step 401, obtain the spatial scalable code stream that obtains by transform domain inter-layer prediction coding;

Step 402, the spatial scalable code stream is being carried out in the rewrite process, if the second size that the second image in the spatial scalable code stream has is not identical with the first size of the first image, then at transform domain take transform domain data corresponding to the first encoding texture of the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image, the transform domain data of the second encoding texture of the second image are carried out code stream rewrite.

The video code flow rewrite method that the embodiment of the invention provides, by at transform domain take transform domain data corresponding to the first encoding texture of the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image, the transform domain data of the second encoding texture of the second image are carried out code stream to be rewritten, do not process owing to not needing that when rewriteeing the code stream of the second image image is carried out inverse transformation again on transform domain, therefore reduced the complexity that code stream rewrites.

Further, in conjunction with shown in Figure 5, on above-mentioned basis embodiment illustrated in fig. 4, specifically can comprise in step 402:

The code stream that has the first image of first size in the described spatial scalable code stream is carried out the entropy decoding obtain the first conversion coefficient, after described the first conversion coefficient carried out up-sampling, carry out convergent-divergent at transform domain and process, obtain the transform domain data of the first encoding texture of described the first image;

The code stream that has the second image of the second size in the described spatial scalable code stream is carried out the entropy decoding obtain the 3rd transform domain inter-layer prediction data;

Transform domain data according to the second encoding texture of the transform domain data of the first encoding texture and described the second image of described the 3rd transform domain inter-layer prediction data reconstruction;

The transform domain data of the predictive mode in the described spatial scalable code stream and described the second encoding texture are carried out the entropy coding obtain rewriteeing rear code stream.

Further, in conjunction with shown in Figure 6, on above-mentioned basis embodiment illustrated in fig. 4, step 402 specifically can comprise:

The code stream that has the first image of first size in the described spatial scalable code stream is carried out the entropy decoding obtain the first transform data, described the first transform data is carried out inverse transformation, data after the inverse transformation are carried out up-sampling and change quantization, obtain the transform domain data of the first encoding texture of described the first image;

The code stream that has the second image of the second size in the described spatial scalable code stream is carried out the entropy decoding obtain the second transform domain inter-layer prediction data;

Transform domain data according to the second encoding texture of described second image of the second transform domain inter-layer prediction data reconstruction of the transform domain data of the first encoding texture and described the second image;

The predictive mode of the transform domain data of described the second encoding texture and described spatial scalable code stream is carried out the entropy coding obtain rewriteeing rear code stream.

Further, in above-mentioned Fig. 4～embodiment illustrated in fig. 6, code stream is individual layer code stream or the spatial scalable code stream that obtains by transform domain inter-layer prediction coding after rewriteeing.

Fig. 7 is the schematic flow sheet of an embodiment of video encoding/decoding method of the present invention, and Fig. 8 is a schematic diagram of middle video decode embodiment illustrated in fig. 7, and Fig. 9 is another schematic diagram of middle video decode embodiment illustrated in fig. 7; As shown in Figure 7, the embodiment of the invention comprises the steps:

Step 701, obtain the spatial scalable code stream that obtains by transform domain inter-layer prediction coding;

Step 702, the spatial scalable code stream is being carried out in the decode procedure, if the second size that the second image in the spatial scalable code stream has is not identical with the first size of the first image, then at transform domain take transform domain data corresponding to the first encoding texture of the first image as reference, rebuild the transform domain data of the second encoding texture of the second image;

Step 703, the transform domain data of the second encoding texture are carried out inverse transformation, the second encoding texture of acquisition utilizes the second encoding texture to rebuild the second image.

The video encoding/decoding method that the embodiment of the invention provides, by at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of the second image, reduced the complexity of decoding.

Further, describe in conjunction with Fig. 8, above-mentioned embodiment illustrated in fig. 7 in, above-mentioned steps 702 can comprise:

The code stream of the first image of having first size in the spatial scalable code stream is carried out the entropy decoding obtain the first conversion coefficient, after described the first conversion coefficient carried out up-sampling, carry out convergent-divergent at transform domain and process, obtain the transform domain data of the first encoding texture of described the first image;

The code stream of the second image of having the second size in the spatial scalable code stream is carried out the entropy decoding obtain the second transform domain inter-layer prediction data;

Transform domain data according to the second encoding texture of the transform domain data of the first encoding texture and described the second image of described the second transform domain inter-layer prediction data reconstruction.

In the said process and since directly on transform domain take transform domain data corresponding to the first encoding texture of the first image as reference, rebuild the transform domain data of the second encoding texture of the second image, therefore reduced the complexity of decoding.

Further, describe in conjunction with Fig. 9, above-mentioned embodiment illustrated in fig. 7 in, above-mentioned steps 702 can also comprise:

The code stream that has the first image of first size in the described spatial scalable code stream is carried out the entropy decoding obtain the first conversion coefficient, described the first conversion coefficient is carried out inverse transformation, data after the inverse transformation are carried out up-sampling and conversion, obtain the transform domain data of the first encoding texture of described the first image;

Transform domain data according to the second encoding texture of described second image of the 3rd transform domain inter-layer prediction data reconstruction of the transform domain data of the first encoding texture and described the second image.

Further, on the basis of above-mentioned Fig. 7～9 illustrated embodiments, can also comprise:

Be the identification information of transform domain inter-layer prediction according to the described described code stream with spatial scalable of mark that is used for that receives, rebuild the transform domain data of the second encoding texture of the second image at transform domain;

By label information is transferred in the decoding device, perhaps the mode with applicable acquiescence is transferred in decoding device or the network node, so that decoding device knows that by this label information this multilayer code stream is to have space layer code stream overwrite function, perhaps in a plurality of space layer codings, adopted the inter-layer prediction coding of transform domain; If the mode with transmission sends to decoding device or network node with this label information, then this label information can be carried in code stream, also can by related protocol (for example: SDP) or message packet (for example: the RTCP message) or the mode of packet be transferred to network node or decoding device.

Figure 10 is the structural representation of an embodiment of video coding apparatus of the present invention, the embodiment of the invention can realize the flow process of above-mentioned embodiment of the method shown in Figure 1, as shown in figure 10, the embodiment of the invention comprises: acquisition module 11, the first coding module 12, the second coding module 13, the first processing module 14;

Wherein, acquisition module 11 obtains the original series of vedio data; The code stream that the first image that has first size in 12 pairs of described original series of the first coding module is predicted, conversion, quantification, coding obtain described the first image; The first processing module 14 is according to the predictive mode of described the first image, to the second image that has the second size in the original series predict, conversion, quantification, obtain the transform domain data of described the second image; If the second size that the second image in the described original series has is not identical with described first size, the second coding module 13 is take the transform domain data of described the first image as with reference to described the second image is predicted at transform domain, transform domain inter-layer prediction data to described the second image after the prediction are encoded, and obtain the code stream of described the second image; Wherein, the code stream of the code stream of described the first image and described the second image forms the spatial scalable code stream.

The video coding apparatus that the embodiment of the invention provides, the second coding module 13 is take the transform domain data of the first image as with reference to the second image is predicted at transform domain, data to the second image after the prediction are encoded, obtain the code stream of the second image, do not process owing to not needing that or not first image in the original series is carried out inverse transformation again on transform domain, therefore reduced the complexity of coding.

Further, on above-mentioned basis embodiment illustrated in fig. 10, the second coding module can also comprise: the first processing unit, the second processing unit, the first coding unit; The first processing unit obtains the first encoding texture of described the first image, and described the first encoding texture is carried out up-sampling, and the data behind the up-sampling are carried out change quantization, obtains the transform domain data of described the first encoding texture; The second processing unit obtains the second encoding texture of the second image, and described the second encoding texture is carried out change quantization, obtains the transform domain data of described the second encoding texture; The first coding unit is take the transform domain data of described the first encoding texture as with reference to the transform domain data of described the second encoding texture are carried out inter-layer prediction, obtain the first transform domain inter-layer prediction data, described the first transform domain inter-layer prediction data are carried out the entropy coding, form the code stream of the second image.

Further, on above-mentioned basis embodiment illustrated in fig. 10, the second coding module can also comprise: the 3rd processing unit, is managed unit, the second coding unit everywhere; Wherein, the 3rd processing unit obtains the 3rd encoding texture of described the first image, and described the 3rd encoding texture is carried out change quantization, and the data after the conversion are carried out up-sampling, obtains the transform domain data of described the 3rd encoding texture; Manages the 4th encoding texture that the unit obtains described the second image everywhere, and described the 4th encoding texture is carried out change quantization, obtains the transform domain data of the 4th encoding texture of the second image; The second coding unit is take the transform domain data of described the 3rd encoding texture as with reference to the transform domain data of described the 4th encoding texture are carried out inter-layer prediction, obtain the second transform domain inter-layer prediction data, described the second transform domain inter-layer prediction data are carried out the entropy coding, form the code stream of the second image.

Further, on above-mentioned basis embodiment illustrated in fig. 10, also comprise: sending module, for the identification information that transmission is the transform domain inter-layer prediction for the described spatial scalable code stream of mark, so that decoding at transform domain, processes or transcoding processing or code stream rewriting processing the spatial scalable code stream that receiving equipment forms described coding afterwards according to described identification information.

Figure 11 is the structural representation of an embodiment of video code flow rewriting device of the present invention, and the embodiment of the invention can realize the flow process of above-mentioned embodiment of the method shown in Figure 4, and as shown in figure 11, the embodiment of the invention comprises: acquisition module 111, code stream rewrite module 112;

Wherein, acquisition module 111 obtains the spatial scalable code stream that obtains by transform domain inter-layer prediction coding; Code stream rewrites module 112 to carry out in the rewrite process described spatial scalable code stream, if the second size that the second image in the described spatial scalable code stream has is not identical with the first size of the first image, then at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image, the transform domain data of the second encoding texture of described the second image are carried out code stream rewrite.

The video code flow rewriting device that the embodiment of the invention provides, code stream rewrite module 112 by at transform domain take transform domain data corresponding to the first encoding texture of the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image, the transform domain data of the second encoding texture of the second image are carried out code stream to be rewritten, do not process owing to not needing that the image in the original series is carried out inverse transformation again on transform domain, therefore reduced the complexity that code stream rewrites.

Further, on above-mentioned basis embodiment illustrated in fig. 11, code stream rewrites module and can also comprise: the first processing unit, the second processing unit, the first reconstruction unit, the first entropy coding unit; Wherein, the first processing unit, carry out the entropy decoding for the code stream that described spatial scalable code stream is had the first image of first size and obtain the first transform data, described the first transform data is carried out inverse transformation, data after the inverse transformation are carried out up-sampling and conversion, obtain the transform domain data of the first encoding texture of described the first image; The second processing unit carries out the entropy decoding for the code stream that described spatial scalable code stream is had the second image of the second size and obtains the second transform domain inter-layer prediction data; The first reconstruction unit is used for the transform domain data according to the second encoding texture of described second image of the second transform domain inter-layer prediction data reconstruction of the transform domain data of the first encoding texture and described the second image; The first entropy coding unit is used for predictive mode to the transform domain data of described the second encoding texture and described spatial scalable code stream and carries out the entropy coding and obtain rewriteeing rear code stream.

Further, on above-mentioned basis embodiment illustrated in fig. 11, code stream rewrites module and can also comprise: the 3rd processing unit, the is managed unit, the second reconstruction unit, the second entropy coding unit everywhere; Wherein, the second processing unit, carry out the entropy decoding for the code stream that described spatial scalable code stream is had the first image of first size and obtain the first conversion coefficient, after described the first conversion coefficient carried out up-sampling, carry out convergent-divergent at transform domain and process, obtain the transform domain data of the first encoding texture of described the first image; The manages the unit everywhere, is used for that the code stream that described spatial scalable code stream has the second image of the second size is carried out the entropy decoding and obtains the 3rd transform domain inter-layer prediction data; The second reconstruction unit is used for the transform domain data according to the second encoding texture of the transform domain data of the first encoding texture and described the second image of described the 3rd transform domain inter-layer prediction data reconstruction; The second entropy coding unit is used for that the transform domain data of the predictive mode of described spatial scalable code stream and described the second encoding texture are carried out the entropy coding and obtains rewriteeing rear code stream.

Figure 12 is the structural representation of an embodiment of video decoder of the present invention, the embodiment of the invention can realize the flow process of above-mentioned embodiment of the method shown in Figure 7, as shown in figure 12, the embodiment of the invention comprises: acquisition module 121, first is rebuild module 122, second and is rebuild module 123; Triple modeling pieces 124.

Wherein, acquisition module 121 obtains the spatial scalable code stream that obtains by transform domain inter-layer prediction coding; First rebuilds module 122 obtains the spatial scalable code that obtains by transform domain inter-layer prediction coding; Second rebuilds 123 pairs of described spatial scalable code streams of module carries out in the decode procedure, if the second size that the second image in the described spatial scalable code stream has is not identical with the first size of the first image, then at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image; The transform domain data of 124 pairs of the second encoding textures of triple modeling pieces are carried out inverse transformation, and the second encoding texture of acquisition utilizes the second encoding texture to rebuild the second image.

The video decoder that the embodiment of the invention provides, first rebuild module 122 by at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image, do not process owing to not needing that or not first image in the original series is carried out inverse transformation again on transform domain, therefore reduced the complexity of decoding.

Further, on above-mentioned basis embodiment illustrated in fig. 12, first rebuilds module can also comprise: the first processing unit, the first decoding unit, the first reconstruction unit; Wherein, the first processing unit, carry out the entropy decoding for the code stream that described spatial scalable code stream is had the first image of first size and obtain the first conversion coefficient, after described the first conversion coefficient carried out up-sampling, carry out convergent-divergent at transform domain and process, obtain the transform domain data of the first encoding texture of described the first image; The first decoding unit carries out the entropy decoding for the code stream that described spatial scalable code stream is had the second image of the second size and obtains the second transform domain inter-layer prediction data; The first reconstruction unit is used for the transform domain data according to the second encoding texture of the transform domain data of the first encoding texture and described the second image of described the second transform domain inter-layer prediction data reconstruction.

Further, on above-mentioned basis embodiment illustrated in fig. 12, first rebuilds module can also comprise: the second processing unit, the second decoding unit, the second reconstruction unit; Wherein, the second processing unit, carry out the entropy decoding for the code stream that described spatial scalable code stream is had the first image of first size and obtain the first conversion coefficient, described the first conversion coefficient is carried out inverse transformation, data after the inverse transformation are carried out up-sampling and conversion, obtain the transform domain data of the first encoding texture of described the first image; The second decoding unit carries out the entropy decoding for the code stream that described spatial scalable code stream is had the second image of the second size and obtains the 3rd transform domain inter-layer prediction data; The second reconstruction unit is used for the transform domain data according to the second encoding texture of described second image of the 3rd transform domain inter-layer prediction data reconstruction of the transform domain data of the first encoding texture and described the second image.

Figure 13 is the structural representation of an embodiment of video coding and decoding system of the present invention, and as shown in figure 13, the embodiment of the invention comprises: video coding apparatus 131, code stream rewriting device 132, video decoder 133;

Wherein, video coding apparatus 131 obtains the original series of vedio data; To the code stream that the first image that has first size in the described original series is predicted, conversion, quantification, coding obtain described the first image; According to the predictive mode of described the first image, to the second image that has the second size in the original series predict, conversion, quantification, obtain the transform domain data of described the second image; If described the second size is not identical with described first size, take the transform domain data of described the first image as with reference to the transform domain data of described the second image are predicted at transform domain, transform domain inter-layer prediction data to described the second image after the prediction are encoded, and obtain the code stream of described the second image; Wherein, the code stream of the code stream of described the first image and described the second image forms the spatial scalable code stream;

Code stream rewriting device 132 obtains the spatial scalable code stream that obtains by transform domain inter-layer prediction coding; Described spatial scalable code stream is being carried out in the rewrite process, if the second size that the second image in the described spatial scalable code stream has is not identical with the first size of the first image, then at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image, the transform domain data of the second encoding texture of described the second image are carried out code stream rewrite;

Video decoder 133 obtains the spatial scalable code stream that obtains by transform domain inter-layer prediction coding; Described spatial scalable code stream is carried out in the decode procedure, if the second size that the second image in the described spatial scalable code stream has is not identical with the first size of the first image, then at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image; Transform domain data to the second encoding texture are carried out inverse transformation, and the second encoding texture of acquisition utilizes the second encoding texture to rebuild the second image.

The video coding and decoding system that the embodiment of the invention provides, by to the second image of having the second size in the original series on transform domain according to the predictive mode of the first image and take the transform domain data of the first image as with reference to the second image is predicted at transform domain, data to the second image after the coded prediction are encoded, obtain the code stream of the second image, do not process owing to not needing that or not first image in the original series is carried out inverse transformation again on transform domain, therefore reduced the complexity of Code And Decode.

Figure 14 is the structural representation of the code stream rewriting system that the embodiment of the invention was suitable for, and as shown in figure 14, the embodiment of the invention comprises: encoder 141, network node 142, decoder 143; Wherein, be specifically as follows video coding apparatus embodiment illustrated in fig. 10 depending on encoder 141; Network node 142 is specifically as follows video code flow rewriting device embodiment illustrated in fig. 11; Decoder 143 is specifically as follows video decoder embodiment illustrated in fig. 12.

Wherein, the original series that encoder 141 gets access to vedio data forms the multilayer code stream after encoding and processing, there are following at least three kinds of situations in the multilayer code stream in network transmission process: one, the multilayer code stream directly is transferred to the decoder 143 that can decode to the multilayer code stream, two, the multilayer bit stream is to rewriteeing network node 142 with transcoding to the multilayer code stream, after network node 142 rewrites or is transcoded into the individual layer code stream with the multilayer code stream, with the individual layer bit stream to decoder 143, the individual layer code stream that 143 pairs of decoders the receive processing of decoding, the multilayer code stream is continued to transfer to follow-up network node, treat that follow-up network node carries out code stream and rewrites until input the individual layer code stream; Three, the multilayer code stream is through behind the network node 142, behind the multilayer code stream of generated code fluid layer number less than the number of plies of the multilayer code stream that inputs to network node 142, transfer to again follow-up networking node, so that follow-up network node rewrites or transcoding the multilayer code stream again, rewrite or convert to individual layer code stream or multilayer code stream according to the demand of terminal equipment.

In the code stream rewriting system that the embodiment of the invention was suitable for, by to the second image of having the second size in the original series on transform domain take the transform domain data of the first image as with reference to the second image is predicted at transform domain, data to the second image after the coded prediction are encoded, obtain the code stream of the second image, do not process owing to not needing that or not first image in the original series is carried out inverse transformation again on transform domain, therefore reduced the complexity of Code And Decode.

The embodiment of the invention and Y-PSNR (the Peak Signal-to-Noise Ratio that can not rewrite the SVC coding of code stream of the prior art, be called for short: PSNR) as shown in table 1, the embodiment of the invention and PSNR of the prior art are as shown in table 2, rewritable encoding scheme and as shown in table 3 with broadcasting the result that compares of coding, the comparing result that four layers of SVC code stream is rewritten as one deck AVC code stream and direct coding one deck AVC code stream is as shown in table 4, wherein, deltaQP represent the embodiment of the invention and SVC quantization parameter (Quantization Parameter, be called for short: QP) differing is 2,3,6 o'clock result relatively.

The PSNR gain of table 1 embodiment of the invention and SVC relatively

Sequence type	deltaQP＝2	deltaQP＝3	deltaQP＝6
				Bus	-0.753684	-0.531987	-0.25914
Mobile	-0.849045	-0.558285	-0.229195

Table 2 embodiment of the invention compares with the same PSNR gain of broadcasting code stream

Sequence type	deltaQP＝2	deltaQP＝3	deltaQP＝6
				Bus	0.71315	0.68764	0.53639
Mobile	0.48129	0.45957	0.3193

The rewritable encoding scheme of table 3 and the same coding of broadcasting are compared

After rewriteeing, table 4 embodiment of the invention code stream code stream compares with one deck AVC code stream of encoding separately

Sequence type	deltaQP＝2	deltaQP＝3	deltaQP＝6
				Bus	4.53543	4.49629	3.86001
Mobile	3.57424	3.49251	3.05672

The technical scheme that realizes the multilayer code stream is carried out the code stream rewriting that the invention described above embodiment provides, can realize comprehensively that the code stream of multilayer quality and the code stream of multilayer space code stream rewrite, the coding efficiency phase ratio distortion of the multilayer code stream that the multilayer code stream that the embodiment of the invention generates and employing prior art generate (RateDistortion, be called for short: RD) performance is identical; And, begin can after receiving terminal is receiving this multilayer code stream, be rewritten as quickly and easily new multilayer code stream or individual layer code stream to second image as enhancement layer bitstream of any number from the first image (perhaps low layer code stream) as basic layer bit stream in the code stream in the technical scheme that the embodiment of the invention provides; The decoding of the individual layer after the rewriting or multilayer code stream is identical with the decoded result that rewrites front multilayer code stream.

Dct transform among the invention described above embodiment also can be for realizing other conversion of discrete transform territory conversion, for example: wavelet transformation, the embodiment of the invention only describes with dct transform and does not consist of the restriction that transform domain in the embodiment of the invention is processed, and everyly can be by the processing that other transform domain mode realizes the described technical scheme of the embodiment of the invention.

One of ordinary skill in the art will appreciate that: all or part of step that realizes above-described embodiment can be finished by the relevant hardware of program command, aforesaid program can be stored in the computer read/write memory medium, this program is carried out the step that comprises said method embodiment when carrying out; And aforesaid storage medium comprises: the various media that can be program code stored such as ROM, RAM, magnetic disc or CD.

It should be noted that at last: above embodiment only in order to technical scheme of the present invention to be described, is not intended to limit; Although with reference to previous embodiment the present invention is had been described in detail, those of ordinary skill in the art is to be understood that: it still can be made amendment to the technical scheme that aforementioned each embodiment puts down in writing, and perhaps part technical characterictic wherein is equal to replacement; And these modifications or replacement do not make the essence of appropriate technical solution break away from the spirit and scope of various embodiments of the present invention technical scheme.

Claims

1. a method for video coding is characterized in that, comprising:

Obtain the original series of vedio data;

If described the second size is not identical with described first size, take the transform domain data of described the first image as with reference to the transform domain data of described the second image are predicted at transform domain, transform domain inter-layer prediction data to described the second image after the prediction are encoded, and obtain the code stream of described the second image; Wherein, the code stream of the code stream of described the first image and described the second image forms the spatial scalable code stream;

Described transform domain data take described the first image are as with reference to the transform domain data of described the second image are predicted at transform domain, transform domain inter-layer prediction data to described the second image after the prediction are encoded, and the code stream that obtains described the second image comprises:

Obtain the first encoding texture of described the first image, described the first encoding texture is carried out up-sampling, the data behind the up-sampling are carried out change quantization, obtain the transform domain data of described the first encoding texture;

Obtain the second encoding texture of the second image, described the second encoding texture is carried out change quantization, obtain the transform domain data of described the second encoding texture;

Take the transform domain data of described the first encoding texture as with reference to the transform domain data of described the second encoding texture are carried out inter-layer prediction, obtain the first transform domain inter-layer prediction data, described the first transform domain inter-layer prediction data are carried out the entropy coding, form the code stream of the second image.

2. method according to claim 1, it is characterized in that, described transform domain data take described the first image are as with reference to the transform domain data of described the second image are predicted at transform domain, transform domain inter-layer prediction data to described the second image after the prediction are encoded, and the code stream that obtains described the second image comprises:

Obtain the 3rd encoding texture of described the first image, described the 3rd encoding texture is carried out change quantization, the data after the conversion are carried out up-sampling, obtain the transform domain data of described the 3rd encoding texture;

Obtain the 4th encoding texture of described the second image, described the 4th encoding texture is carried out change quantization, obtain the transform domain data of the 4th encoding texture of the second image;

Take the transform domain data of described the 3rd encoding texture as with reference to the transform domain data of described the 4th encoding texture are carried out inter-layer prediction, obtain the second transform domain inter-layer prediction data, described the second transform domain inter-layer prediction coded data is carried out the entropy coding, form the code stream of the second image.

3. arbitrary described method is characterized in that according to claim 1～2, also comprises:

The identification information that it is the transform domain inter-layer prediction that transmission is used for the described spatial scalable code stream of mark is processed or transcoding processing or code stream rewriting processing so that the spatial scalable code stream that receiving equipment forms described coding afterwards according to described identification information is decoded at transform domain.

4. arbitrary described method is characterized in that according to claim 1～2, also comprises:

If have the image of at least three kinds of sizes in the described original series, then with described the first image as a reference, perhaps with described the second image as a reference and with described the second image as the first image, with one in the residual image in the image of described at least three kinds of sizes as the second image, if it is not identical with described first size to carry out the second size that the second image has in the described original series, then the predictive mode according to described the first image carries out predictive coding to described the second image at transform domain take described the first image as reference, data to described the second image after the predictive coding are encoded, and obtain the code stream of described the second image.

5. a video code flow rewrite method is characterized in that, comprising:

Described spatial scalable code stream is being carried out in the rewrite process, if the second size that the second image in the described spatial scalable code stream has is not identical with the first size of the first image, then at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image, the transform domain data of the second encoding texture of described the second image are carried out code stream rewrite;

Described at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image, the transform domain data of the second encoding texture of described the second image carried out code stream rewrite and comprise:

The code stream that has the first image of first size in the described spatial scalable code stream is carried out the entropy decoding obtain the first transform data, described the first transform data is carried out inverse transformation, data after the inverse transformation are carried out up-sampling and conversion, obtain the transform domain data of the first encoding texture of described the first image;

6. method according to claim 5, it is characterized in that, described at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image, the transform domain data of the second encoding texture of described the second image carried out code stream rewrite and comprise:

7. arbitrary described method is characterized in that according to claim 5～6, and code stream is individual layer code stream or the spatial scalable code stream that obtains by transform domain inter-layer prediction coding after the described rewriting.

8. a video encoding/decoding method is characterized in that, comprising:

Described spatial scalable code stream is being carried out in the decode procedure, if the second size that the second image in the described spatial scalable code stream has is not identical with the first size of the first image, then at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image;

Transform domain data to the second encoding texture are carried out inverse transformation, and the second encoding texture of acquisition utilizes the second encoding texture to rebuild the second image;

Described at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, the transform domain data of rebuilding the second encoding texture of described the second image comprise:

9. method according to claim 8 is characterized in that, described at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, the transform domain data of rebuilding the second encoding texture of described the second image comprise:

10. arbitrary described method is characterized in that according to claim 8～9, also comprises:

Be the identification information of transform domain inter-layer prediction according to the described described code stream with spatial scalable of mark that is used for that receives, rebuild the transform domain data of the second encoding texture of the second image at transform domain.

11. a video coding apparatus is characterized in that, comprising:

Acquisition module is for the original series that obtains vedio data;

The second coding module, if it is not identical with described first size to be used for described the second size, take the transform domain data of described the first image as with reference to the transform domain data of described the second image are predicted at transform domain, transform domain inter-layer prediction data to described the second image after the prediction are encoded, and obtain the code stream of described the second image; Wherein, the code stream of the code stream of described the first image and described the second image forms the spatial scalable code stream;

Described the second coding module comprises:

The first processing unit, the first encoding texture for obtaining described the first image carries out up-sampling to described the first encoding texture, and the data behind the up-sampling are carried out change quantization, obtains the transform domain data of described the first encoding texture;

The second processing unit, the second encoding texture for obtaining the second image carries out change quantization to described the second encoding texture, obtains the transform domain data of described the second encoding texture;

The first coding unit, be used for take the transform domain data of described the first encoding texture as with reference to the transform domain data of described the second encoding texture are carried out inter-layer prediction, obtain the first transform domain inter-layer prediction data, described the first transform domain inter-layer prediction data are carried out the entropy coding, form the code stream of the second image.

12. device according to claim 11 is characterized in that, described the second coding module comprises:

The 3rd processing unit, the 3rd encoding texture for obtaining described the first image carries out change quantization to described the 3rd encoding texture, and the data after the conversion are carried out up-sampling, obtains the transform domain data of described the 3rd encoding texture;

The manages the unit everywhere, is used for obtaining the 4th encoding texture of described the second image, and described the 4th encoding texture is carried out change quantization, obtains the transform domain data of the 4th encoding texture of the second image;

The second coding unit, be used for take the transform domain data of described the 3rd encoding texture as with reference to the transform domain data of described the 4th encoding texture are carried out inter-layer prediction, obtain the second transform domain inter-layer prediction data, described the second transform domain inter-layer prediction data are carried out the entropy coding, form the code stream of the second image.

13. arbitrary described device is characterized in that according to claim 11～12, also comprises:

Sending module, for the identification information that transmission is the transform domain inter-layer prediction for the described spatial scalable code stream of mark, so that decoding at transform domain, processes or transcoding processing or code stream rewriting processing the spatial scalable code stream that receiving equipment forms described coding afterwards according to described identification information.

14. a video code flow rewriting device is characterized in that, comprising:

Code stream rewrites module, be used for described spatial scalable code stream is being carried out rewrite process, if the second size that the second image in the described spatial scalable code stream has is not identical with the first size of the first image, then at transform domain take transform domain data corresponding to the first encoding texture of described the first image as reference, rebuild the transform domain data of the second encoding texture of described the second image, the transform domain data of the second encoding texture of described the second image are carried out code stream rewrite;

Described code stream rewrites module and comprises:

The first processing unit, carry out the entropy decoding for the code stream that described spatial scalable code stream is had the first image of first size and obtain the first transform data, described the first transform data is carried out inverse transformation, data after the inverse transformation are carried out up-sampling and conversion, obtain the transform domain data of the first encoding texture of described the first image;

The second processing unit carries out the entropy decoding for the code stream that described spatial scalable code stream is had the second image of the second size and obtains the second transform domain inter-layer prediction data;

The first reconstruction unit is used for the transform domain data according to the second encoding texture of described second image of the second transform domain inter-layer prediction data reconstruction of the transform domain data of the first encoding texture and described the second image;

The first entropy coding unit is used for predictive mode to the transform domain data of described the second encoding texture and described spatial scalable code stream and carries out the entropy coding and obtain rewriteeing rear code stream.

15. according to claim 14, it is characterized in that, described code stream rewrites module and comprises:

The 3rd processing unit, carry out the entropy decoding for the code stream that described spatial scalable code stream is had the first image of first size and obtain the first conversion coefficient, after described the first conversion coefficient carried out up-sampling, carry out convergent-divergent at transform domain and process, obtain the transform domain data of the first encoding texture of described the first image;

The manages the unit everywhere, is used for that the code stream that described spatial scalable code stream has the second image of the second size is carried out the entropy decoding and obtains the 3rd transform domain inter-layer prediction data;

The second reconstruction unit is used for the transform domain data according to the second encoding texture of the transform domain data of the first encoding texture and described the second image of described the 3rd transform domain inter-layer prediction data reconstruction;

The second entropy coding unit is used for that the transform domain data of the predictive mode of described spatial scalable code stream and described the second encoding texture are carried out the entropy coding and obtains rewriteeing rear code stream.

16. a video decoder is characterized in that, comprising:

Triple modeling pieces are used for the transform domain data of the second encoding texture are carried out inverse transformation, and the second encoding texture of acquisition utilizes the second encoding texture to rebuild the second image;

Described first rebuilds module comprises:

The first processing unit, carry out the entropy decoding for the code stream that described spatial scalable code stream is had the first image of first size and obtain the first conversion coefficient, after described the first conversion coefficient carried out up-sampling, carry out convergent-divergent at transform domain and process, obtain the transform domain data of the first encoding texture of described the first image;

The first decoding unit carries out the entropy decoding for the code stream that described spatial scalable code stream is had the second image of the second size and obtains the second transform domain inter-layer prediction data;

The first reconstruction unit is used for the transform domain data according to the second encoding texture of the transform domain data of the first encoding texture and described the second image of described the second transform domain inter-layer prediction data reconstruction.

17. according to claim 16, it is characterized in that, described first rebuilds module comprises:

The second processing unit, carry out the entropy decoding for the code stream that described spatial scalable code stream is had the first image of first size and obtain the first conversion coefficient, described the first conversion coefficient is carried out inverse transformation, data after the inverse transformation are carried out up-sampling and conversion, obtain the transform domain data of the first encoding texture of described the first image;

The second decoding unit carries out the entropy decoding for the code stream that described spatial scalable code stream is had the second image of the second size and obtains the 3rd transform domain inter-layer prediction data;

The second reconstruction unit is used for the transform domain data according to the second encoding texture of described second image of the 3rd transform domain inter-layer prediction data reconstruction of the transform domain data of the first encoding texture and described the second image.

18 1 kinds of video coding and decoding systems is characterized in that, comprising: video coding apparatus, at least one video decoder, at least one code stream rewriting device,