CN104396252B - Use the multi-view point video decoding method and its device of the reference picture collection predicted for multi-view point video - Google Patents

Use the multi-view point video decoding method and its device of the reference picture collection predicted for multi-view point video Download PDF

Info

Publication number
CN104396252B
CN104396252B CN201380033884.4A CN201380033884A CN104396252B CN 104396252 B CN104396252 B CN 104396252B CN 201380033884 A CN201380033884 A CN 201380033884A CN 104396252 B CN104396252 B CN 104396252B
Authority
CN
China
Prior art keywords
picture
current
reproduction order
vid
collection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201380033884.4A
Other languages
Chinese (zh)
Other versions
CN104396252A (en
Inventor
崔秉斗
朴正辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Publication of CN104396252A publication Critical patent/CN104396252A/en
Application granted granted Critical
Publication of CN104396252B publication Critical patent/CN104396252B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/109Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/58Motion compensation with long-term prediction, i.e. the reference frame for a current frame not being the temporally closest one
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Abstract

Provide a kind of by performing inter prediction and interview prediction to each picture of multi-view point video come to multiple view video coding and decoded method according to viewpoint.A kind of predictive coding method encoded to multi-view point video, including:Determine to include there is at least one reference picture collection for rebuilding picture in short term of the reproduction order different from the reproduction order of current picture and including at least one long-term reference picture collection for rebuilding picture among the same viewpoint being reconstructed prior to current picture rebuilds picture, the definite reference picture collection for including at least one short-term reconstruction picture with the reproduction order identical with the reproduction order of current picture among the different points of view reconstruction picture being reconstructed prior to current picture;At least one reference listing is determined between the first reference listing and the second reference listing, wherein, first reference listing is included at least one reconstruction picture of at least one reconstruction picture of the reproduction order of the reproduction order with earlier than current picture among definite multiple reference picture collection and the VID of the view identifier VID with the reproduction order identical with the reproduction order of current picture and with less than current picture, second reference listing includes at least one reconstruction picture of at least one reconstruction picture of the reproduction order with the reproduction order identical with the reproduction order of current picture and with the reproduction order for being later than current picture and the VID of the VID with the reproduction order identical with the reproduction order of current picture and with more than current picture;At least one reference picture and reference block of the current block for current picture are determined by using definite at least one reference listing;At least one operation selected from the inter prediction for current block and interview prediction is performed by using reference block.

Description

Use the multi-view point video encoding and decoding for the reference picture collection predicted for multi-view point video Method and its device
Technical field
The present invention relates to multiple view video coding and decoding.
Background technology
With the exploitation and offer of the hardware for reproducing and storing high-resolution or high-quality video content, for for Effectively high-resolution or high-quality video content are encoded or the demand of decoded Video Codec is increasing.Root According to traditional Video Codec, video is compiled according to the limited coding method based on the macro block with preliminary dimension Code.
The view data of spatial domain is transformed to the coefficient of frequency domain via frequency transformation., will according to Video Codec Image is divided into the block with preliminary dimension, and discrete cosine transform (DCT) is performed to each each piece, and to frequency in module unit Rate coefficient is encoded, and is calculated with carrying out the quick of frequency transformation.Compared with the view data of spatial domain, the coefficient of frequency domain holds Easily compressed.Specifically, represented due to the prediction error according to the inter prediction via Video Codec or infra-frame prediction The image pixel value of spatial domain, therefore when performing frequency transformation to prediction error, mass data can be transformed to 0.According to regarding Frequency codec, can reproduce raw data, to reduce data volume by using low volume data to replace continuously laying equal stress on.
It is same increased despite the presence of the demand of the growth of the video captured to multiple views, but with the quantity of viewpoint The amount of video data can throw into question.Therefore, the effort effectively encoded to multi-view point video is being continued.
The content of the invention
Technical problem
The present invention provides a kind of by performing inter prediction and viewpoint to each image of multi-view point video according to viewpoint Between prediction come the method that is coded and decoded to multi-view point video.
Technical solution
According to an aspect of the present invention, there is provided a kind of predictive coding method encoded to multi-view point video, including: Determine to include with the reproduction order with current picture not among the same viewpoint being reconstructed prior to current picture rebuilds picture With reproduction order at least one short-term reference picture collection for rebuilding picture and including at least one long-term picture of rebuilding Reference picture collection, determines to include having and current picture among the different points of view being reconstructed prior to current picture rebuilds picture At least one short-term reference picture collection for rebuilding picture of the identical reproduction order of reproduction order;In the first reference listing and second At least one reference listing is determined between reference listing, wherein, the first reference listing include definite multiple reference picture collection it In the reproduction order with earlier than current picture reproduction order at least one reconstruction picture and with current picture The identical reproduction order of reproduction order and the VID of view identifier (VID) with less than current picture it is at least one heavy Picture is built, the second reference listing is included with the viewpoint identical with the viewpoint of current picture and with the reproduction for being later than current picture At least one reconstruction picture of the reproduction order of order and with the reproduction order identical with the reproduction order of current picture simultaneously And at least one reconstruction picture of the VID of the VID with more than current picture;By using definite at least one reference columns Table determines at least one reference picture and reference block of the current block for current picture;By using reference block perform from for At least one prediction selected in the inter prediction and interview prediction of current block.
Beneficial effect
The predictive coding equipment and prediction decoding equipment for multi-view point video can be established for regarding more according to the present invention The inter prediction of point video and the reference picture list of interview prediction.One reference picture list may include to be used for inter prediction Reference picture and both reference pictures for interview prediction.
The ginseng of the state on clearly reflecting decoded picture collection can be sent and received by sequence parameter set and slice header Examine the information of picture collection.Reference listing can be determined according to reference picture collection, so as to consider the shape of current decoded picture collection Reference picture is determined in the reference listing that state determines, and reference picture can be used for inter prediction/between motion compensation and viewpoint pre- Survey/parallax compensation.
Brief description of the drawings
Fig. 1 a are the block diagrams of multi-view point video predictive coding equipment according to the embodiment;
Fig. 1 b are the flow charts of multi-view point video predictive coding method according to the embodiment;
Fig. 2 a are the block diagrams of multi-view point video prediction decoding equipment according to the embodiment;
Fig. 2 b are the flow charts of multi-view point video prediction decoding method according to the embodiment;
Fig. 3 shows the references object of the inter prediction according to the embodiment for current picture and interview prediction;
Fig. 4 shows the reference listing of the references object configuration according to the embodiment based on Fig. 3;
Fig. 5 a and Fig. 5 b are the diagrams for explaining the processing of change L0 lists according to the embodiment;
Fig. 6 a show the grammer of sequence pictures parameter set according to the embodiment;
Fig. 6 b show the grammer of parameter sets according to the embodiment;
Fig. 7 shows the grammer of slice header according to the embodiment;
Fig. 8 a show the parameter set of the reference picture collection according to the embodiment for interview prediction;
Fig. 8 b show the grammer of the parameter according to the embodiment for being used to change reference listing;
Fig. 9 shows the reference listing of combination according to the embodiment;
Figure 10 and Figure 11 is the diagram for explaining the processing of the reference listing according to the embodiment for being used to change combination;
Figure 12 is the frame of the multi-view video encoding apparatus according to the embodiment including multi-view point video predictive coding equipment Figure;
Figure 13 is the frame of the multi-view point video decoding device according to the embodiment including multi-view point video prediction decoding equipment Figure;
Figure 14 is the frame of the video encoder of the coding unit according to an embodiment of the invention based on according to tree construction Figure;
Figure 15 is the frame of the video decoding apparatus of the coding unit according to an embodiment of the invention based on according to tree construction Figure;
Figure 16 is the diagram for describing the concept of coding unit according to an embodiment of the invention;
Figure 17 is the block diagram of the image encoder according to an embodiment of the invention based on coding unit;
Figure 18 is the block diagram of the image decoder according to an embodiment of the invention based on coding unit;
Figure 19 is the diagram for showing the deeper coding unit according to an embodiment of the invention according to depth and subregion;
Figure 20 is the diagram for describing the relation between coding unit and converter unit according to an embodiment of the invention;
Figure 21 is the coding information for describing coding unit corresponding with coding depth according to an embodiment of the invention Diagram;
Figure 22 is the diagram of the deeper coding unit according to an embodiment of the invention according to depth;
Figure 23 to Figure 25 be used for describe coding unit, predicting unit and converter unit according to an embodiment of the invention it Between relation diagram;
Figure 26 is for describing between the coding unit of the coding mode information according to table 1, predicting unit and converter unit Relation diagram;
Figure 27 is the diagram of the physical arrangement of the disk of storage program according to an embodiment of the invention;
Figure 28 is to record the diagram with the disk drive of reading program by using disk;
Figure 29 is to provide the integrally-built diagram of the contents providing system of content distribution service;
Figure 30 and Figure 31 is the shifting according to an embodiment of the invention for applying method for video coding and video encoding/decoding method The respective diagram of internal structure and external structure of mobile phone;
Figure 32 is the diagram of the digit broadcasting system according to an embodiment of the invention using communication system;
Figure 33 is to show the cloud computing according to an embodiment of the invention using video encoder and video decoding apparatus The diagram of the network structure of system.
Preferred forms
According to an aspect of the present invention, there is provided a kind of predictive coding method encoded to multi-view point video, including: Determine to include with the reproduction order with current picture not among the same viewpoint being reconstructed prior to current picture rebuilds picture With reproduction order at least one short-term reference picture collection for rebuilding picture and including at least one long-term picture of rebuilding Reference picture collection, determines to include having and current picture among the different points of view being reconstructed prior to current picture rebuilds picture At least one short-term reference picture collection for rebuilding picture of the identical reproduction order of reproduction order;In the first reference listing and second At least one reference listing is determined between reference listing, wherein, the first reference listing include definite multiple reference picture collection it In the reproduction order with earlier than current picture reproduction order at least one reconstruction picture and with current picture The identical reproduction order of reproduction order and the VID of view identifier (VID) with less than current picture it is at least one heavy Picture is built, the second reference listing is included with the viewpoint identical with the viewpoint of current picture and with the reproduction for being later than current picture At least one reconstruction picture of the reproduction order of order and with the reproduction order identical with the reproduction order of current picture simultaneously At least one reconstruction picture of the VID of VID with more than current picture;By using definite at least one reference listing Determine at least one reference picture and reference block of the current block for current picture;Performed by using reference block from for working as At least one prediction selected in preceding piece of inter prediction and interview prediction.
The step of determining reference picture collection may include:Among the same viewpoint being reconstructed prior to current picture rebuilds picture Determine to include the reference that at least one non-reference with the reproduction order different from the reproduction order of current picture rebuilds picture Picture collection;Determine to include the reproduction with current picture among the different points of view being reconstructed prior to current picture rebuilds picture At least one non-reference of sequentially identical reproduction order rebuilds the reference picture collection of picture.
According to another aspect of the present invention, there is provided a kind of that decoded prediction decoding method, institute are carried out to multi-view point video Stating prediction decoding method includes:Determine to include having with working as among the same viewpoint being reconstructed prior to current picture rebuilds picture At least one short-term reference picture collection for rebuilding picture of the different reproduction order of the reproduction order of preceding picture and including at least The one long-term reference picture collection for rebuilding picture, bag is determined among the different points of view being reconstructed prior to current picture rebuilds picture Include at least one short-term reference picture collection for rebuilding picture with the reproduction order identical with the reproduction order of current picture; At least one reference listing is determined between first reference listing and the second reference listing, wherein, the first reference listing includes determining Multiple reference picture collection among have earlier than current picture reproduction order reproduction order at least one reconstruction picture And the VID of the VID with the reproduction order identical with the reproduction order of current picture and with less than current picture is at least One reconstruction picture, the second reference listing include with the viewpoint identical with the viewpoint of current picture and with being later than current picture Reproduction order reproduction order at least one reconstruction picture and there is the reproduction identical with the reproduction order of current picture Order and at least one reconstruction picture with the VID of the VID more than current picture;By using definite at least one reference List determines at least one reference picture and reference block of the current block for current picture;By using reference block perform from At least one compensation selected in the motion compensation of current block and parallax compensation.
The step of determining reference picture collection may include:Among the same viewpoint being reconstructed prior to current picture rebuilds picture Determine to include the reference that at least one non-reference with the reproduction order different from the reproduction order of current picture rebuilds picture Picture collection;Determine to include the reproduction with current picture among the different points of view being reconstructed prior to current picture rebuilds picture At least one non-reference of sequentially identical reproduction order rebuilds the reference picture collection of picture.
According to another aspect of the present invention, there is provided a kind of predictive coding equipment encoded to multi-view point video, institute Stating predictive coding equipment includes:Reference picture collection determiner, for rebuilding picture in the same viewpoint being reconstructed prior to current picture Determine to include at least one short-term reconstruction picture with the reproduction order different from the reproduction order of current picture among face Reference picture collection and the reference picture collection for including at least one long-term reconstruction picture, in the difference being reconstructed prior to current picture Viewpoint determines to include having at least one short-term of the reproduction order identical with the reproduction order of current picture among rebuilding picture Rebuild the reference picture collection of picture;Reference listing determiner, for being determined between the first reference listing and the second reference listing At least one reference listing, wherein, having earlier than current among multiple reference picture collection that first reference listing includes determining At least one reconstruction picture of the reproduction order of the reproduction order of picture and with identical with the reproduction order of current picture Reproduction order and with less than current picture VID VID at least one reconstruction picture, the second reference listing include have with At least one reconstruction of the identical viewpoint of the viewpoint of current picture and the reproduction order with the reproduction order for being later than current picture The VID of picture and VID with the reproduction order identical with the reproduction order of current picture and with more than current picture At least one reconstruction picture;Fallout predictor, for being determined by using definite at least one reference listing for current picture At least one reference picture and reference block of current block, and performed by using reference block from the inter prediction for current block With at least one prediction selected in interview prediction.
According to another aspect of the present invention, there is provided a kind of that decoded prediction decoding equipment, institute are carried out to multi-view point video Stating prediction decoding equipment includes:Reference picture collection determiner, for rebuilding picture in the same viewpoint being reconstructed prior to current picture Determine to include at least one short-term reconstruction picture with the reproduction order different from the reproduction order of current picture among face Reference picture collection and the reference picture collection for including at least one long-term reconstruction picture, in the difference being reconstructed prior to current picture Viewpoint determines to include having at least one short-term of the reproduction order identical with the reproduction order of current picture among rebuilding picture Rebuild the reference picture collection of picture;Reference listing determiner, for being determined between the first reference listing and the second reference listing At least one reference listing, wherein, having earlier than current among multiple reference picture collection that first reference listing includes determining At least one reconstruction picture of the reproduction order of the reproduction order of picture and with identical with the reproduction order of current picture Reproduction order and with less than current picture VID VID at least one reconstruction picture, the second reference listing include have with At least one reconstruction of the identical viewpoint of the viewpoint of current picture and the reproduction order with the reproduction order for being later than current picture The VID of picture and VID with the reproduction order identical with the reproduction order of current picture and with more than current picture At least one reconstruction picture;Compensator, for being determined by using definite at least one reference listing for current picture At least one reference picture and reference block of current block, and performed by using reference block from the motion compensation for current block With at least one compensation selected in parallax compensation.
Another aspect according to the embodiment, there is provided a kind of computer readable recording medium storing program for performing, in the computer-readable note Program of the record for perform prediction coding method on recording medium.According to another aspect of the present invention, there is provided a kind of computer Readable medium recording program performing, program of the record for perform prediction coding/decoding method on the computer readable recording medium storing program for performing.
Embodiment
Hereinafter, multi-view point video predictive coding equipment according to the embodiment will be described with reference to Figure 1A to Figure 11, multiple views regard Frequency predictive coding method, multi-view point video prediction decoding equipment and multi-view point video prediction decoding method.Reference will also be given to Figure 12 and Figure 13 descriptions include the multi-view video encoding apparatus of multi-view point video predictive coding equipment and predict including multi-view point video to solve The multi-view point video decoding device of decoding apparatus.Figure 14 to Figure 26 descriptions be reference will also be given to based on the more of the coding unit with tree construction Viewpoint video encoding device, multi-view point video decoding device, multi-view point video encoding method and multi-view point video coding/decoding method.Most Afterwards, multi-view point video encoding method according to the embodiment, multi-view point video coding/decoding method, video will be described with reference to Figure 27 to Figure 33 The various embodiments that coding method and video encoding/decoding method may be used on.Hereinafter, " image " can refer to video static image or Moving image, or also refer to video in itself.
Multi-view point video predictive coding equipment according to the embodiment, multi-view point video prediction will be described with reference to Figure 1A to Figure 11 Coding method, multi-view point video prediction decoding equipment and multi-view point video prediction decoding method.
Figure 1A is the block diagram of multi-view point video predictive coding equipment 10 according to the embodiment
It is true that multi-view point video predictive coding equipment 10 according to the embodiment includes reference picture collection determiner 12, reference listing Determine device 14 and fallout predictor 16.
Multi-view point video predictive coding equipment 10 according to the embodiment carries out basic visual point image and additional viewpoint image Coding.For example, central viewpoint picture, left view dot image and right visual point image are encoded, wherein, central viewpoint picture can be encoded For basic visual point image, left view dot image can be encoded as the first additional viewpoint image, and right visual point image can be encoded as second Additional viewpoint image.According to viewpoint, can be output by the data encoded to image to produce as single bit stream.
When the quantity of additional viewpoint is at least three, basic visual point image, the first additional viewpoint of the first additional viewpoint Image, the second additional viewpoint the second additional viewpoint image ..., the K additional viewpoints image of K additional viewpoints can be compiled Code.Therefore, the coding result of basic visual point image can be used as basic multi-view bitstream to export, the first additional viewpoint image, second Additional viewpoint image ... and the coding result of K additional viewpoint images can be respectively as the first additional viewpoint bit stream, Two additional viewpoint bit streams ... and the output of K additional viewpoints bit stream.
For example, multi-view point video predictive coding equipment 10 can encode basic visual point image includes coding symbol to export Number and sampling point (sample) base layer stream.Multi-view point video predictive coding equipment 10 can also be by referring to by basic viewpoint The coded identification and sampling point that image is encoded and produced, encode additional viewpoint image to export extra play bit stream.
Multi-view point video predictive coding equipment 10 according to the embodiment performs coding according to the block of each image of video. Block can have square shape, rectangular shape or any geometry, and be not limited to the data cell with preliminary dimension.Root Block according to embodiment can be maximum coding unit among the coding unit according to tree construction, coding unit, predicting unit or Converter unit.The Video coding based on the coding unit according to tree construction and decoding are described later with reference to Figure 14 to Figure 26 Method.
Multi-view point video predictive coding equipment 10 according to the embodiment can perform the interframe of cross reference same viewpoint image Prediction.By performing inter prediction, reference key, instruction current picture of the instruction for the reference picture of current picture can be produced The motion vector of movable information between reference picture and the residual error as difference component between current picture and reference picture Data.
In addition, multi-view point video predictive coding equipment 10 according to the embodiment is executable by referring to different visual point images Predict the multiple views prediction of current viewpoint picture.By performing interview prediction, the current picture of instruction current view point can be produced The reference key of reference picture, current picture and different points of view reference picture between parallax and as current picture and The residual error data of difference component between different points of view reference picture.
Multi-view point video predictive coding equipment 10 according to the embodiment can perform from same viewpoint the image of current view point The operation selected in the interview prediction between inter prediction and different points of view image between image.It can be based on coding The data cell of unit, predicting unit or converter unit performs inter prediction and interview prediction.
Hereinafter, for convenience of description, will be according to the embodiment to describe for the prediction of the image for a viewpoint The operation of multi-view point video predictive coding equipment 10.However, the operation of multi-view point video predictive coding equipment 10 not only can be to one The image of a viewpoint performs, and is also applied to different points of view image.
The reconstruction picture that prediction for other same viewpoint images can be referenced can be stored in decoding picture according to viewpoint In face buffer (DPB).However, the reconstruction picture for current picture being stored in DPB can be partially or wholly used for really Surely it is used for the inter prediction of current picture and/or the reference listing of interview prediction.
Multi-view point video predictive coding equipment 10 according to the embodiment refer among same viewpoint image prior to current The image that picture is reconstructed, to perform inter prediction to present image.Indicate number (that is, the picture order count of reproduction order (POC)) each image can be distributed to.Although the POC less than the POC of present image distributes to an image, if the figure As being reconstructed prior to current picture, then can come to perform inter prediction to current picture by referring to the picture of the reconstruction.
Multi-view point video predictive coding equipment 10 according to the embodiment can by perform inter prediction produce instruction from it is different The motion vector of alternate position spike between the corresponding block of image.
Multi-view point video predictive coding equipment 10 according to the embodiment refers to the different points of view image of same reproduction order Among the image being reconstructed prior to current picture, so as to present image perform interview prediction.For intersecting regarding for identification Point identifier (VID) can distribute to each viewpoint.For example, more remote from current view point to the left, then VID is smaller, and to the right from current Viewpoint is more remote, then VID is bigger.Can be by referring to the identical reproduction order of the reproduction order with the current picture with current view point Different points of view image among it is previous rebuild picture, interview prediction is performed to current picture.
Multi-view point video predictive coding equipment 10 according to the embodiment produces multiple visual point images by interview prediction Between parallax information.Multi-view point video predictive coding equipment 10 according to the embodiment can produce depth map, and depth map instruction is made Difference vector or regarded between the viewpoint of the parallax information of different points of view image corresponding with same scene (i.e. same reproduction order) Depth between point.
Figure 1B is the flow chart of multi-view point video predictive coding method according to the embodiment.Reference Figure 1B, now will below The operation of the element of multi-view point video predictive coding equipment 10 according to the embodiment is described.
In operation 11, reference picture collection determiner 12 according to the embodiment can determine that reference picture collection, wherein, reference picture Collection is stored in the reconstruction picture that the current picture among the multiple reconstruction pictures being reconstructed in DPB prior to current picture refers to Set, the set for rebuilding picture can be candidate image.
Reference picture collection according to the embodiment can include three subsets according to the state for the reconstruction picture being stored in DPB, These three subsets include the reconstruction picture of the same viewpoint as the references object for inter prediction.First subset can be bag The short term reference picture collection of at least one reconstruction picture is included, wherein, at least one reconstruction picture can be stored in DPB Same viewpoint rebuild picture among short term reference picture.Yield in the second subset can include at least one same viewpoint to rebuild picture The long term reference picture collection in face, wherein, it can be long term reference picture collection that at least one same viewpoint, which rebuilds picture,.3rd Subset can be the unused reference picture collection for including being not employed as the same viewpoint reconstruction picture of reference picture.
Short term reference picture collection according to the embodiment for inter prediction may include the reproduction with earlier than current picture The reproduction order of order it is preceding reconstruction picture and be later than current picture reproduction order reproduction order rear heavy Build picture.Therefore, short term reference picture collection can be divided into including can be referenced preceding reconstruction picture subset including can not be by Reference preceding reconstruction picture subset including can be referenced in the subset of rear reconstruction picture and including that can not be referenced In the subset of rear reconstruction picture.
Inter prediction reference picture for present image needs to be prior to the decoded image of present image.According to implementation The reference picture for inter prediction of example can be classified as short term reference picture and long term reference picture.DPB storage by pair The reconstruction picture that prior images perform motion compensation and produce.The reconstruction picture previously produced is used as different images The reference picture of inter prediction.Therefore, the interframe of present image is alternatively used among the reconstruction image being stored in DPB At least one short term reference picture or at least one long term reference picture of prediction.Short term reference picture can be according to short-term ginseng The decoding order of picture and current picture just before or recently decoded image is examined, and long term reference picture can worked as Preceding picture is long ago decoded and is deposited after being selected as the reference picture of the inter prediction of other images Store up the image in DPB.
Among the reconstruction picture being stored in DPB, short term reference picture and long term reference picture can classified each other It is chosen afterwards.Long term reference picture is can be referenced for the inter prediction of multiple images and therefore can be stored for a long time Image in DPB.
By performing inter prediction for short term reference picture necessary to each image to current picture and next image It can be updated, therefore short term reference picture can be updated in DPB.Therefore, when new short term reference picture is stored in DPB When, previously stored short term reference picture can be sequentially deleted from the image most stored for a long time.
As the long term reference index of instruction long term reference picture according to the embodiment, the picture order of long term reference picture Counting least significant bit (LSB) information of (POC) information can be determined.According to embodiment, the POC information of long term reference picture can It is divided into highest significant position (MSB) information and LSB information.Only LSB information is used as the long-term ginseng of instruction long term reference picture Examine index.
Reference picture collection according to the embodiment can be drawn according to the state for the reconstruction picture being stored in DPB including short term reference Face collection, the short term reference picture collection include rebuilding picture as the different points of view of the references object for interview prediction.According to The short term reference picture collection for being used between different points of view predict of embodiment may include that the different points of view in DPB is stored in rebuilds picture The reconstruction picture for being assigned the reproduction order identical with the reproduction order of current picture among face.
Multi-view point video predictive coding equipment 10 according to the embodiment can perform interview prediction and interframe to current picture Prediction.DPB according to the embodiment can store the different points of view also with the reproduction order identical with the reproduction order of current picture Rebuild picture.
Therefore, with the reproduction order identical with the reproduction order of current picture and with working as in DPB is stored in Among the reconstruction picture of the different viewpoint of the viewpoint of preceding picture, the short term reference picture for interview prediction may include to have small Picture is rebuild in negative (-) viewpoint of the viewpoint number of the viewpoint number of current picture and with the VID's for being more than current picture Just (+) viewpoint of VID rebuilds picture.Therefore, short term reference picture collection can be divided into negative (-) viewpoint including that can be referenced and rebuild The subset of picture and just (+) viewpoint including that can be referenced rebuild the subset of picture.
Reference picture collection determiner 12 according to the embodiment can be in the same viewpoint weight including being reconstructed prior to current picture Build among picture, determine to include the reproduction order for having the reproduction order of the current picture from being stored in DPB different at least One short-term reference picture collection for rebuilding picture and the reference picture collection for including at least one long-term reconstruction picture.According to implementation The reference picture collection determiner 12 of example can determine to include tool among the different points of view being reconstructed prior to current picture rebuilds picture At least one of reproduction order for having the reproduction order of the current picture with being stored in DPB identical rebuilds picture in short term.
Reference picture collection determiner 12 according to the embodiment can be in the phase according to the embodiment being reconstructed prior to current picture Among rebuilding picture with viewpoint, determine suitable including the different reproduction of the reproduction order with the current picture from being stored in DPB At least one non-reference of sequence rebuilds the reference picture collection of picture.
Reference picture collection determiner 12 according to the embodiment can also be rebuild in the different points of view being reconstructed prior to current picture Among picture, determine to include the non-ginseng of at least one different points of view with the reproduction order identical with the reproduction order of current picture Examine the reference picture collection for rebuilding picture.
Reference picture collection determiner 12 according to the embodiment can determine that whether used in current band for current picture One of definite reference picture collection.
If it is determined that using one of reference picture collection determined for current picture, then reference picture collection according to the embodiment Determiner 12 can concentrate selection index from reference picture.
If it is determined that without using one of reference picture collection determined for current picture, then reference picture according to the embodiment Collection determiner 12 can directly determine the reference picture collection for current band.
As the reference picture collection for inter prediction, have the first of the picture of the VID of the VID less than current view point to count The second quantity of picture of the VID of amount and VID with more than current view point can be determined, and be had less than current view point Between the VID of the picture of the VID of difference between the VID of the picture of the VID of VID and the VID with more than current view point Difference can be determined.
In operation 13, reference picture determiner 14 according to the embodiment can determine that record is used as the candidate of reference picture The reference listing of image, it is at least one pre- for what is selected from the inter prediction of current picture and interview prediction to determine The reference picture of survey.Reference listing for current picture can record on current among the reconstruction picture being stored in DPB The information of the order of the reference picture of picture reference.
Reference picture determiner 14 according to the embodiment can produce a reference listing or two ginsengs according to image mode Examine list.When current picture be capable of the P type of strip image of forward prediction or be capable of double prediction B type of strip images when, Reference listing determiner 14 can produce L0 lists as the first reference listing.
In operation 13, reference picture determiner 14 according to the embodiment can be among the same viewpoint image of current picture really Determine L0 lists, L0 lists include the reproduction order of the reproduction order with earlier than current picture at least one reconstruction picture and It is less than at least one reconstruction of the VID of current picture with the reproduction order and VID identical with the reproduction order of current picture Picture.
Reference picture determiner 14 according to the embodiment can be by using the reconstruction picture of reference picture collection with such suitable Sequence produces L0 lists:Among same viewpoint short term reference picture collection include can be referenced it is preceding reconstruction picture subset and It is negative including what can be referenced among the subset of rear reconstruction picture, different points of view short term reference picture collection including what can be referenced (-) viewpoint rebuilds the subset of picture and subset, the long term reference picture collection of picture is rebuild including just (+) viewpoint.
In operation 13, when current picture is B type of strip images, reference picture determiner 14 according to the embodiment may be used also L1 lists are produced as the second reference listing.Reference picture determiner 14 according to the embodiment can determine that L1 lists, L1 list bags Include the reproduction order with the reproduction order for being later than current picture at least one reconstruction picture and with current picture The identical reproduction order of reproduction order and at least one reconstruction picture with the VID of the VID more than current picture.
Reference picture determiner 14 according to the embodiment can be by using the reconstruction picture of reference picture collection with such suitable Sequence produces L0 lists:Among same viewpoint short term reference picture collection include can be referenced it is rear reconstruction picture subset and Include to be referenced just among the subset of preceding reconstruction picture, different points of view short term reference picture collection including what can be referenced (+) viewpoint rebuilds the subset of picture and rebuilds the subset of picture, long term reference picture collection including bearing (-) viewpoint.
However, among the image of same viewpoint, L0 lists preferably include having suitable earlier than the reproduction of current picture The reconstruction picture of the reproduction order of sequence, and may not include the reconstruction picture of the reproduction order with the reproduction order for being later than current picture Face.Equally, among the different points of view image with same reproduction order, L0 lists preferentially can include having and be less than currently The reconstruction picture of the VID of the VID of picture and the reconstruction picture that may include the VID of the VID with more than current picture.
Similarly, among the image of same viewpoint, L1 lists are preferably included with the reproduction for being later than current picture The reconstruction picture of the reproduction order of order, and may not include the reconstruction of the reproduction order of the reproduction order with earlier than current picture Picture.Equally, among the different points of view image with same reproduction order, L1 lists preferentially can include having and be more than currently The reconstruction picture of the VID of the VID of picture and the reconstruction picture that may include the VID of the VID with less than current picture.
Therefore, reference picture determiner 14 according to the embodiment will can select at least one from L0 lists and L1 lists List is determined as the reference listing of one for being selected from the inter prediction of current picture and interview prediction.
In operation 15, fallout predictor 16 according to the embodiment can be by using at least one determined by reference listing determiner 14 A reference listing determines at least one reference picture and reference block of the current block for current picture.17 are being operated, according to The fallout predictor 16 of embodiment can be pre- from the interframe for current picture to perform by using the reference block determined in operation 15 At least one prediction surveyed and selected in interview prediction.
Reference listing determiner 14 according to the embodiment can determine that whether optionally change what is determined in current picture The reference sequence of the reference key of at least one reference listing.
When that optionally can change reference sequence in current picture, reference listing determiner 14 can be changed for belonging to In the reference sequence of the reference key of at least one reference listing of the current band of current picture.
Reference listing determiner 14 according to the embodiment can determine have for current picture among the first reference listing Earlier than the first quantum of at least one reconstruction picture of the reproduction order of the reproduction order of current picture and with being less than Second quantum of at least one reconstruction picture of the VID of the VID of current picture.Reference listing determiner according to the embodiment 14 can be directed to the reproduction order for the reproduction order that current picture determine to be later than current picture among the second reference listing at least At least one reconstruction picture of the VID of one the 3rd quantum for rebuilding picture and the VID with more than current picture 4th quantum.
Reference listing determiner 14 according to the embodiment can be determined whether individually to replace from being set for current picture The first quantum and the second quantum of first reference listing and the 3rd quantum of the second reference picture and the 4th One selected in quantum.
When belonging to the quantum of reconstruction picture of each reference listing and can individually be replaced, ginseng according to the embodiment Examine list determiner 14 can determine that can be applied independently in current band from the first reference listing reconstruction picture quantity and Selected in the quantity of the reconstruction picture of second reference listing at least one.
That is, when the quantum for the reconstruction picture for belonging to each reference picture can be in current band by independence When ground is replaced, reference listing determiner 14 according to the embodiment can be by the viewpoint phase with current picture among the first reference listing The quantity of the same viewpoint reconstruction picture of the reproduction order of reproduction order same and with earlier than current picture replaces with can be only Vertical the first effective quantity applied to current band (replace being usually applied to the first quantum of current picture).
Equally, reference listing determiner 14 according to the embodiment will can regard among the first reference listing with current picture Point is identical and at least one same viewpoint of the VID of VID with less than current picture rebuilds the quantity of picture replace with can It is applied independently in the second effective quantity of current band (replace being usually applied to the second quantum of current picture).
Equally, reference listing determiner 14 according to the embodiment will can regard among the second reference listing with current picture Point is identical and has the number of at least one same viewpoint of the reproduction order for the reproduction order for being later than current picture reconstruction picture It is (the be usually applied to current picture the 3rd basic to replace that amount replaces with the 3rd effective quantity that can be applied independently in current band Quantity).
Equally, reference listing determiner 14 according to the embodiment will can regard among the second reference listing with current picture Point is identical and a same viewpoint of the VID of VID with more than current picture rebuilds the quantity of picture replace with can be independent The 4th effective quantity applied to current band (replace being usually applied to the 4th quantum of current picture).
The maximum quantity of the reference key according to the embodiment for belonging to the first reference listing can be the first reference listing it In have earlier than current picture reproduction order reproduction order at least one reconstruction picture the first quantum with tool Have less than the sum of second quantum of at least one reconstruction picture of VID of VID of current picture.
The maximum quantity of the reference key according to the embodiment for belonging to the second reference listing can be the second reference listing it In the 3rd quantum for following at least one reconstruction picture after current picture with more than current picture VID The sum of the 4th quantum of at least one reconstruction picture of VID.
Fallout predictor 16 according to the embodiment can be by based on from by the definite L0 lists of reference listing determiner 14 and L1 row At least one list selected in table according to reference sequence is compared current picture with the reconstruction picture being stored in DPB Compared with determining the reference picture of the prediction for current picture.Reference listing determiner 14 can by reference picture detect with Most similar piece of current block determines reference block.
Fallout predictor 16 according to the embodiment can determine that the reference key for indicating previously determined reference picture, and be determined as The motion vector or difference vector of alternate position spike between current block and reference block.Current block and reference block for each pixel it Between difference can be confirmed as residual error data.
When fallout predictor 16 according to the embodiment performs inter prediction, fallout predictor 16 can be from the tool in the first reference listing There is having at least one the reconstruction picture and the second reference listing of the reproduction order earlier than current picture to be later than current picture Determine to refer to picture at least one reconstruction picture selected at least one reconstruction picture of the reproduction order of the reproduction order in face Face and reference block.By performing the first residual error data between the inter prediction reference block and current block that determine to current block, referring to Show that the first motion vector of reference block and the first reference key of instruction reference picture can be generated as the result of inter prediction Data.
When fallout predictor 16 according to the embodiment performs interview prediction, fallout predictor 16 can be from the first reference listing In at least one the reconstruction picture and the second reference listing of the VID of VID with less than current picture with being drawn more than current Reference picture and reference block are determined at least one reconstruction picture selected at least one reconstruction picture of the VID of the VID in face. By performing the second residual error data between the interview prediction reference block and current block that determine, instruction reference block to current block Second reference key of the second difference vector and instruction reference picture can be generated as the result data of interview prediction.
Multi-view point video predictive coding equipment 10 according to the embodiment may include to be used for overall control reference listing determiner 14 and the central processing unit (not shown) of fallout predictor 16.Selectively, reference listing determiner 14 and fallout predictor 16 can pass through it The operation of processor (not shown), these processors mutually can operate organically so that multi-view point video predictive coding equipment 10 carry out general operation.Selectively, reference listing determiner 14 and fallout predictor 16 can be by multi-view point video predictive coding equipment 10 ppu (not shown) control.
Multi-view point video predictive coding equipment 10 according to the embodiment may include to be used to store 14 He of reference listing determiner One or more data storage cell (not shown)s for outputting and inputting data of fallout predictor 16.Multi-view point video predictive coding Equipment 10 may include to enter data into data storage cell/from the memory control of data storage cell output data for controlling Device (not shown) processed.
It is used to rebuild the embodiment quilt according to describing above by reference to Figure 1A and Figure 1B now with reference to Fig. 2A and Fig. 2 B descriptions The multi-view point video decoding device and multi-view point video coding/decoding method of the multi-view point video bit stream of predictive coding.
Fig. 2A is the block diagram of multi-view point video prediction decoding equipment 20 according to the embodiment, and Fig. 2 B are according to the embodiment more The flow chart of the block diagram of viewpoint video prediction decoding method.
Multi-view point video prediction decoding equipment 20 according to the embodiment includes reference listing determiner 24 and comparator 26.
Multi-view point video prediction decoding equipment 20 according to the embodiment can receive the bit from the image coding of each viewpoint Stream.Can having the bit stream of the coded data of basic visual point image with record, discretely receiving record has the coding of additional viewpoint image The bit stream of data.
For example, multi-view point video prediction decoding equipment 20 can be by rebuilding Primary layer bit stream decoding in basic viewpoint figure Picture.Multi-view point video prediction decoding equipment 20 also optionally decodes extra play bit stream.Can be by referring to from base The coded identification and sampling point that this layer of bit stream is rebuild decode extra play bit stream to rebuild additional viewpoint image.To additional Decode to layer making choice property of bit stream, so as to only rebuild desired viewpoint video from multi-view point video.
Come for example, multi-view point video prediction decoding equipment 20 according to the embodiment can decode basic multi-view bitstream Reconstructed center visual point image, decodes the first additional viewpoint bit stream to rebuild left view dot image, to the second additional viewpoint Bit stream decoding rebuilds right visual point image.
When the quantity of additional viewpoint is at least three, the first additional viewpoint can be rebuild from the first additional viewpoint bit stream First additional viewpoint image, can rebuild the second additional viewpoint image of the second additional viewpoint from the second additional viewpoint bit stream, and The K additional viewpoint images of K additional viewpoints can be rebuild from K additional viewpoints bit stream.
Multi-view point video prediction decoding equipment 20 according to the embodiment is according to the block perform decoding of each object of video.Root Block according to embodiment can be have maximum coding unit among the coding unit of tree construction, coding unit, predicting unit or Converter unit.
Multi-view point video prediction decoding equipment 20 according to the embodiment can receive the motion vector produced via inter prediction And parallax information, the bit stream together with the coded data including the image according to viewpoint produced via interview prediction.
Multi-view point video prediction decoding equipment 20 according to the embodiment can be by performing mutually with reference to via same viewpoint frame Between the motion compensation of image predicting and predict carry out reconstruction image.Motion compensation is by by using the motion vector of present image Definite reference picture synthesizes with the residual error data of present image and reconfigures the operation of the reconstruction image of present image.
Multi-view point video prediction decoding equipment 20 according to the embodiment can perform parallax by reference to different points of view image and mend Repay, to rebuild the additional viewpoint image predicted via interview prediction.Parallax compensation is by by using present image The different points of view reference picture that parallax information determines synthesizes with the residual error data of present image and rebuilds the reconstruction of present image The operation of image.Multi-view point video prediction decoding equipment 20 according to the embodiment can perform parallax compensation to rebuild by reference to not With the current viewpoint picture of visual point image prediction.
Coding unit or predicting unit can be based on, reconstruction is performed by parallax compensation between inter motion compensation and viewpoint.
Compensator 26 according to the embodiment can be via regarding with reference to the different points of view image rebuild from different multi-view bitstreams Predicted between point and rebuild current identification image with reference to the inter prediction of same viewpoint image, so as to according to viewpoint to bit stream Decoding.
Compensator 26 according to the embodiment can be via with reference to reproduction order and current picture among different points of view reconstruction picture Reproduction order identical reconstruction picture viewpoint between parallax compensation rebuild current viewpoint picture.According to circumstances, can be via ginseng Parallax compensation is examined between the viewpoint of two or more different points of view images to rebuild current viewpoint picture.Reference listing determiner 24 It can determine that reference listing, drawn so that compensator 26 can determine that for the motion compensation of current picture or the exact references of parallax compensation Face.
Hereinafter, now with reference to Fig. 2 B come describe to determine reference listing for inter prediction and interview prediction and At least one method selected from inter prediction and interview prediction is finally performed by using reference listing.
In operation 21, reference picture collection determiner 22 according to the embodiment can determine that reference picture collection, wherein, reference picture Collection is the reconstruction picture rebuild among picture that current picture refers to, the current picture prior to being stored in DPB is reconstructed Set, the set for rebuilding picture can be candidate image.
Multi-view point video predictive coding equipment 10 according to the embodiment can perform interview prediction and interframe to current picture Prediction.DPB according to the embodiment can store the different points of view weight with the reproduction order identical with the reproduction order of current picture Build picture.
Reference picture collection according to the embodiment may include the short term reference picture collection for inter prediction.It is according to the embodiment Short term reference picture collection for inter prediction may include to be stored in same viewpoint in DPB rebuild among picture be assigned with The reconstruction picture of the different reproduction order of the reproduction order of current picture.
Therefore, the short term reference picture collection for inter prediction may include with the current picture with being stored in DPB The identical viewpoint of viewpoint and rebuilding among picture so with the reproduction order different from the reproduction order of current picture In preceding reconstruction picture and in rear reconstruction picture, wherein, the reproduction order having in preceding reconstruction picture earlier than current picture Reproduction order, it is described it is rear reconstruction picture have be later than current picture reproduction order reproduction order.Therefore, short term reference Picture collection can be divided into including that can be referenced in the preceding subset for rebuilding picture including the son in preceding reconstruction picture that can not be referenced Collection including the subset in rear reconstruction picture that can be referenced and the subset in rear reconstruction picture including that can not be referenced.
Reference picture collection according to the embodiment may include the short term reference picture collection for interview prediction.According to embodiment The short term reference picture collection for interview prediction may include to be stored in different points of view in DPB and rebuild distribution among picture There is the reconstruction picture of the reproduction order identical with the reproduction order of current picture.
Therefore, the short term reference picture collection for interview prediction may include with the current picture with being stored in DPB Such negative (-) viewpoint for rebuilding among picture of different points of view of the identical reproduction order of reproduction order rebuild picture and just (+) viewpoint rebuilds picture, wherein, described negative (-) viewpoint rebuilds the VID for the VID that picture has less than current picture, described Just (+) viewpoint rebuilds just (+) viewpoint reconstruction picture that picture has the VID for the VID for being more than current picture.Therefore, short term reference Picture collection can be divided into the subset of negative (-) viewpoint reconstruction picture including that can be referenced and just (+) including that can be referenced regards Point rebuilds the subset of picture.
The phase that reference picture collection determiner 22 according to the embodiment can be reconstructed in the current picture prior to being stored in DPB Rebuild with viewpoint among picture and determine to include that there is at least one short of the reproduction order different from the reproduction order of current picture Phase rebuilds the reference picture collection of picture and including at least one long-term reference picture collection for rebuilding picture.Ginseng according to the embodiment Examining picture collection determiner 22 can determine among the different points of view that the current picture prior to being stored in DPB is rebuild rebuilds picture Including at least one short-term reference picture collection for rebuilding picture with the reproduction order identical with the reproduction order of current picture.
Reference picture collection determiner 22 according to the embodiment can be according to the embodiment identical prior to what current picture was rebuild Viewpoint determines to include at least one non-ginseng with the reproduction order different from the reproduction order of current picture among rebuilding picture Examine the reference picture collection for rebuilding picture.
Reference picture collection determiner 22 according to the embodiment can further include the different points of view rebuild prior to current picture and rebuild Such reference picture collection among picture, wherein, the reference picture collection is included with the reproduction order phase with current picture At least one non-reference different points of view of same reproduction order rebuilds picture.
Reference picture collection determiner 22 according to the embodiment can determine that whether used in current band for current picture One of definite reference picture collection.
If using one of reference picture collection determined for current picture, reference picture collection according to the embodiment determines Device 22 can concentrate selection index from reference picture.
If without using one of reference picture collection determined for current picture, reference picture collection according to the embodiment is true Reference picture collection for current band can directly be determined by determining device 22.
As the reference picture collection for interview prediction, have the of the picture of the VID of VID less than current view point The second quantity of picture of the VID of one quantity and VID with more than current view point can be determined, and had and are less than current view point VID VID picture VID between difference and with more than current view point VID VID picture VID between Difference can be determined.
In operation S23, reference listing determiner 24 according to the embodiment can determine that the L0 for forward prediction and double prediction List and the L1 lists for double prediction.
Reference listing determiner 24 according to the embodiment can determine to include tool among the same viewpoint image of current picture There is at least one reconstruction picture of the reproduction order of the reproduction order earlier than current picture and with the reproduction with current picture First reference of at least one reconstruction picture of the VID of sequentially identical reproduction order and the VID with less than current picture List, i.e. L0 lists.
Reference listing determiner 24 according to the embodiment can determine that including with being later than the reproduction order of current picture again Now at least one reconstruction picture of order and with the reproduction order identical with the reproduction order of current picture and with big In the second reference listing of at least one reconstruction picture of the VID of the VID of current picture, i.e. L1 lists.
However, among same viewpoint image, when L0 lists preferably include the reproduction order with earlier than current picture Reproduction order reconstruction picture and effective reference key when having residue, L0 lists, which may include to have, is later than current picture again The now reconstruction picture of order.Equally, among the different points of view image with same reproduction order, when L0 lists preferably include When the reconstruction picture and effective reference key of the VID of VID with less than current picture have residue, L0 lists may include have There is the reconstruction picture of the VID of the VID more than current picture.
Similarly, among same viewpoint image, when L1 lists preferably include be later than current picture reproduction it is suitable When the reconstruction picture and effective reference key of the reproduction order of sequence have residue, L1 lists may include to have earlier than current picture The reconstruction picture of the reproduction order of reproduction order.Equally, among the different points of view image with same reproduction order, when L1 is arranged Table preferably include with more than current picture VID VID reconstruction picture and effective reference key have residue when, L1 List may include the reconstruction picture of the VID of the VID with less than current picture.
The pre- measurement equipment 20 of multi-view point video can will rebuild picture and be stored in DPB.Record is related to store on reference listing The information being referenced for the motion compensation of current picture or the order for rebuilding picture of parallax compensation in DPB.
In operation 25, compensator 26 according to the embodiment can be by using at least one determined by reference listing determiner 24 A reference listing determines at least one reference picture and reference block of the current block for current picture.
In operation 27, compensator 26 according to the embodiment can be directed to current bitstream and perform from reference to suitable with same reproduction The different points of view of sequence rebuilds the parallax compensation of picture and at least one selected is rebuild in the motion compensation of picture with reference to same viewpoint Kind, to rebuild the current block of current picture.
More particularly, the pre- measurement equipment 20 of multi-view point video according to the embodiment can parse current bitstream to be used for Reference key, parallax information and the residual error data of interview prediction.Compensator 26 according to the embodiment can be by using with reference to rope The reference picture among definite different points of view image is attracted, and the reference of reference picture is determined by using parallax information Block.Reference block is by compensation residual error data, so as to rebuild current picture.
The executable motion compensation that picture is rebuild with reference to same viewpoint of compensator 26 according to the embodiment, to rebuild current draw Face.
Say in further detail, the pre- measurement equipment 20 of multi-view point video can parse bit stream to obtain the movement for current picture Reference key, motion vector and the residual error data of compensation.Compensator 26 can determine same viewpoint weight by using reference key The reference picture among picture is built, the reference block in reference picture is determined by using motion vector, and mended with residual error data Reference block is repaid, so as to rebuild current picture.
Reference listing determiner 24 according to the embodiment can be determined whether optionally to change at least one in current picture The reference sequence of the reference key of a reference listing.Substantially, the reference key instruction due to reference listing and reference key phase The reference sequence for the reconstruction picture answered, so if reference key is changed, then the ginseng for rebuilding picture corresponding with reference key Order is examined to be changed.
During the reference sequence that reference listing is optionally changed in current picture when according to the embodiment, according to embodiment Reference listing determiner 24 can change current band for belonging to current picture at least one reference listing reference rope The reference sequence drawn.
Reference listing determiner 24 according to the embodiment can also determine the tool among the first reference listing for current picture There are the first quantum, the first reference columns of at least one reconstruction picture of the reproduction order of the reproduction order prior to current picture The second quantum, the second reference columns of at least one reconstruction picture of the VID with the VID less than current picture among table 3rd quantum of at least one reconstruction picture of the reproduction order of the reproduction order for being later than current picture among table, second 4th quantum of at least one reconstruction picture of the VID with the VID more than current picture among reference listing.
Reference listing determiner 24 according to the embodiment can individually be replaced in current band and set for current picture The first reference listing the first quantum and the second quantum and the 3rd quantum of the second reference listing and What is selected in four quantums is at least one.
If it is, the basic number for the reconstruction picture for belonging to each reference listing is individually replaced in current band Measure, then reference listing determiner 24 according to the embodiment can use the first effective quantity for being only applied to current band to replace the first ginseng Examining among list has viewpoint same viewpoint and reproduction order with current picture before the reproduction order of current picture Reconstruction picture quantity, to replace being usually applied to the first quantum of current picture.
Equally, reference listing determiner 24 according to the embodiment can be replaced with the second effective quantity for being only applied to current band Change the VID's with VID identical with the viewpoint of current picture and with less than current picture among the first reference listing At least one quantity for rebuilding picture, to replace being usually applied to the second quantum of current picture.
Equally, reference listing determiner 24 according to the embodiment can be replaced with the 3rd effective quantity for being only applied to current band Change among the second reference listing with the viewpoint identical with the viewpoint of current picture and with the reproduction for being later than current picture The quantity of at least one reconstruction picture of the reproduction order of order, to replace being usually applied to the 3rd basic number of current picture Amount.
Equally, reference listing determiner 24 according to the embodiment can be replaced with the 4th effective quantity for being only applied to current band Change the VID's with the viewpoint identical with the viewpoint of current picture and with more than current picture among the second reference listing The quantity of at least one reconstruction picture of VID, to replace being usually applied to the 4th quantum of current picture.
The maximum quantity of the reference key according to the embodiment for belonging to the first reference listing can be the first reference listing it In the reproduction order with the reproduction order earlier than current picture at least one reconstruction picture the first quantum and the Among one reference listing have less than current picture VID VID at least one reconstruction picture the second quantum it With.
The maximum quantity of the reference key according to the embodiment for belonging to the second reference listing can be the second reference listing it In the reproduction order with the reproduction order for being later than current picture at least one reconstruction picture the 3rd quantum and the Among two reference listings have more than current picture VID VID at least one reconstruction picture the 4th quantum it With.
Compensator 26 can be performed by using definite reference listing and selected at least from motion compensation and parallax compensation One kind operation.Reference listing according to the embodiment can be recorded on for the reconstruction picture of inter prediction and for pre- between viewpoint The information of the reconstruction picture of survey.Therefore, compensator 26 can be performed from motion compensation and parallax by using a reference listing and mended Repay at least one operation of middle selection.
Multi-view point video prediction decoding equipment 20 according to the embodiment can receive the reference of the current block for current picture Index and residual error data, and receive motion vector or difference vector.What it is according to the instruction of the reference picture indices of reception is reference Same viewpoint among list rebuilds picture or the identical POC images of different points of view, it may be determined that the vector of reception is movement arrow Amount or difference vector.
Therefore, compensator 26 according to the embodiment can determine that the reference picture indicated by reference key among reference listing Face, if it is determined that reference picture be same viewpoint rebuild picture, then compensator 26 according to the embodiment can determine that reconstruction picture Among the reference block indicated by motion vector, and for referring to block compensation residual error data, so as to rebuild current block.
Compensator 26 according to the embodiment can determine that the reference picture indicated by reference key among reference listing, if Definite reference picture is the reconstruction picture of different points of view, then compensator 26 according to the embodiment, which can determine that, rebuilds among picture The reference block indicated by motion vector, and for referring to block compensation residual error data, so as to rebuild current block.
, can be from the second multi-view bitstream weight with identical from current view point bit stream reconstruction current viewpoint picture as described above Build the second visual point image.
Multi-view point video prediction decoding equipment 20 according to the embodiment may include to be used for overall control reference listing determiner 24 and the central processing unit (not shown) of compensator 26.Selectively, reference listing determiner 24 and compensator 26 can pass through it The operation of processor (not shown), these processors mutually can operate organically so that multi-view point video prediction decoding equipment 20 carry out general operation.Selectively, reference listing determiner 24 and compensator 26 can be by multi-view point video prediction decoding equipment 20 ppu (not shown) control.
Multi-view point video prediction decoding equipment 20 according to the embodiment may include to be used to store 24 He of reference listing determiner One or more data storage cell (not shown)s for outputting and inputting data of compensator 26.Multi-view point video prediction decoding Equipment 20 may include to enter data into data storage cell/from the memory control of data storage cell output data for controlling Device (not shown) processed.
Now with reference to Fig. 3, Fig. 4, Fig. 5 A and Fig. 5 B by using three for inter prediction for reference listing Reconstructed block and for the example of three reconstructed blocks of interview prediction come describe in detail the present invention operation.
Fig. 3 shows the references object of the inter prediction according to the embodiment for current picture 31 and interview prediction.
For example, the image 30 of four viewpoints is encoded, and for having viewpoint VID 5, reproduction order POC 18 The reference listing of inter prediction of current picture 31 be determined.Assuming that current picture 31 refer to have VID 5 image it In three images 32,33 and 34 being reconstructed prior to current picture 31 carry out inter prediction.It is also supposed that current picture 31 can With reference to three images 35,36 and 37 being reconstructed among the different points of view image with POC 18 prior to current picture 31 come into Row interview prediction.
Fig. 4 shows the reference listing of the references object configuration according to the embodiment based on Fig. 3.
After Fig. 3, the reconstruction picture 32,33,34,35,36 and 37 for being used to predict that can be referred to by current picture 31 can quilt It is stored in the DPB 40 for current picture 31.
In basic L0 lists 41 according to the embodiment, preferably reference sequence can be assigned to among inter prediction The reconstruction picture of forward prediction, preferential reference sequence can be assigned to the reconstruction picture closer to current picture 31.Preferential Reference sequence can be assigned to rebuilds reconstruction picture of the picture not for interview prediction for inter prediction.Preferential ginseng Examine the VID having less than current view point that order can be assigned among the reconstruction picture for interview prediction according to viewpoint VID rebuild picture rather than with more than current view point VID VID reconstruction picture.
Just as basic L0 lists 41, in basic L1 lists 45 according to the embodiment, preferential reference sequence can be allocated To the reconstruction picture for being closer to current picture 31.Preferential reference sequence can be assigned to the reconstruction picture for inter prediction and The non-reconstruction picture for interview prediction.However, in basic L1 lists 45, preferably reference sequence can be assigned to and be used for The reconstruction picture of back forecast among inter prediction.Preferential reference sequence can be assigned to be used between viewpoint according to viewpoint The reconstruction picture of the VID with the VID more than current picture among the reconstruction picture of prediction, rather than have and be less than current view point VID reconstruction picture.
For convenience, the image with VID A and POC B is named as VID A/POC B images.
Therefore, it can be same viewpoint VID 5/POC to belong to the reconstruction pictures of basic L0 lists 41 according to reference sequence 17 pictures 32,16 pictures 33 of VID 5/POC, 19 pictures 34 of VID 5/POC, 18 pictures 35 of VID 3/POC, VID 1/POC 18 picture 37 of 18 pictures 36 and VID 7/POC.
It can be 19 pictures 34 of VID 5/POC, VID 5/ to belong to the reconstruction pictures of basic L1 lists 45 according to reference sequence 17 pictures 32 of POC, 16 pictures 33 of VID 5/POC, 18 pictures 37 of VID 7/POC, 18 pictures 35 of VID 3/POC and VID 18 pictures 36 of 1/POC.
Fig. 5 A and Fig. 5 B are the diagrams for explaining the processing of change L0 lists according to the embodiment.
In reference key table 50 according to the embodiment, reference key Idx 51 is typically the weight indicated in reference listing Build the number of the fundamental order of picture.Reference key can substantially indicate the reference sequence in reference listing.Therefore, with reference to rope In the order drawn, picture of rebuilding corresponding with reference key can be referenced.
However, it can be referred to by reference to the selectively modified of order optionally to change in current band by reference key The reference sequence shown.In present image, the modification index List_entry_l0 55 of reference key table 50 according to the embodiment It can be defined as optionally changing the reference key set in basic L0 lists 41.
At this point, the Idx 0,1,2,3,4 and 5 of basic L0 lists 41 sequentially indicate 17 pictures 32 of VID 5/POC, 16 pictures 33 of VID 5/POC, 19 pictures 34 of VID 5/POC, 18 pictures 35 of VID 3/POC, 18 pictures 36 of VID 1/POC With 18 pictures 37 of VID 7/POC.
It is, due to reference sequence according to the index List_entry_l0 55 of modification be modified to Idx 0,3,1, 2nd, 4 and 5,17 pictures of VID 5/POC can be modified to according to reference sequence by being consequently belonging to the reconstruction picture of the L0 lists 59 of modification 32nd, 18 pictures 35 of VID 3/POC, 16 pictures 33 of VID 5/POC, 19 pictures 34 of VID 5/POC, 18 pictures of VID 1/POC 36 and VID 7/POC, 18 pictures 37.
Therefore, in basic L0 lists 41, the VID 5/ as the reconstruction picture that can be used for inter prediction by initial reference Next reference sequence of 17 pictures 32 of POC can be 16 pictures 33 of VID 5/POC for inter prediction, pre- for interframe After all same viewpoints reconstruction picture surveyed is referenced, the reconstruction picture for the different points of view of interview prediction can be joined Examine.
However, in the L0 lists 59 of the modification of the index List_entry_l0 55 of modification according to the embodiment, can quilt Initial reference be used for inter prediction the reference sequence after identical 17 pictures 32 of vision point ID 5/POC that follows can be It can be used for 17 pictures 32 of different points of view VID 5/POC of interview prediction by initial reference.
However, although three same viewpoints of above description rebuild picture 32,33 and 34 and three different points of view rebuild picture Face 35,36 and 37 belongs in the example of a L0 list 41 of the predictive coding for current picture, according to the embodiment to belong to The quantity of the reconstruction picture of reference listing should not necessarily be limited by this.
Fig. 6 A show the grammer of sequence parameter set 65 according to the embodiment.
Multi-view point video predictive coding equipment 10 according to the embodiment can be produced including on being usually applied to current picture The sequence parameter set 65 of the information of default setting in sequence.Concretely, on the reference listing for interview prediction, close It can be included in sequence parameter set 65 in the reference picture collection for each picture that can be used for belonging to current sequence.
Sequence parameter set 65 according to the embodiment includes parameter " num_interview_ref_pic_sets " 66, parameter " num_interview_ref_pic_sets " 66 indicates the admissible reference picture collection of each picture for belonging to current sequence Quantity.For example, " if num_interview_ref_pic_sets " 66 is confirmed as the number from 0 to 64, can be directed to each Picture sets the quantity with the reference picture of respective counts as many.
Sequence parameter set 65 according to the embodiment and the reference as the detail parameters for defining each reference picture collection " interview_ref_pic_set (i) " 67 of the parameter set of picture collection is related.Ginseng will be described more fully with reference to Fig. 8 A Examine the parameter set of " interview_ref_pic_set (i) " 67 of picture collection.Fig. 8 A are shown as according to the embodiment be used for " interview_ref_pic_set (i) " 67 of the parameter set of the reference picture collection of interview prediction.
The parameter set of reference picture collection " interview_ref_pic_set (i) " 67 according to the embodiment may include information " num_negative_interview_pics " 93 and information " num_positive_interview_pics " 94, wherein, letter The instructions of " num_negative_interview_pics " 93 are ceased positioned at the direction of the VID smallers since the viewpoint of current picture Different points of view in (negative (-) viewpoint direction) rebuilds the quantity of picture, information " num_positive_interview_pics " 94 Indicate to rebuild picture positioned at the different points of view since the viewpoint of current picture on the direction (just (+) viewpoint direction) of VID biggers Quantity.
The parameter set of reference picture collection " interview_ref_pic_set (i) " 67 according to the embodiment may also include " delta_view_idx_s0_minus1 [i] " 95 and " delta_view_idx_s1_minus1 [i] " 96, wherein, " delta_ View_idx_s0_minus1 [i] " 95 is indicated:With the current view point of each reconstruction image in negative (-) viewpoint direction VID is compared, and the VID of respective viewpoints reduces;" delta_view_idx_s1_minus1 [i] " 96 is indicated:With positioned at just (+) viewpoint The VID of the current view point of each reconstruction image on direction is compared, the VID increases of respective viewpoints.
Referring back to the sequence parameter set 65 of Fig. 6 A, multi-view point video prediction decoding equipment 20 according to the embodiment can be from sequence Row parameter set 65 parses " num_interview_ref_pic_sets " 66, to read the picture that can be used for belonging to current sequence The quantity of reference picture collection.
Multi-view point video prediction decoding equipment 20 according to the embodiment can be parsed for each reference picture collection " interview_ref_pic_set (i) " 67, to be read from " num_negative_interview_pics " 93 positioned at negative (-) The quantity of different points of view reconstruction image in viewpoint direction, and from " num_positive_ interview_pics " 94 read position In the quantity of the different points of view reconstruction image just in (+) viewpoint direction.
Multi-view point video prediction decoding equipment 20 according to the embodiment can parse reference picture collection according to the embodiment The parameter set of " interview_ref_pic_set (i) " 67, to be read from " delta_view_idx _ s0_minus1 [i] " 95 Difference between the VID of respective viewpoints and the VID of current view point of each reconstruction image in negative (-) viewpoint direction, from And read the VID of the reconstruction image in negative (-) viewpoint direction.
Similarly, multi-view point video prediction decoding equipment 20 according to the embodiment can parse reference picture according to the embodiment Collect the parameter set of " interview_ref_pic_set (i) " 67, to be read from " delta_view_idx_s1_minus1 [i] " 96 Difference of the fetch bit between the VID of respective viewpoints and the VID of current view point of each reconstruction image just in (+) viewpoint direction, So as to read the VID positioned at the reconstruction image just in (+) viewpoint direction.
Fig. 6 B show the grammer of parameter sets 60 according to the embodiment.
Multi-view point video predictive coding equipment 10 according to the embodiment can be produced including on that can be usually applied to currently draw The parameter sets 60 of the information of default setting in face.Specifically, on reference listing, on belonging to L0 lists and L1 row The information of the quantum of the reconstruction picture of table can be included in parameter sets 60, wherein, institute is used by each prediction block State and rebuild at least one that picture carries out selecting from inter prediction and interview prediction in current picture.
For example, " num_ref_idx_l0_default_active_minus1 " 61 indicates having prior to working as in L0 lists The same viewpoint of the reproduction order of the reproduction order of preceding picture effectively rebuilds the quantum of picture.“num_interview_ Having in the instruction L0 lists of ref_idx_l0_default_active_minus1 " 62 is identical with the reproduction order of current picture Reproduction order and VID be less than current view point VID effective quantum for rebuilding picture.“num_ref_idx_l1_ Default_active_minus1 " 63 indicates the reproduction order with the reproduction order for being later than current picture in L1 lists Same viewpoint effectively rebuilds the quantum of picture.“num_interview_ref_idx_l1_default_active_ Minus1 " 64 indicates in L1 lists there is the reproduction order and VID identical with the reproduction order of current picture to be more than currently Effective quantum for rebuilding picture of the VID of viewpoint.
Multi-view point video prediction decoding equipment 20 according to the embodiment can be from the bitstream extraction parameter sets 60 of reception. Multi-view point video prediction decoding equipment 20 according to the embodiment can parse " num_ref_idx_l0_ from parameter sets 60 Default_active_minus1 " 61, to read the reproduction order with the reproduction order prior to current picture in L0 lists Same viewpoint effectively rebuild the quantum of picture.Multi-view point video prediction decoding equipment 20 according to the embodiment can parse " num_interview_ref_idx_l0_default_active_minus1 " 62, is drawn with reading in L0 lists to have with current The identical reproduction order of the reproduction order in face and VID are less than effective quantum for rebuilding picture of the VID of current view point.Root " num_ref_idx_l1_default_active_ can be parsed according to the multi-view point video prediction decoding equipment 20 of embodiment Minus1 " 63, it is effective to read the same viewpoint of the reproduction order with the reproduction order for being later than current picture in L1 lists Rebuild the quantum of picture.Multi-view point video prediction decoding equipment 20 according to the embodiment can parse " num_interview_ Ref_idx_l1_default_active_minus1 " 64, to read the reproduction order having with current picture in L1 lists Identical reproduction order and VID are more than effective quantum for rebuilding picture of the VID of current view point.
Fig. 7 shows the grammer of slice header 70 according to the embodiment.
Multi-view point video predictive coding equipment 10 according to the embodiment can be produced including on being usually applied to current band Setting information slice header 70.
Slice header 70 according to the embodiment may include the parameter 90 of the short term reference picture collection for inter prediction, be used for The parameter 92 of the parameter 91 of the reference picture collection of interview prediction and long term reference picture collection for inter prediction.
The parameter 90 of short term reference picture collection according to the embodiment for inter prediction may include information whether that use exists The information " short_term_ref_pic_set_sps_flag " 87 of the short term reference picture collection determined in sequence parameter set 65.
If without using the short term reference picture collection determined in sequence parameter set 65 in current band, parameter 90 can Including parameter set " short_term_ref_pic_set (num_short_term_ref_pic_sets) " 80, it is used for definition The short term reference picture collection of current band.
Meanwhile if the short term reference picture collection determined in sequence parameter set 65, parameter are used in current band 90 may include the index " short_ of the short term reference picture collection determined for current band and in sequence parameter set 65 term_ref_pic_set_idx”88。
Parameter 91 for the reference picture collection of interview prediction may include information " interview_ref_pic_set_ Sps_flag " 97, wherein, " interview_ref_pic_set_sps_flag " 97 indicates whether use in sequence parameter set The reference picture collection for interview prediction determined in 65.
If without using the reference picture for interview prediction determined in sequence parameter set 65 in current band Collection, then parameter 91 may include parameter set " interview_ref_pic_set (num_interview_ref_pic_sets) " 67, To define the reference picture collection of the interview prediction for current band.
If using the reference picture collection for interview prediction determined in sequence parameter set 65 in current band, Then parameter 91 may include the reference picture for interview prediction that will be determined for current band and in sequence parameter set 65 The index " interview_ref_pic_set_idx " 98 of face collection.
Parameter 92 for the long term reference picture collection of inter prediction may include the quantity " num_ of long term reference picture The information of long_term_pics " 89 and POC on long term reference picture.
On reference listing, the information optionally changed in current band (rather than basic setup of current picture) can It is included in slice header 70.
When current band be forward prediction mode P type of strip or double prediction pattern B type of strip when, slice header 70 It may include " num_ref_idx_active_override_flag " 71.“num_ref_idx_active_override_flag” 71 indicate whether to replace " the num_ref_idx_ determined from current picture parameter set 60 with another value in current band l0_default_active_minus1”61, “num_interview_ref_idx_l0_default_active_minus1” 62nd, " num_ref_idx_l1_default_active_minus1 " 63 and " num_interview_ref_idx_l1_ Selected in the quantum of the reconstruction picture of default_active_minus1 " 64 at least one.
If other can be used in current band by using " num_ref_idx_active_override_flag " 71 Value rebuilds the quantum of picture to replace, then " num_ref_idx_active_override_flag " 72 can be included in In slice header 70, wherein, " num_ref_idx_active_override_flag " 72 indicates pre- for interframe in L0 lists The effective quantity for rebuilding picture surveyed.If allow the coding and decoding of 3D videos, " num_ for current nal units Interview_ref_idx_l0_active_minus1 " 73 can be included in slice header 70, wherein, " num_ Interview_ref_idx_l0_active_minus1 " 73 indicates the effective reconstruction for being used for interview prediction in L0 lists The quantity of picture.
If other can be used in current band by using " num_ref_idx_active_override_flag " 71 Value is come to replace the quantum for rebuilding picture and current band be B type of strip, then " num_ref_idx_l1_active_ Minus1 " 74 can be included in slice header 70, wherein, " num_ref_idx_l1_active_minus1 " 74 indicates L1 lists In the effective quantity for rebuilding picture for being used for inter prediction.If the coding for allowing 3D videos for current nal units is conciliate Code, then " num_interview_ref_idx_l1_active_minus1 " 75 can be included in slice header 70, wherein, " num_interview_ref_idx_l1_active_minus1 " 75 indicates to be used for the effective of interview prediction in L1 lists Rebuild the quantity of picture.
Multi-view point video prediction decoding equipment 20 according to the embodiment can be from the bitstream extraction slice header 70 of reception.When work as Preceding article band be forward prediction mode P type of strip or double prediction pattern B type of strip when, multi-view point video prediction decoding is set Whether standby 20 can parse " num_ref_idx_active_override_flag " 71 from slice header 70, replaced with reading with another value Change the quantum of the reconstruction picture of the reference listing collection in current picture.
Multi-view point video prediction decoding equipment 20 according to the embodiment can parse the short-term of slice header 70 according to the embodiment Whether the parameter 90 of reference picture collection, used short-term with being read from " short_term_ref_pic_set_sps_flag " 87 The short term reference picture collection determined in reference picture collection (being determined in sequence parameter set 65).
If, can be from without using the short term reference picture collection determined in sequence parameter set 65 in current band The extractions of " short_term_ref_pic_set (num_short_term_ref_pic_sets) " 80, which are used to define, to be used to work as preceding article The parameter of the short term reference picture collection of band.
Meanwhile if using the short term reference picture collection determined in sequence parameter set 65 in current band, can be from " short_term_ref_pic_set_idx " 88 read by current band use in reference parameter sets 65 determine it is short The index of phase reference picture collection.
Multi-view point video prediction decoding equipment 20 according to the embodiment can parse according to the embodiment for slice header 70 The parameter 91 of the reference picture collection of interview prediction, with from " interview_ref_pic_set_sps_flag " 97 read whether Use the reference picture collection for interview prediction determined in sequence parameter set 65.
If, can be from without using the short term reference picture collection determined in sequence parameter set 65 in current band The extractions of " short_term_ref_pic_set (num_short_term_ref_pic_sets) " 80, which are used to define, to be used to work as preceding article The parameter set of the reference picture collection of the interview prediction of band.
Meanwhile if using the short term reference picture collection determined in sequence parameter set 65 in current band, can be from " interview_ref_pic_set_idx " 98 reads the use determined in sequence parameter set 65 that will be used in current band In the index of the reference picture collection of interview prediction.
Quantity " the num_long_ of long term reference picture can be read in the parameter 92 of long term reference picture collection according to the embodiment The information of term_pics " 89 and POC on long term reference picture.
If the quantum of the reconstruction picture of reference listing collection, multi-view point video prediction decoding are replaced with another value Equipment 20 can parse " num_ref_idx_active_override_flag " 72 from slice header 70, to read the use in L0 lists In effective quantity for rebuilding picture of inter prediction.If allowing the coding and decoding of 3D videos for current nal units, Multi-view point video prediction decoding equipment 20 can parse " num_interview_ref_idx_l0_active_ from slice header 70 Minus1 " 73, to read the effective quantity for rebuilding picture for being used for interview prediction in L0 lists.
If other values are used in current band by using " num_ref_idx_active_override_flag " 71 It is B type of strip to replace the quantum for rebuilding picture and current band, then multi-view point video prediction decoding equipment 20 can From slice header 70 " num_ref_idx_l1_active_minus1 " 74 is parsed, be used for inter prediction to read in L1 lists Effectively rebuild the quantity of picture.If allowing the coding and decoding of 3D videos for current nal units, multi-view point video prediction Decoding device 20 can parse " num_interview_ref_idx_l1_active_minus1 " 75 from slice header 70, to read The effective quantity for rebuilding picture for being used for interview prediction in L1 lists.
Multi-view point video predictive coding equipment 10 and multi-view point video prediction decoding equipment 20 can determine that 76 in current band In whether change previously determined reference sequence in reference listing.If the reference of reference listing can be changed in current band Sequentially and for current nal units allow the coding and decoding of 3D videos, then reference listing modification parameter set 77 can be called.
Fig. 8 B show the grammer of the parameter according to the embodiment for being used to change reference listing.
When current band be forward prediction mode P type of strip or double prediction pattern B type of strip when, according to implementation The reference listing modification parameter set 77 of example may include " ref_pic_list_modification_flag_l0 " 81, wherein, the letter Breath indicates whether optionally to change the reference sequence for the reconstruction picture for belonging to L0 reference listings.
When optionally changing reference sequence in current band, forward prediction or double prediction for current band The reference key (" list_entry_l0 " 82) of L0 lists can be selectively modified.At this point, the reference of L0 lists is belonged to The maximum quantity 83 of index can be at least one of the reproduction order with the reproduction order prior to current picture in L1 lists At least one quantums for rebuilding picture of VID of a quantum 61 for rebuilding picture and the VID with less than current picture The sum of 62.Therefore, new reference key can match to reach with the reference key (" list_entry_l0 " 82) of L0 lists belongs to L0 The maximum quantity 83 of the reference key of list is as many so that the reference sequence for rebuilding picture corresponding with each reference key can It is selectively modified in current band.
When current band is the B type of strip of double prediction pattern, reference listing according to the embodiment changes parameter set 84 It may include " ref_pic_list_modification_flag_l1 " 84, which, which indicates whether optionally to change, belongs to L1 The reference sequence of the reconstruction picture of reference listing.
When optionally changing reference sequence in current band, forward prediction or double prediction for current band The reference key (" list_entry_l1 " 85) of L1 lists can be selectively modified.At this point, the reference of L1 lists is belonged to The maximum quantity 86 of index can be at least one of the reproduction order with the reproduction order prior to current picture in L1 lists The basic number of a quantum 63 for rebuilding picture and at least one reconstruction picture of the VID of the VID with less than current picture Amount the sum of 64.Therefore, channel reference index can match to reach with the reference key (" list_entry_l1 " 85) of L1 lists belongs to The maximum quantity 86 of the reference key of L1 lists is as many so that the reference sequence for rebuilding picture corresponding with each reference key It can be selectively modified in current band.
Therefore, multi-view point video prediction decoding equipment 20 according to the embodiment can change parameter set 77 from reference listing and parse " ref_pic_list_modification_flag_l0 " 81 or " ref_pic_list_modification_flag_l1 " 84, To read whether reference sequence is selectively modified in L0 lists or L1 lists.If it is determined that the reference sequence of L0 lists Be selectively modified, then can be according to " list_entry_l0 " as selectively modified reference key in current band 82 change the reference sequence of L0 lists, wherein, as the maximum quantity 83 of reference key of the knots modification with belonging to L0 lists It is more.Equally, if it is determined that the reference sequence of L1 lists is selectively modified, then can be according to as the selectivity in current band " list_entry_l1 " 85 of the reference key of modification changes the reference sequence of L1 lists, wherein, knots modification and belong to L1 row The maximum quantity 86 of the reference key of table is as many.
Multi-view point video predictive coding equipment 10 and multi-view point video prediction decoding equipment 20 according to the embodiment can pass through Inter prediction or interview prediction are performed using the new reference listing combined with existing reference listing.
Fig. 9 shows the reference listing of combination according to the embodiment.
It is, the reconstruction picture of basic L0 lists 41 can be belonged to by combination and belong to the weight of basic L1 reference listings 45 Picture is built to produce new basic LC list 90.
Can be with Z-shaped sequence alternate with reference to the reconstruction picture for belonging to basic L0 lists 41 and basic L1 reference listings 45 The order in face come determine according to another embodiment belong to basic LC list 90 reconstruction picture reference sequence.
For example, it can be opened according to 17 pictures 32 of VID 5/POC from the first reconstruction picture as basic L0 lists 41 19 pictures 34 of VID 5/POC, 16 pictures 33 of VID 5/POC, the basic L1 of basic L0 lists 41 for the basic L1 lists 45 begun 17 pictures 32 of VID 5/POC of list 45,19 pictures 34 of VID 5/POC, the VID of basic L1 lists 45 of basic L1 lists 41 16 pictures 33 of 5/POC, 18 pictures 35 of VID 3/POC, the VID 7/POC 18 of basic L1 lists 45 of basic L0 lists 41 are drawn The reference sequence in face 37 determines basic LC list 90.
Figure 10 and Figure 11 is the diagram for explaining the processing of the reference listing according to the embodiment for changing combination.
Multi-view point video predictive coding equipment 10 and multi-view point video prediction decoding equipment 20 according to another embodiment can The reference listing of combination is optionally changed in current band, that is, the reference for belonging to the reconstruction picture of basic LC list 90 is suitable Sequence.Therefore, basic LC list 90 is not used in current band.Instead, multi-view point video predictive coding equipment 10 and multiple views regard Frequency prediction decoding equipment 20 can use basic L0 lists 40 and basic L1 lists 41 by reference to the reference key table of Figure 10 Picture is rebuild, so as to produce modification LC lists 11.
In reference key table 50 according to the embodiment, reference key 101Idx is the reconstruction picture changed in LC lists 111 The number of the reference sequence in face." pic_from_list_0_flag " 103 may indicate that each weight for belonging to modification LC lists 111 It is to belong to basic L0 lists 41 to still fall within basic L1 lists 45 to build picture." ref_idx_list_curr " 105 may indicate that substantially L0 lists 41 or the reference key of the corresponding reconstructed picture in basic L1 lists 45.
Therefore, belong to modification LC lists 111 reconstruction picture can according to pic_from_list_0_flag " 103 or " ref_idx_list_curr " 105 be confirmed as 17 pictures 32 of VID 5/POC according to the basic L0 lists 41 of reference sequence, 19 pictures 34 of VID 5/POC of basic L1 lists 45,18 pictures 35 of VID 3/POC of basic L0 lists 41, basic L1 lists The VID of 45 18 pictures 37 of VID 7/POC, 16 pictures 33 of VID 5/POC of basic L0 lists 41 and basic L0 lists 41 19 pictures 34 of 5/POC.
Therefore, when multi-view point video predictive coding equipment 10 according to another embodiment and multi-view point video prediction decoding are set During for 20 using LC lists, if allowing the coding and decoding of 3D videos, reference listing combination parameter for current nal units Collect " ref_pic_list_3D_combination " and 77 " ref_pic_list_3D_ of reference listing modification parameter set Modification " may be recorded in the slice header for current band.Reference listing combination ginseng according to another embodiment Manifold " ref_pic_list_3D_combination " may include the parameter of the reference listing for determining combination.
Multi-view point video prediction decoding equipment 20 according to another embodiment can be combined from band header parsing reference listing joins Manifold " ref_pic_list_3D_combination ", with from " ref_pic_list_3D_combination " read whether group Close L0 lists and L1 lists and use LC lists.If it is determined that using LC lists, then belong to the reference key of LC lists most Big quantity can be read.The reconstruction picture of L0 lists and L1 lists substantially belongs to the LC lists of Z-shaped order, so if The maximum quantity for belonging to the reference key of LC lists is determined, then the reconstruction picture and its reproduction order for belonging to LC lists can be true It is fixed.
Multi-view point video prediction decoding equipment 20 according to another embodiment can read the weight for belonging to LC lists from slice header Whether the reference sequence for building picture is changed.If the reference sequence for belonging to the reconstruction picture of LC lists is changed, can be every As maximum quantity in the modification LC lists of a reference key according to the reference key with belonging to the LC lists from slice header More quantity come reset current reference order.
Figure 12 is the multi-view video encoding apparatus 121 according to the embodiment for including multi-view point video predictive coding equipment 10 Block diagram.
Multi-view video encoding apparatus 121 according to the embodiment include DPB 42, multi-view point video predictive coding equipment 10, Change quantizer 46 and entropy coder 48.
DPB 42 according to the embodiment stores the previous reconstruction picture identical with the viewpoint of current picture and is drawn with current The POC numbers in face are identical and viewpoint from current picture is different previous rebuilds picture.Can be in the reconstruction in being stored in DPB 42 The reference picture for inter prediction and interview prediction is determined among picture.It can be held by multi-view video encoding apparatus 121 Behaviour of the row above by reference to Figure 1A, Figure 1B and Fig. 3 to Fig. 8 multi-view point video predictive coding equipment 10 according to the embodiment described Make.
Multi-view point video predictive coding equipment 10 according to the embodiment can be in working as P type of strip or B type of strip L0 lists are determined among the same viewpoint image of preceding picture, wherein, L0 lists include being assigned prior to current picture POC extremely A few VID for rebuilding picture and being assigned the POC identical with the POC of current picture and the VID with less than current picture At least one reconstruction picture.Multi-view point video predictive coding equipment 10 can be in the identical of the current picture as B type of strip L1 lists are determined among visual point image, wherein, L1 lists, which include being assigned, follows that POC's after current picture is at least one Rebuild picture and be assigned the POC identical with the POC of current picture and there is the VID of the VID more than current picture at least One reconstruction picture.
Therefore, multi-view point video predictive coding equipment 10 can be determined by using the reconstruction picture being stored in DPB 42 For the inter prediction of multi-view point video and the L0 lists of interview prediction and L1 lists.According to circumstances, arranged in L0 lists and L1 The reference sequence of reconstruction picture defined in table can be selectively modified.
Multi-view point video predictive coding equipment 10 can determine to work as by reference to L0 lists or with reference to L0 lists and L1 lists The reference picture of preceding picture, and determine that the reference block of reference picture is selected to perform from inter prediction and interview prediction It is at least one.
Multi-view point video predictive coding equipment 10 according to the embodiment can be by using the reconstruction picture being stored in DPB 42 Face configures reference listing, is performed inter prediction by using the reference picture that is selected from reference listing to present image and is regarded Predicted between point, and produce residual error data.
Conversion quantizer 46 according to the embodiment can be to the residual error data that is produced by multi-view point video predictive coding equipment 10 Perform conversion and quantify, and produce quantization conversion coefficient.Entropy coder 48 according to the embodiment can to quantization conversion coefficient with And the semiology analysis entropy coding including motion vector and reference key.
It is pre- that multi-view video encoding apparatus 121 according to the embodiment can be directed to image execution interframe of each block to video Survey, conversion is performed by the residual error data of each block to being produced by inter prediction or interview prediction and is quantified every to produce The quantization conversion coefficient of a block, and by performing entropy coding to quantization conversion coefficient come output bit flow, so as to be compiled to video Code.
Multi-view video encoding apparatus 121 can be current to perform by reference to the previous reconstruction picture being stored in DPB 42 The motion compensation of picture or parallax compensation, and produce the reconstruction picture of current picture.The reconstruction picture of current picture can by with Act on the inter prediction of different images or the reference picture of interview prediction.Therefore, multi-view video encoding apparatus 121 may be used also The operation of multi-view point video prediction decoding equipment 20 is performed, wherein, multi-view point video prediction decoding equipment 20, which performs, is used for interframe Prediction or motion compensation or the parallax compensation of interview prediction.
Multi-view video encoding apparatus 121 according to the embodiment can with the interior video coding processing device that is embedded in or External video coding processing device interacts, to export video coding result, so as to perform including infra-frame prediction, inter prediction, conversion With the video encoding operations of quantization.When multi-view video encoding apparatus 121 according to the embodiment includes interior video coded treatment Device and multi-view video encoding apparatus 121 or the central processing unit (CPU) or figure for controlling multi-view video encoding apparatus 121 When shape processing unit (GPU) includes Video coding processing module, video encoding operations can be performed.
Figure 13 is the multi-view point video decoding device 131 according to the embodiment for including multi-view point video prediction decoding equipment 20 Block diagram.
Multi-view point video decoding device 131 according to the embodiment may include receiver 52, inverse quantization inverse converter 54, DPB 56th, multi-view point video prediction decoding equipment 20 and loop filter 59.
Receiver 52 according to the embodiment can receive bit stream, and entropy decoding is performed to the bit stream of reception, and to coding View data is parsed.
Inverse quantization inverse converter 54 can perform inverse quantization and inverse transformation to the coded image data parsed by receiver 52, and And reconstructive residual error data.
Receiver 52 according to the embodiment can parse motion vector and/or difference vector from bit stream.It is according to the embodiment DPB 56 can store the previous of the reference picture of the motion compensation that can be used as another image or parallax compensation and rebuild picture.According to reality Reference columns can be configured by using the reconstruction picture being stored in DPB 56 by applying the multi-view point video prediction decoding equipment 20 of example Table, and by using reference listing come perform motion compensation using motion vector and residual error data or using difference vector and The parallax compensation of residual error data.
Multi-view point video prediction decoding equipment 20 according to the embodiment is executable to be regarded more with what is described with reference to Fig. 2A and Fig. 2 B Point video predictive decoding equipment 20 operates identical operation.
Multi-view point video prediction decoding equipment 20 according to the embodiment can be directed to as P type of strip or B type of strip Current picture determines L0 lists among same viewpoint image, wherein, L0 lists include being assigned the POC's prior to current picture It is at least one to rebuild picture and be assigned the POC identical with the POC of current picture and have the VID's less than current picture At least one reconstruction picture of VID.
Multi-view point video prediction decoding equipment 20 can be among the same viewpoint image as the current picture of B type of strip Determine L1 lists, wherein, L1 lists include be assigned follow the POC after current picture at least one reconstruction picture and It is assigned at least one reconstruction picture of the POC identical with the POC of current picture and the VID of the VID with more than current picture Face.
Therefore, multi-view point video prediction decoding equipment 20 can determine that pre- between inter prediction and viewpoint for multi-view point video The L0 lists and L1 lists of survey.According to circumstances, the reference sequence of the reconstruction picture defined in L0 lists and L1 lists can be chosen Change to selecting property.
Multi-view point video prediction decoding equipment 20 can determine to work as by reference to L0 lists or with reference to L0 lists and L1 lists The reference picture of preceding picture, and determine that the reference block of reference picture is selected to perform from inter prediction and interview prediction It is at least one.
Multi-view point video decoding device 131 according to the embodiment can be performed according to viewpoint to the block of each image of video Decode and rebuild video.Receiver 52 can parse the coded data and motion vector or parallax information of each block.Inverse quantization Inverse converter 54 can perform inverse quantization and inverse transformation to each piece of coded data, and rebuild the residual data of each block.It is more Viewpoint video prediction decoding equipment 20, which can determine that, to be indicated by each piece of motion vector or difference vector among reference picture Reference block, and reference block is synthesized with residual error data, so as to regenerate reconstructed block.
Loop filter 59 the reconstruction picture that is exported by multi-view point video prediction decoding equipment 20 is performed block elimination filtering and Adaptive offset (SAO) filtering of sampling.Loop filter 59 can be directed to each block and perform block elimination filtering and SAO filtering, and defeated Go out final reconstruction picture.The output image of loop filter 559 can be stored in DPB 56 and be used for next image Motion compensation reference picture.
Multi-view point video decoding device 131 according to the embodiment can with the interior video decoding processor that is embedded in or External video decoding processor interacts, to export video decoded result, so as to perform including inverse quantization, inverse transformation, infra-frame prediction With the video decoding operation of motion compensation.When multi-view point video decoding device 131 according to the embodiment is decoded including interior video The CPU of processor and multi-view point video decoding device 131 or control multi-view point video decoding device 131 is decoded including video During processing module, video decoding operation can be performed.
Set above by reference to multi-view point video predictive coding equipment 10 that Figure 1A to Figure 13 is described, multi-view point video prediction decoding Standby 20, multi-view video encoding apparatus 121 and multi-view point video decoding device 131 can be according to the multi-view point videos for the present invention Predictive coding equipment be configured to the reference listing of the inter prediction of multi-view point video and interview prediction.It is pre- for interframe The reference picture of survey and reference picture for interview prediction can be included in a reference listing.
State on significantly reflecting decoded picture collection can be sent and received by sequence parameter set and slice header The information of reference picture collection.Reference listing is determined according to reference picture collection, and is therefore considering the state of decoded picture collection In definite reference listing, reference picture can be determined and used to inter prediction/motion compensation and interview prediction/parallax is mended Repay.
The prediction decoding equipment for multi-view point video can be produced including the reference for inter prediction according to the present invention At least one reference listing of image and reference picture for interview prediction.Prediction decoding equipment for multi-view point video The reference picture of present image can be determined by reference to a reference listing, reference block is determined among reference picture, and Perform at least one selected from motion compensation and parallax compensation.
Multi-view point video predictive coding equipment 10 according to the embodiment, multi-view point video prediction decoding equipment 20, multiple views The block of video data can be divided into the coding with tree construction by video encoder 121 and multi-view point video decoding device 131 Unit, and predicting unit can be used for the inter prediction of coding unit as described above.Hereinafter, will be with reference to Figure 16 to Figure 18 Description is based on the coding unit with tree construction and the method for video coding of converter unit, video encoder, video decoding side Method and video decoding apparatus.
Multi-view point video predictive coding equipment 10 according to the embodiment, multi-view point video prediction decoding according to the embodiment are set The block of video data can be divided into tree by standby 20, multi-view video encoding apparatus 121 and multi-view point video decoding device 131 The coding unit of structure, and coding unit, predicting unit and converter unit can be used between the viewpoint of above-mentioned coding unit Prediction or inter prediction.Hereinafter, will be described with reference to Figure 14 to Figure 26 based on coding unit and converter unit with tree construction Method for video coding, video encoder, video encoding/decoding method and video decoding apparatus.
In principle, for the coding and decoding of multi-view point video processing during, for basic visual point image coding and Decoding process and for additional viewpoint image coding and decoding processing can be executed separately.In other words, when to regarding more When point video performs interview prediction, the coding and decoding result of single-view video can be referred to mutually, but according to single view Video performs single coding and decoding processing.
Therefore, because referring to Figure 14 to Figure 26 Video codings based on the coding unit with tree construction described and Decoding process is the Video coding and decoding process for handling single-view video, therefore only performs inter prediction and movement benefit Repay.However, as described above with described in Figure 1A to Figure 13, in order to multiple view video coding and decoding, to basic visual point image and Additional viewpoint image performs parallax compensation between interview prediction and viewpoint.
Therefore, in order to regard multi-view point video predictive coding equipment 10 according to the embodiment and multiple views according to the embodiment Frequency encoding device 121 is based on the coding unit with tree construction to multi-view point video predictive coding, multi-view point video predictive coding Equipment 10 and multi-view video encoding apparatus 121 may include the video encoder 100 of Figure 14 with number of views as many, with Just Video coding is performed according to each single viewpoint video, so as to control single view of each video encoder 100 to distribution Video coding.In addition, can be by using for different viewpoints for the video encoder 100 encoded to single-view video The coding result of the respective single viewpoint of each video encoder 100 of Video coding performs interview prediction.Multiple views Each of predictive encoding of video equipment 10 and multi-view video encoding apparatus 121 can be produced according to viewpoint to be included according to viewpoint The bit stream of coding result.
Similarly, in order to enable multi-view point video prediction decoding equipment 20 according to the embodiment and according to the embodiment regarding more Point video decoding apparatus 131 is based on the coding unit with tree construction to multi-view point video prediction decoding, multi-view point video prediction Decoding device 20 and multi-view point video decoding device 131 may include Figure 15's with the quantity of the viewpoint of multi-view point video as many Video encoder 200, so that the basic visual point image stream for reception and the additional viewpoint image stream received are held according to viewpoint Row video decodes, so as to control each video decoding apparatus 200 to decode the single-view video of distribution.In addition, for haplopia The point decoded video decoding apparatus 200 of video can be by using for the decoded each video decoding apparatus of different points of view video The decoded result of 200 respective single view performs interview prediction.Therefore, multi-view point video prediction decoding equipment 20 and regard more Each of point video decoding apparatus 131 can produce the bit stream for including the decoded result according to viewpoint according to viewpoint.
Figure 14 is the video encoder 100 of the coding unit according to an embodiment of the invention based on according to tree construction Block diagram.
It is according to the embodiment to be wrapped based on the video encoder 100 that video estimation is performed according to the coding unit of tree construction Include maximum coding unit divide 110, coding unit determiner 120 and output unit 130.Hereinafter, describe for convenience, " video is known as based on the video encoder 100 that video estimation is performed according to the coding unit of tree construction by according to the embodiment Encoding device 100 ".
Maximum coding unit divide 110 can be based on the maximum sized coding unit as the current picture with image Maximum coding unit, to be divided to current picture.If current picture is more than maximum coding unit, will can currently draw The view data in face is divided at least one maximum coding unit.Maximum coding unit according to an embodiment of the invention can be Size is the data cell of 32 × 32,64 × 64,128 × 128,256 × 256 grades, wherein, the shape of data cell is width Square with length for 2 some powers.It is single that view data can be output to coding according at least one maximum coding unit First determiner 120.
Coding unit according to an embodiment of the invention can be characterized by full-size and depth.Depth representing coding unit from The number that maximum coding unit is spatially divided, and with depth down, can according to the deeper coding unit of depth Minimum coding unit is divided into from maximum coding unit.The depth of maximum coding unit is highest depth, minimum coding unit Depth be lowest depth.Due to the depth down of maximum coding unit, the ruler of coding unit corresponding with each depth Very little reduction, therefore coding unit corresponding with greater depths may include multiple coding units corresponding with more low depth.
As described above, the view data of current picture is divided into maximum coding list according to the full-size of coding unit Member, and each maximum coding unit may include the deeper coding unit being divided according to depth.Due to according to depth to root Divided according to the maximum coding unit of the embodiment of the present invention, therefore can be according to depth to being included in maximum coding unit The view data of spatial domain is hierarchically classified.
The depth capacity and full-size of coding unit can be predefined, depth capacity and the full-size limitation is to most The height and width of big coding unit carry out the number of layering division.
Coding unit determiner 120 to by the region of maximum coding unit is divided according to depth and obtain to Few division region is encoded, and is determined according at least one division region for exporting the figure finally encoded As the depth of data.In other words, coding unit determiner 120 by according to the maximum coding unit of current picture with according to depth The deeper coding unit of degree encodes view data, depth of the selection with minimum coding error, to determine that coding is deep Degree.Definite coding depth and output unit 130 is output to according to the view data being encoded of definite coding depth.
Based on equal to or less than the corresponding deeper coding unit of at least one depth of depth capacity, being encoded to maximum View data in unit is encoded, and the knot relatively encoded based on each deeper coding unit to view data Fruit.After pair compared with the encoding error of deeper coding unit, the depth with minimum coding error may be selected.Can At least one coding depth is selected for each maximum coding unit.
With coding unit hierarchically divided according to depth and the quantity of coding unit increase, maximum coding unit Size be divided.In addition, coding unit is corresponding to same depth in a maximum coding unit, also by surveying respectively The encoding error of the view data of each coding unit is measured to determine whether to draw each coding unit corresponding with same depth It is divided into more low depth.Therefore, even if view data is included in a maximum coding unit, in a maximum coding unit Encoding error is different according to region, therefore coding depth can be according to region and different in view data.Therefore, can be at one In maximum coding unit determine one or more coding depths, and can according to the coding unit of at least one coding depth come The view data of maximum coding unit is divided.
Therefore, coding unit determiner 120 according to the embodiment, which can determine that to be included in maximum coding unit, has tree The coding unit of structure." coding unit with tree construction " according to an embodiment of the invention is included in maximum coding unit Including all deeper coding units in the corresponding coding unit of the depth with being determined as coding depth.It can be compiled according to maximum Depth in the same area of code unit hierarchically determines the coding unit of coding depth, and can in the different areas independently Determine the coding unit of coding depth.Similarly, the volume in current region can be independently determined from the coding depth in another region Code depth.
Depth capacity according to an embodiment of the invention is and the division from maximum coding unit to minimum coding unit The related index of number.First depth capacity according to an embodiment of the invention can be represented from maximum coding unit to minimum code The total number of division of unit.Second depth capacity according to an embodiment of the invention can represent to compile from maximum coding unit to minimum The sum of the depth levels of code unit.For example, when the depth of maximum coding unit is 0, to maximum coding unit division once The depth of coding unit can be arranged to 1, the depth of the coding unit of maximum coding unit division twice can be arranged to 2.Here, if minimum coding unit is the coding unit drawn to maximum coding unit in four times, there are depth 0,1,2,3 With 45 depth levels, and therefore the first depth capacity can be arranged to 4, and the second depth capacity can be arranged to 5.
It can encode and convert according to maximum coding unit perform prediction.Always according to maximum coding unit, it is equal to based on basis Or come perform prediction coding and conversion less than the deeper coding unit of the depth of depth capacity.
Since whenever being divided according to depth to maximum coding unit, the quantity of deeper coding unit increases, because This performs all deeper coding units produced with depth down the coding for including predictive coding and conversion.In order to just In description, in maximum coding unit, predictive coding and conversion will be described based on the coding unit of current depth now.
Video encoder 100 according to the embodiment can differently select the data sheet for being encoded to view data The size or shape of member.In order to be encoded to view data, such as predictive coding, conversion and the operation of entropy coding are performed, this When, identical data cell can be operated with for all, or can be directed to and each operate with different data cells.
For example, video encoder 100 is not only alternatively used for the coding unit encoded to view data, it is also optional The data cell different from coding unit is selected, to be encoded to the view data perform prediction in coding unit.
In order to be encoded in maximum coding unit perform prediction, coding unit corresponding with coding depth can be based on (i.e., not The coding unit of coding unit corresponding with more low depth is divided into again) carry out perform prediction coding.Hereinafter, be no longer divided and It will be referred to as " predicting unit " now as the coding unit of the elementary cell for predictive coding.Obtained by dividing predicting unit Subregion may include predicting unit and at least one divided by what is selected in the height to predicting unit and width And the data cell obtained.Subregion is the data cell that the predicting unit of coding unit is divided, and predicting unit can be had The subregion of the size identical with coding unit.
For example, the coding unit for working as 2N × 2N (wherein, N is positive integer) is no longer divided, and as the pre- of 2N × 2N When surveying unit, the size of subregion can be 2N × 2N, 2N × N, N × 2N or N × N.The example of divisional type is included by pre- Survey the height of unit or symmetric partitioning that width is symmetrically divided and obtained, by the height to predicting unit or width into Row asymmetricly divides (such as, 1:N or n:1) subregion that obtains, obtained by geometrically being divided to predicting unit Subregion and with arbitrary shape subregion.
The prediction mode of predicting unit can be at least one selected from frame mode, inter-frame mode and skip mode It is a.For example, frame mode or inter-frame mode can be performed to the subregion of 2N × 2N, 2N × N, N × 2N or N × N.In addition, can be only right The subregion of 2N × 2N performs skip mode.Coding can independently be performed to a predicting unit in coding unit, so as to select Prediction mode with minimum coding error.
Video encoder 100 according to the embodiment not only can be based on the coding unit for being encoded to view data Also conversion can be performed to the view data in coding unit based on the data cell different from coding unit.In order to coding Unit performs conversion, can the data cell based on the size with less than or equal to coding unit, to perform conversion.For example, with It may include the data cell of frame mode and the data cell of inter-frame mode in the data cell of conversion.
Similar with the coding unit according to tree construction according to the present embodiment, the converter unit in coding unit can be by recurrence Ground is divided into smaller size of region.Therefore, can be single to coding based on the converter unit with tree construction according to transformed depth Residual error data in member is divided.
According to embodiment, transformed depth can be also set in converter unit, wherein, transformed depth is represented to coding unit Height and width divided with obtain converter unit division number.For example, in the current coded unit of 2N × 2N, When the size of converter unit is 2N × 2N, transformed depth can be 0, and when the size of converter unit is N × N, conversion is deep Degree can be 1, and when the size of converter unit is N/2 × N/2, transformed depth can be 2.In other words, can also be according to conversion Depth sets the converter unit with tree construction.
Information on coding depth is not only needed according to the coding information of coding unit corresponding with coding depth, is also needed Will be on the information with predictive coding and the relevant information of conversion.Therefore, coding unit determiner 120 not only determines there is minimum The coding depth of encoding error, also determine predicting unit in divisional type, according to the prediction mode of predicting unit and for becoming The size of the converter unit changed.
The basis in maximum coding unit according to an embodiment of the invention is described in detail later with reference to Figure 16 to Figure 26 The coding unit of tree construction and the method for definite predicting unit/subregion and converter unit.
Coding unit determiner 120 can be by using the rate-distortion optimization based on Lagrange's multiplier, to measure according to depth The encoding error of the deeper coding unit of degree.
Output unit 130 exports the view data of maximum coding unit and on the volume according to coding depth in the bitstream The information of pattern, wherein, the view data of the maximum coding unit by coding unit determiner 120 based on being determined at least One coding depth is encoded.
Coded image data can be obtained by being encoded to the residual error data of image.
Information on the coding mode according to coding depth may include the information on coding depth, on single in prediction The information of the information of divisional type in member, the information on prediction mode and the size on converter unit.
Information on coding depth can be defined by using the division information according to depth, wherein, according to depth Division information indicates whether to perform coding to the coding unit of more low depth rather than current depth.If current coded unit Current depth is coding depth, then the view data in current coded unit is encoded and exported, therefore definable divides Information by current coded unit not to be divided into more low depth.Optionally, if the current depth of current coded unit is not Coding depth, then perform the coding unit of more low depth coding, and therefore definable division information with to current coded unit Divided to obtain the coding unit of more low depth.
If current depth is not coding depth, performed to being divided into the coding unit of coding unit of more low depth Coding.Since at least one coding unit of more low depth is present in a coding unit of current depth, to lower Each coding unit of depth repeats coding, and therefore can recursively perform volume to the coding unit with same depth Code.
Due to the coding unit for for a maximum coding unit determining that there is tree construction, and for the volume of coding depth Code unit determines the information at least one coding mode, so can be determined for a maximum coding unit at least one The information of a coding mode.Further, since layering division, therefore the figure of maximum coding unit carry out view data according to depth As the coding depth of data can be different according to position, therefore view data can be directed to and set on coding depth and coding mode Information.
Therefore, output unit 130 according to the embodiment will can be believed on the coding of corresponding coding depth and coding mode Breath distribute to selected from coding unit, predicting unit and the minimum unit being included in maximum coding unit it is at least one.
Minimum unit according to an embodiment of the invention is by the way that the minimum coding unit for forming lowest depth is divided into 4 Part and obtain square data cell.Selectively, minimum unit according to the embodiment can be largest square data sheet Member, wherein, largest square data cell may include in maximum coding unit is included in included all coding units, pre- Survey in unit, zoning unit and converter unit.
For example, the coding information exported by output unit 130 can be classified as be believed according to the coding of deeper coding unit Breath and the coding information according to predicting unit.Letter on prediction mode may include according to the coding information of deeper coding unit Breath and the information on partitions sizes.It may include the letter in the estimation direction on inter-frame mode according to the coding information of predicting unit Breath, the information of reference picture index, the information on motion vector, the chromatic component on frame mode on inter-frame mode Information and interpolation method on frame mode information.
According to picture, band or GOP define on the maximum sized information of coding unit and on depth capacity Information can be inserted into the head, sequence parameter set or parameter sets of bit stream.
It can also be exported via the head, sequence parameter set or parameter sets of bit stream on current video permission on becoming Change the information of the maximum sized information of unit and the minimum dimension on converter unit.Output unit 130 pair can have with prediction Reference information, information of forecasting and the type of strip information of pass are encoded, and export these information.
In video encoder 100, deeper coding unit can be by by the coding unit (higher of greater depths One layer) height or the width coding unit that is divided into two parts and obtains.In other words, when current depth coding unit size When being 2N × 2N, the size of the coding unit of more low depth is N × N.In addition, size is the coding of the current depth of 2N × 2N Unit may include the coding unit of most 4 more low depths.
Therefore, the ruler for the maximum coding unit that video encoder 100 can be definite based on the feature for considering current picture Very little and depth capacity, by determining the coding unit with optimum shape and optimal size come shape for each maximum coding unit Into the coding unit with tree construction.Further, since can be by using any one in various prediction modes and conversion to every A maximum coding unit performs coding, therefore is contemplated that the feature of the coding unit of various picture sizes to determine optimum code mould Formula.
Therefore, if encoded with conventional macro block to the image with high-resolution or big data quantity, each picture Macro block quantity extremely increase.Therefore, the bar number increase of the compression information produced for each macro block, therefore, it is difficult to send pressure The information of contracting, and efficiency of data compression reduces.However, by using video encoder 100, due to considering the ruler of image It is very little, while the full-size of coding unit is increased, coding unit is adjusted based on the feature of image, therefore image can be increased Compression efficiency.
It may include the video with the quantity of viewpoint as many above by reference to Figure 1A multi-view video encoding apparatus 10 described Encoding device 100, encodes one-view image with the viewpoint according to multi-view point video.
When video encoder 100 encodes one-view image, coding unit determiner 120 can be directed to each most Big coding unit, determines to carry out the predicting unit of inter prediction according to the coding unit with tree construction, and according to predicting unit Perform inter prediction.
Specifically, the executable inter prediction and ginseng that picture is rebuild with reference to same viewpoint of coding unit determiner 120 Examine the interview prediction that different points of view rebuilds picture.Coding unit determiner 120 according to the embodiment can be as P band classes L0 lists are determined among the same viewpoint image of the current picture of type or B type of strip, wherein, L0 lists include being assigned elder generation In the POC of current picture at least one reconstruction picture and be assigned the POC identical with the POC of current picture and with being less than At least one reconstruction picture of the VID of the VID of current picture.Coding unit determiner 120 can be in working as B type of strip L1 lists are determined among the same viewpoint image of preceding picture, wherein, L1 lists include being assigned following after current picture At least one reconstruction picture of POC and it is assigned the POC identical with the POC of current picture and with more than current picture At least one reconstruction picture of the VID of VID.
Therefore, coding unit determiner 120 can determine to be used for multiple views by using the reconstruction picture being stored in DPB The inter prediction of video and the L0 lists of interview prediction and L1 lists.According to circumstances, defined in L0 lists and L1 lists Rebuilding the reference sequence of picture can be selectively modified in predetermined strap.
Coding unit determiner 120 according to the embodiment can be come by reference to L0 lists or with reference to L0 lists and L1 lists Determine the reference picture of current picture, and determine the reference block of reference picture to perform from inter prediction and interview prediction At least one of selection.
Figure 15 is the video decoding apparatus 200 of the coding unit according to an embodiment of the invention based on according to tree construction Block diagram.
The video decoding apparatus 200 according to the embodiment for performing the video estimation based on the coding unit with tree construction wraps Include receiver 210, view data and coded information extractor 220 and image data decoder 230.Hereinafter, for the ease of Description, it is according to the embodiment to be referred to as based on the video decoding apparatus 200 that video estimation is performed according to the coding unit of tree construction " video decoding apparatus 200 ".
Various terms (such as coding unit, depth, predicting unit, change for the decoding operate of video decoding apparatus 200 Change unit and the information on various coding modes) definition and the definition phase that is described with reference to Figure 14 and video encoder 100 Together.
Receiver 210 receives and parses the bit stream of encoded video.View data and coded information extractor 220 are from solution The bit stream of analysis, coded image data is extracted for each coding unit, and the view data of extraction then is output to image Data decoder 230, wherein, coding unit has the tree construction according to each maximum coding unit.View data and coding are believed Ceasing extractor 220 can be from coding of head, sequence parameter set or the sequence parameter set extraction on current picture on current picture The maximum sized information of unit.
In addition, the bit stream of view data and coded information extractor 220 analytically, according to maximum coding unit, extraction On the coding depth of the coding unit with tree construction and the information of coding mode.Extraction on coding depth and coding mould The information of formula is output to image data decoder 230.In other words, it is single to be divided into maximum coding for the view data in bit stream Member so that image data decoder 230 decodes view data for each maximum coding unit.
It can be directed to and be set on the information of at least one coding depth corresponding with coding depth on being encoded according to maximum The coding depth of unit and the information of coding mode, the information on coding mode may include on phase corresponding with coding depth Answer the information of the information of the divisional type of coding unit, the information on prediction mode and the size on converter unit.In addition, Information on coding depth can be extracted as according to the division information of depth.
Coding depth on each maximum coding unit of basis extracted by view data and coded information extractor 220 Information with coding mode is the coding depth and coding mould on being confirmed as producing minimum coding error in a case where The information of formula, i.e. coding side (such as, video encoder 100) according to maximum coding unit to according to depth it is each compared with When coding is repeatedly carried out in deep layer coding unit.Therefore, video decoding apparatus 200 can be by missing according to generation minimum code The coding depth and coding mode of difference decode view data to recover image.
Due to according to the embodiment corresponding coding can be assigned on coding depth and the coding information of coding mode Predetermined unit of data in unit, predicting unit and minimum unit, therefore view data and coded information extractor 220 can bases Predetermined unit of data, extracts the information on coding depth and coding mode.If the coding on corresponding maximum coding unit Depth and the information of coding mode are recorded according to predetermined unit of data, then will can be assigned it is identical on coding depth and The predetermined unit of data of the information of coding mode is inferred as the data cell being included in same maximum coding unit.
Image data decoder 230 based on the coding depth and the information of coding mode according to maximum coding unit, By being decoded to the view data in each maximum coding unit, to recover current picture.In other words, image data decoding Device 230 can be based on extracting on every among the coding unit with tree construction that is included in each maximum coding unit The information of the divisional type of a coding unit, prediction mode and converter unit, decodes the view data of coding.At decoding Reason may include prediction (comprising infra-frame prediction and motion compensation) and inverse transformation.
Image data decoder 230 can the divisional type based on the predicting unit on the coding unit according to coding depth With the information of prediction mode, according to the subregion and prediction mode of each coding unit, infra-frame prediction or motion compensation are performed.
In addition, in order to perform inverse transformation to each maximum coding unit, each coding can be read in image data decoder 230 The converter unit information according to tree construction of unit, so that the converter unit based on each coding unit performs inverse transformation.Pass through Inverse transformation, can recover the pixel value of the spatial domain of coding unit.
Image data decoder 230 can determine current maximum coding unit by using the division information according to depth Coding depth.If division information instruction view data is no longer divided in current depth, current depth is coding depth. Therefore, image data decoder 230 can be by using the predicting unit on each coding unit corresponding with coding depth The information of the size of divisional type, prediction mode and converter unit, solves the coded data in current maximum coding unit Code.
In other words, the predetermined unit of data in coding unit, predicting unit and minimum unit can be distributed to by observation Coding information set collects the data cell for the coding information for including identical division information, and the data cell collected can It is considered as that a decoded data cell will be carried out with identical coding mode by image data decoder 230.In this way, by obtaining Obtain on the information of the coding mode for each coding unit to be decoded to current coded unit.
Above by reference to Figure 1A and Figure 12 multi-view point video predictive coding equipment 10 described and multi-view video encoding apparatus 121 may include the image data decoder 230 with the quantity of viewpoint as many, to be produced according to the viewpoint of multi-view point video For inter prediction and the reference picture of interview prediction.
In addition, above by reference to Fig. 2A and Figure 13 multi-view point video prediction decoding equipment 20 described and multi-view point video decoding Equipment 131 may include the video decoding apparatus 200 with the quantity of viewpoint as many, will pass through the bit stream decoding root that docking is received Recover image according to viewpoint.
When the bit stream of the predetermined viewpoint video among multi-view point video is received, the figure of video decoding apparatus 200 As data decoder 230 will can be divided by view data and coded information extractor 230 from the sampling point of the image of bitstream extraction For the coding unit with tree construction.Image data decoder 230 can be by by dividing the sampling point of picture image And the coding unit with tree construction obtained, motion compensation is performed according to the predicting unit for inter prediction to recover to scheme Picture.
Specifically, executable inter prediction and the reference that picture is rebuild with reference to same viewpoint of image data decoder 230 Different points of view rebuilds the interview prediction of picture.Image data decoder 230 according to the embodiment can be as P type of strip Or L0 lists are determined among the same viewpoint image of the current picture of B type of strip, wherein, L0 lists include being assigned prior to At least one reconstruction picture of the POC of current picture and be assigned the POC identical with the POC of current picture and be less than work as At least one reconstruction picture of the VID of the VID of preceding picture.Image data decoder 230 can be as the current of B type of strip L1 lists are determined among the same viewpoint image of picture, wherein, L1 lists include being assigned the POC followed after current picture At least one reconstruction picture and be assigned the POC identical with the POC of current picture and with more than current picture VID VID at least one reconstruction picture.
Therefore, image data decoder 230 can determine to be used for multiple views by using the reconstruction picture being stored in DPB The inter prediction of video and the L0 lists of interview prediction and L1 lists.According to circumstances, defined in L0 lists and L1 lists Rebuilding the reference sequence of picture can be selectively modified in predetermined strap.
Image data decoder 230 according to the embodiment can be come by reference to L0 lists or with reference to L0 lists and L1 lists Determine the reference picture of current picture.Image data decoder 230 can be by using by view data and coded information extractor The motion vectors or difference vector of 220 parsings determine the reference prediction unit of reference picture.Can be by via from motion compensation With selected in parallax compensation it is at least one to reference prediction unit compensation residual error data, to rebuild current prediction unit.
Therefore, video decoding apparatus 200 can be obtained on when recursively performing coding for each maximum coding unit The information of at least one coding unit of minimum coding error is produced, and described information can be used to be solved to current picture Code.In other words, the coding unit with tree construction for the optimum code unit being confirmed as in maximum coding unit can be decoded.
Therefore, even if view data has high-resolution and big data quantity, also can by using coding unit size and Coding mode, to view data carry out effectively decode and recover, wherein, by using from encoder receive on optimal The information of coding mode, the size and coding mode of the coding unit are adaptively determined according to the feature of view data.
Figure 16 is the diagram for describing the concept of coding unit according to an embodiment of the invention.
The size of coding unit may be expressed as width × height, and can be 64 × 64,32 × 32,16 × 16 and 8 × 8.64 × 64 coding unit can be divided into 64 × 64,64 × 32,32 × 64 or 32 × 32 subregion, 32 × 32 coding list Member can be divided into 32 × 32,32 × 16,16 × 32 or 16 × 16 subregion, and 16 × 16 coding unit can be divided into 16 × 16,16 × 8,8 × 16 or 8 × 8 subregion, 8 × 8 coding unit can be divided into 8 × 8,8 × 4,4 × 8 or 4 × 4 point Area.
In video data 310, resolution ratio is 1920 × 1080, and the full-size of coding unit is 64, and depth capacity is 2.In video data 320, resolution ratio is 1920 × 1080, and the full-size of coding unit is 64, and depth capacity is 3.Regarding Frequency is according in 330, and resolution ratio is 352 × 288, and the full-size of coding unit is 16, and depth capacity is 1.Shown in Figure 16 Depth capacity represents the division total degree from maximum coding unit to minimum coding unit.
If high resolution or data volume are big, the full-size of coding unit may be larger, so as to not only improve coding Efficiency, and reflect the feature of image exactly.Therefore, there is 310 He of video data than 330 higher resolution of video data The full-size of 320 coding unit can be 64.
Since the depth capacity of video data 310 is 2, due to by maximum coding unit division twice, depth Deepen to two layers, therefore the coding unit 315 of video data 310 may include the maximum coding unit and major axis that major axis dimension is 64 Size is 32 and 16 coding unit.Since the depth capacity of video data 330 is 1, due to by single to maximum coding Once, depth down is to one layer, therefore the coding unit 335 of video data 330 may include major axis dimension for 16 most for member division Big coding unit and the coding unit that major axis dimension is 8.
Since the depth capacity of video data 320 is 3, due to by maximum coding unit division three times, depth Deepen to 3 layers, therefore the coding unit 325 of video data 320 may include the maximum coding unit and major axis that major axis dimension is 64 Size is 32,16 and 8 coding unit.With depth down, details can be represented accurately.
Figure 17 is the block diagram of the image encoder 400 according to an embodiment of the invention based on coding unit.
Image encoder 400 according to the embodiment performs the behaviour of the coding unit determiner 120 of video encoder 100 Make to be encoded to view data.In other words, intra predictor generator 410 is in intra mode to the coding list in present frame 405 Member performs infra-frame prediction, and exercise estimator 420 and motion compensator 425 are by using present frame 405 and reference frame 495, in frame Between inter prediction and motion compensation are performed respectively to the coding unit in present frame 405 under pattern.
Pass through 430 and of converter from the data of intra predictor generator 410, exercise estimator 420 and motion compensator 425 output Quantizer 440 is outputted as the conversion coefficient after quantifying.Conversion coefficient after quantization passes through inverse DCT 460 and inverse converter 470 are recovered as the data in spatial domain, and the data in the spatial domain of recovery are by removing module unit 480 and offset adjusting unit Reference frame 495 is outputted as after 490 post processings.Conversion coefficient after quantization can be outputted as bit by entropy coder 450 Stream 455.
In order to which image encoder 400 is applied in video encoder 100 according to the embodiment, image encoder 400 All elements (that is, intra predictor generator 410, exercise estimator 420, motion compensator 425, converter 430, quantizer 440, Entropy coder 450, inverse DCT 460, inverse converter 470, remove module unit 480 and offset adjusting unit 490) consider it is each most While the depth capacity of big coding unit, operation is performed based on each coding unit in the coding unit with tree construction.
Specifically, intra predictor generator 410, exercise estimator 420 and motion compensator 425 are considering that current maximum coding is single Member full-size and depth capacity while, determine with tree construction coding unit in each coding unit subregion and Prediction mode, converter 430 determine the size of the converter unit in each coding unit in the coding unit with tree construction.
Motion compensator 420 can perform the inter prediction with reference to same viewpoint image according to predicting unit and estimate interframe Movement.Motion compensator 420 can also perform regarding with reference to the different points of view image with same reproduction order according to predicting unit Predicted between point, and estimate parallax between viewpoint.
Motion compensator 425 can perform the motion compensation with reference to same viewpoint image according to predicting unit, and rebuild pre- Survey unit.Motion compensator 425 can be also performed with reference to the different points of view image with same reproduction order according to predicting unit Parallax compensation, and rebuild predicting unit.
Exercise estimator 420 and motion compensator 425 determine the method for reference listing with being retouched above with reference to Figure 1A to Figure 11 That states is identical.
Figure 18 is the block diagram of the image decoder 500 according to an embodiment of the invention based on coding unit.
Resolver 510 parses the coded image data that will be decoded from bit stream 505 and decodes required coding information.Compile Code view data is outputted as the data of inverse quantization by entropy decoder 520 and inverse DCT 530, and the data of inverse quantization pass through inverse Converter 540 is recovered as the view data in spatial domain.
For the view data in spatial domain, intra predictor generator 550 in intra mode performs coding unit pre- in frame Survey, by using reference frame 585, coding unit performs motion compensation to motion compensator 560 in inter mode.
Can be by removing module unit by the view data in the spatial domain of intra predictor generator 550 and motion compensator 560 570 and offset adjuster 580 post processing after be outputted as recover frame 595.In addition, by going module unit 570 and offset to adjust The view data that device 580 post-processes can be outputted as reference frame 585.
In order to be decoded in the image data decoder 230 of video decoding apparatus 200 to view data, image decoding Device 500 can perform the operation performed afterwards in the execution operation of resolver 510.
In order to which image decoder 500 is applied in video decoding apparatus 200, all elements of image decoder 500 (that is, resolver 510, entropy decoder according to the embodiment 520, inverse DCT 530, inverse converter 540, intra predictor generator 550, Motion compensator 560, go module unit 570 and offset adjuster 580) each maximum coding unit is directed to, based on tree construction Coding unit perform operation.
Specifically, the subregion of intra predictor generator 550 and motion compensator 560 based on each coding unit with tree construction Operation is performed with prediction mode, inverse converter 540 performs operation based on the size of the converter unit of each coding unit.
Motion compensator 560 can perform the motion compensation with reference to same viewpoint image according to predicting unit and rebuild prediction Unit.Motion compensator 560 can also perform regarding with reference to the different points of view image with same reproduction order according to predicting unit Difference compensation, and rebuild predicting unit.Motion compensator 560 determines the method for reference listing with being retouched above by reference to Figure 1A to Figure 11 That states is identical.
Figure 19 is the diagram for showing the deeper coding unit according to an embodiment of the invention according to depth and subregion.
Video encoder 100 and video decoding apparatus according to the embodiment 200 according to the embodiment use hierarchical coding Unit is to consider the feature of image.Maximum height, the maximum width of coding unit can be adaptively determined according to the feature of image And depth capacity, or maximum height, maximum width and the depth capacity of coding unit can be arranged differently than by user.Can be according to volume The predetermined full-size of code unit determines the size of the deeper coding unit according to depth.
According to an embodiment of the invention, in the hierarchy 600 of coding unit according to the embodiment, coding unit is most Big height and maximum width are 64, and depth capacity is 4.In this case, depth capacity presentation code unit is compiled from maximum Code unit is divided into the total degree of minimum coding unit.Due to according to vertical axis depth of the embodiment along hierarchy 600 Deepen, therefore the height of deeper coding unit and width are divided.In addition, predicting unit and subregion are along hierarchy 600 trunnion axis is illustrated, wherein, the predicting unit and subregion are to be predicted coding to each deeper coding unit Basis.
In other words, in hierarchy 600, coding unit 610 is maximum coding unit, wherein, depth 0, size is 64 × 64 (that is, highly multiplying width).As depth is deepened along vertical axis, the coding that there are size be 32 × 32 and depth is 1 Unit 620, size are 16 × 16 and depth is 2 coding unit 630, the coding unit 640 that size is 8 × 8 and depth is 3. The coding unit 640 that size is 8 × 8 and depth is 3 is minimum coding unit.
The predicting unit and subregion of coding unit are arranged according to each depth along trunnion axis.In other words, if size It is 64 × 64 and coding unit 610 that depth is 0 is predicting unit, then predicting unit can be divided into including in coding unit Subregion in 610, i.e. subregion 610 that size is 64 × 64, the subregion 612 that size is 64 × 32, the subregion that size is 32 × 64 614 or size be 32 × 32 subregion 616.
Similarly, can be 32 × 32 by size and the predicting unit of coding unit 620 that depth is 1 is divided into and is included in volume Subregion in code unit 620, i.e. subregion 620 that size is 32 × 32, the subregion 622 that size is 32 × 16, size be 16 × 32 subregion 624 and size is 16 × 16 subregion 626.
Similarly, can be 16 × 16 by size and the predicting unit of coding unit 630 that depth is 2 is divided into and is included in volume Subregion in code unit 630, i.e. be included in the subregion 630, size that the size in coding degree unit 630 is 16 × 16 be 16 × 8 subregion 632, the subregion 634 that size is 8 × 16 and subregion 636 that size is 8 × 8.
Similarly, can be 8 × 8 by size and the predicting unit of coding unit 640 that depth is 3 is divided into and is included in coding Subregion in unit 640, i.e. be included in subregion, the subregion 642 that size is 8 × 4 that the size in coding unit 640 is 8 × 8, The subregion 644 that size is 4 × 8 and the subregion 646 that size is 4 × 4.
It is according to the embodiment in order to determine to form at least one coding depth of the coding unit of maximum coding unit 610 The coding unit determiner 120 of video encoder 100 is corresponding to each depth in maximum coding unit 610 to being included in Coding unit perform coding.
With depth down, including data with same range and identical size encoded according to the deeper of depth it is single The quantity increase of member.For example, it is desired to four are included in and depth 1 corresponding one with 2 corresponding coding unit of depth to cover Data in coding unit.Therefore, in order to according to depth relatively identical data is encoded as a result, corresponding with depth 1 Coding unit and four are encoded with 2 corresponding coding unit of depth.
In order to perform coding for the current depth among depth, can along the trunnion axis of hierarchy 600, by pair with Each predicting unit in the corresponding coding unit of current depth performs coding, will be directed to current depth selection minimum code and miss Difference.Optionally, with depth along hierarchy 600 vertical axis deepen, can by performing coding for each depth, than Compared with the minimum coding error according to depth, to search for minimum coding error.There is minimum coding error in coding unit 610 Depth and subregion can be chosen as the coding depth and divisional type of coding unit 610.
Figure 20 is for describing the pass according to an embodiment of the invention between coding unit 710 and converter unit 720 The diagram of system.
Video encoder 100 or video decoding apparatus according to the embodiment 200 according to the embodiment are for each maximum Coding unit, according to the coding unit of the size with less than or equal to maximum coding unit, encodes image or is decoded. Can the data cell based on no more than corresponding coding unit, to select for the converter unit that is converted during coding Size.
For example, in video encoder 100 according to the embodiment or video decoding apparatus according to the embodiment 200, such as The size of fruit coding unit 710 is 64 × 64, then can perform conversion by using the converter unit 720 that size is 32 × 32.
In addition, can by the size less than 64 × 64 be 32 × 32,16 × 16,8 × 8 and 4 × 4 each converter unit Conversion is performed, to be encoded to data of the size for 64 × 64 coding unit 710, then may be selected that there is minimum code to miss The converter unit of difference.
Figure 21 is the coding information for describing coding unit corresponding with coding depth according to an embodiment of the invention Diagram.
The output unit 130 of video encoder 100 according to the embodiment pair corresponding with coding depth can be encoded each The information 800 on divisional type, the information 810 on prediction mode and the information on converter unit size of unit 820 are encoded, and are sent using information 800, information 810 and information 820 as the information on coding mode.
Information 800 is indicated on by dividing the predicting unit of current coded unit the letter of the shape of subregion that obtains Breath, wherein, the subregion is the data cell for being predicted coding to current coded unit.For example, can be 2N by size The current coded unit CU_0 of × 2N is divided into any one in following subregion:Subregion 802 that size is 2N × 2N, size are The subregion 804 of 2N × N, the subregion 806 that size is N × 2N and subregion 808 that size is N × N.Here, on divisional type Information 800 be provided to instruction size be 2N × N subregion 804, the subregion 806 that size is N × 2N and size be N × N Subregion 808 in one.
Information 810 indicates the prediction mode of each subregion.For example, information 810 may indicate that point to being indicated by information 800 The pattern for the predictive coding that area performs, i.e. frame mode 812, inter-frame mode 814 or skip mode 816.
Information 820 indicates the converter unit being based on when current coded unit is performed and converted.For example, converter unit can To be converter unit 822 in the first frame, be converted in converter unit 824, the first inter-frame transform unit 826 or the second frame in the second frame Unit 828.
The view data and coded information extractor 220 of video decoding apparatus 200 can according to each deeper coding unit, Extract and use and be used for decoded information 800,810 and 820.
Figure 22 is the diagram of the deeper coding unit according to an embodiment of the invention according to depth.
Division information can be used to the change of indicated depth.Whether the coding unit of division information instruction current depth is divided Into the coding unit of more low depth.
For being 0 to depth and coding unit 900 that size is 2N_0 × 2N_0 is predicted the predicting unit 910 of coding It may include the subregion of following divisional type:Divisional type 912 that size is 2N_0 × 2N_0, the subregion that size is 2N_0 × N_0 Type 914, the divisional type 916 that size is N_0 × 2N_0 and divisional type 918 that size is N_0 × N_0.Fig. 9 illustrate only By symmetrically dividing predicting unit 910 and the divisional type 912 to 918 obtained, but divisional type not limited to this, and The subregion of predicting unit 910 may include asymmetric subregion, the subregion with predetermined shape and the subregion with geometry.
According to every kind of divisional type, to a subregion that size is 2N_0 × 2N_0, two that size is 2N_0 × N_0 Predictive coding is repeatedly carried out in subregion, two that size is N_0 × 2N_0 subregion and four subregions that size is N_0 × N_0.Can Frame mode and inter-frame mode are performed to the subregion that size is 2N_0 × 2N_0, N_0 × 2N_0,2N_0 × N_0 and N_0 × N_0 Under predictive coding.The predictive coding under skip mode can be only performed to the subregion that size is 2N_0 × 2N_0.
If encoding error is minimum in a divisional type in divisional type 912 to 916, can not be by predicting unit 910 are divided into more low depth.
If encoding error is minimum in divisional type 918, depth changes to 1 with the division point in operation 920 from 0 Area's type 918, and be 2 to depth and coding is repeatedly carried out to search for minimum volume in coding unit 930 that size is N_0 × N_0 Code error.
For being 1 to depth and coding unit 930 that size is 2N_1 × 2N_1 (=N_0 × N_0) is predicted coding Predicting unit 940 may include the subregion of following divisional type:Divisional type 942, the size that size is 2N_1 × 2N_1 be The divisional type 944 of 2N_1 × N_1, the divisional type 946 that size is N_1 × 2N_1 and subregion class that size is N_1 × N_1 Type 948.
If encoding error is minimum in divisional type 948, depth changes to 2 with the division point in operation 950 from 1 Area's type 948, and be 2 to depth and coding unit 960 that size is N_2 × N_2 repeats coding to search for minimum code Error.
When depth capacity is d, can be performed according to the division operation of each depth until depth becomes d-1, and draw Point information can be encoded one in depth is 0 to d-2.In other words, when coding is performed until in the depth with d-2 Corresponding coding unit operate be divided in 970 after depth be d-1 when, for being d-1 to depth and size is 2N_ (d- 1) predicting unit 990 that the coding unit 980 of × 2N_ (d-1) is predicted coding may include the subregion of following divisional type:Ruler Very little is that divisional type 994, size that divisional type 992, the size of 2N_ (d-1) × 2N (d-1) is 2N_ (d-1) × N (d-1) are The divisional type 996 and size of N_ (d-1) × 2N (d-1) is the divisional type 998 of N_ (d-1) × N (d-1).
A subregion, size 2N_ that can be to the size in divisional type 992 to 998 for 2N_ (d-1) × 2N_ (d-1) (d-1) two subregions of × N_ (d-1), size are N_ (d-1) × two subregions of 2N_ (d-1), size is N_ (d-1) × N_ (d-1) predictive coding is repeatedly carried out in four subregions, to search for the divisional type with minimum coding error.
Even if when divisional type 998 has minimum coding error, since depth capacity is d, depth is the volume of d-1 Code unit CU_ (d-1) is also no longer divided into more low depth, forms the coding of the coding unit of current maximum coding unit 900 Depth is confirmed as d-1, and the divisional type of current maximum coding unit 900 can be confirmed as N_ (d-1) × N (d-1).This Outside, since depth capacity is d, and the minimum coding unit 980 with lowest depth d-1 is no longer divided into more low depth, Therefore it is not provided with the division information of minimum coding unit 980.
Data cell 999 can be for current maximum coding unit " minimum unit ".According to an embodiment of the invention Minimum unit can be by by minimum coding unit 980 be divided into 4 parts and obtain square data cells.Pass through repetition Ground performs coding, and video encoder 100 can select to have by comparing according to the encoding error of the depth of coding unit 900 Respective partition type and prediction mode are arranged to the volume of coding depth by the depth of minimum coding error to determine coding depth Pattern.
In this way, being compared in all depth 1 into d to the minimum coding error according to depth, and compiled with minimum The depth of code error can be confirmed as coding depth.Coding depth, the divisional type of predicting unit and prediction mode can be used as and close It is encoded and sends in the information of coding mode.Further, since coding unit is divided into coding depth from 0 depth, therefore The division information of only coding depth is arranged to 0, and the division information of the depth in addition to coding depth is arranged to 1.
The view data and coded information extractor 220 of video decoding apparatus 200 according to the embodiment are extractable and use On the coding depth of coding unit 900 and the information of predicting unit, to be decoded to subregion 912.It is according to the embodiment to regard The depth that division information is 0 can be determined as coding depth by frequency decoding device 200 by using the division information according to depth, And decoded using the information of the coding mode on respective depth.
Figure 23 to Figure 25 is to be used to describe coding unit 1010, predicting unit 1060 and change according to an embodiment of the invention Change the diagram of the relation between unit 1070.
Coding unit 1010 is corresponding with the coding depth determined by video encoder 100 in maximum coding unit Coding unit with tree construction.Predicting unit 1060 is the subregion of the predicting unit in each coding unit 1010, and conversion is single Member 1070 is the converter unit of each coding unit 1010.
When the depth of the maximum coding unit in coding unit 1010 is 0, coding unit 1012 and coding unit 1054 Depth be 1, the depth of coding unit 1014,1016,1018,1028,1050 and 1052 is 2, coding unit 1020,1022, 1024th, 1026,1030,1032 and 1048 depth is 3, and the depth of coding unit 1040,1042,1044 and 1046 is 4.
In predicting unit 1060, some coding units are obtained by dividing the coding unit in coding unit 1010 1014th, 1016,1022,1032,1048,1050,1052 and 1054.In other words, 1014,1022,1050 and of coding unit The size of divisional type in 1054 is 2N × N, the size of the divisional type in coding unit 1016,1048 and 1052 be N × 2N, the size of the divisional type of coding unit 1032 is with regard to N × N.The predicting unit and subregion of coding unit 1010 are less than or equal to Each coding unit.
In the converter unit 1070 in the data cell less than coding unit 1052, to the image of coding unit 1052 Data perform conversion or inverse transformation.In addition, in terms of size and dimension, coding unit 1014 in converter unit 1070, 1016th, 1022,1032,1048,1050 and 1052 be different from predicting unit 1060 in coding unit 1014,1016,1022, 1032nd, 1048,1050 and 1052.In other words, video encoder 100 and video decoding apparatus 200 according to the embodiment can Infra-frame prediction, estimation, motion compensation, conversion and inverse transformation are independently performed to the data cell in same coding unit.
Therefore, each coding unit with hierarchy in each region of maximum coding unit is recursively performed Encode to determine optimum code unit, so as to obtain the coding unit with recurrence tree construction.Coding information may include on The division information of coding unit, the information on divisional type, the information on prediction mode and the size on converter unit Information.Table 1 shows the coding information that can be set by video encoder 100 according to the embodiment and video decoding apparatus 200.
[table 1]
The output unit 130 of video encoder 100 according to the embodiment is exportable on the coding list with tree construction The coding information of member, the view data and coded information extractor 220 of video decoding apparatus 200 according to the embodiment can be from receptions Coding information of the bitstream extraction arrived on the coding unit with tree construction.
Division information indicates whether for current coded unit to be divided into the coding unit of more low depth.If current depth d Division information be 0, then the depth that current coded unit is no longer divided into more low depth is coding depth, so as to for The coding depth defines the information of the size on divisional type, prediction mode and converter unit.If present encoding list Member is further divided into according to division information, then independently performs coding to four division coding units of more low depth.
Prediction mode can be one kind in frame mode, inter-frame mode and skip mode.All divisional types can be directed to Frame mode and inter-frame mode are defined, skip mode is defined only for the divisional type that size is 2N × 2N.
The ruler that information on divisional type may indicate that height by symmetrically dividing predicting unit or width to obtain The very little symmetric partitioning type for 2N × 2N, 2N × N, N × 2N and N × N, and by asymmetricly dividing the height of predicting unit Or width and the size that obtains are 2N × nU, the asymmetric divisional type of 2N × nD, nL × 2N and nR × 2N.Can be by by 1:3 With 3:1 come divide the height of predicting unit obtain respectively size be 2N × nU and 2N × nD asymmetric divisional type, can lead to Cross by 1:3 and 3:1 come divide the width of predicting unit obtain respectively size be nL × 2N and nR × 2N asymmetric subregion class Type.
Converter unit can be sized to the two types under frame mode and the two types under inter-frame mode.Change Sentence is talked about, if the division information of converter unit is 0, the size of converter unit can be 2N × 2N, i.e. current coded unit Size.If the division information of converter unit is 1, it is single conversion can be obtained by being divided to current coded unit Member.In addition, if when the divisional type of current coded unit that size is 2N × 2N is symmetric partitioning type, then converter unit Size can be N × N, if the divisional type of current coded unit is non-symmetric partitioning type, the size of converter unit can To be N/2 × N/2.
Coding information on the coding unit according to the embodiment with tree construction may include from corresponding to coding depth Coding unit, select in predicting unit and minimum unit it is at least one.Coding unit corresponding with coding depth may include What is selected from the predicting unit comprising identical coding information and minimum unit is at least one.
Therefore, determine whether adjacent data unit is included in and compiles by comparing the coding information of adjacent data unit In the code corresponding same coding unit of depth.In addition, determined and coding depth phase by using the coding information of data cell The corresponding coding unit answered, and therefore can determine that the distribution of the coding depth in maximum coding unit.
Therefore, current coded unit is predicted if based on the coding information of adjacent data unit, then can be direct With reference to and using data cell in the deeper coding unit neighbouring with current coded unit coding information.
Optionally, current coded unit is predicted if based on the coding information of adjacent data unit, then made The data cell neighbouring with current coded unit is searched for the coding information of data cell, and refers to the neighbouring volume searched Code unit is to be predicted current coded unit.
Figure 26 is for describing between the coding unit of the coding mode information according to table 1, predicting unit and converter unit Relation diagram.
Maximum coding unit 1300 includes the coding unit 1302,1304,1306,1312,1314,1316 of coding depth With 1318.Here, since coding unit 1318 is the coding unit of coding depth, division information can be configured to 0.Can The one kind that will be arranged on size for the information of the divisional type of the coding unit 1318 of 2N × 2N in following divisional type:Ruler It is very little be 2N × 2N divisional type 1322, the divisional type 1324 that size is 2N × N, the divisional type that size is N × 2N 1326th, size is N × N divisional type 1328, the divisional type 1332 that size is 2N × nU, the subregion that size is 2N × nD Class1 334, the divisional type 1336 that size is nL × 2N and divisional type 1338 that size is nR × 2N.
The division information (TU dimension marks) of converter unit is a kind of manipulative indexing.Converter unit corresponding with manipulative indexing Size can be changed according to the predicting unit type or divisional type of coding unit.
For example, when divisional type is configured to symmetrical (that is, divisional type 1322,1324,1326 or 1328), if The TU dimension marks of converter unit are 0, then set the converter unit 1342 that size is 2N × 2N, if TU dimension marks are 1, The converter unit 1344 that size is N × N is set.
When divisional type is configured to asymmetric (for example, divisional type 1332,1334,1336 or 1338), if TU Dimension mark is 0, then sets the converter unit 1352 that size is 2N × 2N, if TU dimension marks are 1, setting size is The converter unit 1354 of N/2 × N/2.
With reference to Figure 20, TU dimension marks are the marks with value 0 or 1, but TU dimension marks are not limited to 1 bit, in TU For dimension mark from while 0 increase, converter unit can be by layering division with tree construction.Division information (the TU of converter unit Dimension mark) can be manipulative indexing example.
In this case, according to an embodiment of the invention, can be by using the TU dimension marks of converter unit and change Full-size and the minimum dimension of unit are changed to represent the size of actually used converter unit.Video encoder 100 Size information of maximum conversion unit, size information of minimum conversion unit and maximum TU dimension marks can be encoded.To most The result that big converter unit dimension information, size information of minimum conversion unit and maximum TU dimension marks are encoded can be inserted into SPS.Video decoding apparatus 200 can be by using size information of maximum conversion unit, size information of minimum conversion unit and maximum TU dimension marks decode video.
If for example, the size of (a) current coded unit be 64 × 64 and maximum converter unit size be 32 × 32, Then (a-1) when TU dimension marks be 0 when, the size of converter unit can be 32 × 32, (a-2) when TU dimension marks be 1 when, The size of converter unit can be 16 × 16, and (a-3) when TU dimension marks are 2, the size of converter unit can be 8 × 8.
As another example, if the size of (b) current coded unit be 32 × 32 and minimum converter unit size be 32 × 32, then (b-1) when TU dimension marks be 0 when, the size of converter unit can be 32 × 32.Here, due to converter unit Size can not be less than 32 × 32, therefore TU dimension marks can not be arranged to the value in addition to 0.
As another example, if it is 1 that the size of (c) current coded unit, which is 64 × 64 and maximum TU dimension marks, Then TU dimension marks can be 0 or 1.Here, TU dimension marks can not be arranged to the value in addition to 0 or 1.
Therefore, when TU dimension marks are 0, it is if defining maximum TU dimension marks " MaxTransformSizeIndex ", minimum converter unit size are " MinTransformSize ", and converter unit size is " RootTuSize ", then can define the current minimum converter unit ruler that can be determined in current coded unit by equation (1) Very little " CurrMinTuSize ":
CurrMinTuSize=max (MinTransformSize, RootTuSize/ (2^ MaxTransformSizeIndex)) …(1)
Compared with the current minimum converter unit size " CurrMinTuSize " that can be determined in current coded unit, when When TU dimension marks are 0, converter unit size " RootTuSize " may indicate that the maximum converter unit ruler that can be selected in systems It is very little.In equation (1), " RootTuSize/ (2 ∧ MaxTransformSizeIndex) " is indicated when TU dimension marks are 0, Converter unit size " RootTuSize " has been divided converter unit size during number corresponding with maximum TU dimension marks, The minimum transform size of " MinTransformSize " instruction.Therefore, " RootTuSize/ (2^ MaxTransformSizeIndex) " and in " MinTransformSize " less value can be can be in current coded unit Definite current minimum converter unit size " CurrMinTuSize ".
According to an embodiment of the invention, maximum converter unit size RootTuSize can change according to the type of prediction mode Become.
For example, if current prediction mode is inter-frame mode, can be determined by using following equation (2) “RootTuSize”.In equation (2), the maximum converter unit size of " MaxTransformSize " instruction, " PUSize " instruction Current prediction unit size:
RootTuSize=min (MaxTransformSize, PUSize) ... (2)
That is, if current prediction mode were inter-frame mode, the converter unit size when TU dimension marks are 0 " RootTuSize " can be less value in maximum converter unit size and current prediction unit size.
If the prediction mode of current bay unit is frame mode, can be determined by using following equation (3) “RootTuSize”.In equation (3), " PartitionSize " indicates the size of current bay unit:
RootTuSize=min (MaxTransformSize, PartitionSize) ... (3)
That is, if current prediction mode were frame mode, the converter unit size when TU dimension marks are 0 " RootTuSize " can be less value among maximum converter unit size and the size of current bay unit.
However, the type of prediction mode in zoning unit and the current maximum converter unit size that changes " RootTuSize " is only example, and the present invention is not limited thereto.
According to the method for video coding based on the coding unit with tree construction described above with reference to Figure 14 to Figure 26, pin Each coding unit of tree construction encodes the view data of spatial domain.According to based on the coding unit with tree construction Video encoding/decoding method, for each maximum coding unit perform decoding to recover the view data of spatial domain.Therefore, can recover Picture and video (that is, picture sequence).Video after recovery can be reproduced by reproduction equipment, can be stored in storage medium In, or can be sent by network.
Computer program can be written as according to an embodiment of the invention, and can be held using computer readable recording medium storing program for performing It is implemented in the general purpose digital computer of line program.The example of computer readable recording medium storing program for performing include magnetic storage medium (for example, ROM, floppy disk, hard disk etc.) and optical recording media (for example, CD-ROM or DVD).
For the ease of explaining, regarded above by reference to what Figure 1A to Figure 20 was described according to multi-view point video Forecasting Methodology, multiple views The method for video coding of frequency prediction restoration methods or multi-view point video encoding method will be collectively referred to as " video volume according to the present invention Code method ".In addition, restoration methods or multi-view point video solution are predicted according to multi-view point video above by reference to what Figure 1A to Figure 20 was described Code method will be referred to as " video encoding/decoding method according to the present invention ".
In addition, described with reference to Figure 1A to Figure 26 video encoder (including multi-view point video predictive coding equipment 10, Multi-view video encoding apparatus 121, video encoder 100 or image encoder 400) it will be referred to as " regarding according to the present invention Frequency encoding device ".In addition, (including multi-view point video prediction decoding is set the video decoding apparatus described with reference to Figure 1A to Figure 26 Standby 20, multi-view point video decoding device 131, video decoding apparatus 200 or image decoder 500) it will be referred to as " according to the present invention Video decoding apparatus ".
Will be described in now storage program according to an embodiment of the invention computer readable recording medium storing program for performing (for example, Disk 26000).
Figure 27 is the physical arrangement of the disk 26000 of storage program according to an embodiment of the invention.As storage medium Disk 26000 can be hard disk drive, compact disc read write (CD-ROM) disk, Blu-ray disc or digital versatile disc (DVD). Disk 26000 includes multiple concentric magnetic track Tr, and the circumferencial direction of each concentric magnetic track Tr along disk 26000 is divided into certain amount of Sector Se.In the specific region of disk 26000, it can distribute and store and perform quantization parameter as described above and determine method, regard The program of frequency coding method and video encoding/decoding method.
Described now with reference to Figure 28 using the storage medium of storage following procedure the computer system realized, its In, described program is used to perform method for video coding and video encoding/decoding method as described above.
Figure 28 is the diagram that the simultaneously disk drive 26800 of reading program is recorded by using disk 26000.Computer system 27000 be able to will perform from method for video coding according to an embodiment of the invention and video decoding side via disk drive 26800 At least one program storage selected in method is in disk 26000.In order to which operation is stored in disk in computer system 27000 Program in 26000, from 26000 reading program of disk and can be sent to department of computer science by using disk drive 26800 by program System 27000.
Execution selects at least one from method for video coding according to an embodiment of the invention and video encoding/decoding method Program can not only be stored in the disk 26000 shown in Figure 27 or Figure 28, be also stored in storage card, ROM cassette tapes or In solid state drive (SSD).
System explained below using method for video coding and video encoding/decoding method described above.
Figure 29 is to provide the integrally-built diagram of the contents providing system 11000 of distribution of content service.By communication system Coverage be divided into the cell of preliminary dimension, and wireless base station 11700,11800,11900 and 12000 is installed respectively In these cells.
Contents providing system 11000 includes multiple self-contained units.For example, such as computer 12100, personal digital assistant (PDA) 12200, multiple self-contained units of video camera 12300 and mobile phone 12500 are via ISP 11200th, communication network 11400 and wireless base station 11700,11800,11900 and 12000 are connected to internet 11100.
However, contents providing system 11000 is not limited to as shown in Figure 29, and in device is optionally connected to Hold supply system 11000.Multiple self-contained units can directly connect not via wireless base station 11700,11800,11900 and 12000 It is connected to communication network 11400.
Video camera 12300 is to be capable of the imaging device of captured video image, for example, digital video camera.Mobile phone 12500 can utilize various agreements (for example, individual digital communication (PDC), CDMA (CDMA), wideband code division multiple access (W- CDMA), global system for mobile communications (GSM) and personal handyphone system (PHS)) at least one of communication means.
Video camera 12300 can be connected to streaming server 11300 via wireless base station 11900 and communication network 11400.Stream The content that server 11300 allows to receive from user via video camera 12300 is streamed via real-time broadcast.It can be used Video camera 12300 or streaming server 11300 encode the content received from video camera 12300.Pass through video The video data that camera 12300 captures can be sent to streaming server 11300 via computer 12100.
The video data captured by camera 12600 also can be sent to streaming server via computer 12100 11300.Similar with digital camera, camera 12600 is the imaging device that can catch both static image and video image.It can make The video data captured by camera 12600 is encoded with camera 12600 or computer 12100.Video will can be held The software of row coding and decoding is stored in the computer readable recording medium storing program for performing that can be accessed by computer 12100 (for example, CD-ROM Disk, floppy disk, hard disk drive, SSD or storage card) in.
, can be from mobile phone if video data is caught in by the camera being built in mobile phone 12500 12500 receive video data.
The large-scale integrated electricity in video camera 12300, mobile phone 12500 or camera 12600 can also be passed through Road (LSI) system encodes video data.
Contents providing system 1100 can to by user using video camera 12300, camera 12600, mobile phone 12500 or The content-data (for example, the content recorded during concert) that another imaging device is recorded is encoded, and by after coding Content-data be sent to streaming server 11300.Streaming server 11300 can be by the content-data after coding with the shape of flow content Formula is sent to other clients of request content data.
Client is can to carry out decoded device to the content-data after coding, for example, computer 12100, PDA 12200th, video camera 12300 or mobile phone 12500.Therefore, contents providing system 11000 allows client to receive and reproduce Content-data after coding.In addition, the content-data after the permission client real-time reception coding of contents providing system 11000 is simultaneously Content-data after coding is decoded and reproduced, thus allows for personal broadcaster.
The coding and decoding operation for the multiple self-contained units being included in content in supply system 11000 can be similar to according to this The coding and decoding operation of the video encoder and video decoding apparatus of the embodiment of invention.
It is described more fully now with reference to Figure 30 and Figure 31 and is included in Content supply according to an embodiment of the invention Mobile phone 12500 in system 11000.
Figure 30 shows according to an embodiment of the invention using method for video coding and the mobile phone of video encoding/decoding method 12500 external structure.Mobile phone 12500 can be smart phone, and the function of the smart phone is unrestricted, and described A large amount of functions of smart phone can be changed or extend.
Mobile phone 12500 includes the inside antenna that radio frequency (RF) signal can be exchanged with the wireless base station 12000 of Figure 21 12510, and including the image captured for display by camera 12530 or via the figure receive and decoded of antenna 12510 The display screen 12520 (for example, liquid crystal display (LCD) or Organic Light Emitting Diode (OLED) screen) of picture.Cell phone 12500 Operation panel 12540 including including control button and touch panel.If display screen 12520 is touch-screen, operating surface Plate 12540 further includes the touch-sensing panel of display screen 12520.Mobile phone 12500 includes being used to export voice and sound Loudspeaker 12580 or another type voice output unit and the microphone 12550 or another kind of for inputting voice and sound Type sound input unit.Mobile phone 12500 further includes the camera 12530 for catching video and static image, such as electric charge Coupled apparatus (CCD) camera.Mobile phone 12500 may also include:Storage medium 12570, for example passes through camera for storing 12530 videos captured or static image, the coding/decoding data received via e-mail or according to various modes The coding/decoding data of acquisition;Slot 12560, storage medium 12570 are loaded into mobile phone 12500 via slot 12560 In.Storage medium 12570 can be flash memory, it may for example comprise secure digital (SD) in plastic housing card or electrically erasable and can compile Journey read-only storage (EEPROM).
Figure 31 shows the internal structure of mobile phone 12500 according to an embodiment of the invention.In order to systematically control bag Include the component of the mobile phone 12500 of display screen 12520 and operation panel 12540, power supply circuit 12700, operation input control It is device 12640, image coding unit 12720, camera interface 12630, LCD controller 12620, image decoding unit 12690, multiple Passed through with device/demultiplexer 12680, recording/reading unit 12670, modulation/demodulation unit 12660 and Sound Processor Unit 12650 Central controller 12710 is connected to by synchronous bus 12730.
If user's operation power knob, and be arranged to " electric power starting " state from " power-off " state, then power supply is electric All components power supply of the road 12700 from battery pack to mobile phone 12500, so that mobile phone 12500 is arranged to operation mould Formula.
Central controller 12710 includes CPU, ROM and RAM.
While communication data is sent to outside by mobile phone 12500, under the control of central controller 12710, Digital signal is produced by mobile phone 12500.For example, Sound Processor Unit 12650 can produce digital audio signal, image coding is single Member 12720 can produce data image signal, and the text data of message can be controlled via operation panel 12540 and operation input Device 12640 is generated.When digital signal is sent to modulation/demodulation unit 12660 under the control in central controller 12710 When, modulation/demodulation unit 12660 is modulated the frequency band of digital signal, and telecommunication circuit 12610 is to band modulation numeral Voice signal performs digital-to-analogue conversion (DAC) and frequency conversion.The transmission signal exported from telecommunication circuit 12610 can be via antenna 12510 are sent to voice communication base station or wireless base station 12000.
For example, when mobile phone 12500 is in call mode, under the control of central controller 12710, via Mike The voice signal that wind 12550 obtains is transformed into digital audio signal by Sound Processor Unit 12650.Digital audio signal can be through Transmission signal is transformed into by modulation/demodulation unit 12660 and telecommunication circuit 12610, and can be sent via antenna 12510.
When text message (for example, Email) is sent in a data communication mode, the text data of text message It is transfused to via operation panel 12540, and central controller 12610 is sent to via operation input controller 12640.In Under the control for entreating controller 12610, text data is transformed into via modulation/demodulation unit 12660 and telecommunication circuit 12610 Signal is sent, and wireless base station 12000 is sent to via antenna 12510.
In order to send view data in a data communication mode, the view data captured by camera 12530 is via camera Interface 12630 is provided to image coding unit 12720.The view data captured can be controlled via camera interface 12630 and LCD Device 12620 processed is displayed directly on display screen 12520.
The structure of image coding unit 12720 can be corresponding to the structure of video encoder 100 described above.Image is compiled Code unit 12720 can be according to by method for video coding described above, the image data transformation that will be received from camera 12530 For the view data after compressing and encoding, and the view data after coding is then output to multiplexer/demultiplexer 12680. During the record operation of camera 12530, the voice signal obtained by the microphone 12550 of mobile phone 12500 can be via sound Sound processor 12650 is transformed into digital audio data, and digital audio data may be sent to that multiplexer/demultiplexer 12680。
Multiplexer/demultiplexer 12680 to from the view data after the coding that image coding unit 12720 receives with from The voice data that Sound Processor Unit 12650 receives is multiplexed together.The result being multiplexed to data can be via modulation/solution Adjust unit 12660 and telecommunication circuit 12610 to be transformed into transmission signal, then can be sent via antenna 12510.
When mobile phone 12500 is from external reception communication data, the signal received via antenna 12510 can be performed Frequency retrieval and ADC are to translate the signals into digital signal.Modulation/demodulation unit 12660 adjusts the frequency band of digital signal System.The digital signal is sent to, at sound by video decoding unit 12690 according to the type of the digital signal after band modulation Manage device 12650 or LCD controller 12620.
In the talk mode, mobile phone 12500 is amplified the signal received via antenna 12510, and passes through Frequency conversion and ADC are performed to amplified signal to obtain digital audio signal.Under the control of central controller 12710, The digital audio signal received is transformed into simulated sound via modulation/demodulation unit 12660 and Sound Processor Unit 12650 Signal, and analoging sound signal is output via loudspeaker 12580.
When in a data communication mode, receive the data of the video file accessed on internet site, via modulation/ The signal output received via antenna 12510 from wireless base station 12000 is multiplex data by demodulating unit 12660, and will be multiple With data sending to multiplexer/demultiplexer 12680.
In order to be decoded to the multiplex data received via antenna 12510, multiplexer/demultiplexer 12680 will be multiple Video data stream after demultiplexing into coding with data and the voice data stream after coding.Via synchronous bus 12730, after coding Video data stream and coding after voice data stream be respectively provided to video decoding unit 12690 and Sound Processor Unit 12650。
The structure of image decoding unit 12690 can be corresponding to the structure of video decoding apparatus 200 described above.Image solution Code unit 12690 can be according to the video decoding side as used in video decoding apparatus 200 described above or image decoder 500 Method, decodes the video data after coding come the video data after being recovered, and will be extensive via LCD controller 12620 Video data after multiple is supplied to display screen 12520.
Therefore, the data of the video file accessed on internet site can be included on display screen 12520.Meanwhile Voice data can be transformed into analoging sound signal by Sound Processor Unit 12650, and analoging sound signal is supplied to loudspeaker 12580.Therefore, the audio number included in the video file accessed on internet site can be also reproduced in via loudspeaker 12580 According to.
Mobile phone 12500 or another type of communication terminal can include video according to an embodiment of the invention to compile The transceiver terminal of both decoding apparatus and video decoding apparatus, can be the transceiver terminal for only including video encoder, Huo Zheke To be the transceiver terminal for only including video decoding apparatus.
Communication system according to the present invention is not limited to the communication system described above by reference to Figure 30.For example, Figure 32 shows root According to the digit broadcasting system using communication system of the embodiment of the present invention.The digit broadcasting system of Figure 32 can be by using basis The video encoder and video decoding apparatus of the embodiment of the present invention come receive via satellite or ground network transmission numeral Broadcast.
Specifically, video data stream to telecommunication satellite or broadcast are defended by using radio wave in broadcasting station 12890 Star 12900.Broadcasting satellite 12900 sends broadcast singal, and broadcast singal is sent to satellite broadcasting via family expenses antenna 12860 Receiver., can be by TV receivers 12810, set-top box 12870 or another device to the video flowing after coding in each house Decoded and reproduced.
When video decoding apparatus according to an embodiment of the invention is implemented in reproduction equipment 12830, reproduction equipment 12830 can be to regarding after the coding that is recorded on storage medium 12820 (disk or storage card such as recovering digital signal) Frequency stream is parsed and decoded.Therefore, the vision signal after recovering can be being reproduced for example on monitor 12840.
Line being connected to for the antenna 12860 of satellite/terrestrial broadcast or for receiving cable television (TV) broadcast In the set-top box 12870 of cable antenna 12850, video decoding apparatus according to an embodiment of the invention can be installed.From set-top box The data of 12870 outputs can be also reproduced on TV Monitor 12880.
Such as another example, video decoding apparatus according to an embodiment of the invention can be installed in TV receivers 12810, Rather than in set-top box 12870.
Automobile 12920 with appropriate antenna 12910 can receive the letter from satellite 12900 or the transmission of wireless base station 11700 Number.Decoded video can be reproduced on the display screen of the auto-navigation system 12930 in automobile 12920.
Vision signal can be encoded by video encoder according to an embodiment of the invention, then can be stored in storage In medium.Specifically, picture signal can be stored in DVD disc 12960 by DVD recorder, or can be by hdd recorder 12950 store picture signal in a hard disk.Such as another example, vision signal can be stored in SD card 12970.If hard disk Logger 12950 includes video decoding apparatus according to an embodiment of the invention, then is recorded in DVD disc 12960, SD card 12970 Or the vision signal on another storage medium can be reproduced on TV Monitor 12880.
Auto-navigation system 12930 may not include the camera 12530, camera interface 12630 and image coding unit of Figure 31 12720.For example, computer 12100 and TV receivers 12810 may not include camera 12530, the camera interface 12630 in Figure 31 In image coding unit 12720.
Figure 33 is to show the cloud computing according to an embodiment of the invention using video encoder and video decoding apparatus The diagram of the network structure of system.
Cloud computing system may include cloud computing server 14000, customer data base (DB) 14100, multiple computing resources 14200 and user terminal.
Request in response to carrying out user terminal, cloud computing system are provided via data communication network (for example, internet) The program request outsourcing service of multiple computing resources 14200.Under cloud computing environment, service provider is combined by using virtual technology Computing resource at the data center of different physical locations, to provide desired service to the user.Service user need not Computing resource (for example, using, memory, operating system (OS) and safety) is installed in his/her terminal that possesses to use They, but can be selected at desired time point in the Virtual Space produced by virtual technology from service and using wanting Service.
The user terminal of appointed service user is via the data communication network for including internet and mobile communications network It is connected to cloud computing server 14000.Cloud computing service can be provided from cloud computing server 14000 to user terminal, especially It is rabbit service.User terminal can be the various types of electronic devices that can be connected to internet, for example, on table Type PC 14300, intelligence TV 14400, smart phone 14500, notebook 14600, portable media player (PMP) 14700, tablet PC 14800 etc..
Cloud computing server 14000, which can be combined, is distributed in multiple computing resources 14200 in cloud network, and to user terminal The result of combination is provided.The multiple computing resource 14200 may include various data services, and may include to upload from user terminal Data.As described above, cloud computing server 14000 can be by being distributed in the different areas according to virtual technology combination Video database to provide desired service to user terminal.
The user information of user on having subscribed to cloud computing service is stored in user DB 14100.User information It may include log-on message, address, name and the personal credit information of user.User information may also include the index of video.Here, List, the list for the video being reproduced for indexing the video that may include to be reproduced, the video being reproduced before Breakpoint etc..
The information on video being stored in user DB 14100 can be shared between the user device.For example, when response When Video service is supplied to notebook 14600 by the request from notebook 14600, Video service is again Existing history is stored in user DB 14100.When receiving the request for reproducing this Video service from smart phone 14500 When, cloud computing server 14000 is searched for based on user DB 14100 and reproduces this Video service.When smart phone 14500 is from cloud When calculation server 14000 receives video data stream, by video data stream is decoded the processing that reproduces video with Operation above by reference to Figure 30 mobile phones 12500 described is similar.
The reproduction that cloud computing server 14000 refers to be stored in the desired Video service in user DB 14100 is gone through History.For example, cloud computing server 14000 is received from user terminal is used for asking for the video that reproduction is stored in user DB 14100 Ask.If this video before be reproduced, by cloud computing server 14000 perform to this video carry out streaming side Method can be according to the request (that is, according to be that will reproduce video from the breakpoint of the starting point of video or video) for carrying out user terminal It is and different.For example, if user terminal requests reproduce video since the starting point of video, cloud computing server 14000 will be from The flow data for the video that first frame of video starts is sent to user terminal.If user terminal requests are opened from the breakpoint of video Begin to reproduce video, then the flow data of the video since frame corresponding with breakpoint is sent to use by cloud computing server 14000 Family terminal.
In the case, user terminal may include the video decoding apparatus described as described above with Figure 1A to Figure 26.As separately One example, user terminal may include the video encoder described as described above with Figure 1A to Figure 26.Optionally, user terminal It may include as described above with both Figure 1A to Figure 26 video decoding apparatus described and video encoder.
The implementation according to the present invention described above by reference to Figure 1A to Figure 26 is described with reference to Figure 27 to Figure 33 above The various applications of the method for video coding, video encoding/decoding method, video encoder and video decoding apparatus of example.However, according to The method being stored in method for video coding and video encoding/decoding method in storage medium of various embodiments of the present invention, or will Video encoder and video decoding apparatus realize that method in a device is not limited to the reality described above by reference to Figure 27 to Figure 33 Apply example.
Although being specifically illustrated with reference to the embodiment of the present invention and the invention has been described, the ordinary skill people of this area Member it will be understood that, in the case where not departing from the spirit and scope of the present invention being defined by the appended claims, can wherein into Various changes in row form and details.Embodiment should be to be considered only as describing significance rather than limitation purpose.Therefore, The scope of the present invention is not limited by detailed description of the invention, but appended claims limit, and is fallen in the range of All difference be to be interpreted as being included in the invention.

Claims (12)

1. a kind of predictive coding method encoded to multi-view point video, the predictive coding method include:
Determine to include having among the same viewpoint being reconstructed prior to current picture rebuilds picture suitable with the reproduction of current picture At least one short-term reference picture collection for rebuilding picture of the different reproduction order of sequence and long-term rebuild picture including at least one The reference picture collection in face, determines to include having to draw with current among the different points of view being reconstructed prior to current picture rebuilds picture At least one short-term reference picture collection for rebuilding picture of the identical reproduction order of the reproduction order in face;
At least one reference listing is determined between the first reference listing and the second reference listing, wherein, the first reference listing bag Include among definite multiple reference picture collection with the viewpoint identical with the viewpoint of current picture and with being drawn earlier than current At least one reconstruction picture of the reproduction order of the reproduction order in face and with it is identical with the reproduction order of current picture again Now at least one reconstruction picture of the VID of order and the view identifier VID with less than current picture, the second reference listing bag Include reproduction order with the viewpoint identical with the viewpoint of current picture and with the reproduction order for being later than current picture extremely Few one is rebuild picture and with the reproduction order identical with the reproduction order of current picture and with more than current picture At least one reconstruction picture of the VID of VID;
At least one reference picture of the current block for current picture is determined by using definite at least one reference listing And reference block;
At least one prediction selected from inter prediction and interview prediction is performed for current block by using reference block,
Wherein it is determined that the step of reference picture collection, includes:
Whether determine in current band using one of the multiple reference picture collection determined for current picture;
If it is determined that using one of the multiple reference picture collection determined for current picture, then from the multiple reference picture Concentrate selection index;
If it is determined that without using one of the multiple reference picture collection determined for current picture, it is determined that for current band Reference picture collection.
2. the predictive coding method encoded as claimed in claim 1 to multi-view point video, wherein it is determined that reference picture collection The step of include:
Determine to include having among the same viewpoint being reconstructed prior to current picture rebuilds picture suitable with the reproduction of current picture At least one non-reference of the different reproduction order of sequence rebuilds the reference picture collection of picture;
Determine to include having among the different points of view being reconstructed prior to current picture rebuilds picture suitable with the reproduction of current picture At least one non-reference of the identical reproduction order of sequence rebuilds the reference picture collection of picture.
3. the predictive coding method encoded as claimed in claim 1 to multi-view point video, wherein it is determined that for working as preceding article The step of reference picture collection of band, includes:
Determine the first quantity of the picture of the VID of the VID with less than current view point and with the VID's for being more than current view point Second quantity of the picture of VID;
Determine the difference between the VID of the picture of the VID of the VID with less than current view point and with being more than current view point Difference between the VID of the picture of the VID of VID.
4. the predictive coding method encoded as claimed in claim 1 to multi-view point video, wherein it is determined that at least one ginseng The step of examining list includes:
Determine that the reference that the reference key of definite at least one reference listing whether is selectively changed in current band is suitable Sequence;
When determining to selectively change reference sequence in current band, selectively change for belonging to working as current picture The reference sequence of the reference key of at least one reference listing of preceding article band.
5. the predictive coding method encoded as claimed in claim 1 to multi-view point video, wherein:
When performing inter prediction, from the reproduction order with earlier than current picture being included in certainly in the first reference listing again Now at least one reconstruction picture of order and it is included in the second reference listing there is the reproduction order for being later than current picture Reference picture and reference block are determined at least one reconstruction picture selected at least one reconstruction picture of reproduction order;Pass through Inter prediction is performed to current block using definite reference block;The first of the definite current block produced by performing inter prediction is residual First reference key of the reference picture that the first motion vector of the definite reference block of difference data, instruction and instruction determine;
When performing interview prediction, from the VID's with the VID less than current picture being included in certainly in the first reference listing The VID's of at least one reconstruction picture and the VID with more than current picture being included in the second reference listing is at least one Rebuild in picture and reference picture and reference block are determined at least one reconstruction picture selected;By using definite reference block pair Current block performs interview prediction;Second residual error data of the current block for determining to produce by performing interview prediction, indicate really Second reference key of the reference picture that the second difference vector of fixed reference block and instruction determine.
6. a kind of carry out decoded prediction decoding method to multi-view point video, the prediction decoding method includes:
Determine to include having among the same viewpoint being reconstructed prior to current picture rebuilds picture suitable with the reproduction of current picture At least one short-term reference picture collection for rebuilding picture of the different reproduction order of sequence and long-term rebuild picture including at least one The reference picture collection in face, determines to include having to draw with current among the different points of view being reconstructed prior to current picture rebuilds picture At least one short-term reference picture collection for rebuilding picture of the identical reproduction order of the reproduction order in face;
At least one reference listing is determined between the first reference listing and the second reference listing, wherein, the first reference listing bag Include among definite multiple reference picture collection with the viewpoint identical with the viewpoint of current picture and with being drawn earlier than current At least one reconstruction picture of the reproduction order of the reproduction order in face and with it is identical with the reproduction order of current picture again Now at least one reconstruction picture of the VID of order and the VID with less than current picture, the second reference listing include having with working as At least one reconstruction of the identical viewpoint of the viewpoint of preceding picture and the reproduction order with the reproduction order for being later than current picture The VID of picture and VID with the reproduction order identical with the reproduction order of current picture and with more than current picture At least one reconstruction picture;
At least one reference picture of the current block for current picture is determined by using definite at least one reference listing And reference block;
At least one compensation selected from motion compensation and parallax compensation is performed for current block by using reference block,
Wherein it is determined that the step of reference picture collection, includes:
Whether determine in current band using one of the multiple reference picture collection determined for current picture;
If it is determined that using one of the multiple reference picture collection determined for current picture, then from the multiple reference picture Concentrate selection index;
If it is determined that without using one of the multiple reference picture collection determined for current picture, it is determined that for current band Reference picture collection.
7. decoded prediction decoding method is carried out to multi-view point video as claimed in claim 6, wherein it is determined that reference picture collection The step of include:
Determine to include having among the same viewpoint being reconstructed prior to current picture rebuilds picture suitable with the reproduction of current picture At least one non-reference of the different reproduction order of sequence rebuilds the reference picture collection of picture;
Determine to include having among the different points of view being reconstructed prior to current picture rebuilds picture suitable with the reproduction of current picture At least one non-reference of the identical reproduction order of sequence rebuilds the reference picture collection of picture.
8. decoded prediction decoding method is carried out to multi-view point video as claimed in claim 6, wherein it is determined that for working as preceding article The step of reference picture collection of band, includes:
Determine the first quantity of the picture of the VID of the VID with less than current view point and with the VID's for being more than current view point Second quantity of the picture of VID;
Determine the difference between the VID of the picture of the VID of the VID with less than current view point and with being more than current view point Difference between the VID of the picture of the VID of VID.
9. decoded prediction decoding method is carried out to multi-view point video as claimed in claim 6, wherein it is determined that at least one ginseng The step of examining list includes:
Determine that the reference that the reference key of definite at least one reference listing whether is selectively changed in current band is suitable Sequence;
When determining to selectively change reference sequence in current band, selectively change for belonging to working as current picture The reference sequence of the reference key of at least one reference listing of preceding article band.
10. decoded prediction decoding method is carried out to multi-view point video as claimed in claim 6, wherein, perform and mended from movement Repay includes with least one the step of compensating selected in parallax compensation:
Receive reference key, residual error data and the motion vector or difference vector of the current block for current picture;
At least one reconstruction picture selected from from following reconstruction picture determines the reference picture indicated by reference key:Including At least one the reconstruction picture and bag of the reproduction order with the reproduction order earlier than current picture in the first reference listing Include at least one reconstruction picture of the reproduction order with the reproduction order for being later than current picture in the second reference listing;
Determined from definite reference picture by the reference block of motion vector or the difference vector instruction of current block;
To definite reference block compensation residual error data.
11. a kind of predictive coding equipment for being encoded to multi-view point video, the predictive coding equipment includes:
Reference picture collection determiner, for determining to include tool among rebuilding picture in the same viewpoint being reconstructed prior to current picture There is at least one short-term reference picture collection and bag for rebuilding picture of the reproduction order different from the reproduction order of current picture At least one long-term reference picture collection for rebuilding picture is included, among the different points of view being reconstructed prior to current picture rebuilds picture Determine to include at least one short-term reference picture for rebuilding picture with the reproduction order identical with the reproduction order of current picture Face collection;
Reference listing determiner, for determining at least one reference listing between the first reference listing and the second reference listing, Wherein, there is the viewpoint identical with the viewpoint of current picture among multiple reference picture collection that the first reference listing includes determining And at least one reconstruction picture of the reproduction order of the reproduction order with earlier than current picture and with current picture The identical reproduction order of reproduction order and the VID of VID with less than current picture at least one reconstruction picture, the second ginseng Examining list is included with the viewpoint identical with the viewpoint of current picture and with the reproduction for the reproduction order for being later than current picture At least one reconstruction picture of order and with the reproduction order identical with the reproduction order of current picture and with more than work as At least one reconstruction picture of the VID of the VID of preceding picture;
Fallout predictor, for determining at least one of the current block for current picture by using definite at least one reference listing A reference picture and reference block, and performed for current block by using reference block and selected from inter prediction and interview prediction At least one prediction,
Wherein, reference picture collection determiner is configured to:
Whether determine in current band using one of the multiple reference picture collection determined for current picture;
If it is determined that using one of the multiple reference picture collection determined for current picture, then from the multiple reference picture Concentrate selection index;
If it is determined that without using one of the multiple reference picture collection determined for current picture, it is determined that for current band Reference picture collection.
12. one kind is used to carry out decoded prediction decoding equipment to multi-view point video, the prediction decoding equipment includes:
Reference picture collection determiner, for determining to include tool among rebuilding picture in the same viewpoint being reconstructed prior to current picture There is at least one short-term reference picture collection and bag for rebuilding picture of the reproduction order different from the reproduction order of current picture At least one long-term reference picture collection for rebuilding picture is included, among the different points of view being reconstructed prior to current picture rebuilds picture Determine to include at least one short-term reference picture for rebuilding picture with the reproduction order identical with the reproduction order of current picture Face collection;
Reference listing determiner, for determining at least one reference listing between the first reference listing and the second reference listing, Wherein, there is the viewpoint identical with the viewpoint of current picture among multiple reference picture collection that the first reference listing includes determining And at least one reconstruction picture of the reproduction order of the reproduction order with earlier than current picture and with current picture The identical reproduction order of reproduction order and the VID of VID with less than current picture at least one reconstruction picture, the second ginseng Examining list is included with the viewpoint identical with the viewpoint of current picture and with the reproduction for the reproduction order for being later than current picture At least one reconstruction picture of order and with the reproduction order identical with the reproduction order of current picture and with more than work as At least one reconstruction picture of the VID of the VID of preceding picture;
Compensator, for determining at least one of the current block for current picture by using definite at least one reference listing A reference picture and reference block, and perform what is selected from motion compensation and parallax compensation for current block by using reference block At least one compensation,
Wherein, reference picture collection determiner is configured to:
Whether determine in current band using one of the multiple reference picture collection determined for current picture;
If it is determined that using one of the multiple reference picture collection determined for current picture, then from the multiple reference picture Concentrate selection index;
If it is determined that without using one of the multiple reference picture collection determined for current picture, it is determined that for current band Reference picture collection.
CN201380033884.4A 2012-04-25 2013-04-25 Use the multi-view point video decoding method and its device of the reference picture collection predicted for multi-view point video Expired - Fee Related CN104396252B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201261638101P 2012-04-25 2012-04-25
US61/638,101 2012-04-25
PCT/KR2013/003579 WO2013162311A1 (en) 2012-04-25 2013-04-25 Multiview video encoding method using reference picture set for multiview video prediction and device therefor, and multiview video decoding method using reference picture set for multiview video prediction and device therefor

Publications (2)

Publication Number Publication Date
CN104396252A CN104396252A (en) 2015-03-04
CN104396252B true CN104396252B (en) 2018-05-04

Family

ID=49483525

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201380033884.4A Expired - Fee Related CN104396252B (en) 2012-04-25 2013-04-25 Use the multi-view point video decoding method and its device of the reference picture collection predicted for multi-view point video

Country Status (5)

Country Link
US (1) US20150124877A1 (en)
EP (1) EP2843946A4 (en)
KR (1) KR102106536B1 (en)
CN (1) CN104396252B (en)
WO (1) WO2013162311A1 (en)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130119379A (en) * 2012-04-23 2013-10-31 삼성전자주식회사 Method and apparatus for multiview video encoding using reference picture list for multiview video prediction, method and apparatus for multiview video decoding using reference picture list for multiview video prediction
US9357195B2 (en) * 2012-08-16 2016-05-31 Qualcomm Incorporated Inter-view predicted motion vector for 3D video
WO2014075236A1 (en) * 2012-11-14 2014-05-22 Mediatek Singapore Pte. Ltd. Methods for residual prediction with pseudo residues in 3d video coding
WO2015102439A1 (en) 2014-01-03 2015-07-09 삼성전자 주식회사 Method and apparatus for managing buffer for encoding and decoding multi-layer video
US10715833B2 (en) * 2014-05-28 2020-07-14 Apple Inc. Adaptive syntax grouping and compression in video data using a default value and an exception value
US10368084B2 (en) * 2014-11-27 2019-07-30 Kt Corporation Video signal processing method and device
WO2016085231A1 (en) 2014-11-27 2016-06-02 주식회사 케이티 Video signal processing method and device
US20170006219A1 (en) 2015-06-30 2017-01-05 Gopro, Inc. Image stitching in a multi-camera array
US9992502B2 (en) 2016-01-29 2018-06-05 Gopro, Inc. Apparatus and methods for video compression using multi-resolution scalable coding
US10291910B2 (en) 2016-02-12 2019-05-14 Gopro, Inc. Systems and methods for spatially adaptive video encoding
US10484621B2 (en) 2016-02-29 2019-11-19 Gopro, Inc. Systems and methods for compressing video content
US10645362B2 (en) 2016-04-11 2020-05-05 Gopro, Inc. Systems, methods and apparatus for compressing video content
US10163030B2 (en) 2016-05-20 2018-12-25 Gopro, Inc. On-camera image processing based on image activity data
US10462466B2 (en) 2016-06-20 2019-10-29 Gopro, Inc. Systems and methods for spatially selective video coding
US10553029B1 (en) 2016-09-30 2020-02-04 Amazon Technologies, Inc. Using reference-only decoding of non-viewed sections of a projected video
US10609356B1 (en) * 2017-01-23 2020-03-31 Amazon Technologies, Inc. Using a temporal enhancement layer to encode and decode stereoscopic video content
US10198862B2 (en) 2017-01-23 2019-02-05 Gopro, Inc. Methods and apparatus for providing rotated spherical viewpoints
MX2021001743A (en) * 2018-08-17 2021-06-23 Huawei Tech Co Ltd Reference picture management in video coding.
EP3854099A4 (en) * 2018-09-21 2022-06-29 Sharp Kabushiki Kaisha Systems and methods for signaling reference pictures in video coding
EP3629584A1 (en) * 2018-09-25 2020-04-01 Koninklijke Philips N.V. Apparatus and method for generating and rendering a video stream
CN113826390B (en) * 2019-05-16 2024-03-08 字节跳动有限公司 Intra-frame block copy for screen content codec

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080022063A (en) * 2006-09-05 2008-03-10 엘지전자 주식회사 Method for decoding a video signal and apparatus for implementing the same
KR20090006132A (en) * 2006-04-04 2009-01-14 퀄컴 인코포레이티드 Apparatus and method of enhanced frame interpolation in video compression
CN101455084A (en) * 2006-03-30 2009-06-10 Lg电子株式会社 A method and apparatus for decoding/encoding a video signal
CN101653001A (en) * 2006-10-13 2010-02-17 汤姆逊许可公司 Reference picture list management syntax for multiple view video coding
KR20120027194A (en) * 2009-04-21 2012-03-21 엘지전자 주식회사 Method and apparatus for processing multi-view video signal

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008023967A1 (en) * 2006-08-25 2008-02-28 Lg Electronics Inc A method and apparatus for decoding/encoding a video signal
US8948256B2 (en) * 2006-10-13 2015-02-03 Thomson Licensing Reference picture list management syntax for multiple view video coding
KR100941608B1 (en) * 2006-10-17 2010-02-11 경희대학교 산학협력단 Method for encoding and decoding a multi-view video and apparatus therefor
KR100902353B1 (en) * 2007-11-16 2009-06-12 광주과학기술원 Device and Method for estimating death map, Method for making intermediate view and Encoding multi-view using the same
IN2014KN00990A (en) * 2011-11-11 2015-10-09 Fraunhofer Ges Forschung
US9258559B2 (en) * 2011-12-20 2016-02-09 Qualcomm Incorporated Reference picture list construction for multi-view and three-dimensional video coding
ES2629744T3 (en) * 2012-01-17 2017-08-14 Telefonaktiebolaget Lm Ericsson (Publ) Management of reference image lists
KR20130119379A (en) * 2012-04-23 2013-10-31 삼성전자주식회사 Method and apparatus for multiview video encoding using reference picture list for multiview video prediction, method and apparatus for multiview video decoding using reference picture list for multiview video prediction

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101455084A (en) * 2006-03-30 2009-06-10 Lg电子株式会社 A method and apparatus for decoding/encoding a video signal
KR20090006132A (en) * 2006-04-04 2009-01-14 퀄컴 인코포레이티드 Apparatus and method of enhanced frame interpolation in video compression
KR20080022063A (en) * 2006-09-05 2008-03-10 엘지전자 주식회사 Method for decoding a video signal and apparatus for implementing the same
CN101653001A (en) * 2006-10-13 2010-02-17 汤姆逊许可公司 Reference picture list management syntax for multiple view video coding
KR20120027194A (en) * 2009-04-21 2012-03-21 엘지전자 주식회사 Method and apparatus for processing multi-view video signal

Also Published As

Publication number Publication date
CN104396252A (en) 2015-03-04
EP2843946A1 (en) 2015-03-04
KR102106536B1 (en) 2020-05-06
US20150124877A1 (en) 2015-05-07
WO2013162311A1 (en) 2013-10-31
KR20130120423A (en) 2013-11-04
EP2843946A4 (en) 2016-01-20

Similar Documents

Publication Publication Date Title
CN104396252B (en) Use the multi-view point video decoding method and its device of the reference picture collection predicted for multi-view point video
CN103931192B (en) The method and apparatus determined for the motion vector in Video coding or decoding
CN104081779B (en) Method and its device for inter prediction and the method and its device for motion compensation
CN104365101B (en) For determining the method and apparatus of the reference picture for inter-prediction
CN105144713B (en) For the method and device thereof of decoder setting encoded to video and based on decoder, the method and device thereof being decoded to video are set
CN104488272B (en) It is used to encode video or the method and apparatus of the motion vector for decoding video for predicting
CN104620578B (en) Method and apparatus for the multi-layer video coding of random access and the method and apparatus of the multi-layer video decoding for random access
CN105027568B (en) Method and its equipment for the method to band section progress entropy code and its equipment and for carrying out entropy decoding to band section
CN105340271B (en) Multi-layer video coding method and device thereof for random access and the multi-layer video coding/decoding method and its device for random access
CN105103552B (en) Method and device thereof for the method and device thereof of compensation brightness difference encoded to cross-layer video and for being decoded to video
CN104365104B (en) For multiple view video coding and decoded method and apparatus
CN104541510B (en) The inter-frame prediction method and its equipment that reference picture list can be changed
CN103875249B (en) Method and apparatus for the method and apparatus of multi-view point video predictive coding and for multi-view point video prediction decoding
CN105594212B (en) For determining the method and its equipment of motion vector
CN105308966A (en) Video encoding method and apparatus thereof, and a video decoding method and apparatus thereof
CN106031175B (en) Use the cross-layer video coding method of luminance compensation and its device and video encoding/decoding method and its device
CN106105210B (en) The method and apparatus for managing the buffer for being coded and decoded to multi-layer video
CN104396251A (en) Method for encoding multiview video using reference list for multiview video prediction and device therefor, and method for decoding multiview video using refernece list for multiview video prediction and device therefor
CN107005711A (en) Sample-by-sample predictive coding apparatus and method
CN105165011A (en) Device and method for scalable video encoding considering memory bandwidth and computational quantity, and device and method for scalable video decoding
CN107690806A (en) Method for video coding and video encoder and video encoding/decoding method and video decoding apparatus
CN105340274A (en) Depth map encoding method and apparatus thereof, and depth map decoding method and an apparatus thereof
CN106416256B (en) For carrying out coding or decoded method and apparatus to depth image
CN105308958A (en) Interlayer video encoding method and apparatus for using view synthesis prediction, and video decoding method and apparatus for using same
CN107005710A (en) Multi-view image coding/decoding method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180504

CF01 Termination of patent right due to non-payment of annual fee