CN107005710A

CN107005710A - Multi-view image coding/decoding method and device

Info

Publication number: CN107005710A
Application number: CN201580065653.0A
Authority: CN
Inventors: 李振荣; 朴慜祐
Original assignee: Samsung Electronics Co Ltd
Current assignee: Samsung Electronics Co Ltd
Priority date: 2014-10-07
Filing date: 2015-09-21
Publication date: 2017-08-01
Anticipated expiration: 2035-09-21
Also published as: EP3193504A1; KR101919015B1; EP3193504A4; CN107005710B; US20170318287A1; WO2016056772A1; KR20170057315A; US10554966B2

Abstract

The method decoded to multi-view image is provided, this method includes：From skip flag in bit stream getting frame, the frame in skip flag indicates whether the current block included based on the depth image that frame in dancing mode rebuilds multi-view image；When the frame in skip flag indicates to rebuild current block based on frame in dancing mode, prediction mode information of being jumped out of bit stream getting frame, frame in jump prediction mode information indicates the intra prediction mode for being ready to use in current block among multiple intra prediction modes；Intra-frame prediction method according to being indicated by frame in jump prediction mode information determines the predicted value for the sample that current block includes；And the predicted value based on sample, rebuild current block by determining the reconstructed value of sample.

Description

Multi-view image coding/decoding method and device

Technical field

The present invention relates to based on the Video coding and coding/decoding method being predicted to multi-view image, more particularly, to The multiple view video coding and coding/decoding method coded and decoded in the case of without using residual data to image.

Background technology

With the exploitation and supply of the hardware for reproducing and storing high-resolution or high-quality video content, to for having The demand of effect ground coding or decoding high resolution or the Video Codec of high-quality video content gradually increases.According to traditional Video Codec, is encoded according to based on the Limited-Coding method of the macro block with preliminary dimension to video.

The view data of spatial domain is transformed into the coefficient of frequency domain via frequency transformation.According to Video Codec, it will scheme As being divided into the block with preliminary dimension, discrete cosine transform (DCT) is performed to each piece, and to frequency system in units of block Number is encoded, and frequency transformation is calculated so as to quick.Compared with the view data of spatial domain, the coefficient of frequency domain easily compresses.Tool Body, because the image pixel value of spatial domain is according to the inter prediction or the predicated error of infra-frame prediction via Video Codec To express, therefore, when performing frequency transformation to predicated error, substantial amounts of data can be transformed into 0.By using small size Data are replaced continuously and the data that repeatedly generate, it is possible to reduce the data volume of image.

Multi-view point video codec is to basic viewpoint (base-view) image and one or more subordinate viewpoints (dependent-view) image is coded and decoded.Basic visual point image and one or more subordinate visual point images are wrapped respectively Include corresponding with viewpoint texture picture and depth picture.The data volume of multi-view image can be reduced in a certain way, so as to go Except the redundancy between basic visual point image and the redundancy and texture picture and depth picture of one or more subordinate visual point images.

Depth image is to represent visual angle and the image of the distance between object, and for synthesizing texture maps from some visual angles Picture.Similar with texture image, depth image is invisible to the mankind, therefore the mankind are difficult to the distortion of depth image.

In general, the coding method with high coding efficiency has high distortion rate.Therefore, typically without using with people The coding method of the recognizable distortion rate of class, even if this method has high coding efficiency.However, because the mankind are difficult to depth The distortion of image is spent, therefore, it is possible to use the coding method with high coding efficiency is encoded to multi-view image, to improve Binary encoding efficiency.It therefore, it can the high volume of the code efficiency for the method that suggestion code efficiency comparison texture image is encoded Code method is used as the method encoded to depth image.

The detailed description of the present invention

Technical problem

Texture image and depth image for multiple visual angles is needed to obtain three-dimensional (3D) image.Therefore, store, transmit It is more than storage with the data volume needed for reproducing 3D rendering, transmit and reproduces the data volume needed for two-dimentional (2D) image.Accordingly, it would be desirable to A variety of coding methods are developed to reduce the data volume of 3D rendering.

Technical solution

In one embodiment, the method decoded to multi-view image includes：Jumped out of bit stream getting frame Mark, the frame in skip flag was indicated whether based on working as that the depth image that frame in dancing mode rebuilds multi-view image includes Preceding piece；When the frame in skip flag is indicated based on frame in dancing mode to rebuild current block, jumped out of bit stream getting frame Prediction mode information, frame in jump prediction mode information indicates the frame for being ready to use in current block among multiple intra prediction modes Inner estimation mode；According to the intra-frame prediction method for prediction mode information instruction of being jumped using frame in, determine what current block included The predicted value of sample；And the predicted value based on sample, rebuild current block by determining the reconstructed value of sample.

In one embodiment, this method may also include：Jump enables mark in getting frame, and frame in jump enables mark Note indicates whether frame in dancing mode can be used for including the dad image data cell of depth image.Skip flag can be wrapped in getting frame Include：When frame in jump, which enables mark, indicates that frame in dancing mode can be used for dad image data cell, skip flag in getting frame.

In one embodiment, frame in jump prediction mode information can indicate horizontal pattern, vertical pattern, level list One pattern or vertical single-mode.

In one embodiment, determining the predicted value of sample may include：When frame in jump prediction mode information indicates water When flat-die type powdered, the predicted value for the sample that current block is included is defined as being equal among the sample neighbouring with the left side of current block It is located at the value of the sample of identical row with the sample that current block includes；When frame in jump prediction mode information indicates vertical pattern When, the predicted value of the sample that current block is included be defined as being equal among the sample neighbouring with the upside of current block with it is current The sample that block includes is located at the value of the sample of same column；When frame in jump prediction mode information indicates horizontal single-mode, The predicted value for the sample that current block is included is defined as making a reservation for equal to being located among the sample neighbouring with the left side of current block The value of sample at position；And when frame in jump prediction mode information indicates vertical single-mode, current block is included The predicted value of sample determine to be equal to the sample positioned at pre-position among the sample neighbouring with the upside of current block Value.

In one embodiment, the device for being decoded to multi-view image is obtained including frame in skip flag Device, frame in jump prediction mode information getter, predicted value determiner and reconstructor, wherein：Frame in skip flag getter It is configured to from skip flag in bit stream getting frame, the frame in skip flag indicates whether to rebuild many based on frame in dancing mode The current block that the depth image of visual point image includes；Frame in jump prediction mode information getter, which is configured to jump when frame in, to be marked Prediction mode information of being jumped during current block out of bit stream getting frame is rebuild in note instruction according to frame in dancing mode, and the frame in is jumped The prediction mode information that jumps indicates the intra prediction mode for being ready to use in current block among multiple intra prediction modes；Predicted value is determined Device is configured to determine the sample that current block includes according to the intra-frame prediction method indicated by frame in jump prediction mode information Predicted value；Reconstructor is configured to the predicted value based on sample to determine the reconstructed value of sample.

In one embodiment, the method encoded to multi-view image includes：It is determined that to the depth of multi-view image The method that the current block that degree image includes is encoded；Based on identified coding method generation whether according to frame in jump mould The frame in skip flag that formula is encoded to current block；When being encoded according to frame in dancing mode to current block, based on institute Jump prediction mode information in the coding method delta frame of determination, frame in jump prediction mode information indicates multiple infra-frame predictions Being used among pattern predicts the intra prediction mode of current block；And transmission includes frame in skip flag and frame in jump prediction The bit stream of pattern information.Transmitted bit stream may include to transmit the bit stream for the residual data for not including current block.

In one embodiment, this method may also include determining that whether instruction frame in dancing mode can be used for including depth The frame in jump of the dad image data cell of image enables mark.Skip flag may include in delta frame：When frame in jump is enabled When mark indicates that frame in dancing mode can be used for dad image data cell, skip flag in delta frame.

In one embodiment, frame in jump prediction mode information may indicate that horizontal pattern, vertical pattern, level are single Pattern or vertical single-mode.

In one embodiment, horizontal pattern can be that the predicted value of the sample for including current block is defined as being equal to With the frame in of the value of the sample for being located at identical row with the sample that current block includes among the sample that the left side of current block is neighbouring Predictive mode, vertical pattern can be that the predicted value for the sample for including current block is defined as being equal to the upside neighbour with current block The sample included with current block among near sample is located at the intra prediction mode of the value of the sample of same column, and level is single Pattern can be that the predicted value for the sample for including current block is defined as being equal among the sample neighbouring with the left side of current block The sample positioned at pre-position value intra prediction mode, and vertical single-mode can include current block The predicted value of sample be defined as being equal to the sample positioned at pre-position among the sample neighbouring with the upside of current block The intra prediction mode of value.

In one embodiment, the device for being encoded to multi-view image includes encoding method determiner, frame Interior skip flag maker, frame in jump prediction mode information maker and coding information transmitter, wherein：Coding method is true Determine device to be configured to determine the method that the current block for including the depth image of multi-view image is encoded；Frame in skip flag Maker is configured to based on skip flag in identified coding method delta frame, and the frame in skip flag is indicated whether according to frame Interior dancing mode is encoded to current block；Frame in jump prediction mode information maker is configured to when according to frame in dancing mode Based on prediction mode information of being jumped in identified coding method delta frame when being encoded to current block, frame in jump prediction Pattern information indicates the intra prediction mode for being used to predict current block among multiple intra prediction modes；And coding information is passed Defeated device, which is configured to transmission, includes the bit stream of frame in skip flag and frame in jump prediction mode information.Transmitted bit stream includes passing The bit stream of the defeated residual data for not including current block.

There is provided non-transitory computer readable recording medium, being recorded in the non-transitory computer readable recording medium has The method and the program of the method encoded to multi-view image decoded for performing to multi-view image.

Beneficial effects of the present invention

Under frame in dancing mode, the major part of the coding information of depth image is skipped.Therefore, using frame in dancing mode The efficiency encoded to depth image can be improved.

Brief description of the drawings

Figure 1A is the block diagram of the video decoder according to embodiment.

Figure 1B is the flow chart of the video encoding/decoding method according to embodiment.

Fig. 2A is the block diagram of the video coding apparatus according to embodiment.

Fig. 2 B are the flow charts of the method for video coding according to embodiment.

Fig. 3 A are the diagrams for showing the horizontal pattern Forecasting Methodology according to embodiment.

Fig. 3 B are the diagrams for showing the vertical mode prediction method according to embodiment.

Fig. 4 A are the diagrams for showing the horizontal single-mode Forecasting Methodology according to embodiment.

Fig. 4 B are the diagrams for showing the vertical single-mode Forecasting Methodology according to embodiment.

Fig. 5 is the block diagram of the video decoder according to another embodiment.

Fig. 6 A show three-dimensional (3D) image spreading sentence structure structure of the sequence parameter set according to embodiment.

Fig. 6 B show the coding unit sentence structure structure according to embodiment.

Fig. 6 C show the coding unit expanded sentence structure structure according to embodiment.

Fig. 7 shows the pre- geodesic structure of multi-view image according to embodiment.

Fig. 8 A are the block diagrams of the video coding apparatus of the coding unit based on tree structure according to embodiment.

Fig. 8 B are the block diagrams of the video decoder of the coding unit based on tree structure according to embodiment.

Fig. 9 shows the concept of the coding unit according to embodiment.

Figure 10 A are the block diagrams of the image encoder based on coding unit according to embodiment.

Figure 10 B are the block diagrams of the image decoder based on coding unit according to embodiment.

Figure 11 shows the deeper coding unit and subregion according to depth according to embodiment.

Figure 12 shows the relation between coding unit and converter unit according to embodiment.

Figure 13 shows the multinomial coding information according to depth according to embodiment.

Figure 14 shows the deeper coding unit according to depth according to embodiment.

Figure 15, Figure 16 and Figure 17 are shown according to the pass between the coding unit of embodiment, predicting unit and converter unit System.

Figure 18 shows the relation between coding unit, predicting unit and the converter unit of the coding mode information according to table 1.

Figure 19 shows the physical arrangement of the disc having program stored therein according to embodiment.

Figure 20 shows to record the disc driver with reading program by using disc.

Figure 21 shows the overall structure of the contents providing system for providing content distribution service.

Figure 22 and Figure 23 show the application method for video coding and the mobile phone of video encoding/decoding method according to embodiment External structure and internal structure.

Figure 24 shows the digit broadcasting system of the use communication system according to embodiment.

Figure 25 shows the net according to the use video coding apparatus of embodiment and the cloud computing system of video decoder Network structure.

Best mode

In one embodiment, the method decoded to multi-view image includes：Jumped out of bit stream getting frame Mark, the frame in skip flag was indicated whether based on working as that the depth image that frame in dancing mode rebuilds multi-view image includes Preceding piece；When frame in skip flag indicates to rebuild current block based on frame in dancing mode, prediction of being jumped out of bit stream getting frame Pattern information, frame in jump prediction mode information indicates that the frame in for being ready to use in current block among multiple intra prediction modes is pre- Survey pattern；The intra-frame prediction method indicated according to frame in jump prediction mode information, determines the pre- of the sample that current block includes Measured value；And the predicted value based on sample, rebuild current block by determining the reconstructed value of sample.

In one embodiment, the method encoded to multi-view image includes：It is determined that to the depth of multi-view image The method that the current block that degree image includes is encoded；Based on skip flag, the frame in identified coding method delta frame Interior skip flag indicates whether to encode current block according to frame in dancing mode；When according to frame in dancing mode to current block When being encoded, based on prediction mode information of being jumped in identified coding method delta frame, frame in jump predictive mode letter Breath indicates the intra prediction mode for being used to predict current block among multiple intra prediction modes；And transmission includes frame in jump The bit stream of mark and frame in jump prediction mode information.Transmitted bit stream, which can include transmission, does not include the residual of current block According to bit stream.

Embodiments of the present invention

Hereinafter, in various embodiments described in this specification, term ' image ' not only can uniformly refer to For still image, dynamic picture, such as video can also be referred to.In addition, the term ' picture ' described in this specification refers to To be encoded or decoding still image.

Hereinafter, term ' sample ' refers to the data distributed to the sampling location of image and will be processed.For example, space Pixel in the image in domain can be sample.

In the present note, describe by handling the depth information relevant with multiple viewpoints to implement three-dimensional (3D) image Technology, multiple views plus depth (MVD) figure.Therefore, the term for describing processing multi-view depth information is described below.

Term ' basic visual point image ' refers to the visual point image that absolute coding/decoding is carried out relative to different points of view image.

Term ' subordinate visual point image ' refers to the visual point image relative to different points of view image from possession coding/decoding.Cause This, subordinate visual point image can be encoded into independent visual point image or different subordinate visual point images from possession.

Term ' texture picture ' or ' texture maps ' refer to include the image of the colouring information of the object relevant with current view point.

Term ' depth picture ' or ' depth map ' refer to include and the distance dependent on the surface from current view point to object The image of information.

Hereinafter, term ' dancing mode ' should be understood by skipping substantial amounts of coding information and only generating/obtain Some in coding information are come the coding/decoding method that is encoded/decoded to coding unit.Therefore, when using dancing mode When being decoded to coding unit, coding information can be skipped, such as, converter unit segmentation information for coding unit and residual Remaining information.

In one embodiment, in interframe dancing mode (that is, the inter-frame mode for applying dancing mode), it can jump Cross using all decoded informations needed for inter-frame mode perform decoding, in addition to (merge) information is merged.

Hereinafter, term ' frame mode ' should be understood not applying the mould of dancing mode using intra-frame prediction method but Formula.Term ' frame in dancing mode ' should be understood the pattern using intra-frame prediction method and application dancing mode.Below will ginseng Examine the numerous embodiments that frame in dancing mode is described in detail in Fig. 1 to Fig. 6.

Below by way of the multi-view image coding/decoding method described using above-mentioned concept for realizing 3D rendering.

In order to realize 3D rendering, it is necessary to the texture image and depth image relevant with multiple viewpoints.For example, when based on three Viewpoint is realized during 3D rendering, it is necessary to three texture images and three depth images.Therefore, compared with 2D images, in storage, pass Substantial amounts of data are needed during defeated and reproduction 3D rendering.Accordingly, it would be desirable to develop a variety of coding methods to reduce the data volume of 3D rendering.

Similar with texture image, depth image is invisible to the mankind, therefore the mankind are difficult to the distortion of depth image.Cause This, even if the coding method with high distortion rate is applied into depth image, when the quality of the texture image of synthesis remains unchanged When, the coding method with high code-rate and high distortion rate can also be applied to depth image.With high coding efficiency and height The frame in dancing mode of distortion rate will be described as can be applied to the example of the coding/decoding method of depth image below.

According to coding information to be skipped, frame in dancing mode can have numerous embodiments.Therefore, frame in jump mould The configuration of formula is not limited to embodiment discussed below.

Hereinafter, coordinate (x, y) is determined relative to the sample of the apex positioned at the upper left side of block.More specifically, Coordinate positioned at the apex of the upper left side of block is confirmed as (0,0).The x values of coordinate increase in right direction, and coordinate y Value increases upwards in lower section.For example, the coordinate positioned at the sample 508 of the apex of the lower right side of Fig. 5 A reference block 500 be (7, 7)。

Figure 1A is the block diagram of the video decoder 100 according to embodiment.Specifically, Figure 1A is according to embodiment Use the block diagram of the decoding apparatus of frame in dancing mode.

Video decoder 100 can include frame in skip flag getter 110, frame in jump prediction mode information and obtain Device 120, predicted value determiner 130 and reconstructor 140.Although Figure 1A shows that frame in skip flag getter 110, frame in jump are pre- It is separated element to survey pattern information getter 120, predicted value determiner 130 and reconstructor 140, but in some embodiments In, they are desirably integrated into identical element.Alternately, frame in skip flag getter 110, frame in jump predictive mode The function of information acquirer 120, predicted value determiner 130 and reconstructor 140 can be by the member of two or more in some elements Part is performed.For example, the function of frame in skip flag getter 110 and frame in jump prediction mode information getter 120 can be by Coding information getter is performed, and the function of predicted value determiner 130 and reconstructor 140 can be performed by decoder.

Although Figure 1A shows frame in skip flag getter 110, frame in jump prediction mode information getter 120, prediction Value determiner 130 and reconstructor 140 are included in the element in an equipment, but perform frame in skip flag getter 110, frame The equipment of the function of interior jump prediction mode information getter 120, predicted value determiner 130 and reconstructor 140 does not need physics Ground is adjacent to each other.Therefore, in one embodiment, frame in skip flag getter 110, frame in jump prediction mode information are obtained It can be scattered to take device 120, predicted value determiner 130 and reconstructor 140.

In one embodiment, Figure 1A frame in skip flag getter 110, frame in jump prediction mode information are obtained Device 120, predicted value determiner 130 and reconstructor 140 may be embodied as a processor.In one embodiment, they can To be embodied as multiple processors.

Video decoder 100 can include being used to store by frame in skip flag getter 110, frame in jump prediction mould The memory cell (not shown) for the data that formula information acquirer 120, predicted value determiner 130 and reconstructor 140 are generated.Frame in is jumped Jump mark getter 110, frame in jump prediction mode information getter 120, predicted value determiner 130 and reconstructor 140 can be with Data are extracted from memory cell and use the data.

Figure 1A video decoder 100 is not limited to physical unit.For example, one in the function of video decoder 100 It may be embodied as software, rather than hardware.

Frame in skip flag getter 110 obtains the frame in skip flag for current block from bit stream.

Current block refers to the block that currently will be decoded.Current block can be tetragonal coding unit or predicting unit.

Frame in skip flag is the mark for indicating whether to predict current block based on frame in dancing mode.Only in current block Just can be with skip flag in getting frame when being included in depth image.In one embodiment, frame in skip flag can be with Represent ' 0 ' or ' 1 '.When frame in skip flag represents ' 0 ', frame in dancing mode bel not applied to current block.Conversely, when frame in is jumped When jump mark represents ' 1 ', frame in dancing mode is applied to current block.When frame in skip flag does not include in the bitstream, frame Frame in skip flag is defined as ' 0 ' by interior skip flag getter 110.Therefore, when frame in skip flag is not included in bit stream When middle, frame in dancing mode bel not applied to current block.

When frame in corresponding with current block jump, which enables mark, indicates that frame in dancing mode can be applied to current block, frame in Skip flag getter 110 can be with skip flag in getting frame.Frame in jump enables mark and indicates that frame in dancing mode whether may be used Applied to current block.Therefore, after acquisition frame in jump corresponding with current block enables mark, when frame in jump enables mark When indicating that frame in dancing mode can be applied to block, the frame in skip flag for each block is obtained.

Frame in jump enables mark getter (not shown) and can be jumped out of bit stream getting frame and enable mark.Frame in is jumped Jump, which enables mark, can be included in following item：The included header for cutting piece fragment (slice segment), work in current block Include sequence for the image parameters collection of the depth image of the upper strata element that cuts piece fragment, as the upper strata element of depth image Sequence parameter set, or as sequence upper strata element all videos video parameter collection.For example, when next in units of sequence When restriction frame in jump enables mark, frame in jump enables mark getter included from bit stream acquisition sequential parameter concentration Frame in jump enables mark.

Frame in jump enables mark and can be applied to be included in all pieces for jumping and being enabled in the corresponding element of mark with frame in. The frame in jump e.g., including concentrated in sequential parameter is enabled mark and can be applied to be included in the sequence limited using sequence parameter set All pieces in row.

Frame in jump, which enables mark, can represent ' 0 ' or ' 1 '.When frame in jump enable mark represent ' 0 ' when, not relative to Apply frame in jump and enable skip flag in all pieces of getting frames of mark.When frame in jump, which enables mark, indicates ' 1 ', phase Skip flag in all pieces of getting frames of mark is enabled for applying frame in jump.It is not included in when frame in jump enables mark When in bit stream, frame in jump enables mark getter and can enable mark and be defined as ' 0 ' frame in jump.Therefore, frame in is worked as Jump enable mark not include in the bitstream when, not relative to apply frame in jump enable in all pieces of getting frames of mark Skip flag.

In one embodiment, when it is determined that interframe dancing mode bel not applied to current block, frame in skip flag is obtained Device 110 can be with skip flag in getting frame.Therefore, it can obtain indicate whether using interframe dancing mode interframe jump Skip flag in getting frame after mark.In one embodiment, can skip flag in first getting frame, then when according to frame When interior skip flag determines that frame in dancing mode bel not applied to current block, interframe skip flag can be obtained.

When it is determined that being decoded according to frame in dancing mode to current block, frame in jump prediction mode information getter 120 obtain the frame in jump prediction mode information relevant with current block from bit stream.

Intra prediction mode should be understood to determine to be included in using the sample neighbouring with coding unit or predicting unit The Forecasting Methodology of the predicted value of sample in coding unit or predicting unit.The example of intra prediction mode includes DC patterns, put down Surface model, directional prediction pattern (such as, vertical pattern, profile internal schema (intra-contour mode), tapered mode (wedge mode), single-mode (singal mode) etc.).

Below with reference to the vertical pattern in Fig. 3 A to Fig. 4 B detailed description intra prediction modes, horizontal pattern, vertical list One pattern and horizontal single-mode.

Frame in jump prediction mode information indicates the infra-frame prediction when being decoded according to frame in dancing mode to current block Pattern will be used for current block.Frame in jump prediction mode information can indicate the infra-frame prediction mould in intra prediction mode candidate Formula.

In one embodiment, frame in jump prediction mode information can indicate the vertical mould as intra prediction mode Formula or horizontal pattern.In another embodiment, frame in jump prediction mode information can be indicated as intra prediction mode Vertical pattern, horizontal pattern, vertical single-mode or horizontal single-mode.In another embodiment, frame in jump prediction mould Formula information can indicate the intra prediction mode among the intra prediction mode for frame mode.

In one embodiment, the value in the range of 0 to N-1 can be assigned to frame in jump prediction mode information.Herein, ' N ' represents the quantity of intra prediction mode candidate.When the quantity of intra prediction mode candidate is ' 4 ', the value in the range of 0 to 3 Frame in jump prediction mode information can be assigned to.

Specifically, in one embodiment, when frame in jump prediction mode information indicates to be used as intra prediction mode When vertical pattern or horizontal pattern, frame in jump prediction mode information is assigned to by ' 0 ' or ' 1 '.In this embodiment, frame is worked as When interior jump prediction mode information indicates ' 0 ', vertical pattern can be used as to the intra prediction mode of current block.Conversely, working as frame When interior jump prediction mode information indicates ' 1 ', horizontal pattern can be used as to the intra prediction mode of current block.

In another embodiment, indicate vertical pattern when frame in jump prediction mode information, it is horizontal pattern, vertical single During intra prediction mode among pattern and horizontal single-mode, the value in the range of 0 to 3 is assigned to frame in jump predictive mode Information.In this embodiment, when frame in jump prediction mode information is ' 0 ', vertical pattern can be used as current block Intra prediction mode；When frame in jump prediction mode information is ' 1 ', horizontal pattern can be used as to the infra-frame prediction of current block Pattern；When frame in jump prediction mode information is ' 2 ', vertical single-mode can be used as to the infra-frame prediction mould of current block Formula；And when frame in jump prediction mode information is ' 3 ', horizontal single-mode can be used as to the infra-frame prediction mould of current block Formula.

In another embodiment, when frame in jump prediction mode information indicate it is pre- for all frame ins in frame mode Survey pattern a period of time, according to the substantially the same method of the method for the intra prediction mode with determining to be ready to use in frame mode, Frame in jump prediction mode information can indicate the intra prediction mode being ready to use in current block.For example, when in frame mode When, determine most probable pattern (MPM) among multiple intra prediction modes and determine intra prediction mode in view of MPM, In frame in dancing mode, it may be determined that MPM among multiple intra prediction modes and be considered that MPM to determine frame in Jump predictive mode.

It is to be allocated to frame in jump with the quantity increase available for the intra prediction mode candidate in frame in dancing mode The amount increase of the bit of prediction mode information.For example, include vertical pattern as intra prediction mode candidate, it is horizontal pattern, vertical When single-mode and horizontal single-mode, the quantity of intra prediction mode candidate is ' 4 '.Therefore, when using block code When (Fixed Length Code), frame in jump prediction mode information can be expressed with two bits, and compile using unitary Code (Unary Code) when, it is possible to use one to three bits come express frame in jump prediction mode information.As another example, If intra prediction mode candidate includes 32 intra prediction modes, when using block code, it is possible to use five bits To express frame in jump prediction mode information.Therefore, may when intra prediction mode candidate includes excessive frame mode Reduce the efficiency of frame in dancing mode.

However, with the quantity increase of intra prediction mode candidate, the applicability increase of frame in dancing mode.For example, working as , only can be according to vertical pattern or horizontal mould in current block when intra prediction mode candidate only includes vertical pattern and horizontal pattern Formula can just apply frame in dancing mode in the case of predicting.However, when intra prediction mode candidate includes vertical pattern, water , also can be single according to vertical single-mode and level in current block when flat-die type powdered, vertical single-mode and horizontal single-mode Pattern and frame in dancing mode is applied when predicting.

Therefore, it can according to an appropriate number of intra prediction mode candidate and its applicability under frame in dancing mode come It is determined that the intra prediction mode candidate available for frame in dancing mode.

When frame in skip flag indicate can according to frame in dancing mode to predict current block when, frame in jump predictive mode Information acquirer 120 can jump prediction mode information out of bit stream getting frame.Therefore, when frame in skip flag indicates not root Predicted according to frame in dancing mode during current block, the not interior jump prediction of getting frame of frame in jump prediction mode information getter 120 Pattern information.

In one embodiment, when frame in skip flag is indicated not according to frame in dancing mode to predict current block, The information relevant with predicting the method for current block can be obtained after frame in skip flag.For example, it is root that can obtain instruction The information of current block is rebuild according to frame mode or inter-frame mode.

Predicted value determiner 130 determines to work as according to the intra prediction mode jumped indicated by prediction mode information by frame in The preceding piece of predicted value of sample that includes.

Under frame in dancing mode, coding information is not obtained, in addition to frame in jump prediction mode information.Therefore, according to Current block is predicted using the intra prediction mode of frame in jump prediction mode information instruction.

Predicted value based on sample, reconstructor 140 rebuilds current block by determining the reconstructed value of sample.

In intra mode, determine the predicted value for the sample that current block includes, and use the pre- of residual data and sample Measured value determines the reconstructed value of sample.However, under frame in dancing mode, skipping residual data, and the therefore prediction of sample Value is identical with its reconstructed value.

In one embodiment, reconstructor 140 can be filtered to determine sample by the predicted value to sample Reconstructed value.For example, reconstructor 140 can be by the way that by wave filter in loop, (such as, deblocking wave filter and sample are adaptively inclined Move) predicted value of sample is applied to determine the reconstructed value of sample.Therefore, in some embodiments, the predicted value of sample can With different from its reconstructed value.

Figure 1B is the flow chart of the video encoding/decoding method 10 according to embodiment.Specifically, Figure 1B is according to embodiment According to frame in dancing mode perform coding/decoding method flow chart.

In operation S12, frame in skip flag corresponding with current block is obtained from bit stream.Current block is included in multiple views In the depth image of image.Frame in skip flag indicates whether to predict current block based on frame in dancing mode.Only current Just can be with skip flag in getting frame when block is included in depth image.

In one embodiment, indicate that frame in dancing mode should when frame in corresponding with current block jump enables mark , can be with skip flag in getting frame during for current block.

In one embodiment, can be with jump in getting frame when it is determined that frame in dancing mode bel not applied to current block Mark.

In operation S14, the frame in jump prediction mode information for current block is obtained from bit stream.Frame in jump prediction Pattern information indicates that the frame in for being ready to use in current block that can be used among multiple intra prediction modes in frame in dancing mode is pre- Survey pattern.

In one embodiment, can when frame in skip flag is indicated according to frame in dancing mode to rebuild current block To obtain the frame in jump prediction mode information for current block from bit stream.

In operation S16, current block is determined according to the intra-frame prediction method indicated by frame in jump prediction mode information The predicted value of the sample included.

In operation S18, the predicted value based on sample rebuilds current block by determining the reconstructed value of sample.

It can be performed according to the above-mentioned video encoding/decoding method 10 of embodiment by video decoder 100.

Fig. 2A is the block diagram of the video coding apparatus 200 according to embodiment.Specifically, Fig. 2A is according to embodiment Use the block diagram of the code device of frame in dancing mode.

Video coding apparatus 200 can include encoding method determiner 210, frame in skip flag maker 220, frame in and jump The prediction mode information that jumps maker 230 and coding information transmitter 240.Although Fig. 2A jumps encoding method determiner 210, frame in Jump mark maker 220, frame in jump prediction mode information maker 230 and coding information transmitter 240 are shown as separated Element, but in one embodiment, they are desirably integrated into identical element.In one embodiment, frame in is jumped The function of mark maker 220 and frame in jump prediction mode information maker 230 (can not shown by coding information maker Go out) perform.

Although Fig. 2A believes encoding method determiner 210, frame in skip flag maker 220, frame in jump predictive mode Breath maker 230 and coding information transmitter 240 are shown as including element within one device, but perform coding method determination Device 210, frame in skip flag maker 220, frame in jump prediction mode information maker 230 and coding information transmitter 240 Equipment need not be physically adjacent to each other.Therefore, in one embodiment, encoding method determiner 210, frame in jump mark Remember that maker 220, frame in jump prediction mode information maker 230 and coding information transmitter 240 can be scattered.

In one embodiment, Fig. 2A encoding method determiner 210, frame in skip flag maker 220, frame in are jumped Jump prediction mode information maker 230 and coding information transmitter 240 may be embodied as a processor.Alternately, one In individual embodiment, they may be embodied as multiple processors.

Video coding apparatus 200 can include being used to store by encoding method determiner 210, frame in skip flag maker 220th, the memory cell for the data that frame in jump prediction mode information maker 230 and coding information transmitter 240 are generated (is not shown Go out).The prediction mode information maker in addition, encoding method determiner 210, frame in skip flag maker 220, frame in are jumped 230 and coding information transmitter 240 can extract storage data in the memory unit and use the data.

Fig. 2A video coding apparatus 200 is not limited to physical equipment.For example, one in the function of video coding apparatus 200 It may be embodied as software, rather than hardware.

Encoding method determiner 210 determines what the current block being included in the depth image of multi-view image was encoded Method.According to the encoding method determiner 210 of an embodiment determine frame mode, inter-frame mode, frame in dancing mode and Among interframe dancing mode for the coding mode that is encoded to current block.In addition, working as according to frame in dancing mode to working as Preceding piece when being encoded, determines to can be used in frame in dancing mode according to the encoding method determiner 210 of an embodiment Being used among multiple intra prediction modes predicts the intra prediction mode of current block.

It is determined that after intra prediction mode, frame can be determined according to the encoding method determiner 210 of an embodiment In interior dancing mode and frame mode for the coding mode that is encoded to current block.

Encoding method determiner 210 can determine to be most suitable for the coding mode of current block and frame in by rate-distortion optimization Predictive mode.Therefore, when rate distortion costs are relatively low because being encoded according to frame in dancing mode to current block, coding method Frame in dancing mode is defined as the coding mode of current block by determiner 210.Similarly, encoding method determiner 210 can be by Intra prediction mode with low rate distortion costs is defined as the intra prediction mode of current block.

Based on the coding method determined by encoding method determiner 210, the generation instruction of frame in skip flag maker 220 is The no frame in skip flag encoded according to frame in dancing mode to current block.In one embodiment, when according to frame in When dancing mode is encoded to current block, frame in skip flag can be defined as ' 1 ' by frame in skip flag maker 220. Conversely, in one embodiment, when not encoded according to frame in dancing mode to current block, the generation of frame in skip flag Frame in skip flag can be defined as ' 0 ' by device 220.

In one embodiment, frame in skip flag maker 220 can jump according to frame in corresponding with current block Enable mark and carry out skip flag in delta frame.Determine that current block disapproves frame in dancing mode when enabling mark according to frame in jump When, frame in skip flag maker 220 does not generate the frame in skip flag for current block.Conversely, being opened when according to frame in jump When determining that current block permits frame in dancing mode with mark, frame in skip flag maker 220 generates the frame in for current block Skip flag.

In one embodiment, it can come relative to piece fragment, depth image, sequence units or all video ranks is cut Define frame in jump and enable mark.For example, when enabling mark relative to sequence units to define frame in jump, frame in jump is opened All pieces can apply to be included in sequence units with mark.

In one embodiment, when the coding mode of current block is not interframe dancing mode, the life of frame in skip flag Growing up to be a useful person 220 can be with skip flag in delta frame.Video coding apparatus 200 can first determine interframe skip flag, for determining The coding mode of current block is interframe dancing mode.When interframe skip flag is indicated according to interframe dancing mode to current block When being encoded, frame in skip flag maker 220 can skip skip flag in delta frame.When interframe skip flag is indicated not When being encoded according to interframe dancing mode to current block, frame in skip flag maker 220 can be with skip flag in delta frame.

In another embodiment, no matter the coding mode of current block is interframe dancing mode, frame in skip flag Maker 220 can skip flag in delta frame.The coding mode that video coding apparatus 200 can be configured in current block is not Interframe skip flag is generated during frame in dancing mode.

Based on the coding method generated by encoding method determiner 210, frame in jump prediction mode information maker 230 is given birth to Into the frame in jump prediction mode information of the intra prediction mode indicated for predicting current block.

In one embodiment, when being encoded according to frame in dancing mode to current block, frame in jump prediction mould Jump prediction mode information in the delta frame of formula information generator 230.Therefore, when according to the pattern pair for being different from frame in dancing mode When current block is encoded, the not jump prediction mode information in delta frame of frame in jump prediction mode information maker 230.

In one embodiment, frame in jump prediction mode information is determined to indicate to can be used in frame in dancing mode Being used among multiple intra prediction modes predicts the intra prediction mode of current block.

In one embodiment, the intra prediction mode in available for frame in dancing mode is vertical pattern, level When pattern, vertical single-mode and horizontal single-mode, the value in the range of 0 to 3 can be assigned to frame in jump predictive mode Information.For example, when being assigned to vertical pattern by ' 0 ', frame in jump prediction mode information maker 230 can be in vertical pattern Frame in jump prediction mode information is defined as ' 0 ' when being used for prediction current block.

Frame in jump prediction mode information maker 230 can be according to available for the infra-frame prediction mould in frame in dancing mode The type and quantity of formula indicates the intra prediction mode according to distinct methods.

Coding information transmitter 240, which is transmitted, includes the bit stream of frame in skip flag and frame in jump prediction mode information.

In frame in dancing mode, a large amount of coding informations generated in frame mode are skipped.Therefore, in frame in dancing mode In, the bit stream transmitted by coding information transmitter 240 does not include the residual for indicating the difference between current block and prediction block According to.However, in some embodiments, coding information transmitter 240 by frame in skip flag and frame in jump except predicting mould Outside formula information is included into bit stream, coding information can also be included into bit stream in addition.

Fig. 2 B are the flow charts of the method for video coding 10 according to embodiment.Specifically, Fig. 2 B are according to embodiment The coding method based on frame in dancing mode flow chart.

In operation S22, it is determined that the method encoded to the current block being included in the depth image of multi-view image. In one embodiment, it is determined that coding mode and predictive mode for being encoded to current block.Can be with utilization rate distortion Optimization operates S22 to perform.For rate-distortion optimization, if being when being encoded according to frame in dancing mode to current block Method that is maximally effective, then being defined as frame in dancing mode encoding current block.

In operation S24, generation indicates whether the frame in jump mark encoded using frame in dancing mode to current block Note.In one embodiment, mark can be enabled come skip flag in delta frame according to frame in corresponding with current block jump. In one embodiment, can be with skip flag in delta frame when not encoded according to interframe dancing mode to current block.

In operation S26, when being encoded according to frame in dancing mode to current block, generation indicates multiple infra-frame predictions Being used among pattern predicts the frame in jump prediction mode information of the intra prediction mode of current block.

In operation S28, transmission includes the bit stream of frame in skip flag and frame in jump prediction mode information.Indicate to work as The residual data of the preceding piece of difference between prediction block does not include into bit stream.

Method for video coding 20 described above according to embodiment can be performed by video coding apparatus 200.

Below with reference to Fig. 3 A to Fig. 4 B descriptions with horizontal pattern, vertical pattern, horizontal single-mode and vertical single mould The Forecasting Methodology that formula is performed.

In horizontal pattern, the predicted value for the sample being included within current block is defined as and neighbouring with the left side of current block Sample among the sample being located at the sample that is included in current block in identical row value it is identical.Therefore, wrapped in mutually going together The all values for including sample in current block are all identical.The prediction performed according to horizontal pattern is described in detail below with reference to Fig. 3 A Application.

In figure 3 a, current block 300 is 4 × 4 block.Current block 300 includes four rows 312,314,316 and 318.Sample 302nd, 304,306 and 308 is neighbouring with the left side of current block 300.Sample in the first row 312 (for the most up of current block 300) Predicted value be confirmed as with it is in the first row 312 and identical with the value for the sample 302 that the left side of current block 300 is neighbouring.Class As, the predicted value of the sample in the second row 314 of current block 300 is confirmed as and is located in the second row 314 and and current block The value of the neighbouring sample 304 in 300 left side is identical.

Under vertical pattern, the predicted value for the sample being included within current block is defined as and neighbouring with the upside of current block Sample among it is identical with the value for the sample that the sample that is included in current block is located at same column.Therefore, working as in same column The predicted value of preceding piece of sample is identical.The application of the prediction performed according to horizontal pattern is described in detail below with reference to Fig. 3 B.

In figure 3b, current block 320 is 4 × 4 block.Current block 320 includes four row 332,334,336 and 338.Sample 322nd, 324,326 and 328 is neighbouring with the upside of current block 320.In first row 332 (being the leftmost row of current block 320) The predicted value of sample be confirmed as with the first row 332 and the sample 322 neighbouring with the upside of current block 320 value phase Together.Similarly, the predicted value of the sample in secondary series 334 be confirmed as with it is in the secondary series 334 and upper with current block 320 The value of the neighbouring sample 324 in side is identical.

Single-mode should be understood only to determine the sample positioned at pre-position in the sample neighbouring with current block It is defined as including the intra prediction mode of the predicted value of the sample in current block for reference sample and by the value of reference sample.Can So that single-mode is categorized into polytype pattern according to the type of neighbouring sample.

Under horizontal single-mode, the predicted value for being included in the sample in current block is confirmed as and the left side with current block The value of the sample positioned at pre-position in neighbouring sample is identical.Therefore, it is included in the pre- of all samples in current block Measured value is identical.The application of the prediction performed according to horizontal single-mode is described in detail below with reference to Fig. 4 A.

In Figure 4 A, current block 400 is 4 × 4 block.Using in the sample neighbouring with the left side of current block 400 from work as Reference sample 402 on the second row of preceding piece of top in a downwardly direction is pre- come the sample that determines to be included in current block 400 Measured value.For example, when reference sample 402 has value 115, all predicted values for being included in sample in current block 400 are all true It is set to ' 115 '.

Under vertical single-mode, the predicted value for being included in the sample in current block is confirmed as and the upside with current block The value of the sample positioned at pre-position in neighbouring sample is identical.Therefore, it is included in the pre- of all samples in current block Measured value is identical.The application of the prediction performed in vertical single-mode is described in detail in below with reference to Fig. 4 B.

In figure 4b, current block 420 is 4 × 4 block.Determine to be included in current block 420 using reference sample 422 The predicted value of sample, wherein reference sample 422 are the left sides from current block 420 in the sample neighbouring with the upside of current block 420 The second sample that side is counted.For example, when reference sample 422 has value 130, being included in all samples in current block 420 Predicted value is confirmed as ' 130 '.

Do not fixed above with reference to the position of Fig. 4 A and Fig. 4 the B reference samples described, and in some embodiments, It can will be used as reference sample positioned at the neighbouring sample of another position.

Fig. 5 is the flow chart of the video encoding/decoding method 500 of the application frame in dancing mode according to another embodiment.

In the video encoding/decoding method 500 of application frame in dancing mode, the coding mode that current block is checked in order is frame Between dancing mode, frame in dancing mode or frame mode.Specifically, obtain and mark from bit stream, to determine the volume of current block Pattern is interframe dancing mode, frame in dancing mode or frame mode.

Dancing mode is to skip the transmission of coding information so that the maximized coding mode of compression ratio.Therefore, first check for Whether current block is decoded according to dancing mode to skip the mark for indicating that coding mode is frame mode.Then, When coding mode is not dancing mode, inspection is to be applied to current block by frame mode or by inter-frame mode.Dancing mode Including frame in dancing mode and interframe dancing mode.It is possible, firstly, to which it is to apply the interframe with higher applying frequency to jump to check Pattern or frame in dancing mode.

In operation s 510, it is determined whether current block is decoded according to interframe dancing mode.When it is determined that according to interframe When dancing mode is decoded to current block, operation S515 is performed.When it is determined that not building current block according to interframe dancing mode When, perform operation S520.

In operation S515, current block is decoded according to interframe dancing mode.In one embodiment, in interframe Under dancing mode, all coding informations in addition to merging index (merge index) are skipped.Therefore, using adjacent including space The coding information of the block candidate indicated by merging index among block candidate including nearly block and time contiguous block is current to predict Block.Due to no acquisition residual data, therefore, the block for carrying out filtering in loop by the prediction block to current block and obtaining is true It is set to the reconstructed block of current block.

In operation S520, check whether and current block is decoded according to frame in dancing mode.When it is determined that according to frame in When dancing mode is decoded to current block, operation S525 is performed.When it is determined that not carried out according to frame in dancing mode to current block During decoding, operation S530 is performed.

In operation S525, current block is decoded according to frame in dancing mode.In one embodiment, in frame in Under dancing mode, all coding informations in addition to frame in jumps prediction mode information are skipped.Therefore, it is pre- according to being jumped by frame in The intra prediction mode of pattern information instruction is surveyed to predict current block.Such as under interframe dancing mode, obtain remaining due to no Data, therefore, the block for carrying out filtering in loop by the prediction block to current block and obtaining are confirmed as the solution code block of current block.

In operation S530, check whether and current block is decoded according to frame mode.When it is determined that according to frame mode When being decoded to current block, operation S535 is performed.When it is determined that not according to frame mode to build current block when, perform operation S540。

In operation S535, current block is decoded according to frame mode.In one embodiment, in frame mode Under, partition information, intraprediction mode information and residual data is obtained to predict current block.Current block is divided according to partition information Into one or more sub-blocks.Then, the frame in be applied to each in sub-block is determined according to intraprediction mode information Predictive mode.

Intraprediction mode information can include most probable pattern (MPM) mark, the MPM mark indicate whether according to MPM corresponding intra prediction modes predict current block.When predicting current block according to intra prediction mode corresponding with MPM When, intraprediction mode information can also include the MPM information for indicating the MPM quantity of intra prediction mode.When not according to and MPM Corresponding intra prediction mode predicts during current block, can include indicating waiting among the intra prediction mode for not including MPM Intraprediction mode information for the intra prediction mode of current block.

Each sub-block is predicted according to intra prediction mode.Every height is determined using the prediction block and residual data of sub-block The reconstructed block of block.

In operation S540, current block is decoded according to inter-frame mode.In one embodiment, in inter-frame mode In, partition information, fusion mark and residual data is obtained to predict current block.According to partition information by current block be divided into one or Multiple sub-blocks.Then, determined whether fusion mode being applied to each sub-block according to fusion mark.

In application fusion mode, merging index can be additionally obtained.Can be according to the candidate indicated by merging index The coding information of block, it is determined that the prediction block of each sub-block.Then, prediction block and residual data can be used to determine each sub-block Reconstructed block.

When not applying fusion mode, the first motion candidates information, the first movement differential information, the second motion can be obtained Candidate information and the second movement differential information.Based on acquired information, it can obtain in the first reference block and the second reference block At least one.Can be based at least one in the first reference block and the second reference block, it is determined that the prediction block of each sub-block.With Afterwards, prediction block and residual data can be used to determine the reconstructed block of each sub-block.

Fig. 5 video encoding/decoding method 500 be only numerous embodiments in one, and interframe described above jump Pattern, frame in dancing mode, inter-frame mode and frame mode can be implemented differently.

Fig. 6 A to Fig. 6 C show the embodiment of sentence structure (syntax) structure of frame in dancing mode.

Fig. 6 A show the (hereinafter referred to as ' SPS of 3D rendering expanded sentence structure structure 600 according to the sequence parameter set of embodiment Expanded sentence structure structure ').SPS expanded sentence structures structure 600 includes the sentence constitutive element for being commonly used to the sequence units of multi-view point video Element.

In the SPS expanded sentence structures structure 600 according to embodiment, ' d ', which is represented, to be used for texture image and depth image The index being distinguished from each other out.When ' d ' is ' 0 ', the information relevant with texture image is obtained.When ' d ' is ' 1 ', obtain and deep The image-related information of degree.

In the SPS expanded sentence structures structure 600 according to embodiment, including skip_intra_enabled_flag [d]. Skip_intra_enabled_flag [d] refers to that frame in jump enables mark.When ' d ' is ' 1 ', (relative to depth image) Obtain skip_intra_enabled_flag [1].Skip_intra_enabled_flag [1] can be ' 0 ' or ' 1 '.When When skip_intra_enabled_flag [1] is ' 0 ', not relative to all pieces (coding units) being included in sequence units Skip flag in getting frame.When skip_intra_enabled_flag [1] is ' 1 ', relative to being included in sequence units Skip flag in all pieces of (coding unit) getting frames.

In the SPS expanded sentence structures structure 600 according to embodiment, when ' d ' is ' 0 ' (relative to texture image), no Skip_intra_enabled_flag [0] is obtained from bit stream.Therefore, skip_intra_enabled_flag [0] is fixed For ' 0 '.Therefore, frame in dancing mode texture image is not applied to.

In fig. 6, skip_intra_enabled_flag [d] is the embodiment that frame in jump enables mark, and Therefore it can differently express frame in jump and enable mark.

Fig. 6 B show the coding unit sentence structure structure 610 according to embodiment.Coding unit sentence structure structure 610 includes relative The sentence constitutive element obtained in coding unit.Coding unit sentence structure structure 610 includes the frame in skip flag as sentence constitutive element. According in the coding unit of embodiment sentence structure structure 610, including skip_intra_flag [x0] [y0] is used as frame in jump mark Note.According to skip_intra_flag [x0] [y0], acquired sentence constitutive element can be skipped in coding unit sentence structure structure 610 Element.

According in the coding unit of embodiment sentence structure structure 610,

if(slice_type！=I)

Cu_skip_flag [x0] [y0] (sentence structure 1)

Sentence structure 1 represents, when a section sheet type for the depth image belonging to current block is not frame in (I), to obtain and represent that interframe is jumped Jump the cu_skip_flag [x0] [y0] marked.

elseif(SkipIntraEnabledFlag)

Skip_intra_flag [x0] [y0] (sentence structure 2)

Sentence structure 2, which represents to obtain when it is determined that not rebuilding current block according to interframe dancing mode, represents frame in skip flag Skip_intra_flag [x0] [y0], and Fig. 6 A acquired frame in jump enable mark instruction current block allowance frame in Dancing mode.

if(！cu_skip_flag[x0][y0]&&！skip_intra_flag[x0][y0]){

... (the sentence constitutive element (SYNTAX FOR INTRA MODE＆INTER MODE) for being used for frame mode and inter-frame mode)

(sentence structure 3)

Sentence structure 3 represent when not rebuilding current block according to interframe dancing mode or frame in dancing mode according to inter-frame mode or Frame mode rebuilds current block.Therefore, when according to frame in dancing mode to rebuild current block, skip according to frame mode or All constitutive elements that inter-frame mode is rebuild needed for current block.

Cu_extension (x0, y0, log2CbSize) (sentence structure 4)

Sentence structure 4 is Cu_extension (x0, y0, log2CbSize), and it is the coding unit of dancing mode in call frame The order of expanded sentence structure structure.Coding unit expanded sentence structure structure is described below with reference to Fig. 6 C.

if(DcOnlyFlag[x0][y0]||

(！Skip_intra_flag [x0] [y0]s ＆＆CuPredMode [x0] [y0]==MODE_INTRA))

depth_dcs(x0,y0,log2CbSize)

if(！cu_skip_flag[x0][y0]&&！skip_intra_flag[x0][y0]&&！dc_only_flag[x0] [y0]&&！pcm_flag[x0][y0]){

......

transform_tree(x0,y0,x0,y0,log2CbSize,0,0)

(sentence structure 5)

Sentence structure 5 includes calling the order of the sentence structure structure for obtaining residual data and the bar for performing remaining order Part.Depth_dcs (x0, y0, log2CbSize) and transform_tree (x0, y0, x0, y0, log2CbSize, 0,0) are Call the order of the sentence structure structure for obtaining residual data.

！Skip_intra_flag [x0] [y0] is for calling one of these conditions ordered.！skip_intra_flag [x0] [y0] refers to that frame in skip flag is ' 0 '.Therefore, when building current block according to frame in dancing mode (when frame in is jumped During labeled as ' 1 '), do not obtain residual data.

In fig. 6b, skip_intra_flag [x0] [y0] is an embodiment of frame in skip flag, and therefore Frame in skip flag can differently be expressed.

Fig. 6 C show the coding unit expanded sentence structure structure 610 according to embodiment.In multi-view point video, coding unit Expanded sentence structure structure includes the sentence constitutive element additionally needed in coding unit sentence structure structure.For example, coding unit expanded sentence structure knot Structure 610 includes the frame in jump prediction mode information of frame in dancing mode.In the coding unit expanded sentence structure according to embodiment In structure 610, frame in jump prediction mode information is expressed as skip_intra_mode_idx [x0] [y0].

if(skip_intra_flag[x0][y0])

Skip_intra_mode_idx [x0] [y0] (sentence structure 6)

Sentence structure 6, which represents to obtain when the skip_intra_flag [x0] [y0] for representing frame in skip flag is ' 1 ', represents frame The skip_intra_mode_idx [x0] [y0] of interior jump prediction mode information.According to skip_intra_mode_idx [x0] The value of [y0], it is determined that being ready to use in the intra prediction mode of prediction current block.

In one embodiment, can be according to vertical when skip_intra_mode_idx [x0] [y0] has value 0 Pattern predicts current block.In one embodiment, can be with when skip_intra_mode_idx [x0] [y0] has value 1 Current block is predicted according to horizontal pattern.In one embodiment, when skip_intra_mode_idx [x0] [y0] has value When 2, current block can be predicted according to vertical single-mode.In one embodiment, skip_intra_mode_idx is worked as When [x0] [y0] has value 3, current block can be predicted according to horizontal single-mode.

In figure 6 c, skip_intra_mode_idx [x0] [y0] is an implementation of frame in jump prediction mode information Mode, and therefore can differently express frame in jump prediction mode information.

The sentence structure structure shown in Fig. 6 A to Fig. 6 C is only one in the numerous embodiments for express video encoding/decoding method A bit.It therefore, it can differently express the sentence structure structure of frame in dancing mode.

Fig. 7 shows the pre- geodesic structure of multi-view point video according to embodiment.

Multi-view point video pre- geodesic structure that can be according to Fig. 7 according to the video coding apparatus 100 of embodiment Reproduction order 700 predicts basic visual point image and subordinate visual point image.In the figure 7, although to basic visual point image and two Subordinate visual point image is encoded, but in another embodiment, three or more subordinate visual point images can be encoded.

The pre- geodesic structure 700 of multi-view point video according to Fig. 7, the image of same viewpoint is disposed in horizontal direction On.Therefore, left view dot image ' left side ' arranges embarked on journey that in the horizontal direction middle part visual point image ' middle part ' is arranged in the horizontal direction Embark on journey, and right visual point image ' right side ' is arranged embark on journey in the horizontal direction.According to embodiment, middle part visual point image can be base This visual point image, and left view dot image and right visual point image can be subordinate visual point images.According to another embodiment, left view Dot image or right visual point image can be basic visual point images.

In addition, the image layout in the vertical direction with identical sequence of pictures number (POC).POC represents that video includes Image reproduction order.Marked in the pre- geodesic structure 700 of multi-view point video ' POC X ' represent the image being located in each row With respect to reproduction order.X smaller value represents that reproduction order earlier, and higher value represent later reproduction order.

Therefore, the pre- geodesic structure 700 of multi-view point video based on Fig. 7, the left view dot image labeled as ' left side ' is based on POC and cloth Put in the horizontal direction, be based on POC labeled as the basic visual point image at ' middle part ' and arrange in the horizontal direction, and be labeled as The right visual point image on ' right side ' is based on POC and arranged in the horizontal direction.In addition, being located at the left view of same column with basic visual point image Dot image and right visual point image have different viewpoints, but with identical POC.

In the figure 7, for each viewpoint, four consecutive images are configured to a picture group (GOP).Each GOP includes position Image and an anchor picture (key picture) between two continuous anchor pictures.According to embodiment, GOP can comprise more than The image of four images.In addition, according to embodiment, the image of varying number can be included in each GOP.What GOP included The quantity of image can be determined according to coding/decoding efficiency.

Anchor picture is random access points (RAP) picture.When reproducing video, if reproducing positions are reproduced based on a certain Sequentially arbitrarily selected among the image of (that is, POC) arrangement, then reproduce in POC with reproducing positions hithermost anchor pictures.Base This visual point image includes basic viewpoint anchor picture 711,712,713,714 and 715, and left view dot image includes left view point anchor picture 721st, 722,723,724 and 725, and right visual point image include right viewpoint anchor picture 731,732,733,734 and 735.In Fig. 7 The anchor picture shown only example, and according to coding/decoding efficiency, can be located in different POC.

Multi-view image can be by GOP sequential reproduction and prediction (reconstruction).First, according to the pre- geodesic structure of multi-view point video 700 reproduction order, for each viewpoint, can reproduce the image that GOP 0 includes, can then reproduce GOP 1 includes Image.In other words, the image that each GOP includes can be reproduced by GOP 0, GOP 1, GOP 2 and GOP 3 order. In addition, the coded sequence based on the pre- geodesic structure of multi-view point video, for each viewpoint, can predict that (reconstruction) GOP 0 includes Image, can then predict the image that (reconstruction) GOP 1 includes.In other words, can be by GOP 0, GOP 1, the and of GOP 2 GOP 3 order predicts image that (reconstruction) each GOP includes.

The inter prediction of the pre- geodesic structure 700 of multi-view point video includes motion compensated prediction (MCP) and disparity compensation prediction (DCP).MCP using with reference picture have same viewpoint, on the time present image front and rear image interframe Prediction.When predicting current block by MCP, according to the motion vector relevant with the predicting unit of present image and reference picture rope Attract the predicting unit for determining present image.DCP is used in the image with reference picture in identical POC with different points of view Inter prediction.When predicting current block by DCP, according to the difference vector relevant with the predicting unit of present image and with reference to figure The predicting unit of present image is determined as index.In the figure 7, the image that arrow starts is reference picture, and arrow is pointed to Image be reference picture to be used prediction image.

It can then be exported to predicting that basic visual point image result is encoded in the form of basic multi-view bitstream, with And can then be exported to predicting that the result of subordinate viewpoint picture is encoded in the form of subordinate multi-view bitstream.For example, right Middle part visual point image in Fig. 7 is predicted the result of coding and can exported in the form of basic multi-view bitstream, to left view point Image is predicted the result of coding and can exported in the form of the first subordinate bit stream, and right visual point image is predicted The result of coding can be exported in the form of the second subordinate bit stream.

MCP is only carried out to basic visual point image.Therefore, MCP is only carried out to middle part visual point image.In other words, although I type anchors Picture 711,712,713,714 and 715 is without reference to other pictures, but other Type B pictures and b type pictures are regarded substantially with reference to other Dot image is predicted.Type B picture is referred to by I type anchor pictures of the POC before the Type B picture and by POC in the Type B picture I type anchor pictures afterwards are predicted.B type pictures are referred to by I type anchor pictures of the POC before the b type pictures and suitable by POC Type B picture of the sequence after the b type pictures predict, or with reference to pressing Type B anchor figure of the POC orders before the b type pictures Piece and predict by I type anchor pictures of the POC after the b type pictures.

MCP or DCP is performed to left view dot image and right visual point image.Therefore, left view dot image may be referred to identical POC's The right visual point image of middle part visual point image or identical POC.Similarly, right visual point image may be referred to identical POC middle part and regard The left view dot image of dot image or identical POC.

It may be referred to basic viewpoint anchor picture corresponding with left view point anchor picture 721,722,723,724 and 725 in POC 711st, 712,713,714 and 715, performing interview prediction to left view point anchor picture 721,722,723,724 and 725, (interlayer is pre- Survey).May be referred to basic viewpoint anchor picture 711 corresponding with right viewpoint anchor picture 731,732,733,734 and 735 in POC, 712nd, 713,714 and 715 or left view point anchor picture 721,722,723,724 and 725, to right viewpoint anchor picture 731,732, 733rd, 734 and 735 interview prediction is performed.Furthermore, it is possible to reference in POC with left view point non-anchor pictures and right viewpoint non-anchor pictures Left view point non-anchor pictures and right viewpoint non-anchor pictures are performed interview prediction by corresponding other visual point images.

The image with same viewpoint is may be referred to be predicted left view point non-anchor pictures and right viewpoint non-anchor pictures.Cause This, left view point non-anchor pictures and right viewpoint non-anchor pictures can be predicted by MCP or DCP.

Knot can be predicted based on the multi-view point video shown in Fig. 7 according to the video decoder 200 and 400 of embodiment Structure 700 is come visual point image, left view dot image and right visual point image in the middle part of rebuilding.

The forecasting sequence to image shown in Fig. 7 only embodiment., can basis for coding/decoding efficiency Different forecasting sequences carry out prognostic chart picture.

Fig. 8 A are the block diagrams of the video coding apparatus 800 of the coding unit based on tree structure according to embodiment.

The video coding apparatus 800 on video estimation of coding unit based on tree structure includes the He of encoder 810 Output unit 820.Hereinafter, for ease of description, the video on video estimation of the coding unit based on tree structure is compiled Code device 800 is referred to as ' video coding apparatus 800 '.

Encoder 810 can split photo current based on maximum coding unit, and wherein maximum coding unit is with pin To the maximum sized coding unit of the photo current of image.If photo current is more than maximum coding unit, photo current View data can be divided at least one maximum coding unit.Can be tool according to the maximum coding unit of embodiment There is the data cell of 32 × 32,64 × 64,128 × 128,256 × 256 equidimensions, wherein the shape of data cell is with 2 The width of power side and the square of length.

Can be full-size and depth according to the feature of the coding unit of embodiment.Depth representing coding unit is from most The number of times that big coding unit is split by space, and with depth down, can be from maximum according to the deeper coding unit of depth Coding unit is divided into minimum coding unit.The depth of maximum coding unit can be defined as the superiors' depth, and minimum The depth of coding unit can be defined as orlop depth.Because the size of coding unit corresponding with each depth is with most The depth down of big coding unit and reduce, therefore, coding unit corresponding with upper layer depth can include and lower layer is deep Spend corresponding multiple coding units.

As described above, the view data of photo current is divided into maximum coding unit according to the full-size of coding unit, And it can each include the deeper coding unit according to depth segmentation in maximum coding unit.Due to according to embodiment Maximum coding unit is split according to depth, therefore, and the view data for the spatial domain that maximum coding unit includes can root Hierarchical classification is carried out according to depth.

The height and width that limitation maximum coding unit can be predefined be layered the coding list of the total degree of segmentation The depth capacity and full-size of member.

At least one cut section that 810 pairs of encoder is obtained by splitting according to the region of the maximum coding unit of depth Domain is encoded, and determines the depth for exporting the view data finally encoded according at least one cut zone.In other words, According to the maximum coding unit of photo current by being encoded simultaneously to the view data in the deeper coding unit according to depth Depth of the selection with minimum coding error, encoder 810 determines coding depth.It is deep according to the coding that maximum coding unit is determined Degree and view data are output to output unit 820.

It is single to maximum coding based on the corresponding deeper coding unit of at least one depth with being equal to or being deeper than depth capacity View data in member is encoded, and is entered based on each result encoded to view data in deeper coding unit Row compares.After being compared to the encoding error of deeper coding unit, the depth with minimum coding error can be selected. For each maximum coding unit, at least one coding depth can be selected.

As coding unit is split according to depth by layering, and with the quantity increase of coding unit, maximum coding is single The size of member is divided.Even if the coding unit in a maximum coding unit corresponds to same depth, also will be by surveying respectively The encoding error of view data of each coding unit is measured determine whether will be every in coding unit corresponding with same depth Individual coding unit is divided into lower layer depth.Therefore, even if when view data is included in a maximum coding unit, compiling Code error can also be different according to the region in a maximum coding unit, and therefore, coding depth can be according to picture number Region in and it is different.Therefore, one or more coding depths can be determined in a maximum coding unit, and it is maximum The view data of coding unit can be divided according to the coding unit of at least one coding depth.

Therefore, it can determine that what maximum coding unit included has tree structure according to the encoder 810 of embodiment Coding unit.Owning that maximum coding unit includes is included according to ' coding unit with tree structure ' of embodiment The corresponding coding unit of the depth with being defined as coding depth among deeper coding unit.Coding unit with coding depth Depth that can be in the same area of maximum coding unit is layered determination, and can be independently true in the different areas It is fixed.Similarly, the coding depth in current region can be determined independently of the coding depth in another region.

It is related to the segmentation times from maximum coding unit to minimum coding unit according to the depth capacity of embodiment Index.Total segmentation time from maximum coding unit to minimum coding unit can be represented according to the depth capacity of embodiment Number.For example, when the depth of maximum coding unit is 0, the depth that maximum coding unit is divided coding unit once can be with 1 is set to, and the depth of the divided coding unit twice of maximum coding unit could be arranged to 2.Herein, if minimum compile Code unit is the coding unit that maximum coding unit is divided four times, then there is the depth rank of depth 0,1,2,3 and 4, therefore, Depth capacity could be arranged to 4.

It can encode and convert according to maximum coding unit perform prediction.Always according to maximum coding unit, based on according to depth Degree is equal to or less than the deeper coding unit perform prediction coding of depth capacity and conversion.

It is therefore, right due to the quantity increase of the deeper coding unit when maximum coding unit is split according to depth All deeper coding units generated with depth down perform coding, and coding includes predictive coding and conversion.Hereinafter, For ease of description, predictive coding and change will be described based on the coding unit with current depth at least one maximum coding unit Change.

Number for being encoded to view data can differently be selected according to the video coding apparatus 800 of embodiment According to the size or shape of unit.In order to be encoded to view data, the behaviour of such as predictive coding, conversion and entropy code is performed Make, while, identical data cell can be used for all operations or be used to different data cells each operate.

For example, video coding apparatus 800 can not only select the coding unit for being encoded to view data, and The data cell of coding unit can be selected differently from, to be encoded to the view data perform prediction in coding unit.

, can be based on coding unit (that is, the base with coding depth in order to which perform prediction is encoded in maximum coding unit In the coding unit no longer split) carry out perform prediction coding.Hereinafter, no longer split and as the base for predictive coding The coding unit of this unit will be referred to as ' predicting unit ' now.It can be included by the subregion split predicting unit and obtained pre- Survey unit and the data cell by splitting at least one of height and width selected from predicting unit and obtaining.Subregion is to compile The divided data cell of predicting unit of code unit, and predicting unit can be point for having identical size with coding unit Area.

For example, when 2N × 2N (wherein N is positive integer) coding unit is no longer split and turns into 2N × 2N prediction list When first, the size of subregion can be 2N × 2N, 2N × N, N × 2N or N × N.The example of divisional type can optionally include The symmetric partitioning obtained by the height or width of symmetrical Ground Split predicting unit, and can optionally include by not The height or width (such as, 1 of symmetrical Ground Split predicting unit:N or n:1) subregion that obtains, to pass through geometry segmentation prediction single Member and the subregion and subregion with arbitrary shape obtained.

The predictive mode of predicting unit can be at least one of frame mode, inter-frame mode and dancing mode.For example, Frame mode and inter-frame mode can be performed to 2N × 2N, 2N × N, N × 2N or N × N subregion.May only also be to 2N × 2N's Subregion performs dancing mode.Coding can independently be performed to a predicting unit in coding unit, so as to select to have most The predictive mode of lower Item error.

Volume for being encoded to view data can also be based not only on according to the video coding apparatus 800 of embodiment Code unit and conversion is performed to the view data in coding unit based on the data cell different from coding unit.In order to Conversion is performed in coding unit, conversion can be performed based on the data cell of the size with less than or equal to coding unit. For example, converter unit can include the converter unit for the data cell of frame mode and for inter-frame mode.

Converter unit in coding unit can be recursively to divide with the coding unit similar mode according to tree structure Smaller size of region is cut into, therefore, the residual data of coding unit there can be tree structure according to according to transformed depth Converter unit divided.

Expression reaches the transformed depth of the segmentation times of converter unit by the height and width of partition encoding unit It can be set in converter unit.For example, in 2N × 2N current coded unit, when the size of converter unit is 2N × 2N Transformed depth can be 0, and when the size of converter unit is N × N, transformed depth can be 1, and when the size of converter unit Transformed depth can be 2 when being N/2 × N/2.In other words, for converter unit, the converter unit with tree structure can root Set according to transformed depth.

The information relevant with coding depth is not only needed according to the coding information of coding depth, and is needed and prediction and change The information that commutation is closed.Therefore, encoder 810 can not only determine generate minimum coding error depth, and can determine by Predicting unit splits the chi of compartment model, the type of prediction according to predicting unit and the converter unit for conversion of Composition Region It is very little.

It will be described in detail later with reference to Figure 15 to Figure 24 in the maximum coding unit according to embodiment according to tree structure Coding unit and the method that determines predicting unit/subregion and converter unit.

Encoder 810 can be measured according to the deeper of depth by using the rate-distortion optimization based on Lagrange multiplier The encoding error of coding unit.

Output unit 820 exported in the form of bit stream based on by least one coding depth that encoder 810 is determined and The view data of the maximum coding unit of coding and the information relevant with the coding mode according to depth.

Coded view data can be corresponding with the result for being encoded and being obtained by the residual data to image.

The information relevant with the coding mode according to depth can include coding depth information, the divisional type of predicting unit The dimension information of information, prediction mode information and converter unit.

Coding depth information can be limited by using the segmentation information according to depth, and it is designated whether to under relatively The coding unit of layer depth rather than current depth performs coding.If the current depth of current coded unit is coding depth, Current coded unit is encoded by using the coding unit of current depth, and therefore, the segmentation information of current depth It can be defined to not split current coded unit to lower layer depth., whereas if the current depth of current coded unit is not It is coding depth, then must performs coding to the coding unit of lower layer depth, and therefore, the segmentation information of current depth can be with It is defined to current coded unit being divided into the coding unit with lower layer depth.

If current depth is not coding depth, the coding unit to being divided into the coding unit with lower layer depth Perform coding.Because at least one coding unit of lower layer depth is present in a coding unit of current depth, therefore, Coding is repeatedly carried out to each coding unit of lower layer depth, thus can be passed for the coding unit with same depth Perform coding with returning.

Due to determining the coding unit with tree structure for a maximum coding unit, and must be directed to that there is coding The coding unit of depth determines at least one information relevant with coding mode, therefore, it can for a maximum coding unit It is determined that at least one information relevant with coding mode.Because data are according to coding depth progress layering segmentation, maximum coding is single The depth of data of member can also change according to position, and therefore can be set for data coding depth and with coding The relevant information of pattern.

Therefore, can be by the volume relevant with coding mode with corresponding encoded depth according to the output unit 820 of embodiment Code information is assigned at least one in coding unit, predicting unit and minimum unit that maximum coding unit includes.

It is by the way that the minimum coding unit for constituting orlop coding depth is divided into 4 according to the minimum unit of embodiment It is individual and obtain square data cell.Alternately, maximum can be may be included according to the minimum unit of embodiment Largest square number in all in coding unit, predicting unit, zoning unit and converter unit included by coding unit According to unit.

For example, the coding information exported by output unit 820 can be categorized into the coding information according to deeper coding unit With the coding information according to predicting unit.The letter relevant with predictive mode can be included according to the coding information of deeper coding unit Breath and the information relevant with the size of subregion.Estimation during being included according to the coding information of predicting unit with inter-frame mode The relevant information in direction, the information relevant with the reference picture index of inter-frame mode, the information and frame in relevant with motion vector The relevant information of chromatic value of pattern and the information relevant with the interpolation method during frame mode.

With according to the relevant information of picture, the full-size of coding unit cut piece fragment or GOP and limited and with most The header, sequence parameter set or image parameters that the big relevant information of depth can be embedded into bit stream are concentrated.

The relevant information of the full-size of converter unit with permitting relative to current video and with converter unit most The relevant information of small size can also be exported by the header, sequence parameter set or image parameters collection of bit stream.Output unit 820 Can pair reference information related to prediction, information of forecasting and cut piece clip types information encoded and export these information.

According to the most simple embodiment of video coding apparatus 800, deeper coding unit can be by that will have on relatively Coding unit obtained from the height or width of the coding unit (coding unit of last layer) of layer depth are divided into two.Change speech It, when the size of the coding unit of current depth is 2N × 2N, the size of the coding unit of lower layer depth is N × N.This Outside, the current coded unit with 2N × 2N sizes can also include up to four lower layer depth codings with N × N sizes Unit.

Therefore, size based on maximum coding unit and the depth capacity determined in view of the feature of photo current, lead to Cross the coding unit with optimised shape and optimized dimensions for determining each maximum coding unit, video coding apparatus 800 can be with Form the coding unit with tree structure.Due to can by using a variety of predictive modes and conversion in any one to each Maximum coding unit performs coding, accordingly it is also possible to by considering the feature of the coding unit with various picture sizes come really Determine Optimized Coding Based pattern.

Therefore, if encoded in conventional macro block to the image with high-resolution or big data quantity, each figure The number of macroblocks of piece increases too much.Therefore, the amount increase of the compression information generated for each macro block, thus it is difficult to transmission pressure Information and the efficiency of data compression reduction of contracting.However, by using the video coding apparatus according to embodiment, can increase Picture compression efficiency, because while the full-size of coding unit is increased in the case of considering the size of image, Coding unit is adjusted in the case of the feature for considering image.

It can include video coding apparatus corresponding with number of views above with reference to Fig. 2A video coding apparatus described 800, so that the texture image and depth image that include to multiple viewpoints are encoded.For example, due to being regarded in Fig. 7 using three Point, therefore, it is possible to use three video coding apparatus 800 are encoded to Fig. 7 multi-view image.

When 800 pairs of independent visual point images of video coding apparatus are encoded, encoder 810 can be according to each maximum volume In code unit there is each coding unit of tree structure to determine the predicting unit for inter picture prediction, and can be to every Individual predicting unit performs inter picture prediction.

When video coding apparatus 800 is encoded to subordinate visual point image, encoder 810 can determine each maximum volume Predicting unit in code unit and the coding unit with tree structure, and can be pre- each predicting unit execution image Survey or interview prediction.

Video coding apparatus 800 can be encoded to predict current tomographic image by using SAO to inter-layer prediction error.Cause This, in the case where that need not be encoded according to sample position to predicated error, the sample Distribution value based on predicated error can be with By only being encoded using SAO types and the information relevant with skew come the predicated error to current tomographic image.

In one embodiment, encoder 810 can perform Fig. 2A encoding method determiner 210, frame in jump in advance Mark remembers the function of maker 220 and frame in jump prediction mode information maker 230.In one embodiment, export single Member 820 can perform the function of Fig. 2A coding information transmitter 240.

Fig. 8 B show the frame of the video decoder 850 based on the coding unit with tree structure according to embodiment Figure.

According to the video decoder on video estimation based on the coding unit with tree structure of embodiment 850 include view data receives extractor 860 and decoder 870 with coding information.Hereinafter, for ease of description, according to reality The video decoder 850 on video estimation based on the coding unit with tree structure for applying mode is referred to as ' video Decoding apparatus 850 '.

According to embodiment, various terms (such as, coding unit, depth for the decoding operate of video decoder 850 Degree, predicting unit, converter unit and the various types of information relevant with coding mode) definition with being compiled with reference to Fig. 8 A and video Those definition that code device 800 is described are identical.

View data receives the bit stream that extractor 860 receives and parses through Encoded video with coding information.View data The coded image data that extractor 860 is directed to each coding unit from the bitstream extraction through parsing is received with coding information, its Middle coding unit has the tree structure according to each maximum coding unit, and the view data extracted is output into decoding Device 870.View data and coding information receive extractor 860 can from the header relevant with photo current, sequence parameter set or Image parameters collection extracts the information relevant with the full-size of the coding unit of photo current.

View data receives extractor 860 also from the bitstream extraction through parsing with being compiled according to each maximum with coding information The information relevant with coding depth of the coding mode with tree structure of code unit.That is extracted is deep with coding mode and coding The relevant information of degree is output to decoder 870.In other words, the view data in bit stream is divided into maximum coding unit, Decoder 870 is decoded for each maximum coding unit to view data.

It is a kind of or many with that can be directed to according to the coding mode of each maximum coding unit information relevant with coding depth Plant coding depth information and set, and the information relevant with the coding mode according to coding depth can include corresponding encoded list The divisional type information of member, the dimension information of prediction mode information and converter unit.As coding depth information, it can also carry Take the information relevant with the coding mode according to depth.

Receive that extractor 860 extracts by view data and coding information with the coding mould according to each maximum coding unit The formula information relevant with coding depth be intended to ought the grade encoder of video coding apparatus 800 encoded according to each maximum Unit be directed to according to each deeper coding unit of depth be repeatedly carried out coding when generation minimum coding error with encode mould The formula information relevant with coding depth.Therefore, video decoder 850 can be according to the coding method for generating minimum coding error By being decoded data come reconstruction image.

Because the coding information relevant with coding mode with coding depth can be assigned to corresponding coding unit, predicting unit With the predetermined unit of data among minimum unit, therefore, view data receives extractor 860 with coding information can be according to predetermined Data cell extracts the information relevant with coding depth with coding mode.If the coding mode with corresponding maximum coding unit The information relevant with coding depth is recorded according to each predetermined unit of data, then pre- with same depth and segmentation information The data cell that identical maximum coding unit includes can be inferred to be by determining data cell.

Decoder 870 with according to the coding mode of each maximum coding unit information relevant with coding depth based on passing through View data in each maximum coding unit is decoded to rebuild photo current.In other words, decoder 870 can be based on The reading subregion class of each coding unit among the coding unit with tree structure that each maximum coding unit includes Type, predictive mode and converter unit are decoded to encoded view data.Decoding process can include prediction process and inverse Conversion process, the prediction process includes infra-frame prediction and motion compensation.

Based on the information relevant with predictive mode of the divisional type with the predicting unit of the coding unit according to coding depth, Decoder 870 can perform infra-frame prediction or motion compensation according to the subregion and predictive mode of each coding unit.

In addition, for the inverse transformation of each maximum coding unit, decoder 870 can read the root with each coding unit According to the relevant information of the converter unit of tree structure, inverse transformation is performed so as to the converter unit based on each coding unit.By In inverse transformation, the pixel value of the spatial domain of coding unit can be rebuild.

Decoder 870 can determine that the coding of current maximum coding unit is deep by using the segmentation information according to depth Degree.If segmentation information indicates that view data is no longer split with current depth, current depth is coding depth.Therefore, decode Device 870 can be by using the divisional type with the predicting unit of each coding unit corresponding to current depth, predictive mode The information relevant with the size of converter unit decodes come the view data to current maximum coding unit.

In other words, predetermined unit of data among coding unit, predicting unit and minimum unit is distributed to by observation Coding information set, can collect containing including identical segmentation information coding information data cell, and can will collected by Data cell be regarded as a data cell for treating to be decoded with identical coding mode by decoder 870.It therefore, it can by obtaining The information relevant with the coding mode of each coding unit is decoded to current coded unit.

It can include video corresponding with number of views decoding above with reference to the video decoder that Figure 1A and Fig. 3 A are described Device 850, so that the texture image and depth image that include to multiple viewpoints are decoded.For example, due to using three in Fig. 7 Individual viewpoint, therefore, it is possible to use three video decoders 850 are decoded to Fig. 7 multi-view image.

When receiving the independent visual point image stream relevant with independent visual point image, the decoder of video decoder 850 870 can will receive the independent visual point image that extractor 860 is extracted from independent visual point image stream by view data and coding information Sample decomposition into maximum coding unit the coding unit according to tree structure.Decoder 870 can be to reference to tomographic image Each execution inter picture prediction in the coding unit according to tree structure of sample, and independent visual point image can be rebuild.

When receiving subordinate visual point image stream, the decoder 870 of video decoder 850 can by by view data with The sample decomposition that coding information receives the subordinate visual point image that extractor 860 is extracted from subordinate visual point image stream is single into maximum coding The coding unit according to tree structure of member.Decoder 870 can be to each in the coding unit of the sample of the second tomographic image Inter picture prediction or interview prediction are performed, and subordinate visual point image can be rebuild.

View data receives extractor 860 with coding information to obtain SAO types from the current layer bit stream received And skew, and SAO classifications are determined according to the distribution of the sample value of each sample of current layer prognostic chart picture, so that by making Skew is obtained according to SAO classifications with SAO types and skew.Therefore, although not receiving the predicated error according to sample, solution Code device 870 can compensate the skew of respective classes for each sample of current layer prognostic chart picture, and may be referred to be mended The current layer prognostic chart picture repaid determines current layer reconstruction image.

Therefore, video decoder 850 can be obtained when recursively performing coding for each maximum coding unit with The relevant information of at least one coding unit of minimum coding error is generated, and the information can be used to carry out photo current Decoding.In other words, can be to the volume with tree structure of the Optimized Coding Based unit being confirmed as in each maximum coding unit Code unit is decoded.

Therefore, can also be by using whole from coding even if image has high-resolution or with especially substantial amounts of data The optimization information relevant with coding mode received is held, effectively decodes and rebuilds according to the size and coding mode of coding unit The size and coding mode of image, wherein coding unit are adaptively determined according to the feature of view data.

In one embodiment, view data receives extractor 860 with coding information can perform Figure 1A frame in jump The function of jump mark getter 110 and frame in jump prediction mode information getter 120.In one embodiment, decoder 870 can perform the function of Figure 1A predicted value determiner 130 and reconstructor 140.

Fig. 9 shows the concept of the coding unit according to embodiment.

The size of coding unit can be by width × highly represent, and can be 64 × 64,32 × 32,16 × 16 Hes 8×8.64 × 64 coding unit can be divided into 64 × 64,64 × 32,32 × 64 or 32 × 32 subregion, and 32 × 32 Coding unit can be divided into 32 × 32,32 × 16,16 × 32 or 16 × 16 subregion, 16 × 16 coding unit can divide 16 × 16,16 × 8,8 × 16 or 8 × 8 subregion is cut into, and 8 × 8 coding unit can be divided into 8 × 8,8 × 4,4 × 8 Or 4 × 4 subregion.

In video data 910, resolution ratio is 1920 × 1080, and the full-size of coding unit is 64, and maximum deep Spend for 2.In video data 920, resolution ratio is 1920 × 1080, and the full-size of coding unit is 64, and depth capacity For 3.In video data 930, resolution ratio is 352 × 288, and the full-size of coding unit is 16, and depth capacity is 1. Depth capacity shown in Fig. 9 refers to total segmentation times from maximum coding unit to minimum decoding unit.

If resolution ratio is higher or data volume is larger, it is preferably, the full-size of coding unit is larger, so that not only Increase code efficiency, and reflect the feature of image exactly.Therefore, the high video data 910 of resolution ratio video data 930 It can be selected as 64 with the full-size of 920 coding unit.

Because the depth capacity of video data 910 is 2, therefore, the coding unit 915 of video data 910 can include length Maximum coding unit and major axis dimension that shaft size is 64 are 32 and 16 coding unit, because by the way that maximum is encoded Unit is split twice, and depth down is to two layers.On the other hand, because the depth capacity of video data 930 is 1, therefore, video counts It can include the maximum coding unit that major axis dimension is 16, and the coding list that major axis dimension is 8 according to 930 coding unit 935 Member, because by the way that once, depth down is to one layer by maximum coding unit segmentation.

Because the depth capacity of video data 920 is 3, therefore, the coding unit 925 of video data 920 can include length Shaft size is 64 maximum coding unit, and the coding unit that major axis dimension is 32,16 and 8, because by by maximum Coding unit is split three times, and depth down is to 3 layers.With depth down, the ability to express on details can be improved.

Figure 10 A are the block diagrams of the image encoder 1000 based on coding unit according to embodiment.

Include the operation of the encoder 910 of video coding apparatus 900 according to the image encoder 1000 of embodiment, so as to View data is encoded.In other words, intra predictor generator 1004 is held in present frame 1002 with frame mode to coding unit Row infra-frame prediction, and motion estimator 1006 and motion compensator 1008 are by using the present frame 1002 and ginseng of inter-frame mode Frame 1026 is examined to perform interframe estimation and motion compensation.

The data exported from intra predictor generator 1004, motion estimator 1006 and motion compensator 1008 are converted by data Device 1010 and quantizer 1012 are outputted as quantified conversion coefficient.Quantified conversion coefficient passes through the He of inverse quantizer 1018 Inverse converter 1020 and be redeveloped into the data in spatial domain.Reconstruction data in spatial domain pass through deblocking unit 1022 and skew Compensator 1024 is post-processed, and is exported as reference frame 1026.Quantified conversion coefficient can pass through entropy coder 1014 are outputted as bit stream 1016.

In order to be applied to the video coding apparatus 900 according to embodiment, all elements of image encoder 1000 (that is, intra predictor generator 1004, motion estimator 1006, motion compensator 1008, converter 1010, quantizer 1012, entropy code Device 1014, inverse quantizer 1018, inverse converter 1020, deblocking unit 1022 and offset compensator 1024) maximum must be considered Depth performs behaviour based on each coding unit in the coding unit with tree structure according to each maximum coding unit Make.

Specifically, intra predictor generator 1004, motion estimator 1006 and motion compensator 1008 can be current by considering The full-size and depth capacity of maximum coding unit determine each coding unit in the coding unit with tree structure Subregion and predictive mode, and converter 1010 must determine each coding unit among the coding unit with tree structure In converter unit size.

Figure 10 B are the block diagrams of the image decoder 1050 based on coding unit according to embodiment.

Bit stream 1052 is resolved to needed for encoded image data to be decoded and decoding by resolver 1054 The information relevant with coding.Encoded image data is outputted as re-quantization number by entropy coder 1056 and inverse quantizer 1058 According to.View data in spatial domain is rebuild by inverse converter 1060.

For the view data in spatial domain, it is pre- that intra predictor generator 1062 performs frame in the coding unit of frame mode Survey, and motion compensator 1064 performs motion compensation by using reference frame 1070 to the coding unit of inter-frame mode.

It can pass through deblocking unit by the data in the spatial domain of intra predictor generator 1062 and motion compensator 1064 1066 and offset adjuster 1068 post-processed, and reconstruction frames 1072 can be output as.In addition, passing through deblocking unit 1066 and loop filtering unit 1068 post-process data can be output as reference frame 1070.

In order that the decoder 970 of video decoder 1050 is decoded to view data, it can perform according to reality Operation after the resolver 1054 for the image encoder 1050 for applying mode.

In order to be applied to the video decoder 950 according to embodiment, all elements of image encoder 1050 (that is, resolver 1054, entropy decoder 1056, inverse quantizer 1058, inverse converter 1060, intra predictor generator 1062, motion compensation Device 1064, deblocking unit 1066 and offset adjuster 1068) tree structure must be had based on each maximum coding unit Coding unit performs operation.

Specifically, intra predictor generator 1062 and motion compensator 1064 can be determined in the coding unit according to tree structure Each subregion and predictive mode, and inverse converter 1060 must determine the size of the converter unit in each coding unit.

Figure 10 A encoding operation and Figure 10 B decoding operate are described as the video flowing encoding operation in single layer respectively With decoding video stream operation.Therefore, if Figure 12 A scalable video coding device 1200 is to the video flowing of two or more layers Encoded, then image encoder 1000 can be used for each layer.Similarly, if Figure 12 B extending video decoding apparatus The video flowing of 1250 pairs of two or more layers is decoded, then image encoder 1050 can be used for each layer.

Layering is used according to the video coding apparatus 800 of embodiment and according to the video decoder 850 of embodiment Coding unit is to consider the feature of image.Maximum height, Breadth Maximum and the depth capacity of coding unit can be according to images Feature and adaptively determine, or can according to user need and be arranged differently than.According to the deeper coding unit of depth Size can be determined according to the predetermined full-size of coding unit.

In the hierarchy 1100 of the coding unit according to embodiment, the maximum height and Breadth Maximum of coding unit It is respectively 64, and depth capacity is 3.In this case, depth capacity presentation code unit is divided into from maximum coding unit The total degree of minimum coding unit.The vertical axes of hierarchy 1100 due to depth along coding unit are deepened, therefore, more deeply The height and width of coding unit are divided.It is shown as being used for also along the trunnion axis of the hierarchy 1100 of coding unit The basic predicting unit and subregion of the predictive coding of each deeper coding unit.

In other words, coding unit 1110 is the maximum coding unit in the hierarchy 1100 of coding unit, wherein depth For 0 and size (that is, highly multiplying width) is 64 × 64.Depth is deepened along vertical axes, and coding unit 1120 size For 32 × 32 and depth is 1, the size of coding unit 1130 is 16 × 16 and depth is 2, and coding unit 1140 Size is 8 × 8 and depth is 3.The coding unit 1140 that size is 8 × 8 and depth is 3 is minimum coding unit.

The predicting unit and subregion of coding unit are arranged according to each depth along trunnion axis.In other words, if size is 64 × 64 and depth for 0 coding unit 1110 is predicting unit, then predicting unit can be divided into be included in size for 64 × Subregion in 64 coding unit 1110, i.e. subregion 1110 that size is 64 × 64, the subregion 1112 that size is 64 × 32, chi It is very little be 32 × 64 subregion 1114, or size be 32 × 32 subregion 1116.

Similarly, the predicting unit for the coding unit 1120 that size is 32 × 32 and depth is 1, which can be divided into, to be included in Size is the subregion in 32 × 32 coding unit 1120, i.e. subregion 1120 that size be 32 × 32, size are 32 × 16 to divide Area 1122, the subregion 1124 that size is 16 × 32 and the subregion 1126 that size is 16 × 16.

Similarly, the predicting unit for the coding unit 1130 that size is 16 × 16 and depth is 2, which can be divided into, to be included in Size is the subregion in 16 × 16 coding unit 1130, i.e. the size that coding unit 1130 includes is 16 × 16 subregion 1130th, size is 16 × 8 subregion 1132, the subregion 1134 that size is 8 × 16 and the subregion 1136 that size is 8 × 8.

Similarly, the predicting unit for the coding unit 1140 that size is 8 × 8 and depth is 3, which can be divided into, is included in chi It is very little be 8 × 8 coding unit 1140 in subregion, i.e. the size that coding unit 1140 includes for 8 × 8 subregion 1140, chi It is very little be 8 × 4 subregion 1142, the subregion 1144 that size is 4 × 8 and subregion 1146 that size is 4 × 4.

In order to determine the depth of maximum coding unit 1110, according to the encoder of the video coding apparatus 800 of embodiment 810 must perform coding to the coding unit corresponding with depth respectively being included in maximum coding unit 1110.

The quantity of deeper coding unit data, according to depth including same range and identical size adds with depth Increase deeply.For example, it is desired to which four cover with the corresponding coding unit of depth 2 and are included in and 1 corresponding coding of depth Data in unit.Therefore, in order to which the coding result to the identical data according to depth is compared, must by using with depth 1 corresponding coding unit and data are encoded with each in 2 corresponding four coding units of depth.

, can be by the level of the hierarchy 1100 along coding unit in order to perform the coding according to each depth Axle, encodes to select the representativeness as respective depth to each execution in the predicting unit according to the coding unit of depth The minimum coding error of encoding error.The vertical axes of hierarchy 1100 due to depth along coding unit are deepened, can also Coding is performed for each depth, minimum coding error is searched out according to the representative encoding error of depth by comparing.Most The depth and subregion of generation minimum coding error can be selected as the coding of maximum coding unit 1110 in big coding unit 1110 Depth and divisional type.

According to the video coding apparatus 800 of embodiment or according to the video decoder 850 of embodiment according to each The coding unit that the size of maximum coding unit is less than or equal to maximum coding unit is encoded or decoded to image.In coding During process, for conversion converter unit size can the data cell based on no more than corresponding coding unit and select.

For example, in video coding apparatus 800 or video decoder 850, when coding unit 1210 size for 64 × When 64, conversion can be performed by using size is 32 × 32 converter unit 1220.

Can also be by the converter unit to size for 32 × 32,16 × 16,8 × 8 and 4 × 4 (both less than 64 × 64) Each execution become and bring size is encoded into the data of 64 × 64 coding unit 1210, and then can be relative to Converter unit of the original image selection with minimum coding error.

Figure 13 shows the multinomial coding information according to depth according to numerous embodiments.

Can be for corresponding with coding depth every according to the output unit 820 of the video coding apparatus 800 of embodiment Individual coding unit encodes and transmitted divisional type information 1300, prediction mode information 1310 and converter unit dimension information 1320, it is used as the information relevant with coding mode.

Divisional type information 1300 represents and passes through the shape for the subregion split the predicting unit of current coded unit and obtained The relevant information of shape, wherein subregion are the data cells for being predicted coding to current coded unit.For example, size is 2N × 2N current coded unit CU_0 can be divided into any one in following subregion：Size is 2N × 2N subregion 1302, chi It is very little be 2N × N subregion 1304, the subregion 1306 that size is N × 2N and subregion 1308 that size is N × N.In such case Under, the divisional type information 1300 relevant with current coded unit is arranged to one in representing below：Size is 2N × 2N's Subregion 1304 that subregion 1302, size are 2N × N, the subregion 1306 that size is N × 2N and the subregion 1308 that size is N × N.

Prediction mode information 1310 indicates the predictive mode of each subregion.For example, prediction mode information 1310 can be indicated The pattern of the predictive coding performed to the subregion indicated by divisional type information 1300, i.e. frame mode 1312, inter-frame mode 1314 or dancing mode 1316.

Converter unit dimension information 1320 represent to current coded unit perform convert when treat based on converter unit.Example Such as, converter unit can be it is following in one：First frame in converter unit 1322, the second frame in converter unit 1324, first The inter-frame transform unit 1328 of inter-frame transform unit 1326 and second.

The view data of video decoder 850 receives extractor 860 with coding information and can encoded according to each relatively deep Unit extracts and uses divisional type information 1300, prediction mode information 1310 and the converter unit dimension information for encoding 1320。

Figure 14 shows the deeper coding unit according to depth according to numerous embodiments.

Segmentation information can be used for the change for representing depth.Segmentation information specifies whether the coding unit of current depth is divided into The coding unit of lower layer depth.

For being 0 to depth and coding unit 1400 that size is 2N_0 × 2N_0 is predicted the predicting unit of coding 1410 can include the subregion of following divisional type：Divisional type 1412 that size is 2N_0 × 2N_0, size are 2N_0 × N_0 Divisional type 1414, the divisional type 1416 that size is N_0 × 2N_0 and divisional type 1418 that size is N_0 × N_0.Only The divisional type 1412,1414,1416 and 1418 obtained by symmetrical Ground Split predicting unit is shown, but as previously discussed, Divisional type not limited to this, and asymmetric subregion, the subregion with predetermined shape and point with geometry can be included Area.

, must be to a subregion that size is 2N_0 × 2N_0, two that size is 2N_0 × N_0 according to each divisional type Two subregions that subregion, size are N_0 × 2N_0 and four subregions that size is N_0 × N_0 are repeatedly carried out predictive coding.Can To perform frame mode and interframe mould as 2N_0 × 2N_0, N_0 × 2N_0,2N_0 × N_0 and N_0 × N_0 subregion to size The predictive coding of formula.The predictive coding of dancing mode can be only performed for 2N_0 × 2N_0 subregion to size.

If the divisional type 1412 that size is 2N_0 × 2N_0, the divisional type 1414 that size is 2N_0 × N_0 and chi The encoding error of one in the very little divisional type 1416 for N_0 × 2N_0 is minimum, then predicting unit 1410 can not be by It is divided into lower layer depth.

If size for N_0 × N_0 divisional type 1418 encoding error be it is minimum, depth from 0 become 1 and Perform segmentation (operation 1420), and be 2 to depth and the coding unit 1430 of divisional type that size is N_0 × N_0 is repeated Ground performs coding, to search for minimum coding error.

For being 1 to depth and coding unit 1430 that size is 2N_1 × 2N_1 (=N_0 × N_0) is predicted coding Predicting unit 1440 can include：Divisional type 1442 that size is 2N_1 × 2N_1, the subregion class that size is 2N_1 × N_1 The divisional type 1446 that type 1444, size are N_1 × 2N_1 and the divisional type 1448 that size is N_1 × N_1.

If size for N_1 × N_1 divisional type 1448 encoding error be it is minimum, depth from 1 become 2 and Perform segmentation (in operation 1450), and be 2 to depth and coding unit 1460 that size is N_2 × N_2 is repeatedly carried out Coding, to search for minimum coding error.

When depth capacity is d, the deeper coding unit according to depth can be set when depth corresponds to d-1, and And segmentation information can be set when depth corresponds to d-2.In other words, carried out when in coding unit corresponding with depth d-2 Coding is performed after segmentation when depth is d-1 (operating in 1470), for being d-1 to depth and size is 2N_ (d-1) The predicting unit 1490 that × 2N_ (d-1) coding unit 1480 is predicted coding can include the subregion of following divisional type： Divisional type 1494 that divisional type 1492, the size that size is 2N_ (d-1) × 2N_ (d-1) are 2N_ (d-1) × N_ (d-1), Size is that N_ (d-1) × 2N_ (d-1) divisional type 1496 and size are N_ (d-1) × N_ (d-1) divisional type 1498.

Can be that 2N_ (d-1) × 2N_ (d-1) subregion, size are 2N_ (d-1) to the size among divisional type × N_ (d-1) two subregions, size are that N_ (d-1) × 2N_ (d-1) two subregions, size are N_ (d-1) × N_ (d-1) Four subregion perform predictions coding, so as to search for generation minimum coding error divisional type.

When size has minimum coding error for N_ (d-1) × N_ (d-1) divisional type 1498, due to maximum Depth is d, therefore, and depth is no longer divided into lower layer depth for d-1 coding unit CU_ (d-1), and constitutes currently most The depth of the coding unit of big coding unit 1400 is confirmed as d-1, and the divisional type of current maximum coding unit 1400 It can be determined that N_ (d-1) × N_ (d-1).Because depth capacity is d, therefore, coding corresponding with depth d-1 is also not provided with The segmentation information of unit 1452.

Data cell 1499 can be current maximum coding unit ' minimum unit '.According to the minimum unit of embodiment It can be the square data cell by the way that the minimum coding unit with orlop coding depth is divided into 4 and obtained. , can be by comparing according to coding unit 1400 according to the video coding apparatus 100 of embodiment by being repeatedly carried out coding Depth encoding error to select the coding depth with minimum coding error to determine depth, and by corresponding subregion Type and predictive mode are set to the coding mode of coding depth.

Therefore, all depth 0,1 ..., compare minimum coding error according to depth in d-1, d, and can be with Depth with minimum coding error is defined as coding depth.Coding depth, the divisional type of predicting unit and predictive mode It can be encoded and transmit as the information relevant with coding mode.Because coding unit must be divided into coding deeply from depth 0 Degree, therefore, is also set to ' 0 ', and will not include point of the depth of coding depth by the segmentation information only with coding depth Cut information and be set to ' 1 '.

Receiving extractor 860 according to the view data of the video decoder 850 of embodiment and coding information can carry Take and use the coding depth and predicting unit information relevant with coding unit 1400, to be solved to coding unit 1412 Code.By segmentation information can be by using the segmentation information according to depth according to the video decoder 850 of embodiment ' 0 ' depth is defined as coding depth, and can be used to the information relevant with the coding mode of respective depth decode.

Figure 15, Figure 16 and Figure 17 are shown according between the coding unit of numerous embodiments, predicting unit and converter unit Relation.

Coding unit 1510 be in maximum coding unit by video coding apparatus 800 determine according to coding depth compared with Deep coding unit.Predicting unit 1560 is the subregion of each predicting unit in the coding unit 1510 according to coding depth, And converter unit 1570 is each converter unit in the coding unit according to coding depth.

When the depth of the maximum coding unit in deeper coding unit 1510 is 0, the depth of coding unit 1512 is 1, The depth of coding unit 1514,1516,1518,1528,1550 and 1552 be 2, coding unit 1520,1522,1524,1526, 1530th, 1532 and 1548 depth is 3, and the depth of coding unit 1540,1542,1544 and 1546 is 4.

Some subregions 1514,1516,1522,1532,1548,1550,1552 and 1554 among predicting unit 1560 are Obtained by partition encoding unit.In other words, subregion 1514,1522,1550 and 1554 is the subregion class that size is 2N × N Type, subregion 1516,1548 and 1552 is the divisional type that size is N × 2N, and subregion 1532 is the subregion that size is N × N Type.The predicting unit and subregion of deeper coding unit 1510 are less than or equal to each coding unit.

In the data cell less than coding unit 1552, to the picture number of the coding unit 1552 in converter unit 1570 Converted or inverse transformation according to performing.Coding unit 1514,1516,1522,1532,1548,1550,1552 in converter unit 1560 It is also the data cell that is different from the data cell in predicting unit 1560 in size or vpg connection with 1554.In other words, root Can be to the individual data list in identical coding unit according to the video coding apparatus 800 and video decoder 850 of embodiment Member performs infra-frame prediction/motion estimation/motion compensation/and conversion/inverse transformation.

Therefore, to each recursively holding in the coding unit with hierarchy in each region of maximum coding unit Row coding, to determine Optimized Coding Based unit, thus, it is possible to obtain according to the coding unit of recursive tree structure.Coding information The segmentation information relevant with coding unit, divisional type information, prediction mode information and converter unit size letter can be included Breath.

It can be exported and the volume with tree structure according to the output unit 820 of the video coding apparatus 100 of embodiment The relevant coding information of code unit, and connect according to the view data of the video decoder 850 of embodiment with coding information Receiving extractor 860 can be from the bitstream extraction received the coding information relevant with the coding unit with tree structure.

Whether the specified current coded unit of segmentation information is divided into the coding unit of lower layer depth.If current depth D segmentation information is 0, then it is coding depth that wherein current coded unit, which is no longer divided to the depth of lower layer depth, therefore, Divisional type information, prediction mode information and converter unit dimension information can be limited for coding depth.If current compile Code unit must further be split according to segmentation information, then must be to each independence in four partition encoding units of lower layer depth Ground performs coding.

Predictive mode can be one kind in frame mode, inter-frame mode and dancing mode.Frame mode and inter-frame mode It can be limited in all divisional types, and dancing mode is only limited in the divisional type that size is 2N × 2N.

Divisional type information can indicate that the size obtained by the height or width of symmetrical Ground Split predicting unit is 2N × 2N, 2N × N, N × 2N and N × N symmetric partitioning type, and by asymmetrically splitting the height or width of predicting unit The size spent and obtained is 2N × nU, 2N × nD, nL × 2N and nR × 2N asymmetric divisional type.Size is 2N × nU and 2N × nD asymmetric divisional type can be respectively by with 1:3 and 3:1 splits the height of predicting unit and obtains, and size is NL × 2N and nR × 2N asymmetric divisional type can be respectively by with 1:3 and 3:1 splits the width of predicting unit and obtains.

The size of converter unit can be set to two types in frame mode and be set to two in inter-frame mode Individual type.In other words, if the segmentation information of converter unit is 0, the size of converter unit can be 2N × 2N, and it is current The size of coding unit.If the segmentation information of converter unit is 1, it can be converted by splitting current coded unit Unit.If size is symmetric partitioning type, the size of converter unit for the divisional type of 2N × 2N current coded unit It can also be N × N.If the divisional type of current coded unit is asymmetric divisional type, the size of converter unit may be used also To be N/2 × N/2.

It can be assigned to and encode according to the coding information relevant with the coding unit with tree structure of embodiment At least one in the corresponding coding unit of depth, predicting unit and minimum unit.Coding unit corresponding with coding depth can With including one or more predicting units and minimum unit containing identical coding information.

Therefore, it can to determine by comparing the coding information of adjacent data cell adjacent data cell whether be included in In the corresponding identical coding unit of coded coding depth.Can also be determined by using the coding information of data cell with The corresponding coding unit of coded coding depth, and it can therefore be concluded that point of the coding depth gone out in maximum coding unit Cloth.

Therefore, in this case, the prediction to current coded unit is performed if based on adjacent data cell, then may be used Directly to refer to and use the coding information of the data cell in the deeper coding unit adjacent with current coded unit.

In another embodiment, the predictive coding to current coded unit is performed if based on adjacent encoder unit, It can then be searched for and current coded unit phase in deeper coding unit by using the coding information of adjacent deeper coding unit Adjacent data, and it can therefore be concluded that go out adjacent encoder unit.

Maximum coding unit 1800 includes coding unit 1802,1804,1806,1812,1814,1816 and with coding The coding unit 1818 of depth.Herein, because coding unit 1818 is the coding unit with coding depth, therefore, segmentation letter Breath could be arranged to 0.Size could be arranged to include following subregion for the divisional type information of 2N × 2N coding unit 1818 One in type：2N×2N 1822、2N×N 1824、N×2N 1826、N×N 1828、2N×nU 1832、2N×nD 1834th, nL × 2N 1836 and nR × 2N 1838.

Converter unit segmentation information (TU dimension marks) is a type of manipulative indexing, and corresponding with manipulative indexing The size of converter unit can change according to the predicting unit type or divisional type of coding unit.

For example, when divisional type information is arranged to symmetric partitioning type 2N × 2N 1822,2N × N 1824, N × 2N During a kind of in 1826 and N × N 1828, if converter unit segmentation information is 0, the conversion list that size is 2N × 2N is set Member 1842, and if converter unit segmentation information is 1, then the converter unit 1844 that size is N × N can be set.

When divisional type information is arranged to asymmetric divisional type 2N × nU 1832,2N × nD 1834, nL × 2N During a kind of in 1836 and nR × 2N 1838, if converter unit segmentation information (TU dimension marks) is 0, chi can be set The very little converter unit 1852 for 2N × 2N, and if converter unit segmentation information is 1, then it is N/2 × N/2 that can set size Converter unit 1854.

It is the mark that value is 0 or 1 above with reference to Figure 12 converter unit segmentation informations (TU dimension marks) described, but according to The converter unit segmentation information of embodiment is not limited to the mark with 1 bit, and converter unit can be layered Ground Split, and Converter unit segmentation information increases according to setting in the way of 0,1,2,3 etc. simultaneously.Converter unit segmentation information can be conversion The example of index.

In this case, the size of actual use converter unit can be by using the conversion list according to embodiment The minimum dimension of the full-size and converter unit of first segmentation information and converter unit comes together to represent.According to embodiment Video coding apparatus 100 can be to size information of maximum conversion unit, size information of minimum conversion unit and maximum converter unit Segmentation information is encoded.To size information of maximum conversion unit, size information of minimum conversion unit and maximum converter unit point Cutting the result that information encoded can be embedded into SPS.Can be by using according to the video decoder 850 of embodiment Size information of maximum conversion unit, size information of minimum conversion unit and maximum converter unit segmentation information are solved to video Code.

For example, if the size of (a) current coded unit is 64 × 64 and maximum converter unit size is 32 × 32, Then：(a-1) when TU dimension marks are 0, the size of converter unit can be 32 × 32；(a-2) when TU dimension marks are 1, Can be 16 × 16；And (a-3) is when TU dimension marks are 2, can be 8 × 8.

As another example, if the size of (b) current coded unit be 32 × 32 and minimum converter unit size be 32 × 32, then：(b-1) when TU dimension marks are 0, the size of converter unit can be 32 × 32.Herein, due to converter unit Size be not smaller than 32 × 32, therefore, TU dimension marks can not be arranged to be not equal to 0 value.

As another example, if the size of (c) current coded unit is 64 × 64 and maximum TU dimension marks are 1, Then TU dimension marks can be 0 or 1.Herein, TU dimension marks can not be arranged to be not equal to 0 or 1 value.

Therefore, if maximum TU dimension marks are defined as into ' MaxTransformSizeIndex ', by minimum converter unit Size is defined as ' MinTransformSize ', and it is ' RootTuSize ' to convert unit size when TU dimension marks are 0, The current minimum converter unit size ' CurrMinTuSize ' that can be then determined in current coded unit can be limited by equation (1) It is fixed：

CurrMinTuSize

=max (MinTransformSize, RootTuSize/ (2^MaxTransformSizeIndex)) ... (1)

Compared with the current minimum converter unit size ' CurrMinTuSize ' that can be determined in current coded unit, Converter unit size ' RootTuSize ' when TU dimension marks are 0 can represent the maximum converter unit that can be selected in systems Size.In other words, in equation (1), ' RootTuSize/ (2^MaxTransformSizeIndex) ' is represented in TU size marks It is designated as converting converter unit chi when unit size ' RootTuSize ' is divided number of times corresponding with maximum TU dimension marks when 0 It is very little, and the minimum transform size of ' MinTransformSize ' expression.Therefore, ' RootTuSize/ (2^ MaxTransformSizeIndex) ' and the smaller value among ' MinTransformSize ' can be can be in current coded unit The current minimum converter unit size ' CurrMinTuSize ' of middle determination.

According to embodiment, maximum converter unit size RootTuSize can change according to the type of predictive mode.

For example, if current prediction mode is inter-frame mode, ' RootTuSize ' can be by using following equation (2) determine.In equation (2), ' MaxTransformSize ' represents that maximum converter unit size, and ' PUSize ' are represented Current prediction unit size.

RootTuSize=min (MaxTransformSize, PUSize) ... ... (2)

In other words, if current prediction mode is inter-frame mode, when TU dimension marks are 0, converter unit size ' RootTuSize ' can be the smaller value among maximum converter unit size and current prediction unit size.

If the predictive mode of current bay unit is frame mode, ' RootTuSize ' can be by using following Equation (3) is determined.In equation (3), ' PartitionSize ' represents the size of current bay unit.

RootTuSize=min (MaxTransformSize, PartitionSize) ... ... .. (3)

In other words, if current prediction mode is frame mode, when TU dimension marks are 0, converter unit size ' RootTuSize ' can be the smaller value among the size of maximum converter unit size and current bay unit.

However, the type of predictive mode in zoning unit and the current maximum converter unit size that changes ' RootTuSize ' is only embodiment, and for determining the factor not limited to this of current maximum converter unit size.

According to the method for video coding based on the coding unit with tree structure described above with reference to Figure 15 to Figure 18, The view data of spatial domain the coding unit with tree structure it is each it is middle encoded, and the view data of spatial domain With according to side of the video encoding/decoding method based on the coding unit with tree structure to each maximum coding unit perform decoding Formula is rebuild, to allow to rebuild the video formed by picture and sequence of pictures.The video of reconstruction can be by transcriber Reproduce, can be stored in storage medium, or can be via network transmission.

Embodiment of the present disclosure can be written as computer program, and can by using non-transitory computer Read record medium is implemented in the general purpose digital computer of configuration processor.The example bag of non-transitory computer readable recording medium Include magnetic storage medium (for example, ROM, floppy disk, hard disk etc.), optical recording media (for example, CD-ROM or DVD) etc..

, will with reference to Fig. 6 A to Figure 18 scalable video coding methods described and/or method for video coding for ease of description Uniformly it is referred to as ' method for video coding of the disclosure '.For ease of description, the extending video solution described with reference to Fig. 6 A to Figure 18 Code method and/or video encoding/decoding method will also uniformly be referred to as ' video encoding/decoding method of the disclosure '.

Video coding apparatus (including extending video decoding apparatus 1200, the Video coding described with reference to Fig. 6 A to Figure 18 Device 800 or image encoder 1000) it will also uniformly be referred to as ' video coding apparatus of the disclosure '.Retouched with reference to Fig. 6 A to Figure 18 The video decoder (including extending video decoding apparatus 1250, video decoder 850 or image decoder 1050) stated To also uniformly it be referred to as ' video decoder of the disclosure '.

The non-transitory computer readable recording medium of the storage program according to embodiment is will be described in now, it is all Such as, disc 26000.

Figure 19 shows the physical arrangement of the disc 26000 having program stored therein according to embodiment.It is used as storage medium, disk Piece 26000 can be hard disk drive, compact disc read-only memory (CD-ROM) disk, Blu-ray disc or digital versatile disc (DVD).Disk Piece 26000 includes multiple concentric rail Tr, and each concentric rail Tr is divided into certain amount of on the circumferencial direction of disc 26000 Sector Se.In the specific region of disc 26000, can distribute and store execution quantization parameter described above determine method, The program of method for video coding and video encoding/decoding method.

The computer system for describing to be implemented using storage medium referring now to Figure 21, the storage medium is stored for holding The program of row method for video coding described above and video encoding/decoding method.

Figure 20 shows to record the disc driver 26800 with reading program by using disc 26000.Computer system 26700 can store the method for video coding for performing the disclosure via disc driver 26800 and regard in disc 26000 The program of at least one of frequency coding/decoding method.In order to which operation is stored in the journey in disc 26000 in computer system 26700 Sequence, can be from the reading program of disc 26000 and can be by using disc driver 26800 by program transportation to department of computer science System 26700.

Performing the program of at least one of the method for video coding and video encoding/decoding method of the disclosure can not only store In disc 26000 shown in Figure 19 and Figure 21, storage card, ROM cassette tapes or solid-state drive can also be stored in (SSD) in.

Application explained below method for video coding and video encoding/decoding method described above according to embodiment be System.

Figure 21 shows the overall structure of the contents providing system 11000 for providing content distribution service.Communication system Coverage is divided into the cell of preliminary dimension, and wireless base station 11700,11800,11900 and 12000 is separately mounted to this In a little cells.

Contents providing system 11000 includes multiple autonomous devices.For example, such as computer 12100, personal digital assistant (PDA) 12200, multiple autonomous devices such as video camera 12300 and mobile phone 12500 via ISP 11200, Communication network 11400 and wireless base station 11700,11800,11900 and 12000 are connected to internet 11100.

However, contents providing system 11000 is not limited to as shown in Figure 21, and equipment can be used to selectively connect to The contents providing system.Multiple autonomous devices can be directly connected to communication network 11400, and not via wireless base station 11700, 11800th, 11900 and 12000.

Video camera 12300 is can to capture the imaging device of video image, for example, digital camera.Mobile phone 12500 Can be mobile logical using such as individual digital communication (PDC), CDMA (CDMA), WCDMA (W-CDMA), the whole world At least one communication means in the various agreements such as letter system (GSM) and personal handhold telephone system (PHS).

Video camera 12300 can be connected to streaming server 11300 via wireless base station 11900 and communication network 11400. Streaming server 11300 allows the content received via video camera 12300 from user to carry out streaming via real-time broadcast.From taking the photograph The content that camera 12300 is received can be encoded by video camera 12300 or streaming server 11300.Caught by video camera 12300 The video data obtained can be transferred to streaming server 11300 via computer 12100.

The video data captured by camera 12600 can also be transferred to streaming server 11300 via computer 12100. Similar with digital camera, camera 12600 is can to capture still image and the imaging device both video image.By camera The video data of 12600 captures can use camera 12600 or computer 12100 to be encoded.Coding is performed to video to conciliate The software of code can be stored in the non-transitory computer readable recording medium that can be accessed by computer 12100 (for example, CD-ROM Disk, floppy disk, hard disk drive, SSD or storage card) in.

If video data is captured by the camera for being built in mobile phone 12500, it can be received from mobile phone 12500 Video data.

Video data can be by the extensive collection in video camera 12300, mobile phone 12500 or camera 12600 Encoded into circuit (LSI) system.

In the contents providing system 11000 according to embodiment, by user using video camera 12300, camera 12600, The content-data (for example, the content recorded during concert) of mobile phone 12500 or another imaging device record is encoded And it is transferred to streaming server 11300.Streaming server 11300 can be passed encoded content-data using flow content type Defeated other clients to request content data.

Client is the equipment that can be decoded to encoded content-data, for example, computer 12100, PDA 12200th, video camera 12300 or mobile phone 12500.Therefore, contents providing system 11000 allows client to receive and reproduce warp The content-data of coding.Contents providing system 11000 also allows client to receive encoded content-data, and solves in real time Code and the encoded content-data of reproduction, so as to realize personal broadcaster.

The video coding apparatus and video decoder of the disclosure can apply to what contents providing system 11000 included The encoding operation and decoding operate of multiple autonomous devices.

Describe what is included according to the contents providing system 11000 of embodiment in detail referring now to Figure 22 and Figure 24 Mobile phone 12500.

Figure 22 is shown according to the mobile phone 12500 of the application method for video coding of embodiment and video encoding/decoding method External structure.Mobile phone 12500 can be smart phone, and its function is unrestricted and its substantial amounts of function can be changed Or extension.

Mobile phone 12500 includes exterior antenna 12510, via the exterior antenna, radio frequency (RF) signal can with it is wireless Base station 12000 is swapped, and mobile phone 12500 also includes display screen 12520, and the display screen 12520 is used to show by phase The image of the capture of machine 12530 or the image for receiving and decoding via antenna 12510, for example, liquid crystal display (LCD) or organic hair Optical diode (OLED) shields.Mobile phone 12500 includes guidance panel 12540, and the guidance panel 12540 includes control button And touch panel.If display screen 12520 is touch-screen, guidance panel 12540 also includes the touch sensible of display screen 12520 Panel.Mobile phone 12500 includes being used to export voice and the loudspeaker 12580 of sound or the output of another type of sound is single Member, and for inputting the microphone 12550 or another type of sound input unit of voice and sound.Mobile phone 12500 Also include camera 12530 (such as, charge (CCD) camera) to capture video or still image.Mobile phone 12500 Storage medium 12570 and groove 12560 can also be included, wherein storage medium 12570 is used for the data for storing encoded/decoding, For example, the video or still image that are captured by camera 12530, receive or obtained according to various modes via e-mail, storage Medium 12570 is loaded into mobile phone 12500 via groove 12560.Storage medium 12570 can be flash memory, for example, safe number The electrically erasable and programmable read-only memory (EEPROM) that word (SD) blocks or is included in plastic housing.

Figure 23 shows the internal structure of mobile phone 12500.In order to systematically control the part (bag of mobile phone 12500 Include display screen 12520 and guidance panel 12540), power supply circuit 12700, operation input controller 12640, image encoder 12720th, camera interface 12630, lcd controller 12620, image decoder 12690, multiplexer/demultiplexer 12680, record/ During reading unit 12670, modulation/demodulation unit 12660 and Sound Processor Unit 12650 are connected to via synchronous bus 12730 Central processor 12710.

If user's operation power button is simultaneously set to start from ' shutdown ' state ' state, power supply circuit 12700 is by electricity Power is fed to all parts of mobile phone 12500 from battery pack, so that mobile phone 12500 is set into operator scheme.

Central controller 12710 includes CPU, read-only storage (ROM) and random access memory (RAM).

Mobile phone 12500 by communication data transfer to outside when, mobile phone 12500 is in central controller 12710 Control under generate data signal.For example, Sound Processor Unit 12650 can generate digital audio signal, image encoder 12720 Data image signal can be generated, and the text data of message can be via guidance panel 12540 and operation input controller 12640 generations.When being transferred to modulation/demodulation unit 12660 under control of the data signal in central controller 12710, modulation/ Demodulating unit 12660 is modulated to the frequency band of data signal, and 12610 pairs of digital audios through band modulation of telecommunication circuit Signal performs digital to analog conversion (DAC) and frequency transformation.The transmission signal exported from telecommunication circuit 12610 can be via antenna 12510 are transferred to voice communication base station or wireless base station 12000.

For example, when mobile phone 12500 is in dialogue mode, the voice signal obtained via microphone 12550 is in Digital audio signal is converted into by Sound Processor Unit 12650 under the control for entreating controller 12710.Digital audio signal can be via Modulation/demodulation unit 12660 and telecommunication circuit 12610 are transformed into transmission signal, and can be transmitted via antenna 12510.

When text message (for example, Email) is transmitted during data communication mode, the text data of text message Inputted via guidance panel 12540, and central controller 12610 is transferred to via operation input controller 12640.In center Under the control of controller 12610, text data is transformed into transmission letter via modulation/demodulation unit 12660 and telecommunication circuit 12610 Number, and it is transferred to wireless base station 12000 via antenna 12510.

In order to transmit view data during data communication mode, camera 12530 is captured via camera interface 12630 View data is provided to image encoder 12720.The view data captured can be controlled via camera interface 12630 and LCD Device 12620 is directly displayed on display screen 12520.

The structure of image encoder 12720 can be corresponding with the structure of video coding apparatus 100 described above.Image is compiled Code device 12720 can be according to above-mentioned method for video coding by the image data transformation received from camera 12530 is into compression and encodes View data, and encoded view data is then output to multiplexer/demultiplexer 12680.In camera 12530 During record operation, the voice signal obtained by the microphone 12550 of mobile phone 12500 can be via Sound Processor Unit 12650 are transformed into digital audio data, and digital audio data can be for transmission to multiplexer/demultiplexer 12680.

Multiplexer/demultiplexer 12680 is by the encoded image data received from image encoder 12720 and at sound The voice data that reason device 12650 is received is multiplexed together.The result being multiplexed to data can be via modulation/demodulation unit 12660 and communication unit 12610 be transformed into transmission signal, and can then be transmitted via antenna 12510.

When mobile phone 12500 is received from outside communication data, the signal received via antenna 12510 is performed Frequency retrieval and analog-to-digital conversion (ADC), data signal is transformed into by the signal.12660 pairs of numerals of modulation/demodulation unit The frequency band of signal is modulated.According to the type of data signal, by the digital data transmission through band modulation to Video Decoder 12690th, Sound Processor Unit 12650 or lcd controller 12620.

During dialogue mode, mobile phone 12500 amplifies the signal received via antenna 12510, and by right The signal of amplification performs frequency transformation and ADC to obtain digital audio signal.Under the control of central controller 12710, receive Digital audio signal be transformed into analoging sound signal via modulation/demodulation unit 12660 and Sound Processor Unit 12650, and Analoging sound signal is exported via loudspeaker 12580.

During data communication mode, when receiving the data of the video file accessed in internet site, via tune System/demodulating unit 12660 exports the signal received via antenna 12510 from wireless base station 12000 as multiplex data, and Multiplex data is transferred to multiplexer/demultiplexer 12680.

In order to be decoded to the multiplex data received via antenna 12510, multiplexer/demultiplexer 12680 will be multiplexed Data demultiplex into encoded video data stream and coded audio data flow.Encoded video data stream and coded audio number Video Decoder 12690 and Sound Processor Unit 12650 are respectively fed to by synchronous bus 12730 according to flowing through.

The structure of image decoder 12690 can be corresponding with the structure of video decoder described above.By using The above-mentioned video encoding/decoding method of the disclosure, image decoder 12690 can be decoded to encoded video data, to obtain weight The video data built, and the video data of reconstruction is supplied to display screen 12520 via lcd controller 12620.

Therefore, the data of the video file accessed in internet site may be displayed on display screen 12520.Meanwhile, sound Voice data can be transformed into analoging sound signal by sound processor 12650, and analoging sound signal is supplied into loudspeaker 12580.Therefore, the voice data contained in the video file that internet site is accessed can also be via loudspeaker 12580 Reproduce.

Mobile phone 1150 or another type of communication terminal can include being compiled according to the video of illustrative embodiments The transceiver terminal of both code device and video decoder, can be the launch terminal for only including video coding apparatus, Huo Zheke To be the receiving terminal for only including video decoder.

The communication system for being not limited to describe above with reference to Figure 21 according to the communication system of embodiment.For example, Figure 24 is shown According to the digit broadcasting system of the use communication system of embodiment.Figure 24 digit broadcasting system can be by using according to reality The video coding apparatus and video decoder of mode is applied to receive the digital broadcasting via satellite or ground network transmission.

More specifically, broadcasting station 12890 by using radio wave by transmission of streams of video data to telecommunication satellite or wide Broadcast satellite 12900.Broadcasting satellite 12900 transmits broadcast singal, and broadcast singal is transferred to satellite via family expenses antenna 12860 Radio receiver.In each family, encoded video flowing by TV receivers 12810, set top box 12870 or another can be set It is standby to decode and reproduce.

When the video decoder of the disclosure is implemented in transcriber 12830, transcriber 12830 can be to record Encoded video stream on storage medium 12820 (such as, disc or storage card) is parsed and decoded, and is believed with reconstructing digital Number.Therefore, the vision signal of reconstruction can reproduce for example on monitor 12840.

In the antenna 12860 with being broadcasted for satellite/terrestrial or the cable antenna for receiving cable TV (TV) broadcast In the set top box 12870 of 12850 connections, the video decoder of the disclosure can be installed.The data exported from set top box 12870 It can also be reproduced on TV Monitor 12880.

As another example, the video decoder of the disclosure may be mounted in TV receivers 12810, rather than machine top In box 12870.

Automobile 12920 with appropriate antenna 12910 can receive what is transmitted from satellite 12900 or wireless base station 11700 Signal.Decoded video can reproduce on the display screen for the auto-navigation system 12930 being installed in automobile 12920.

Vision signal can be encoded by the video coding apparatus of the disclosure, and can be then recorded and stored in and deposited In storage media.More specifically, picture signal can be stored in DVD 12960 by DVD recorder, or can be by hard disk Logger 12950 is stored in a hard disk.As another example, vision signal can be stored in SD card 12970.If hard disk is remembered Recording device 12950 includes the video decoder according to illustrative embodiments, then records in DVD 12960, SD card 12970 Or the vision signal on another storage medium can reproduce on TV Monitor 12880.

Auto-navigation system 12930 can not include Figure 26 camera 12530, camera interface 12630 and image encoder 12720.For example, computer 12100 and TV receivers 12810 can not include Figure 23 camera 12530, camera interface 12630 With image encoder 12720.

Figure 25 shows the use video coding apparatus and the cloud computing system of video decoder according to numerous embodiments Network structure.

Cloud computing system can include cloud computing server 14100, customer data base (DB) 14100, multiple computing resources 14200 and user terminal.

Request in response to carrying out user terminal, cloud computing system is carried via data communication network (for example, internet) For the outsourcing service on demand of multiple computing resources 14200.Under cloud computing environment, physics will be located at by using virtualization technology Computing resource at the data center of upper diverse location is combined, and service provider provides the user required service.Service is used Computing resource (for example, application program, memory, operating system (OS) and fail-safe software) need not be installed to his/her by family Terminal in be used, but can be at required time point from the clothes in the Virtual Space generated by virtualization technology Selected among business and use required service.

The user terminal of specified services user is via including the data communication network of internet and mobile telecommunications network It is connected to cloud computing server 14000.Cloud computing service can be provided to user terminal from cloud computing server 14000, specifically Ground, rabbit service.User terminal can be the various types of electronic equipments for being connectable to internet, for example, desk-top PC 14300, intelligence TV 14400, smart phone 14500, notebook computer 14600, portable media player (PMP) 14700th, tablet PC 14800 etc..

Multiple computing resources 14200 that cloud computing server 14100 can will be distributed in cloud network are combined, and The result of combination is supplied to user terminal.Multiple computing resources 14200 can include various data, services, and can include From the data of user terminal uploads.As described above, cloud computing server 14100 can be according to virtualization technology by inciting somebody to action The video database of distribution in the different areas is combined to required service being supplied to user terminal.

The user profile relevant with the user for subscribing to cloud computing service is stored in user DB 14100.User profile can be with Log-on message, address, title and personal credit information including user.User profile can also include the index of video.Herein, The breakpoint for the video that the list, the list of the video reproduced, past that index can include the video reproduced reproduce Deng.

The information relevant with the video being stored in user DB 14100 can be shared between the subscriber devices.For example, working as When Video service being supplied into notebook computer 14600 in response to the request from notebook computer 14600, Video service Representation of the historical is stored in user DB 14100.When receiving the request for reproducing the Video service from smart phone 14500, cloud The Video service is searched for and reproduced to calculation server 14000 based on user DB 14100.Come from when smart phone 14500 is received It is similar by being decoded process to reproduce video to video data stream during the video data stream of cloud computing server 14000 In the operation of the mobile phone 12500 described above with reference to Figure 23.

The reproduction that cloud computing server 14000 may be referred to be stored in the required Video service in user DB 14100 is gone through History.For example, cloud computing server 14000 receives and carrys out reproductions of user terminal and be stored in asking for video in user DB 14100 Ask.If this video is reproducing, the method for streaming this video performed by cloud computing server 14000 can root Change according to the request for carrying out user terminal, i.e. according to reproduction video since the beginning of video or breakpoint.For example, If user terminal requests reproduce the video since the beginning of video, cloud computing server 14000 will be from the frame of video first The video stream data of beginning is transferred to user terminal.On the other hand, regarded if user terminal requests reproduce this since breakpoint Frequently, then the video stream data since frame corresponding with breakpoint is transferred to user terminal by cloud computing server 14000.

On this point, user terminal can include such as the video decoder above with reference to described in Figure 1A to Figure 18.Make For another example, user terminal can include such as the video coding apparatus above with reference to described in Figure 1A to Figure 18.Alternately, use Family terminal can include such as both video coding apparatus and video decoder above with reference to described in Figure 1A to Figure 18.

Above with reference to Figure 1A to Figure 18 describe above-mentioned method for video coding, video encoding/decoding method, video coding apparatus and The various applications of video decoder.However, by above with reference to Figure 1A to Figure 18 method for video coding described and video decoding Method is stored in the embodiment of the method in storage medium or implements video coding apparatus and video decoding dress in a device The embodiment for the method put is not limited to Figure 19 to Figure 25 embodiment.

Although specifically illustrating with reference to the embodiment of the disclosure and describing the disclosure, ordinary skill Personnel will be understood that, in the case where not departing from such as spirit and scope of the present disclosure defined in the appended claims, can be to this public affairs The form and details opened make a variety of changes.Embodiment only should consider with descriptive sense, rather than for purposes of limitation. Therefore, the scope of the present disclosure is not limited by the detailed description of the disclosure, but is defined by the following claims, and the scope Interior all differences are all to be interpreted as including in the disclosure.

Claims

1. the method that pair multi-view image is decoded, methods described includes：

From skip flag in bit stream getting frame, the frame in skip flag indicates whether to rebuild described based on frame in dancing mode The current block that the depth image of multi-view image includes；

When the frame in skip flag indicates to rebuild the current block based on the frame in dancing mode, obtained from the bit stream Frame in is taken to jump prediction mode information, frame in jump prediction mode information indicates stand-by among multiple intra prediction modes In the intra prediction mode of the current block；

Intra-frame prediction method according to being indicated by frame in jump prediction mode information determines the sample that the current block includes This predicted value；And

The predicted value based on the sample, the current block is rebuild by determining the reconstructed value of the sample.

2. according to the method described in claim 1, in addition to：Jump enables mark in getting frame, and the frame in jump enables mark Note indicates whether the frame in dancing mode can be used for including the dad image data cell of the depth image, and

Wherein obtaining the frame in skip flag includes：Indicate that the frame in dancing mode can when frame in jump enables mark During for the dad image data cell, the frame in skip flag is obtained.

3. according to the method described in claim 1, wherein, when the frame in skip flag indicate be based on the frame in dancing mode When rebuilding the current block, residual data is not obtained.

4. according to the method described in claim 1, wherein, frame in jump prediction mode information indicates horizontal pattern, vertical Pattern, horizontal single-mode or vertical single-mode.

5. method according to claim 4, wherein it is determined that the predicted value of the sample includes：When the frame in is jumped When the prediction mode information that jumps indicates the horizontal pattern, the predicted value for the sample that the current block is included is defined as It is located at identical row with the sample that the current block includes among the sample neighbouring with the current fast left side The value of sample；When frame in jump prediction mode information indicates the vertical pattern, the institute that the current block is included The predicted value for stating sample was defined as equal to including with the current block among the sample neighbouring with the upside of the current block The sample be located at same column sample value；When frame in jump prediction mode information indicates the horizontal single-mode When, the predicted value of the sample that the current block is included is defined as being equal to neighbouring with the left side of the current block The value of the sample positioned at pre-position among sample；And when frame in jump prediction mode information instruction is described vertical During single-mode, the predicted value of the sample that the current block is included be defined as being equal to the current block it is described on The value of the sample positioned at pre-position among the neighbouring sample in side.

6. for the device decoded to multi-view image, described device includes：

Frame in skip flag getter, is configured to from skip flag in bit stream getting frame, the frame in skip flag instruction is The current block that the no depth image for rebuilding the multi-view image based on frame in dancing mode includes；

Frame in jump prediction mode information getter, is configured to when the frame in skip flag is indicated based on frame in jump mould When formula rebuilds the current block, prediction mode information of being jumped out of described bit stream getting frame, the frame in jump predictive mode Information indicates the intra prediction mode for being ready to use in the current block among multiple intra prediction modes；

Predicted value determiner, is configured to determine institute according to the intra-frame prediction method indicated by frame in jump prediction mode information State the predicted value for the sample that current block includes；And

Reconstructor, is configured to the reconstructed value that the predicted value based on the sample determines the sample.

7. the method that pair multi-view image is encoded, methods described includes：

It is determined that the method encoded to the current block that the depth image of the multi-view image includes；

Based on skip flag in identified coding method delta frame, the frame in skip flag indicates whether to be jumped according to frame in Pattern is encoded to the current block；

When being encoded according to the frame in dancing mode to the current block, based in identified coding method delta frame Being used among jump prediction mode information, the multiple intra prediction modes of the frame in jump prediction mode information instruction predicts institute State the intra prediction mode of current block；And

Transmission includes the bit stream of the frame in skip flag and frame in jump prediction mode information.

8. method according to claim 7, in addition to：Determine that frame in jump enables mark, the frame in jump enables mark Note indicates whether the frame in dancing mode can be used for including the dad image data cell of the depth image, and

Wherein generating the frame in skip flag includes：Indicate that the frame in dancing mode can when frame in jump enables mark During for the dad image data cell, the frame in skip flag is generated.

9. method according to claim 7, wherein, transmitting the bit stream includes：Transmission does not include the current block The bit stream of residual data.

10. method according to claim 7, wherein, the frame in jump prediction mode information indicates horizontal pattern, vertical Pattern, horizontal single-mode or vertical single-mode.

11. method according to claim 10, wherein：

The horizontal pattern is that the predicted value for the sample for including the current block is defined as being equal to the left side with the current block The infra-frame prediction of the value of the sample for being located at identical row with the sample that the current block includes among the neighbouring sample in side Pattern；

The vertical pattern is that the predicted value for the sample for including the current block is defined as being equal to and the current block The neighbouring sample in upside among the sample included with the current block be located at same column sample value frame in Predictive mode；

The horizontal single-mode is that the predicted value for the sample for including the current block is defined as working as equal to described The intra prediction mode of the value of the sample positioned at pre-position among the neighbouring sample in preceding piece of the left side；And

The vertical single-mode is that the predicted value for the sample for including the current block is defined as working as equal to described The intra prediction mode of the value of the sample positioned at pre-position among the neighbouring sample in preceding piece of the upside.

12. for the device encoded to multi-view image, described device includes：

Encoding method determiner, is configured to determine that the current block for including the depth image of the multi-view image is encoded Method；

Frame in skip flag maker, is configured to based on skip flag in identified coding method delta frame, the frame in is jumped Jump mark indicates whether to encode the current block according to frame in dancing mode；

Frame in jump prediction mode information maker, is configured to work as and the current block is compiled according to the frame in dancing mode During code, based on prediction mode information of being jumped in identified coding method delta frame, the frame in jump prediction mode information refers to Show the intra prediction mode for being used to predict the current block among multiple intra prediction modes；And

Coding information transmitter, is configured to transmission and includes the frame in skip flag and the frame in jump prediction mode information Bit stream.

13. record has and is used in non-transitory computer readable recording medium, the non-transitory computer readable recording medium Perform program according to the method described in claim 1.

14. record has and is used in non-transitory computer readable recording medium, the non-transitory computer readable recording medium Perform the program of method according to claim 7.