Condition resolution extension syntax for HEVC extension process
Cross-reference to related applications
This application claims following by Yue Yu and Limin Wang in the application serial that on January 3rd, 2014 submits to be 61/
923,334, entitled " CONDITIONALLY PARSING EXTENSION SYNTAX OF PICTURE PARAMETER SET
(PPS) FOR HEVC RANGE EXTENSION AND MV-HEVC " the rights and interests of U.S. Provisional Patent Application, it passes through at this
Quote and be expressly incorporated herein.
Technical field
The present invention relates to, for encoding the system and method conciliating code data, particularly to for generation and process, there is height
The system and method for the section head of effect video data encoder.
Background technology
With the generation of media program, transmit and reappear and in the technology being associated, occur in that quickly growth.These technology bags
Include encoding scheme, its allow the digital version of media program be encoded to compress it the least size and convenient its
Transmit, store, receive and play back.These technology can be applicable to individual video video recording (PVR), video request program (VOD), multichannel matchmaker
The supply of body segment mesh, mutual, mobile phone and media program transmission
In the case of not compressing, digital media program is typically too big to such an extent as to can not enter with commercially acceptable cost
Row transmits and/or storage.But, the compression of such program makes the transmission of such digital media program and stores not only
Commercially viable, and become universal.
Initially, the transmission of media program relate on the high bandwidth transmission medium of such as cable television and satellite transmit low
Picture to medium resolution.But, such transmission has developed into the transmission medium including more low bandwidth, such as via meter
Calculation machine network, WiFi, mobile TV and the third and fourth generation (3G and 4G) network are to the fixing and the Internet transmission of mobile device.Enter
One step, such transmission there have also been developed the media program being to include fine definition, such as has significant transmission bandwidth and deposits
The HDTV (HDTV) that storage requires.
Efficient video coding (HEVC) coding standard (or H.265) it is the up-to-date of ISO/IEC mpeg standardization tissue announcement
Coding standard.Coding standard before HEVC includes H.264/MPEG-4 advanced video encoding H.262/MPEG-2 and subsequently
(AVC) standard.H.262/ the many application include high definition (HD) TV the most substantially instead of
MPEG-2.HEVC supports more higher resolution than HD, even if in the stereo or embodiment of multi views, and is more suitable for all
Mobile device such as tablet personal computer.Further information about HEVC can be at publication " Overview of the
High Efficiency Video Coding(HEVC)Standard,by Gary J.Sullivan,Jens-Rainer
Ohm,Woo-Jin Han and Thomas Wiegand,IEEE Transactions on Circuits and Systems
For Video Technology, December 2012 " in find, it is expressly incorporated herein by quoting at this.
As in other coding standards, bit stream structure and the grammer of the data of compatible HEVC are standardized, and make
Obtain standard compliant each decoder and will produce identical output when being provided identical input.It is incorporated to some features of HEVC standard
Including definition and the process of section (slice), one or more sections can include one of picture in video sequence together.Depending on
Frequency sequence includes that multiple picture, each picture can include one or more section.Section includes dependent/non-dependent section and relies on
Property section.Dependent/non-dependent section (hereafter simply referred to as section) is can to build according to entropy code, signal estimation and residual signal
The data structure decoded independent of other sections of identical picture.This data structure allows event in the event of data loss
Re-synchronization." dependency section " be permit about the section that will be carried to Internet information (such as with section or
Those information that block in wavefront entrance is relevant) structure so that these data that can be used for system process more quickly
The section of segmentation.Dependency section is mainly used in low latency coding.
HEVC and tradition coding standard define parameter set structure, and it is the behaviour in diversified application and network environment
Provide the motility of improvement and the vigorousness to data degradation of improvement.It can be decoding different piece that parameter set comprises
Encoded video and shared information.Parameter set structure provides the security mechanism for transmitting data, this security mechanism for
It is necessary for decoding process.H.264 defining sequence parameter set (SPS) and image parameters collection (PPS), SPS describes use
In the parameter of decoding picture sequence, and PPS describes the parameter of the picture for decoding picture sequence.HEVC introduces new ginseng
Manifold, video parameter collection (VPS).
The information included according to section head performs coding and the decoding of section.This section head includes for reading
Indicating the grammer with data and logic, this mark and data are used for decoding section.
As its predecessor, HEVC supports the time to picture section and space encoding.HEVC defines and includes what I cut into slices
Section, it carries out space rather than time encoding with reference to another section.I section is alternatively described as " interior " volume of cutting into slices
Code.HEVC also defines the section including that P (prediction) cuts into slices, and it is to be spatially encoded and the time with reference to another section
Coding.P section is alternatively described as what section " " encoded.HEVC also describes and includes that what double prediction (B) cut into slices cuts
Sheet.With reference to two or more, other are cut into slices and are spatially encoded and time encoding in B section.Further, HEVC by P and
The concept of B section is incorporated to can serve as the general B section of reference slice.
Currently, HEVC grammer includes providing extension, with by the ability of HEVC or capacity extension to surmounting baseline.Such expansion
Exhibition includes range expansion (RExt), scalability extension (SHVC) and multi views extension (MV-HEVC).Extension can VPS, SPS,
PPS or a combination thereof send with signaling.
“High Efficiency Video Coding(HEVC)Range Extensions text
specification:Draft 4,”published by the Joint Collaborative Team on Video
Coding(JCT-VC)of ITU-T SG 16WP 3and ISO/IEC JTC 1/SC 29/WG 11,13th Meeting:
Incheon, KR, 18 26April 2013, by David Flynn et al, (passing through to quote at this to be expressly incorporated herein) defines
PPS grammer, it controls multiple spread function by using the spread function unique designation with each spread function unique association
Execution.But, such mark is not independent reading.Such as, PPS grammer sends execution one extension letter with signaling
First mark of number can be in the only ability when another (second) mark of the spread function performed before has special state or value
Grammer that is resolved and that perform is interior reads (such as, it is not possible to read mark, unless the Mark Detection read before is "true").When
Unless the spread function grammer before having been carried out is without when performing spread function, this is not problem.But, at needs
In the case of the independent parsing controlling spread function or execution, here it is problem.What is desired is that a kind of for resolving grammer
The system and method for improvement, this grammer allows the independent parsing controlling spread function.Present disclosure describes such system and
Method.
Summary of the invention
In order to solve above-mentioned requirements, this document discloses a kind of equipment for signaling extensions function and method, described in set
Standby and method includes the sequence of multiple picture for decoding, processes each picture based in part on image parameters collection.?
In one embodiment, method includes: reads extension and there is Signaling Flag;Determine that read extension exists whether Signaling Flag refers to
Having shown will be based in part at least one spread function to process picture;And only work as read extension and there is signaling
Mark indicates during based in part at least one spread function to process picture, reads and sends the first extension with signaling
First spread function Signaling Flag of function, and unrelated with the value of the first spread function Signaling Flag read, read with
Signaling sends the second spread function Signaling Flag of the second spread function.If desired can be by additional spread function signaling mark
Will performs this method.Disclosing another embodiment, one of which device to have the processor of the memorizer of communicative couplings is
Feature, the storage of described memorizer is for performing the instruction of aforementioned operation.
Accompanying drawing explanation
Referring now to accompanying drawing, reference marker similar in accompanying drawing represents corresponding part in the whole text:
Fig. 1 is to depict to may be used for transmission and/or storage and retrieval audio frequency and/or the Video coding-solution of video information
The figure of the exemplary embodiment of code system;
Fig. 2 A is that the AV information of coding is sent to another location and the reality of coding/decoding system received in this position
Execute the figure of example;
Fig. 2 B is to depict storage coding information and retrieve the coding information example for the coding/decoding system presented after a while
Property embodiment figure, hereafter known as encoding and decoding storage system;
Fig. 3 is the block diagram of the embodiment illustrating source encoder;
Fig. 4 is the picture depicting AV information, one of picture in such as sequence of pictures;
Fig. 5 shows the code tree block figure to the exemplary division of coding unit;
Fig. 6 is to illustrate the representative quaternary tree and data parameters divided for the code tree block shown in Fig. 5
The figure represented;
Fig. 7 is the figure illustrating coding unit to the division of one or more predicting unit;
Fig. 8 shows and coding unit is divided into four predicting unit and the set of converter unit that is associated
Figure;
Fig. 9 shows the RQT code tree for the converter unit being associated with the coding unit in the example of Fig. 8
Figure;
Figure 10 is the figure of the spatial prediction illustrating predicting unit;
Figure 11 is the figure illustrating time prediction;
Figure 12 is the figure of the use illustrating motion vector predictor (MVP);
Figure 13 illustrates the example of the use of reference picture list;
Figure 14 is the figure illustrating the process performed by the encoder according to above-mentioned standard;
Figure 15 depict by according to the decoder of emerging HEVC standard in decoding to collocated_from_10_flag
Use;
Figure 16 A and 16B is the figure presenting baseline PPS grammer;
Figure 16 C and 16D is the figure of the PPS grammer presenting improvement;
Figure 17 A-17D illustrates process stream and the grammer of the exemplary improvement for extension process;
Figure 18 is the figure presenting the exemplary PPS grammer for HEVC range expansion;
Figure 19 A-19C shows the further alternative embodiment of expanded signalling grammer;And
Figure 20 illustrates the example processing system that may be used for realizing the disclosed embodiments.
Detailed description of the invention
In the following description, carry out reference to forming the accompanying drawing describing a part, and show this by way of illustration
Some embodiments of invention.It should be understood that in the case of without departing from the scope of the present invention, it is possible to use other embodiments and can
To carry out structural change.
Audio-visual information transmitting-receiving and storage
Fig. 1 is to depict to may be used for transmission and/or storage and retrieval audio frequency and/or the Video coding-solution of video information
The figure of the exemplary embodiment of code (encoding and decoding) system 100.Coding/decoding system 100 includes: coding system 104, it accepts audiovisual
(AV) information 102 and process AV information 102 with generate coding (compression) AV information 106;And solving code system 112, it processes
The AV information 106 of coding is to produce the AV information 114 recovered.Owing to coding and decoding process are the most lossless, the AV information of recovery
114 is incomplete same with original AV information 102, but selects coded treatment and parameter, AV information 114 He of recovery by wisdom
Difference between untreated AV information 102 is acceptable for the mankind experience.
Before decoding and presenting, generally transmit or store and retrieve the AV information 106 of coding, as transmitting-receiving (is launched and connects
Receive) or storage/retrieval system 108 performed by.Transmitting-receiving loss is probably significantly, but storage/retrieval loss generally minimum or
Do not exist, therefore it provides identical or basic to usual and coding the AV information 106 of AV information 110 of the transmitting-receiving solving code system 112
Identical.
Fig. 2 A is that the AV information 106 of coding is sent to another location and the coding/decoding system 200A received in this position
The figure of one embodiment.The AV information 102 of input is converted into and is suitable to the signal of transmission and at transmission channel by transmission segmentation 230
It is sent to the signal of conversion on 212 receive segmentation 232.Receive segmentation 232 and receive the signal of transmission, and the signal that will receive
It is converted into the AV information 114 of the recovery Gong presenting.It is as noted previously, as coding and transmission loss and mistake, the AV information of recovery
The quality of 114 is likely lower than the quality being supplied to transmit the AV information 102 of segmentation 230.But, error correction system can be included in
To lower or to eliminate such mistake.Such as, before the AV information 106 of coding can encode by increasing redundancy
To error correction (FEC), such redundancy can be used to identify and eliminate the mistake received in segmentation.
Transmission segmentation 102 includes one or more source encoders 202 in the multiple sources for encoding AV information 102.For
Compressing to produce the purpose of AV information 106 of coding, first AV information 102 encoded by source encoder 202, and as the most following enter
Described in one step, source encoder 202 can include such as processor and relational storage, and this memorizer storage achieves such as
The codec of MPEG-1, MPEG-2, MPEG-4AVC/H.264, HEVC or the instruction of similar codec.
Coder/decoder system 200A can also include the optional element in Fig. 2 A indicated by dotted line.These optional elements include
Video multiplex encoder 204, coding controller 208 and video demultiplexing decoder 218.Optional video multiplex encoder 204
Multiple source encoders 202 that the one or more parameter reflexs provided according to optional coding controller 208 join for auto-correlation
The AV information 106 of coding.Such multiplexing completes and the most in the time domain based on packet.
In one embodiment, video multiplex encoder 204 includes statistical multiplexer, and a combination thereof is from multiple source encoders
The AV information 106 of the coding of 202 thus bandwidth needed for minimizing transmission.Due to the coding from each source encoder 202
The instantaneous bit rate of AV information the most greatly can change, so this is possible according to the content of AV information 102.
Such as, compare and there is a small amount of motion or the scene of details (such as portrait dialogue), there is a large amount of details and action (such as motion thing
Part) scene generally encode with playout length.Owing to each source encoder 202 can produce, there is high instantaneous bit rate
Information and another source encoder 202 produce the information with low instantaneous bit rate, and owing to coding controller 208 can be ordered
Source encoder 202 encodes AV information 106 according to the particular characteristic parameter affecting instantaneous bit rate, from each source encoder
The signal (each instantaneous bit rate with of short duration change) of 106 can be combined, in an optimal manner to minimize multiplexing
The instantaneous bit rate of stream 205.
As it has been described above, source encoder 202 and video multiplex encoder 204 can be controlled by coding controller 208 alternatively
System is with the instantaneous bit rate minimizing combination video signal.In one embodiment, this is to use to regard from interim storage coding
Frequently the information of the transmission buffer 206 of signal complete and the full level that may indicate that buffer 206.This allows according to biography
The coding performed in source encoder 202 or video multiplex encoder 204 of residue storage in defeated buffer 206.
Transmission segmentation 230 can also include transcoder, and its further encoded video signal is to feed to receive segmentation 232
It is transmitted.Transmission coding can include such as the above-mentioned FEC coding of transmission medium selected and/or be encoded into multiplexing
Scheme.Such as, if transmission is carried out by satellite or land transmitter, then transcoder 114 can be before being transmitted
It is signal constellation (in digital modulation) figure via quadrature amplitude modulation (QAM) or similar modulation technique by Signal coding.And, if will be via interconnection
FidonetFido equipment or the Internet transmit the video signal of coding as a stream, then be transmitted compiling to signal according to suitable agreement
Code.Further, as described further below, if coding signal will be transmitted via mobile phone, then suitable coding is used
Agreement.
Receive segmentation 232 to include transmitting decoder 214, mutual with the encoding scheme of use in transcoder 214 to use
The decoding scheme mended receives the signal encoded by transcoder 210.Can be stored by optional reception buffer 216 temporarily
Decoding reception signal, and if receive signal include multiple video signal, then come reception by video multiplex decoder 218
Signal carries out multiplexed decoded, with from by extracting video signal interested in the video signal of video multiplex encoder 204 multiplexing.
Finally, source decoder 220 use with source encoder 202 for encoding the decoding scheme of the codec complementation of AV information 102
Or codec decodes video signal interested.
In one embodiment, the data of transmission include being transmitted into client (generation from server (representing transmission segmentation 230)
Table receives segmentation 232) the video flowing of packetizing.In the case, transcoder 210 can be by data packetization and by net
Network level of abstraction (NAL) unit is embedded in network packet.NAL unit defines the data capsule with head and code element,
And can correspond to frame of video or other sections of video data.
Can be packetized and transmit via transmission channel 212 by the compression data of transmission, transmission channel can include
Wide area network (WAN) or LAN (LAN).Such network can include such as, the wireless network of such as WiFi, Ethernet net
Network, Internet or the hybrid network being made up of some heterogeneous networks.Can be via such as RTP (RTP), use
The communication protocol of user data datagram protocol (UDP) or the communication protocol of any other type affect such communication.Different grouping
Change method may be used for each network abstract layer (NAL) unit of bit stream.In oneainstance, the size of a NAL unit
Less than the size of MTU (MTU), the size of this MTU corresponds to can be in the case of not being segmented on network
The size of the largest packet of transmission.In the case, during NAL unit is embedded into single network packet.In another case, multiple
Whole NAL unit is included in single network packet.In a third case, a NAL unit may be too large to
Transmitting and be therefore split into the NAL unit of some segmentations in single network is grouped, the NAL unit of the most each segmentation is solely
Vertical network packet is transmitted.For the purpose of decoding, the NAL unit of segmentation is generally continuously transmitted.
Receive segmentation 232 receive the data of packetizing and from network packet, rebuild NAL unit.For segmentation
NAL unit, the Data relationship of the NAL unit from segmentation is got up by client, in order to rebuild original NAL unit.Visitor
That family end 232 decoding receives and the data stream that rebuilds and on the display device reproducing video image and by raising one's voice
Think highly of existing voice data.
Fig. 2 B is to depict storage coding information and retrieve the coding information figure for the exemplary embodiment presented after a while,
Hereafter known as encoding and decoding storage system 200B.This embodiment can be used, for example, in digital VTR (DVR), flash drive
Locally store information in device, hard disk drive or similar devices.In this embodiment, by source encoder 202 to AV information 102
Carry out source code, buffered by storage buffer 234 before storing it in storage device 236 alternatively.Storage sets
Standby 236 can temporarily or in expansion time section storage video signal, and hard disk drive, flash drive can be included
Device, RAM or ROM.The AV information of storage is retrieved subsequently, is carried out buffering and by source decoder by retrieval buffer 238 alternatively
220 are decoded.
Fig. 2 C be depict include coding system or encoder 202 conciliate code system or decoder 220, can be used for transmitting
Another figure with the example content dissemination system 200C receiving HEVC data.In certain embodiments, coding system 202 is permissible
Including input interface 256, controller 241, enumerator 242, frame memory 243, coding unit 244, transmitter buffer 267 and
Output interface 257.Solve code system 220 to include receptor buffer 259, decoding unit 260, frame memory 261 and control
Device 267.Coding system 202 is conciliate code system 220 and can be intercoupled via the transmission path that can carry compression bit stream.
The controller 241 of coding system 202 can control based on transmitter buffer 267 or the capacity of receptor buffer 259
Transmission data volume and other parameters can be included, the data volume of such as time per unit.Controller 241 can control coding
Unit 244 is to prevent to solve the reception signal decoding operation failure of code system 220.Controller 241 can be processor or
Include, but not limited to the microcomputer with processor, random access memory and read only memory.
By non-restrictive example, the source picture 246 provided from content supplier can include sequence of frames of video, this frame sequence
Including the artwork sheet in video sequence.Artwork sheet 246 can be unpressed or compression.If source picture 246 is uncompressed
, then coding system 202 can have encoding function.If source picture 246 is compression, then coding system 202 can have
Transcoding function.Controller 241 can be utilized to obtain coding unit from source picture.Frame memory 243 can have first area and
Second area, wherein first area may be used for storing the second area from the entrance frame of source picture 246 and may be used for reading
Frame and be output to coding unit 244.Controller 241 can be to frame memory 243 output area switch-over control signal
249.Region switch-over control signal 249 may indicate that first area or second area will be utilized.
Controller 241 can export coding control signal 250 to coding unit 244.Coding control signal 250 so that
Coding unit 202 starts encoding operation, such as prepares coding unit based on source picture.In response to the volume from controller 241
Code control signal 250, coding unit 244 can start reading out the coding unit of the preparation of high efficient coding process, high efficient coding
Process is such as predictive coding process or transition coding process, and it processes the coding unit prepared, thus generate based on coding
The video compression data of the source picture that unit is associated.
The video compression data of generation can included the packetizing elementary streams (PES) of video packets by coding unit 244
In pack.Coding unit 244 can use control information and program time stamp (PTS) that video packets is mapped to coding
The video signal 248 of video signal 248 and coding can be sent to transmitter buffer 267.
Video signal 248 including the coding of the video compression data generated can be stored in transmitter buffer 267.
Traffic count device 242 can be incremented by indicate the total amount of data in transmitter buffer 267.It is retrieved along with data and postpones
Rushing in device and remove, enumerator 242 can successively decrease to reflect the data volume in transmitter buffer 267.Occupied area information signal
253 can be sent to enumerator 242 to indicate whether the data from coding unit 244 have added to transmitter buffer
267 or remove from it, therefore enumerator 242 can be with increasing or decreasing.Controller 241 can be based on occupied area information 253
Control produced video packets by coding unit 244, occupied area information 253 can be communicated so as to predict, avoid, prevent and/
Or detection transmitter buffer 267 occurs to overflow or underflow.
Can in response to the preset signals 254 being generated by controller 241 and being exported reset information batching counter 242.?
After traffic count device 242 is reset, coding unit 244 data exported can be counted and obtain by it
The video compression data generated and/or the amount of video packets.Traffic count device 242 can provide expression to obtain to controller 241
The quantity of information signal 255 of the quantity of information obtained.Controller 241 can control coding unit 244 so that in transmitter buffer 267
In there is not spilling.
In certain embodiments, solve code system 220 and can include input interface 266, receptor buffer 259, controller
267, frame memory 261, decoding unit 260 and output interface 267.The receptor buffer 259 solving code system 220 can be interim
Storage compression bit stream, this compression bit stream includes the video compression data of reception based on the source picture from source picture 246
And video packets.Solve code system 220 can read the control information being associated with the video packets in the data received and in
Existing timestamp information and output can apply to the frame number signal 263 of controller 220.Controller 267 can be with the most true
The frame number of counting is supervised at fixed interval.By non-restrictive example, controller 267 can complete at decoding unit 260 every time
The frame number of supervision counting during decoding operation.
In certain embodiments, when frame number signal 263 indicate receptor buffer 259 be in predetermined volumes time, control
Device 267 processed can be to decoding unit 260 output decoding commencing signal 264.When frame number signal 263 indicates receptor buffer
259 when being in less than predetermined volumes, and controller 267 can wait frame number to be counted to become equal to the sending out of situation of scheduled volume
Raw.When the situation occurs, controller 267 can export decoding commencing signal 263.By non-restrictive example, when frame number is believed
Numbers 263 indicate receptor buffer 259 when being in predetermined volumes, and controller 267 can export decoding commencing signal 264.Can
To decode coding based on the presentation time stamp being associated with the video packets of coding with monotone order (being i.e. increased or decreased)
Video packets and video compression data.
In response to decoding commencing signal 264, decoding unit 260 can decode the number amounting to the picture being associated with frame
According to, and the video data of the compression being associated with picture, this picture is relevant to the video packets from receptor buffer 259
Connection.The video signal 269 of decoding can be write in frame memory 261 by decoding unit 260.Frame memory 261 can have
One region and second area, the video signal wherein decoded is written into first area and second area for reading into output interface
The decoding picture 262 of 267.
In various embodiments, coding system 202 can with head end transcoder or code device merges mutually or therewith
Being associated, and solution code system 220 can merge mutually with upstream device or be further associated, upstream device such as sets for mobile
Standby, Set Top Box or transcoder.
Source code/decoding
Have than the chi of original video sequence in AV information 102 as it has been described above, encoder 202 utilizes compression algorithm to generate
Very little smaller size of bit stream and/or file.Can carry out so by reducing the room and time redundancy in original series
Compression.
The encoder 202 of prior art includes and by " Video Coding Experts group " (VCEG) and " motion diagram of ISO of ITU
As expert group " video compression standard H.264/MPEG-4AVC (" advanced video encoding ") of exploitation is compatible between (MPEG) volume
Code device, especially form are publication " Advanced Video Coding for Generic Audiovisual
Services " (in March, 2005), it passes through to quote at this to be expressly incorporated herein.
HEVC " efficient video coding " (sometimes referred to as H.265) is expected to substitute H.264/MPEG-4AVC.As following enter
Described in one step, HEVC introduces the general new coding tools as the coding entity defined in H.264/AVC and entity.
Fig. 3 is the block diagram of the embodiment illustrating source encoder 202.Source encoder 202 accepts AV information 102
And use sampler 302 AV information 102 of sampling to produce consecutive digital images or the sequence 303 of picture, each digital picture or
Picture has multiple pixel.Picture can include frame or field, and wherein, frame is the complete graph caught during known interval
Picture, and field is the set of base line of the odd number of ingredient image or even number.
Sampler 302 produces unpressed sequence of pictures 303.Each digital picture can be by having of multiple coefficient
Or multiple matrix represents, the plurality of coefficient illustrates the information about the pixel constituting picture together.The value of pixel can be right
Should be in brightness or other information.(such as, R-G-B component or bright in the case of some components are associated with each pixel
Degree-chromatic component), each in these components can be processed separately.
Image can be segmented into " section ", can include that a part of picture maybe can include whole picture.H.264 marking
In standard, these sections are divided into the coding entity (typically size is the block of 16 pixel × 16 pixels) being referred to as macro block, and
Each macro block can then be divided into different size of data block 102, such as 4 × 4,4 × 8,8 × 4,8 × 8,8 × 16,16
×8.HEVC extension and generalization surmount the concept of the coding entity of macro block concept.
HEVC coding entity: CTU, CU, PU and TU
As other video encoding standards, HEVC is block-based blending space and time prediction encoding scheme.But,
HEVC introduces the new coding entity not included with H.264/AVC standard.These coding entity include (1) code tree block
(CTU), coding unit (CU), predicting unit (PU) and converter unit (TU), and these coding entity further described below.
Fig. 4 is the figure of the picture 400 depicting AV information 102, picture 400 be such as the picture in sequence of pictures 303 it
One.Picture 400 is spatially divided into nonoverlapping square, and it is referred to as code tree unit, or CTU 402.Unlike wherein
Basic coding unit is the video encoding standard H.264 and before of the macro block of 16x16 pixel, and CTU 402 is the basic of HEVC
Coding unit, and it can be large enough to 128x128 pixel.As shown in Figure 4, generally in picture 400 be similar to progressive scan
Order quote CTU 402.
Each CTU 402 can then be iterated the coding unit being divided into less variable size, below by way of " four fork
Tree " decompose be described further.The region that coding unit is formed in image and transmits in bit stream 314,
The coding parameter similar to this area applications.
Fig. 5 shows and CTU 402 is exemplarily divided into such as coding unit 502A and 502B be (the most alternatively
Be referred to as coding unit 502) the figure of coding unit (CU).Single CTU 402 can be divided into four CU 502, such as CU
502A, each CU 502A are 1/4th of CTU 402 size.The CU 502A of each such segmentation can be the most divided
Being four less CU 502B, it has 1/4th sizes of original CU 502A.
Described by " quaternary tree " data parameters (such as mark or bit) and CTU 402 is divided into CU 502A and more
Little CU 502B, " quaternary tree " data parameters, as being referred to as the expense of grammer, is encoded into output together with coded data
Bit stream 314.
Fig. 6 is to illustrate the representative quaternary tree 600 and data parameters divided for the CTU 402 shown in Fig. 5
The figure of expression.Quaternary tree 600 includes the primary nodal point 602A that multiple node, the plurality of node are included on a level horizontal
(hereafter, quadtree's node can may be alternatively referred to as " node " with the secondary nodal point 602B on relatively low level horizontal
602).At each node 602 of quaternary tree, if node 602 is split into child node further, then give " division mark " or
Bit " 1 ", otherwise gives bit " 0 ".
Such as, CTU 402 shown in Figure 5 divides and can be represented by the quaternary tree 600 presented in Fig. 6, four forks
Tree 600 includes that the division mark " 1 " being associated with the node 602A at top CU 502 (indicates and has 4 in lower-level level
Individual additional node).The quaternary tree 600 of diagram also includes the division mark being associated with the node 602B in intergrade CU 502
" 1 ", to indicate this CU to be also divided into four other CU 502 in next (end) level CU.Source encoder 202 can limit
Little and maximum CU 502 size, thus changes the maximum possible degree of depth of CU 502 division.
Encoder 202 generates the AV information 106 of coding, and its form is to include the bit stream of Part I and Part II
314, wherein Part I has the coded data for CU 502 and Part II includes being referred to as the expense of syntactic element.Compile
Code data include the data of the CU 502 corresponding to coding, and (i.e., as described further below, motion associated there is vowed
Amount, predictor coded residual together or relevant residual).Part II includes can be with the syntactic element of presentation code parameter, its
The coded data of block can not be corresponded directly to.Such as, syntactic element can include address and mark, the amount of the CU 502 in image
Change parameter, selected between the coding/instruction of interior coding mode, quaternary tree 600 or other information.
CU 502 is corresponding to basic coding element and includes two correlator unit: predicting unit (PU) and converter unit
(TU), both there is the full-size of the size being equal to corresponding CU 502.
Fig. 7 is the figure illustrating and CU 502 being divided into one or more PU 702.PU 702 is corresponding to the CU divided
502 and the pixel value of type in the predicted pictures or between picture.PU 702 is to divide expansion H.264/AVC for estimation
Exhibition, and it is divided into the CU 502 of other CU (" division mark "=0) to define PU 702 for each the most further quilt.Such as figure
Shown in 7, at each leaf 604 of quaternary tree 600, final (end level) CU 502 of 2Nx2N can have four may PU mould
One of formula: 2Nx2N (702A), 2NxN (702B), Nx2N (702C) and NxN (702D).
CU 502 can be carried out space or time prediction coding.If CU 502 is with " interior " pattern-coding, enter the most as follows
Described in one step, each PU 702 of CU 502 can have spatial prediction direction and the image information of their own.And, at " interior "
In pattern, the PU 702 of CU 502 can depend on another CU 502, this is because it can be used in the sky in another CU
Between neighbours.If CU 502 is with " " pattern-coding, the most as further described below, each PU 702 of CU 502 can have it
Oneself motion vector and the reference picture being associated.
Fig. 8 shows the set of the converter unit (TU) 802 CU 502 being divided into four PU 702 and be associated
Figure.TU 802 is used to indicate the base unit being carried out spatial alternation by DCT (discrete cosine transform).By further below
The size of each piece of conversion TU 802 that " residual " quaternary tree (RQT) illustrated describes in CU 502 and position.
Fig. 9 shows the figure of the RQT 900 of the TU 802 of CU 502 in the example of Fig. 8.Note, RQT's 900
" 1 " of primary nodal point 902A indicates " 1 " instruction of the secondary nodal point 902B that there is four branches and adjacent lower layer level level
Node indicated by has four branches further.The data describing RQT 900 are also encoded and as in bit stream 314
Expense is transmitted.
The coding parameter of video sequence can be stored in the special NAL unit being referred to as parameter set.Two can be utilized
The parameter set NAL unit of type.First Parameter Set Type is referred to as sequence parameter set (SPS), and includes NAL unit, NAL unit
Unchanged parameter during being included in whole video sequence.Generally, SPS processes coding brief introduction, the size of frame of video and other ginsengs
Number.The parameter set of Second Type is referred to as image parameters collection (PPS), and to may be from an image to another image change
Different values encodes.
Room and time is predicted
One of technology for compression bit stream 314 is to give up the storage of pixel value self, replacement, uses at decoder
At 220, recursive process is come predicted pixel values and storage or sends the difference between predicted pixel values and actual pixel value
(referred to as residual error).As long as decoder 220 can calculate identical predicted pixel values according to provided information, then can be by by residual
Difference adds predictive value to recover actual picture value.Constructed can be used for compresses other data.
It is provided to predictor module 307 referring again to Fig. 3, each PU 702 of the CU 502 processed.Predictor
Module 307 based on the information (infra-frame prediction, it is performed by spatial predictors 324) in PU 702 neighbouring in same number of frames and time
The information (inter prediction, it is performed by versus time estimator 330) of the PU 702 in frame the most close between predicts PU's 702
Value.But, time prediction may not have and current PU always based on the PU configured because the PU of configuration is defined as being positioned at
Reference/the non-reference frame of the x and y coordinates that the x and y coordinates of 702 is identical.These technology make use of the space between PU 702 and time
Between associate.
Therefore coding unit can be classified as include two types: (1) non-temporal predicting unit and (2) time prediction list
Unit.Using present frame to predict non-temporal predicting unit, the PU 702 adjacent or neighbouring in including frame of present frame is (such as, in frame
Prediction), and generated non-temporal predicting unit by spatial predictors 324.According to a time picture (such as P frame) or root
Predicted time predicting unit is carried out according to the time upper at least two reference picture (i.e. B frame) shifting to an earlier date and/or postponing.
Spatial prediction
Figure 10 is the figure of the spatial prediction illustrating PU 702.Picture can include PU 702 and the most close
Other PU 1-4, including neighbouring PU 702N.Spatial predictors 324 is by employing the encoded of the pixel of present image
" in the frame " of the PU 702 of other blocks predicts current block (the block C of such as Figure 10).
Spatial predictors 324 location is suitable to the neighbouring PU (such as, the PU 1,2,3 or 4 of Figure 10) of space encoding and determines
To this adjacent to the angle prediction direction of PU.In HEVC, it may be considered that 35 directions, the most each PU can have associated
One of 35 directions of connection, including level, vertical, 45 degree of diagonal angles, 135 degree of diagonal angles, DC etc..The sky of PU is indicated in grammer
Between prediction direction.
Referring back to the spatial predictors 324 of Fig. 3, this neighbouring PU positioned is used for element 305 and calculates
Residual PU 704 (e), using the difference between the pixel and the pixel of current PU 702 of neighbouring PU 702N.This result is interior pre-
Surveying PU element 1006, it includes prediction direction 1002 and interior prediction residual PU 1004.Can be by according to the most close PU
And the spatial correlation deduction direction of picture encodes prediction direction 1002, so that the encoding rate of interior prediction direction pattern
Can be lowered.
Time prediction
Figure 11 is the figure illustrating time prediction.Time prediction considers from time upper adjacent picture or the letter of frame
Breath, such as before picture, picture i-1.
Generally, time prediction includes single prediction (P type) and many predictions (B type), and P type is by with reference to being only from one
One reference zone of reference picture predicts PU 702, and B type is by with reference to from two of one or two reference picture
Reference zone predicts PU.Reference picture is the figure being encoded and being reconstructed subsequently (by decoding) in video sequence
Picture.
(one for P type or several are for B at one or several of these reference zones for versus time estimator 330
Type) in the region of pixel in frame neighbouring in recognition time so that they are used as the prediction of this current PU 702
Device.In the situation (B type) using some regional prediction devices, they can be fused to generate an independent prediction.In ginseng
Examine in frame and identify that reference zone 1102, motion vector MV 1104 are defined as present frame (figure by motion vector (MV) 1104
Displacement between reference zone 1102 (refIdx) in current PU 702 and reference frame (picture i-1) in sheet i).PU in B picture
Can have up to two MV.MV and refIdx information is included in the grammer of HEVC bit stream.
Referring again to Fig. 3, the difference of the pixel value between reference zone 1102 and current PU 702 can be by by switching 306 institutes
The element 305 selected calculates.This difference predicts the residual error of PU 1106 between being referred to as.In time or the end of inter predication process
Place, current PU 1006 is made up of a motion vector MV 1104 and residual error 1106.
But, as it has been described above, be to use decoder 220 repeatably means for compressing a technology of data, generate
For the predictive value of data, thus calculate the prediction of data and the difference (residual error) of actual value and send residual error to decode.As long as
Decoder 220 can reappear predictive value, then residual values may be used to determine whether actual value.
Actual MV 1104 and the difference (residual error) of prediction MV 1104 can be calculated by generating the prediction of MV 1104 and incite somebody to action
MV residual error sends in bit stream 314, and this technology is applied in time prediction the MV 1104 used.If decoder 220
The MV 1104 of prediction can be reappeared, it is possible to calculate actual MV 1104 according to residual error.HEVC uses between neighbouring PU 702
The spatial coherence of motion calculate the prediction MV for each PU 702.
Figure 12 is the figure of the use illustrating the motion vector predictor (MVP) in HEVC.Motion vector predictor
V1,V2And V3Take from the MV 1104 being positioned at neighbouring or adjacent to block (C) to be encoded multiple pieces 1,2 and 3.Due to these vectors
Relate to the motion vector of the spatial neighboring blocks in identical time frame and can be used for predicting the motion vector of block to be encoded, this
A little vectors are referred to as space motion predictor.
Figure 12 also illustrates temporal motion vector predictor VT, its picture decoded before sequence is (to solve
Code order) in the motion vector of block C ' of common location (such as, the block of picture i-1 is positioned at and is being coded of block (image i
Block C) identical locus, locus).
Spatial motion vector prediction device V1,V2And V3And temporal motion vector predictor VTComponent may be used for generate
Median motion vector predictor VM.In HEVC, according to predetermined availability rule, can be as shown in Figure 12 from following piece
Extract three spatial motion vector prediction devices, it is, be positioned at the block (V in the left side of block to be encoded1), be positioned on block
(V3) and be positioned at one of the block at each angle of block to be encoded (V2).This MV predictor selection technique is referred to as advanced motion and vows
Amount prediction (AMVP).
Therefore, it is thus achieved that there is spatial predictors (such as V1,V2And V3) and versus time estimator VTMultiple (usual five) MV
Predictor (MVP) candidate.In order to reduce the expense sending motion vector predictor in the bitstream with signaling, can be by eliminating
The data of the motion vector repeated reduce the set of motion vector predictor and (such as, have the value identical with the value of other MV
MV can eliminate from candidate).
Encoder 202 can select the motion vector predictor of " best " from candidate, and calculates as selected motion
The motion vector predictor residual error of the difference of vector predictor and actual motion vector, and by motion vector predictor residual error than
Special stream 314 sends.In order to perform this operation, actual motion vector must store in case after a while by decoder 220 use (though
So in bit stream 314, do not send actual motion vector).Signaling bit or mark be included in bit stream 314 with specify from
Which MV residual error is normalization motion vector predictor calculate, and decoder uses this signaling bit or mark to recover after a while
Motion vector.These bits or mark are further described below it.
Referring back to Fig. 3, predict interior prediction residual 1004 and a prediction of process acquisition from space (interior) or time ()
Residual error 1106 is transformed module 308 subsequently and is transformed into above-mentioned converter unit (TU) 802.Can use above for described in Fig. 9
RQT decompose further TU 802 is split into less TU.In HEVC, generally use the decomposition of 2 or 3 grades and approved
Transform size is from 32 × 32,16 × 16,8 × 8 and 4 × 4.As it has been described above, according to discrete cosine transform (DCT) or discrete sine
Conversion (DST) is converted.
Residual error variation coefficient is quantified subsequently by quantizer 310.Quantify in data compression, play very important angle
Color.In HEVC, quantify to be converted into high precision variation coefficient the probable value of limited quantity.Although quantifying to have granted big piezometric
Contracting, but quantifying is to damage operation, and quantization loss can not be resumed.
The coefficient of the conversion residual error quantified is encoded by entropy coder 312 subsequently, then as the figure of coding AV information
A part for the useful data of picture is inserted in compression bit stream 310.The space correlation between syntactic element can also be used
Syntax elements encoded is encoded to increase code efficiency by property.HEVC provides context adaptive binary arithmetic coding
(CABAC).Other form codings or entropy code or arithmetic coding can also be used.
In order to calculate the predictor used above, encoder 202 uses and includes element 316,318,320,322,328
" decode " PU 702 that circulation 315 decoding is the most encoded.This decoding circulation 315 has rebuild PU and residual from quantization transform
The image of difference.
Conversion residual error coefficient E after quantization is provided to quantizer 316, and inverse operation is applied to quantizer 310 by it
Conversion residual error coefficient E after quantization, to produce the conversion residual error coefficient (E ') 708 going to quantify of PU.Go quantify data 708 with
After be provided to inverse converter 318, the inverse operation of conversion that its application conversion module 308 is applied, to generate the PU's of reconstruct
Residual error coefficient (e ') 710.
The residual error coefficient 710 of the PU of reconstruct is added to by selector 306 subsequently from an interior prediction PU 1004 and prediction PU
The coefficient of correspondence of the corresponding prediction PU selected in 1106 (x ') 702 '.Such as, if the residual error rebuild is from spatial prediction
" interior " cataloged procedure of device 324, then add to this residual error by " interior " predictor (x '), in order to the PU that recovers to rebuild (x ")
The original PU that 712, PU (x ") 712 revises corresponding to the loss that caused by conversion (such as be quantization operation in this situation)
702.If residual error 710 is from " " cataloged procedure of versus time estimator 330, then region (this pointed to by current motion vector
A little regions belong to the reference picture in the reference buffer 328 being stored in referenced by present image indexes) be fused then by
Add to this decoded residual.So, original PU 702 is caused due to quantization operation loss and revise.
For the motion-vector prediction technology that encoder 202 use is similar to image prediction technology as above, can
To use motion vector buffer 329 to store motion vector for use in time posterior frame.As described further below, can
With arrange mark and transmit this mark with grammer, with instruction for current decoded frame motion vector should by least for
After coded frame rather than replace the content of MV buffer 329 with the MV of present frame.
Recursive filter 322 is applied to reconstruction signal (x ") 712, in order to reduces the severe to the residual error obtained and quantifies
The impact caused, and improve signal quality.Recursive filter 322 can include, such as de-blocking filter, and it is used for smoothing
Border between PU, with the high frequency caused by cataloged procedure of visually decaying, also includes linear filter, and it is in the institute of image
Have PU to be decoded to be employed afterwards to minimize the variance with original image and (SSD).Linear filtering process is held by frame by frame
Go and use the some pixels around the pixel waiting to be filtered, and also using the spatial correlation between frame pixel.Linear filter
Ripple device coefficient can be encoded and be transmitted at a head of bit stream, it is common that picture or the head of section.
The image of filtering, also referred to as reconstructs image, consequently as the reference picture quilt from reference picture buffers 328
Storage, in order to allow follow-up " " prediction to occur during the compression of the successive image of current video sequence.
Reference picture grammer
As it has been described above, in order to reduce mistake and improve compression, HEVC permits the use of some reference pictures, in order to currently
Image carries out estimating and motion compensation.Current PU 702, the PU 1102 that particular slice is configured in given photo current
Reside in the neighbouring reference/non-reference picture of association.Such as, in fig. 12, PU 702 current in picture (i) is configured
PU 1102 reside in the neighbouring reference picture (i-1) of association.Some of multiple reference/non-reference picture select work as
Best " " predictor of front PU 702 or versus time estimator, this can based on display order in time at photo current
Before or after picture (being backward and forward prediction respectively).
For HEVC, the reference picture list described in section grammer define the index to reference picture.By
List_0 (RefPicList0) defines forward prediction, and is defined back forecast, and row by list_1 (RefPicList1)
Table 0 and list 1 can comprise with display order before photo current or/and multiple reference picture afterwards.
Figure 13 illustrates the example of the use of reference picture list.Consider picture shown in Figure 13 0,2,4,5,6,
8 and 10, wherein, the numeral of each picture represents that display order and photo current are pictures 5.In the case, there is rising
Reference picture index and the list_0 reference picture starting from null index are 4,2,0,6,8 and 10, and have rising ginseng
Examine picture indices and to start from the list_1 reference picture of null index be 6,8,10,4,2 and 0.Motion compensated prediction quilt
The section being limited to list_0 prediction is referred to as prediction or P section.By using the collocated_ref_idx rope in HEVC
Attract the picture of instruction configuration.Its motion is supplemented and predicts that the section including more than one reference picture is that bi-predictive or B is cut
Sheet.Cutting into slices for B, motion compensated prediction can include from list_1 prediction and the reference picture of list_0.
Therefore, the PU 1102 configured is placed in the reference picture specified in list_0 or list_1.Mark
(collocated_from_l0_flag) institute should be obtained from list_0 or list_1 for appointment for particular slice type
The division of configuration.Each reference picture is also associated with motion vector.
At Benjamin Bross, Woo-Jin Han, Jens-Rainer Ohm, Gary J.Sullivan, Thomas
Wiegand,“WD4:Working Draft 4of High-Efficiency Video Coding,“Joint
Collaborative Team on Video Coding(JCT-VC)of ITU-T SG16WP3and ISO/IEC JTC1/
SC29/WG11, JCTVC-F803_d5,6th Meeting:Torino, IT, 14-22July, 2011 (pass through to quote at this to be incorporated to
8.4.1.2.9 section herein) describes the storage of the reference picture for emerging HEVC standard and associated motion vector
And retrieval.
According to standard, if slice_type is 0 equal to B and collocated_from_l0_flag, then
Reference picture is appointed as the division comprising common location as specified by RefPicList1 by collocated_ref_idx variable
Picture.Otherwise (slice_type is equal to 1 or slice_type equal to P equal to B and collocated_from_l0_flag),
Then reference picture is appointed as the division comprising configuration as specified by RefPicList0 by collocated_ref_idx variable
Picture.
Figure 14 is the figure illustrating the process that the encoder 202 according to above-mentioned standard performs.Block 1402 determines currently
Whether picture is the reference picture for another picture.If it is not, then need not store reference picture or motion vector information.
If photo current is the reference picture for another picture, then block 1504 determines that " another " picture is P type or B type map
Sheet.If picture is P type picture, then processing and be passed to block 1410, colloc_from_10_flag is set to 1 by it, and
Storage reference picture and motion vector in list 0.If " another picture " is B type picture, the most however block 1406 is still
So when required reference picture will be stored in list 0, process is directed to block 1408 and 1410, in required reference picture and
When motion vector will be stored in list 1, process is directed to block 1412 and 1414.This decision-making may be based on whether wish from
On time formerly or posterior picture selects reference picture.Which in reference picture selects multiple may be basis
Collocated_ref_idx index determines.
Figure 15 depict decoder 220 according to HEVC standard before decoding a in collocated_from_10_flag
Use.Block 1502 determines that the current slice type calculated is interior or I type.Such section is in coding/decoding mistake
Journey does not use time upper neighbouring section, and therefore need not find time upper neighbouring reference picture.If slice type
Be not I type, then block 1504 determines whether section is B section.If section is not B type, then it is the section of P type, and root
According to the value of collocated_ref_idx, find the reference picture of the division comprising configuration in list 0.If section is B class
Type, then collocated_from_10_flag determines and finds reference picture in list 0 or list 1.As indicated by index
, therefore the picture configured is defined as reference picture, depend on slice type (B type or P type) and
The value of collocated_from_10_flag, reference picture has the collocated_ref_ of instruction in list 0 or list 1
idx.In an embodiment of HEVC, the first reference picture (has index [0] reference picture to be selected as shown in Figure 13
The picture of configuration).
Baseline picture parameter set syntax
Figure 16 A and 16B is the figure presenting baseline PPS raw byte sequence payload (RBSP) grammer.For processing in PPS
The grammer of extension is as illustrated in figure 16b.Logic 1602 determines whether to include that the first extension carrys out coding/decoding media, and reads
Suitably signaling and data.Logic 1602 includes statement 1606-1616.It is the most selected that statement 1606 reading indicates the first extension
Select the pps_extensiona1_flag for coding/decoding process.In one embodiment, logical value " 1 " instruction will use
First extension processes media, and logical value " 0 " indicates and the first extension will not used to process media.Statement 1608 is condition
State, its boot statement 1612-1614 holds according to the value (reading) of transform_skip_enabled_flag before
OK.Particularly, if transform_skip_enabled_flag is logical one or true, then the logic illustrated performs statement
Operation shown in 1612-1614.The transform_skip_enabled_flag 1601 of PPS grammer is as shown in fig. 16.
It is to allow to omit the extension of the dct transform of TU in certain circumstances that conversion is omitted.Substantially, dct transform for
For the media of high coherent signal, there is advantage, which results in outstanding energy compression.But, for having high unrelated signal
Media (such as there are the media of a large amount of details) for, compression performance is very different.For some media, dct transform process has
There is the fewest compression performance, in order to more preferable process performance process is preferably omitted.transform_skip_enabled_
Flag indicates when that the dct transform of TU is omitted in license.At such as " Early Termination of Transform Skip
Mode for High Efficiency Video Coding,”by Do Kyung Lee,Miso Park,Hyung-Do Kim
and Je-Chang Jeong in the Proceedings of the 2014International Conference on
Communications, Signal Processing and Computers has been described, at this by quoting also
Enter herein.If transform_skip_enabled flag is logic 1 (very), then processes and be delivered to statement 1612 He
1614.Otherwise, process is delivered to statement 1618.Statement 1612 performs reading value log2_transform_skip_max_
The operation of size_minus2, its indicate can be omitted maximum TU size (if transform_skip_enabled_
Flag instruction license performs the dct transform of TU).Statement 1614 performs to read the operation of mark pps_extension2_flag, its
Indicate whether to realize other extensions (extension2).
It follows that perform logic 1604.Logic 1604 includes statement 1618-1622.Statement 1618 is conditional statement, if
Pps_extension2_flag is logic 1, then process passes to the logic of statement 1620 and 1622.Statement 1620 and 1622
Additional pps_extension_data_flags is read when there are RBSP data.
In the PPS of aforesaid HEVC range expansion designs, pps_extension2_flag considers Unidentified extension
Data.According to above-mentioned logic, if pps_extension1_flag is true, then pps_extension2_flag exists.As
Really pps_extension1_flag is not true, then pps_extension2_flag does not exists.If pps_extension2_
Flag does not exists, then pps_extension2_flag is inferred to be equal to 0.If pps_extension2_flag is 0, then do not have
There is any additional growth data.
The conception of this logic always checks the value of the pps_extension2_flag for possible additional extension grammer,
State regardless of pps_extension1_flag.But, if pps_extension1_flag is 0, then need not check
Pps_extension2_flag, because if pps_extension1_flag is 0, then pps_extension2_flag will not
Exist, and if pps_extension2_flag does not exists, will infer that it is equal to 0, this indicates and there is no other growth daties.
Entitled " MODIFICATION OF PICTURE PARAMETER SET (PPS) FOR HEVC EXTENSIONS's "
Related U.S.Patent utility patent application serial number 14/533,386 describes the amendment of aforementioned grammer, wherein, the logic 1604 of Figure 16 B
(statement 1616-1620) is incorporated in conditional statement 1608, and only when pps_extension1_flag is detected as logic 1 from quilt
Perform.If this allows pps_extension1_flag to be detected as logical zero, the logic of statement 1610-1620 will be omitted, by
This saves the execution time.
Only extend at a PPS when only one of which PPS extension (extension is omitted in such as conversion) will be activated and may also have
The of the reading additional data (such as to send with signaling by pps_extension2_flag) just performed in the case of being performed
When two PPS extensions are read, this is designed with use.But, if there being additional PPS to extend, this design may be invalid, because grammer
Require after a while extension must resolve all before extension syntax, even if the extension performed before and/or grammer may be with after a while
The extension and/or the grammer that perform are independent or unrelated.
The image parameters collection grammer improved
Figure 17 A-17D is the figure presenting amended PPS raw byte sequence payload (RBSP) grammer.Generally speaking, repair
There is Signaling Flag (pps_extension_present_flag) in the extension of the RBSP syntactic definition after changing, it sends with signaling
Whether will process the picture in sequence based in part at least one spread function.If pps_extension_
Present_flag is detected as vacation, then know the PPS extension that not will comply with, and be no longer necessary to definition and process such extension
Grammer logic, and no longer perform and perform the process that such grammer logic is associated, therefore saving process resource, deposit
Memory resource and the time of process.Amended PPS RBSP grammer also includes one or more expanded signalling mark, each with
Signaling sends the existence of the PPS spread function being associated.Which increase the parsing of PPS grammer and the efficiency of execution, because being not required to
Grammer to store to read also without processor or perform one or more expanded signalling mark, associated data and logic and refer to
Order.
In one embodiment, PPS RBSP grammer is modified so that expanded signalling mark is indexed and changes further
Generation ground reading.Such as, n PPS expanded signalling mark can be denoted as pps_extension_flag [i], wherein i be its value from
The index of 0 to n-1.In one embodiment, it is possible to use seven PPS expanded signalling mark (n=7) of definition.Each so
Independent PPS extension flag can control resolve particular extension function grammer.Such as, a PPS extension flag can control
The parsing of the extension dependent parser of HEVC scope, the 2nd PPS extension flag can control the parsing of MV-HEVC dependent parser.
In another embodiment, by using additional pps_extension_7bits grammer, foregoing teachings can be expanded
Exhibition is for accommodating more than n (n >=8) individual extension.This additional grammer have granted the signaling further expanded, and can refer in future
Fixed more than further expanding that seven PPS being not enough to task indicate.In a preferred embodiment, the number of extended bit
(and maximum of above-mentioned index) is arranged to the multiple of 8 bits (0-7) so that can be easily accomplished the solution of byte-by-byte
Analysis.
Figure 17 A is to illustrate to be used to one or more spread function and carry out the multiple picture of coding/decoding
The flow chart of exemplary operation of sequence.At block 1700, read extension and there is Signaling Flag.There is Signaling Flag instruction in extension
Whether will process the picture referenced by PPS grammer based in part at least one spread function.At block 1702, about
The extension read exists whether Signaling Flag indicates and at least one spread function will be used at least in part to process and PPS language
The picture that method is associated, is determined.In one embodiment, this is to be determined by spread function to have whether Signaling Flag has
The first value is had to complete." value " can be logical value (such as true or false) or numeral or the word that may refer to show logical value
Female digital value (such as 1 or 0).Not there is the first value if extension exists Signaling Flag (indicate and will not use any extension letter
Number processes the picture being associated with PPS grammer), then can skip the operation shown in block 1704-1708.If extension exists
Signaling Flag is confirmed as indicating and will carry out decoding picture based in part at least one spread function, then as in block 1704
Shown in, the first spread function Signaling Flag is read, and regardless of the value of the first spread function Signaling Flag, in block 1706
Shown in, read the second spread function signaling function.Consequently, because the value of the first spread function Signaling Flag read before no matter
Or state the most all reads the second spread function Signaling Flag, so the reading and being read of the second spread function Signaling Flag
The value of the first spread function Signaling Flag is unrelated.This is contrasted with the grammer shown in Figure 16 B, in fig. 16b, is only patrolling
In volumes 1608, pps_extension1_flag is detected as true time and is just read pps_extension2_flag by logic 1614.?
After, as block 1708 is described, perform the spread function sent by the mark read in block 1704-1706 with signaling.
Figure 17 B is the figure presenting the exemplary PPS grammer that can be used to perform the operation shown in Figure 17 A, and this is exemplary
PPS grammer is for use one or more spread functions to decode the sequence of multiple picture.In an illustrated embodiment, extension
Function includes that above-mentioned TU dct transform omits expanded function.
Logic 1712 reads extension and there is Signaling Flag.In an illustrated embodiment, there is Signaling Flag and include in extension
pps_extension_present_flag.Logic 1714 detects to determine whether pps_extension_present_flag patrols
It is true on Ji, and time only in this way, just performs logic 1716-1740.If it is determined that pps_extension_present_flag
Logically false, then process and be delivered to logic 1740.It is important that, it means that any extension flag will not be read, and
Any spread function will not be performed process.
If Figure 17 B performs the operation described in block 1704 and 1706 of Figure 17 A, the most therefore logic 1716-1720 reads and expands
Exhibition function Signaling Flag (here, indicates for pps_extension_flag [i] and/or pps_extension_7bits).Especially
It is that, for i=0 to n-1, logic 1716 and 1718 reads pps_extension_flag [i] (in the exemplary embodiment, n=
1, the most only read a mark, i.e. pps_extension_flag [0]).Logic 1720 reading value pps_extension_
7bits, it is for sending more than being quoted by pps_extension_flag [0]-pps_extension_flag [6] with signaling
The additional extension function of up to 7 spread functions.
Logic 1722 and 1728 detection indicates needs determining whether the pps_extension_flag [0] of reading has
The logical value of related expanding function (TU DCT omission).If pps_extension_flag [0] has a value that (such as examine
Surveying is true in logic), then perform logic 1724-1730.
Logic 1724 detects to determine whether that enabling conversion saves by detection transform_skip_enabled_flag
Slightly.If it is activated (such as transform_skip_enabled_flag is detected as very), then perform the logic of PPS grammer
1726-1728.The reading of logic 1726-1728 is represented by log2_max_transform_skip_block_size_minus2
Value, which specify the block size of maximum converter unit (TU) for omitting dct transform.
Logic 1732 detects to determine whether the value of the pps_extension_7bits read by logic 1760 is detected as
Very.If so, then logic 1734-1738 reads such additional signaling bits.
Figure 17 C is the flow chart further illustrating the exemplary PPS grammer presented in Figure 17 B, as it has been described above,
First all of spread function Signaling Flag (such as pps_extension_flag [i]-pps_extension_flag is read
[n-1]), perform each spread function the most one by one.
Referring to Figure 17 C, block 1750 reads extension and there is Signaling Flag.Block 1752 determines that extension exists whether Signaling Flag has
There is the value indicated performing at least one spread function.If there is Signaling Flag in extension indicates the extension that will not perform
Function, then process after being passed to block 1758.If there is Signaling Flag and indicate one or more for execution extensions in extension
Function, then process and be passed to block 1753, and it have read all spread function Signaling Flag (such as pps_extension_flag
[i]-pps_extension_flag[n-1]).Process is then communicated to block 1754, and its detection is to determine the first spread function letter
Make mark whether have and will perform the value of the first spread function with signaling transmission.Figure 17 B illustrates in logic 1722
Perform the example syntax of this detection.
If the instruction of spread function Signaling Flag will not perform spread function, then process and walk around block 1756 and 1758.For
The grammer performing these operations is arrived exemplified by logic 1730 by the logic 1722 of Figure 17 B.If spread function Signaling Flag indicates
Spread function will be performed, then process and be passed to block 1756, and perform at least some of of spread function process.For performing this
The grammer of a little operations is by illustrated in logic 1724-1728 in Figure 17 B, and it reads for by transform_skip_
The size of the maximum converter unit block of DCT change can be omitted in the case of enabled_flag 1601 instruction.
Block 1758 detects to determine whether to already have accounted for all spread functions.If already having accounted for all extension letters
Number, then process terminates (logic 1740 being similar in the grammer shown in Figure 17 B).If also not accounting for all extension letters
Number, then process and be passed to block 1760, this results in next the function Signaling Flag considered by block 1754.
Previous figures illustrates to process logic, wherein reads all spread function Signaling Flag, holds the most one at a time
The each spread function of row.This embodiment is reading spread function by incremental index as shown in logic 1716 and 1718
The embodiment of Signaling Flag is particularly useful because its by reading and the spread function self of mark (tape index) (can carry or
Can not tape index) perform decoupling.For example, it is possible to only by including the grammer for performing each spread function, one connects
A ground (such as, to perform logic 1754-1756, subsequently by being inserted in the logical statement between logic 1756 and 1758, hold
The further logic of row is to perform next spread function) perform the processing cycle that represented by block 1754-1760.Or, they can
To use incremental index to perform, its same index that could be for reading spread function Signaling Flag, or different index.
Figure 17 D is the flow chart illustrating alternative embodiment, be wherein replaced in start to perform spread function from
Read all spread function Signaling Flag before body, read each spread function Signaling Flag and reading next spread function letter
Spread function is performed before order mark.Block 1760 reads the first spread function Signaling Flag (can be with tape index), and block 1762 is examined
Survey whether the first spread function Signaling Flag read indicates and will perform the first spread function.If function will not be performed, then
Process is passed to block 1768 and does not perform spread function.But, if the first spread function Signaling Flag whether indicate by
Perform the first spread function, then process and be passed to block 1764, wherein performed such before process is delivered to block 1768
Process.Once completing this to process, block 1768 determines whether to have have read all spread function Signaling Flag.If
Words, then process and exit, but if it is not, as illustrated in block 1770, it is considered to next spread function Signaling Flag.Read the
Two spread function Signaling Flag, and for this second spread function Signaling Flag and the second spread function of being associated thereof, repeat
The operation of block 1760-1768.This can also complete via the use of one or more incremental indexes, and different index will be used for
Read spread function Signaling Flag and perform spread function self.
Figure 18 is the figure of the embodiment presenting the PPS grammer for HEVC range expansion.As before, in logical statement 1712
The pps_extension_present_flag read indicates that at least one pps_extension_flag [i] is present in PPS
In grammer.This pps_extension_present_flag is in logical statement 1714, to indicate logical statement 1716 He
1718 should be performed, and these logical statements read the pps_extension_flag [i] for i=0 to n.pps_
Extension_flag [i] value is 1 grammatical structure indicating the existence pps_extension for being associated, and pps_
Extension_flag [i] value is 0 to indicate the grammatical structure not existed for the pps_extension being associated with mark.
In example grammar shown in figure 18, pps_extension_flag [0] value is 1 to indicate at PPS RBSP language
Method structure exists following HEVC range expansion coherent element, as shown in logical statement 1724,1726 and 1804-1820:
·log2_max_transform_skip_block_size_minus2
·luma_chroma_prediction_enabled_flag
·chroma_qp_adjustment_enabled_flag
·diff_cu_chroma_qp_adjustment_depth
·chroma_qp_adjustment_table_size_minus1
·cb_qp_adjustment
·cr_qp_adjustment
Therefore, pps_extension_flag [0] specifies these syntactic elements do not exist equal to 0.
Further, pps_extension_7bits value be 0 specify do not exist in PPS RBSP grammatical structure more
Pps_extension_data_flag syntactic element, and logical statement 1822 eliminates logical statement 1824 and 1828.pps_
Extension_7bits should have the value of 0 in meeting the bit stream leaving over specification version, because pps_extension_
7bits value is not equal to 0 and is preserved for ITU-T/ISO/IEC future usage.HEVC decoder should allow pps_
The value of extension_7bits is not equal to 0 and all pps_extension_data_ that should ignore in PPS NAL unit
Flag syntactic element.
Figure 19 A-19C shows the further alternative embodiment of expanded signalling grammer.Figure 19 A illustrates general language
Method, wherein there is Signaling Flag (pps_extension_present_flag) and be used for being sent in PPS with signaling and be in extension
No existence further spread function grammer.As before, logical statement 1712 reads pps_extension_present_flag.
The grammer for one or more spread functions, logic is there is during only pps_extesion_present_flag indicates PPS
The execution of statement 1714 just command logic statement 1716-1742.Logical statement 1716 reads the pps_ for all i values
Extension_flag [i], and logical statement 1720 reads pps_extension_7bits.Logical statement 1732-1740 reads
Pps_extension_data_flag and associated data.
Figure 19 B illustrates PPS grammer, wherein, read in separate statement spread function Signaling Flag rather than
Via the index being incremented by processing cycle.Specifically, logical statement 1902-1906 read indicate range expansion process will be held
Capable first indicates (pps_range_extension_flag), indicates multilamellar or multi views (MV-HEVC) extension process general
Second be performed indicates (pps_multilayer_extension_flag) and further expands the 3rd of data for reading
Mark (pps_extension_bits6).Logical statement 1910-1912 performs the pps_range_ read such as logical statement 1902
(it can be placed in by pps_range_ in pps_range_extension () process indicated by extension_flag
In the independent PPS range expansion grammer that extension () logical statement is quoted).Logical statement 1914-1916 performs such as pps_
Pps_multilayer_extension () indicated by multilayer_extension_flag (can also be by pps_
The different PPS grammer that multilayer_extension () logical statement is quoted is designated).Logical statement 1918-1926
Read pps_extension_data_flag and the data being associated.
Figure 19 C illustrates PPS grammer, wherein use incremental index read spread function Signaling Flag rather than
Spread function Signaling Flag for perform extension process is detected and used with statement independent, that do not index.Specifically, logic
The index i that statement 1930-1932 uses adopted value to be 0 and 1 reads two pps_extension_flag, i.e. pps_
Extension_flag [0] and pps_extension_flag [1].Logical statement 1934 reads pps_extension_6bits
Value, except quoting pps_extension_flag and making a distinction with [0] or the index of [1] rather than different names, logic language
Sentence 1938-1952 is similar to logical statement 1910-1926 and operates.
Other embodiments of aforementioned grammer are also contemplated to.Deposit for example, it is possible to be grouped extension by type or catalogue
At Signaling Flag (such as pps_extension_flag).The extension that this license has similar data demand is sent out with signaling together
Send and process, thereby saving syntax statement and decoder processes.
As it has been described above, the spread function with signaling transmission can be independent, or can functionally be correlated with.Example
As, before the second spread function can be done, the second spread function can need to use first (pre-treatment or execution
) result of spread function.Or, the second spread function can mutually exclusive with the first spread function (such as, otherwise will perform
First spread function or or will perform the second spread function, but will not both of which perform).Or, the second spread function is permissible
Being unless also performed the function that the first spread function otherwise will not perform, therefore, the second spread function value is only in the first extension
Just it is implied or performs in processing sequence when function is also carried out.Such as, calculate may need from the first spread function and
The output of both the second spread functions or result, and therefore, the existence of the first spread function must imply the second extension letter
Number, vice versa.
Describing aforementioned operation about decoding process, it can be as the generation of a part for coded treatment at source decoder
In 220 or in encoder 202.Coded treatment is also expressed as including determining one or more cutting according to slice type data
Whether the section in sheet is to cut into slices between prediction, and if section be to cut into slices between prediction, then the section head being associated with section
The first parameter configuration in portion is the signaling enabled value of the state of the weight estimation of the view data being associated with section.
Hardware environment
Figure 20 illustrates the example processing system 2000 that can be used to realize embodiments of the invention.Computer
2002 include processor 2004 and memorizer, such as random access memory (RAM) 2006.Computer 2002 is operatively coupled to show
Showing device 2022, the image of such as window is presented to user on graphic user interface 2018B by it.Computer 2002 can couple
To other equipment, such as keyboard 2014, mouse 2016, printer etc..Certainly, it would be recognized by those skilled in the art that above-mentioned group
The combination in any of part, or any number of different assembly, ancillary equipment or other equipment, can make together with computer 2002
With.
Generally, under the control of the computer 2002 operating system 2008 in being stored in memorizer 2006 operate, and with
Family is alternately to accept input and order and to present result by graphic user interface (GUI) module 2018A.Although GUI module
2018A is depicted as standalone module, but the instruction performing GUI function can be resident or be distributed in operating system 2008, computer
In program 2010, or realize with private memory and processor.Computer 2002 also achieves compiler 2012, and it allows
Processor will be translated into the application program 2010 of such as COBOL, the programming language of C++, FORTRAN or other language
2004 readable codes.After completion, application 2010 employing uses the relation of compiler 2012 generation and logic to access and grasp
The data of storage in the memorizer 2006 of control computer 2002.Computer 2002 includes external communication device the most alternatively, such as
Modem, satellite link, Ethernet card or for other equipment of other compunications.
In one embodiment, it is achieved the instruction of operating system 2008, computer program 2010 and compiler 2012 is tangible
Ground is presented as computer-readable medium, and such as data storage device 2020, it can include one or more fixed or movable
Data storage device, such as zip drive, floppy disk 2024, hard disk drive, CD-ROM drive, tape drive
Etc..Further, operating system 2008 and computer program 2010 are made up of instruction, refer to when computer 2002 reads and perform this
When making, this instruction makes computer 2002 perform realize and/or use step essential to the invention.Computer program 2010
And/or operational order can also be tangibly embodied as memorizer 2006 and/or data communications equipment 2030, thus make manufactured
Computer program or article.Thus, terms used herein " article of manufacture ", " program storage device " and " calculating
Machine program product " being intended to comprise can be from any computer readable device or the computer program of medium access.
Processing system 2000 may be embodied in desktop computer, laptop computer, flat board, notebook, individual
Personal digital assistant (PDA), cell phone, smart phone or there is any equipment of suitable treatments and storage capacity.Enter one
Step, processing system 2000 can utilize specialized hardware to perform some or all aforementioned functions.Such as, above-mentioned coding and decoding
Process can be performed by application specific processor and the memorizer being associated.
It would be recognized by those skilled in the art that in the case of without departing substantially from the scope of the present disclosure, this configuration can be carried out
Many amendments.Such as, it would be recognized by those skilled in the art that the combination in any of said modules, or any number of different group
Part, ancillary equipment and other equipment, can be used.Such as, specific function as herein described can be held by hardware module
OK, or can be performed by the processor of instruction performing to store with software or form of firmware.Further, merit as herein described
Can be combined or expand in multiple modules performing in individual module.
Conclusion
For the purpose of illustration and description, the described above of preferred embodiment is had been presented for.Be not intended to limit or
Disclosure is restricted to disclosed precise forms by person.In view of instructing above, many amendments and deformation are all possible.
It is intended to the scope of right do not described in detail by this and limited, but is limited by appended claim.