CN106256128A

CN106256128A - Condition resolution extension syntax for HEVC extension process

Info

Publication number: CN106256128A
Application number: CN201480074902.8A
Authority: CN
Inventors: 余越; 王利民
Original assignee: Keen Trend LLC
Current assignee: Commscope UK Ltd
Priority date: 2014-01-03
Filing date: 2014-12-30
Publication date: 2016-12-21
Anticipated expiration: 2034-12-30
Also published as: CN112887738A; CN112887737B; CN112887737A; CN112887736B; CN112887738B; CN112887735B; CN106256128B; CN112887735A; EP3072299A1; CN112887736A

Abstract

Disclosing a kind of system for signaling extensions function, described system includes the sequence of multiple picture for decoding, processes each picture based in part on image parameters collection.Read extension and there is Signaling Flag, and described extension exists Signaling Flag and is used to determine whether to read the mark sending execution spread function with signaling.Only just read this mark when described extension exists Signaling Flag instruction.

Description

Condition resolution extension syntax for HEVC extension process

Cross-reference to related applications

This application claims following by Yue Yu and Limin Wang in the application serial that on January 3rd, 2014 submits to be 61/ 923,334, entitled " CONDITIONALLY PARSING EXTENSION SYNTAX OF PICTURE PARAMETER SET (PPS) FOR HEVC RANGE EXTENSION AND MV-HEVC " the rights and interests of U.S. Provisional Patent Application, it passes through at this Quote and be expressly incorporated herein.

Technical field

The present invention relates to, for encoding the system and method conciliating code data, particularly to for generation and process, there is height The system and method for the section head of effect video data encoder.

Background technology

With the generation of media program, transmit and reappear and in the technology being associated, occur in that quickly growth.These technology bags Include encoding scheme, its allow the digital version of media program be encoded to compress it the least size and convenient its Transmit, store, receive and play back.These technology can be applicable to individual video video recording (PVR), video request program (VOD), multichannel matchmaker The supply of body segment mesh, mutual, mobile phone and media program transmission

In the case of not compressing, digital media program is typically too big to such an extent as to can not enter with commercially acceptable cost Row transmits and/or storage.But, the compression of such program makes the transmission of such digital media program and stores not only Commercially viable, and become universal.

Initially, the transmission of media program relate on the high bandwidth transmission medium of such as cable television and satellite transmit low Picture to medium resolution.But, such transmission has developed into the transmission medium including more low bandwidth, such as via meter Calculation machine network, WiFi, mobile TV and the third and fourth generation (3G and 4G) network are to the fixing and the Internet transmission of mobile device.Enter One step, such transmission there have also been developed the media program being to include fine definition, such as has significant transmission bandwidth and deposits The HDTV (HDTV) that storage requires.

Efficient video coding (HEVC) coding standard (or H.265) it is the up-to-date of ISO/IEC mpeg standardization tissue announcement Coding standard.Coding standard before HEVC includes H.264/MPEG-4 advanced video encoding H.262/MPEG-2 and subsequently (AVC) standard.H.262/ the many application include high definition (HD) TV the most substantially instead of MPEG-2.HEVC supports more higher resolution than HD, even if in the stereo or embodiment of multi views, and is more suitable for all Mobile device such as tablet personal computer.Further information about HEVC can be at publication " Overview of the High Efficiency Video Coding(HEVC)Standard,by Gary J.Sullivan,Jens-Rainer Ohm,Woo-Jin Han and Thomas Wiegand,IEEE Transactions on Circuits and Systems For Video Technology, December 2012 " in find, it is expressly incorporated herein by quoting at this.

As in other coding standards, bit stream structure and the grammer of the data of compatible HEVC are standardized, and make Obtain standard compliant each decoder and will produce identical output when being provided identical input.It is incorporated to some features of HEVC standard Including definition and the process of section (slice), one or more sections can include one of picture in video sequence together.Depending on Frequency sequence includes that multiple picture, each picture can include one or more section.Section includes dependent/non-dependent section and relies on Property section.Dependent/non-dependent section (hereafter simply referred to as section) is can to build according to entropy code, signal estimation and residual signal The data structure decoded independent of other sections of identical picture.This data structure allows event in the event of data loss Re-synchronization." dependency section " be permit about the section that will be carried to Internet information (such as with section or Those information that block in wavefront entrance is relevant) structure so that these data that can be used for system process more quickly The section of segmentation.Dependency section is mainly used in low latency coding.

HEVC and tradition coding standard define parameter set structure, and it is the behaviour in diversified application and network environment Provide the motility of improvement and the vigorousness to data degradation of improvement.It can be decoding different piece that parameter set comprises Encoded video and shared information.Parameter set structure provides the security mechanism for transmitting data, this security mechanism for It is necessary for decoding process.H.264 defining sequence parameter set (SPS) and image parameters collection (PPS), SPS describes use In the parameter of decoding picture sequence, and PPS describes the parameter of the picture for decoding picture sequence.HEVC introduces new ginseng Manifold, video parameter collection (VPS).

The information included according to section head performs coding and the decoding of section.This section head includes for reading Indicating the grammer with data and logic, this mark and data are used for decoding section.

As its predecessor, HEVC supports the time to picture section and space encoding.HEVC defines and includes what I cut into slices Section, it carries out space rather than time encoding with reference to another section.I section is alternatively described as " interior " volume of cutting into slices Code.HEVC also defines the section including that P (prediction) cuts into slices, and it is to be spatially encoded and the time with reference to another section Coding.P section is alternatively described as what section " " encoded.HEVC also describes and includes that what double prediction (B) cut into slices cuts Sheet.With reference to two or more, other are cut into slices and are spatially encoded and time encoding in B section.Further, HEVC by P and The concept of B section is incorporated to can serve as the general B section of reference slice.

Currently, HEVC grammer includes providing extension, with by the ability of HEVC or capacity extension to surmounting baseline.Such expansion Exhibition includes range expansion (RExt), scalability extension (SHVC) and multi views extension (MV-HEVC).Extension can VPS, SPS, PPS or a combination thereof send with signaling.

“High Efficiency Video Coding(HEVC)Range Extensions text specification:Draft 4,”published by the Joint Collaborative Team on Video Coding(JCT-VC)of ITU-T SG 16WP 3and ISO/IEC JTC 1/SC 29/WG 11,13th Meeting: Incheon, KR, 18 26April 2013, by David Flynn et al, (passing through to quote at this to be expressly incorporated herein) defines PPS grammer, it controls multiple spread function by using the spread function unique designation with each spread function unique association Execution.But, such mark is not independent reading.Such as, PPS grammer sends execution one extension letter with signaling First mark of number can be in the only ability when another (second) mark of the spread function performed before has special state or value Grammer that is resolved and that perform is interior reads (such as, it is not possible to read mark, unless the Mark Detection read before is "true").When Unless the spread function grammer before having been carried out is without when performing spread function, this is not problem.But, at needs In the case of the independent parsing controlling spread function or execution, here it is problem.What is desired is that a kind of for resolving grammer The system and method for improvement, this grammer allows the independent parsing controlling spread function.Present disclosure describes such system and Method.

Summary of the invention

In order to solve above-mentioned requirements, this document discloses a kind of equipment for signaling extensions function and method, described in set Standby and method includes the sequence of multiple picture for decoding, processes each picture based in part on image parameters collection.? In one embodiment, method includes: reads extension and there is Signaling Flag；Determine that read extension exists whether Signaling Flag refers to Having shown will be based in part at least one spread function to process picture；And only work as read extension and there is signaling Mark indicates during based in part at least one spread function to process picture, reads and sends the first extension with signaling First spread function Signaling Flag of function, and unrelated with the value of the first spread function Signaling Flag read, read with Signaling sends the second spread function Signaling Flag of the second spread function.If desired can be by additional spread function signaling mark Will performs this method.Disclosing another embodiment, one of which device to have the processor of the memorizer of communicative couplings is Feature, the storage of described memorizer is for performing the instruction of aforementioned operation.

Accompanying drawing explanation

Referring now to accompanying drawing, reference marker similar in accompanying drawing represents corresponding part in the whole text:

Fig. 1 is to depict to may be used for transmission and/or storage and retrieval audio frequency and/or the Video coding-solution of video information The figure of the exemplary embodiment of code system；

Fig. 2 A is that the AV information of coding is sent to another location and the reality of coding/decoding system received in this position Execute the figure of example；

Fig. 2 B is to depict storage coding information and retrieve the coding information example for the coding/decoding system presented after a while Property embodiment figure, hereafter known as encoding and decoding storage system；

Fig. 3 is the block diagram of the embodiment illustrating source encoder；

Fig. 4 is the picture depicting AV information, one of picture in such as sequence of pictures；

Fig. 5 shows the code tree block figure to the exemplary division of coding unit；

Fig. 6 is to illustrate the representative quaternary tree and data parameters divided for the code tree block shown in Fig. 5 The figure represented；

Fig. 7 is the figure illustrating coding unit to the division of one or more predicting unit；

Fig. 8 shows and coding unit is divided into four predicting unit and the set of converter unit that is associated Figure；

Fig. 9 shows the RQT code tree for the converter unit being associated with the coding unit in the example of Fig. 8 Figure；

Figure 10 is the figure of the spatial prediction illustrating predicting unit；

Figure 11 is the figure illustrating time prediction；

Figure 12 is the figure of the use illustrating motion vector predictor (MVP)；

Figure 13 illustrates the example of the use of reference picture list；

Figure 14 is the figure illustrating the process performed by the encoder according to above-mentioned standard；

Figure 15 depict by according to the decoder of emerging HEVC standard in decoding to collocated_from_10_flag Use；

Figure 16 A and 16B is the figure presenting baseline PPS grammer；

Figure 16 C and 16D is the figure of the PPS grammer presenting improvement；

Figure 17 A-17D illustrates process stream and the grammer of the exemplary improvement for extension process；

Figure 18 is the figure presenting the exemplary PPS grammer for HEVC range expansion；

Figure 19 A-19C shows the further alternative embodiment of expanded signalling grammer；And

Figure 20 illustrates the example processing system that may be used for realizing the disclosed embodiments.

Detailed description of the invention

In the following description, carry out reference to forming the accompanying drawing describing a part, and show this by way of illustration Some embodiments of invention.It should be understood that in the case of without departing from the scope of the present invention, it is possible to use other embodiments and can To carry out structural change.

Audio-visual information transmitting-receiving and storage

Fig. 1 is to depict to may be used for transmission and/or storage and retrieval audio frequency and/or the Video coding-solution of video information The figure of the exemplary embodiment of code (encoding and decoding) system 100.Coding/decoding system 100 includes: coding system 104, it accepts audiovisual (AV) information 102 and process AV information 102 with generate coding (compression) AV information 106；And solving code system 112, it processes The AV information 106 of coding is to produce the AV information 114 recovered.Owing to coding and decoding process are the most lossless, the AV information of recovery 114 is incomplete same with original AV information 102, but selects coded treatment and parameter, AV information 114 He of recovery by wisdom Difference between untreated AV information 102 is acceptable for the mankind experience.

Before decoding and presenting, generally transmit or store and retrieve the AV information 106 of coding, as transmitting-receiving (is launched and connects Receive) or storage/retrieval system 108 performed by.Transmitting-receiving loss is probably significantly, but storage/retrieval loss generally minimum or Do not exist, therefore it provides identical or basic to usual and coding the AV information 106 of AV information 110 of the transmitting-receiving solving code system 112 Identical.

Fig. 2 A is that the AV information 106 of coding is sent to another location and the coding/decoding system 200A received in this position The figure of one embodiment.The AV information 102 of input is converted into and is suitable to the signal of transmission and at transmission channel by transmission segmentation 230 It is sent to the signal of conversion on 212 receive segmentation 232.Receive segmentation 232 and receive the signal of transmission, and the signal that will receive It is converted into the AV information 114 of the recovery Gong presenting.It is as noted previously, as coding and transmission loss and mistake, the AV information of recovery The quality of 114 is likely lower than the quality being supplied to transmit the AV information 102 of segmentation 230.But, error correction system can be included in To lower or to eliminate such mistake.Such as, before the AV information 106 of coding can encode by increasing redundancy To error correction (FEC), such redundancy can be used to identify and eliminate the mistake received in segmentation.

Transmission segmentation 102 includes one or more source encoders 202 in the multiple sources for encoding AV information 102.For Compressing to produce the purpose of AV information 106 of coding, first AV information 102 encoded by source encoder 202, and as the most following enter Described in one step, source encoder 202 can include such as processor and relational storage, and this memorizer storage achieves such as The codec of MPEG-1, MPEG-2, MPEG-4AVC/H.264, HEVC or the instruction of similar codec.

Coder/decoder system 200A can also include the optional element in Fig. 2 A indicated by dotted line.These optional elements include Video multiplex encoder 204, coding controller 208 and video demultiplexing decoder 218.Optional video multiplex encoder 204 Multiple source encoders 202 that the one or more parameter reflexs provided according to optional coding controller 208 join for auto-correlation The AV information 106 of coding.Such multiplexing completes and the most in the time domain based on packet.

In one embodiment, video multiplex encoder 204 includes statistical multiplexer, and a combination thereof is from multiple source encoders The AV information 106 of the coding of 202 thus bandwidth needed for minimizing transmission.Due to the coding from each source encoder 202 The instantaneous bit rate of AV information the most greatly can change, so this is possible according to the content of AV information 102. Such as, compare and there is a small amount of motion or the scene of details (such as portrait dialogue), there is a large amount of details and action (such as motion thing Part) scene generally encode with playout length.Owing to each source encoder 202 can produce, there is high instantaneous bit rate Information and another source encoder 202 produce the information with low instantaneous bit rate, and owing to coding controller 208 can be ordered Source encoder 202 encodes AV information 106 according to the particular characteristic parameter affecting instantaneous bit rate, from each source encoder The signal (each instantaneous bit rate with of short duration change) of 106 can be combined, in an optimal manner to minimize multiplexing The instantaneous bit rate of stream 205.

As it has been described above, source encoder 202 and video multiplex encoder 204 can be controlled by coding controller 208 alternatively System is with the instantaneous bit rate minimizing combination video signal.In one embodiment, this is to use to regard from interim storage coding Frequently the information of the transmission buffer 206 of signal complete and the full level that may indicate that buffer 206.This allows according to biography The coding performed in source encoder 202 or video multiplex encoder 204 of residue storage in defeated buffer 206.

Transmission segmentation 230 can also include transcoder, and its further encoded video signal is to feed to receive segmentation 232 It is transmitted.Transmission coding can include such as the above-mentioned FEC coding of transmission medium selected and/or be encoded into multiplexing Scheme.Such as, if transmission is carried out by satellite or land transmitter, then transcoder 114 can be before being transmitted It is signal constellation (in digital modulation) figure via quadrature amplitude modulation (QAM) or similar modulation technique by Signal coding.And, if will be via interconnection FidonetFido equipment or the Internet transmit the video signal of coding as a stream, then be transmitted compiling to signal according to suitable agreement Code.Further, as described further below, if coding signal will be transmitted via mobile phone, then suitable coding is used Agreement.

Receive segmentation 232 to include transmitting decoder 214, mutual with the encoding scheme of use in transcoder 214 to use The decoding scheme mended receives the signal encoded by transcoder 210.Can be stored by optional reception buffer 216 temporarily Decoding reception signal, and if receive signal include multiple video signal, then come reception by video multiplex decoder 218 Signal carries out multiplexed decoded, with from by extracting video signal interested in the video signal of video multiplex encoder 204 multiplexing. Finally, source decoder 220 use with source encoder 202 for encoding the decoding scheme of the codec complementation of AV information 102 Or codec decodes video signal interested.

In one embodiment, the data of transmission include being transmitted into client (generation from server (representing transmission segmentation 230) Table receives segmentation 232) the video flowing of packetizing.In the case, transcoder 210 can be by data packetization and by net Network level of abstraction (NAL) unit is embedded in network packet.NAL unit defines the data capsule with head and code element, And can correspond to frame of video or other sections of video data.

Can be packetized and transmit via transmission channel 212 by the compression data of transmission, transmission channel can include Wide area network (WAN) or LAN (LAN).Such network can include such as, the wireless network of such as WiFi, Ethernet net Network, Internet or the hybrid network being made up of some heterogeneous networks.Can be via such as RTP (RTP), use The communication protocol of user data datagram protocol (UDP) or the communication protocol of any other type affect such communication.Different grouping Change method may be used for each network abstract layer (NAL) unit of bit stream.In oneainstance, the size of a NAL unit Less than the size of MTU (MTU), the size of this MTU corresponds to can be in the case of not being segmented on network The size of the largest packet of transmission.In the case, during NAL unit is embedded into single network packet.In another case, multiple Whole NAL unit is included in single network packet.In a third case, a NAL unit may be too large to Transmitting and be therefore split into the NAL unit of some segmentations in single network is grouped, the NAL unit of the most each segmentation is solely Vertical network packet is transmitted.For the purpose of decoding, the NAL unit of segmentation is generally continuously transmitted.

Receive segmentation 232 receive the data of packetizing and from network packet, rebuild NAL unit.For segmentation NAL unit, the Data relationship of the NAL unit from segmentation is got up by client, in order to rebuild original NAL unit.Visitor That family end 232 decoding receives and the data stream that rebuilds and on the display device reproducing video image and by raising one's voice Think highly of existing voice data.

Fig. 2 B is to depict storage coding information and retrieve the coding information figure for the exemplary embodiment presented after a while, Hereafter known as encoding and decoding storage system 200B.This embodiment can be used, for example, in digital VTR (DVR), flash drive Locally store information in device, hard disk drive or similar devices.In this embodiment, by source encoder 202 to AV information 102 Carry out source code, buffered by storage buffer 234 before storing it in storage device 236 alternatively.Storage sets Standby 236 can temporarily or in expansion time section storage video signal, and hard disk drive, flash drive can be included Device, RAM or ROM.The AV information of storage is retrieved subsequently, is carried out buffering and by source decoder by retrieval buffer 238 alternatively 220 are decoded.

Fig. 2 C be depict include coding system or encoder 202 conciliate code system or decoder 220, can be used for transmitting Another figure with the example content dissemination system 200C receiving HEVC data.In certain embodiments, coding system 202 is permissible Including input interface 256, controller 241, enumerator 242, frame memory 243, coding unit 244, transmitter buffer 267 and Output interface 257.Solve code system 220 to include receptor buffer 259, decoding unit 260, frame memory 261 and control Device 267.Coding system 202 is conciliate code system 220 and can be intercoupled via the transmission path that can carry compression bit stream. The controller 241 of coding system 202 can control based on transmitter buffer 267 or the capacity of receptor buffer 259 Transmission data volume and other parameters can be included, the data volume of such as time per unit.Controller 241 can control coding Unit 244 is to prevent to solve the reception signal decoding operation failure of code system 220.Controller 241 can be processor or Include, but not limited to the microcomputer with processor, random access memory and read only memory.

By non-restrictive example, the source picture 246 provided from content supplier can include sequence of frames of video, this frame sequence Including the artwork sheet in video sequence.Artwork sheet 246 can be unpressed or compression.If source picture 246 is uncompressed , then coding system 202 can have encoding function.If source picture 246 is compression, then coding system 202 can have Transcoding function.Controller 241 can be utilized to obtain coding unit from source picture.Frame memory 243 can have first area and Second area, wherein first area may be used for storing the second area from the entrance frame of source picture 246 and may be used for reading Frame and be output to coding unit 244.Controller 241 can be to frame memory 243 output area switch-over control signal 249.Region switch-over control signal 249 may indicate that first area or second area will be utilized.

Controller 241 can export coding control signal 250 to coding unit 244.Coding control signal 250 so that Coding unit 202 starts encoding operation, such as prepares coding unit based on source picture.In response to the volume from controller 241 Code control signal 250, coding unit 244 can start reading out the coding unit of the preparation of high efficient coding process, high efficient coding Process is such as predictive coding process or transition coding process, and it processes the coding unit prepared, thus generate based on coding The video compression data of the source picture that unit is associated.

The video compression data of generation can included the packetizing elementary streams (PES) of video packets by coding unit 244 In pack.Coding unit 244 can use control information and program time stamp (PTS) that video packets is mapped to coding The video signal 248 of video signal 248 and coding can be sent to transmitter buffer 267.

Video signal 248 including the coding of the video compression data generated can be stored in transmitter buffer 267. Traffic count device 242 can be incremented by indicate the total amount of data in transmitter buffer 267.It is retrieved along with data and postpones Rushing in device and remove, enumerator 242 can successively decrease to reflect the data volume in transmitter buffer 267.Occupied area information signal 253 can be sent to enumerator 242 to indicate whether the data from coding unit 244 have added to transmitter buffer 267 or remove from it, therefore enumerator 242 can be with increasing or decreasing.Controller 241 can be based on occupied area information 253 Control produced video packets by coding unit 244, occupied area information 253 can be communicated so as to predict, avoid, prevent and/ Or detection transmitter buffer 267 occurs to overflow or underflow.

Can in response to the preset signals 254 being generated by controller 241 and being exported reset information batching counter 242.? After traffic count device 242 is reset, coding unit 244 data exported can be counted and obtain by it The video compression data generated and/or the amount of video packets.Traffic count device 242 can provide expression to obtain to controller 241 The quantity of information signal 255 of the quantity of information obtained.Controller 241 can control coding unit 244 so that in transmitter buffer 267 In there is not spilling.

In certain embodiments, solve code system 220 and can include input interface 266, receptor buffer 259, controller 267, frame memory 261, decoding unit 260 and output interface 267.The receptor buffer 259 solving code system 220 can be interim Storage compression bit stream, this compression bit stream includes the video compression data of reception based on the source picture from source picture 246 And video packets.Solve code system 220 can read the control information being associated with the video packets in the data received and in Existing timestamp information and output can apply to the frame number signal 263 of controller 220.Controller 267 can be with the most true The frame number of counting is supervised at fixed interval.By non-restrictive example, controller 267 can complete at decoding unit 260 every time The frame number of supervision counting during decoding operation.

In certain embodiments, when frame number signal 263 indicate receptor buffer 259 be in predetermined volumes time, control Device 267 processed can be to decoding unit 260 output decoding commencing signal 264.When frame number signal 263 indicates receptor buffer 259 when being in less than predetermined volumes, and controller 267 can wait frame number to be counted to become equal to the sending out of situation of scheduled volume Raw.When the situation occurs, controller 267 can export decoding commencing signal 263.By non-restrictive example, when frame number is believed Numbers 263 indicate receptor buffer 259 when being in predetermined volumes, and controller 267 can export decoding commencing signal 264.Can To decode coding based on the presentation time stamp being associated with the video packets of coding with monotone order (being i.e. increased or decreased) Video packets and video compression data.

In response to decoding commencing signal 264, decoding unit 260 can decode the number amounting to the picture being associated with frame According to, and the video data of the compression being associated with picture, this picture is relevant to the video packets from receptor buffer 259 Connection.The video signal 269 of decoding can be write in frame memory 261 by decoding unit 260.Frame memory 261 can have One region and second area, the video signal wherein decoded is written into first area and second area for reading into output interface The decoding picture 262 of 267.

In various embodiments, coding system 202 can with head end transcoder or code device merges mutually or therewith Being associated, and solution code system 220 can merge mutually with upstream device or be further associated, upstream device such as sets for mobile Standby, Set Top Box or transcoder.

Source code/decoding

Have than the chi of original video sequence in AV information 102 as it has been described above, encoder 202 utilizes compression algorithm to generate Very little smaller size of bit stream and/or file.Can carry out so by reducing the room and time redundancy in original series Compression.

The encoder 202 of prior art includes and by " Video Coding Experts group " (VCEG) and " motion diagram of ISO of ITU As expert group " video compression standard H.264/MPEG-4AVC (" advanced video encoding ") of exploitation is compatible between (MPEG) volume Code device, especially form are publication " Advanced Video Coding for Generic Audiovisual Services " (in March, 2005), it passes through to quote at this to be expressly incorporated herein.

HEVC " efficient video coding " (sometimes referred to as H.265) is expected to substitute H.264/MPEG-4AVC.As following enter Described in one step, HEVC introduces the general new coding tools as the coding entity defined in H.264/AVC and entity.

Fig. 3 is the block diagram of the embodiment illustrating source encoder 202.Source encoder 202 accepts AV information 102 And use sampler 302 AV information 102 of sampling to produce consecutive digital images or the sequence 303 of picture, each digital picture or Picture has multiple pixel.Picture can include frame or field, and wherein, frame is the complete graph caught during known interval Picture, and field is the set of base line of the odd number of ingredient image or even number.

Sampler 302 produces unpressed sequence of pictures 303.Each digital picture can be by having of multiple coefficient Or multiple matrix represents, the plurality of coefficient illustrates the information about the pixel constituting picture together.The value of pixel can be right Should be in brightness or other information.(such as, R-G-B component or bright in the case of some components are associated with each pixel Degree-chromatic component), each in these components can be processed separately.

Image can be segmented into " section ", can include that a part of picture maybe can include whole picture.H.264 marking In standard, these sections are divided into the coding entity (typically size is the block of 16 pixel × 16 pixels) being referred to as macro block, and Each macro block can then be divided into different size of data block 102, such as 4 × 4,4 × 8,8 × 4,8 × 8,8 × 16,16 ×8.HEVC extension and generalization surmount the concept of the coding entity of macro block concept.

HEVC coding entity: CTU, CU, PU and TU

As other video encoding standards, HEVC is block-based blending space and time prediction encoding scheme.But, HEVC introduces the new coding entity not included with H.264/AVC standard.These coding entity include (1) code tree block (CTU), coding unit (CU), predicting unit (PU) and converter unit (TU), and these coding entity further described below.

Fig. 4 is the figure of the picture 400 depicting AV information 102, picture 400 be such as the picture in sequence of pictures 303 it One.Picture 400 is spatially divided into nonoverlapping square, and it is referred to as code tree unit, or CTU 402.Unlike wherein Basic coding unit is the video encoding standard H.264 and before of the macro block of 16x16 pixel, and CTU 402 is the basic of HEVC Coding unit, and it can be large enough to 128x128 pixel.As shown in Figure 4, generally in picture 400 be similar to progressive scan Order quote CTU 402.

Each CTU 402 can then be iterated the coding unit being divided into less variable size, below by way of " four fork Tree " decompose be described further.The region that coding unit is formed in image and transmits in bit stream 314, The coding parameter similar to this area applications.

Fig. 5 shows and CTU 402 is exemplarily divided into such as coding unit 502A and 502B be (the most alternatively Be referred to as coding unit 502) the figure of coding unit (CU).Single CTU 402 can be divided into four CU 502, such as CU 502A, each CU 502A are 1/4th of CTU 402 size.The CU 502A of each such segmentation can be the most divided Being four less CU 502B, it has 1/4th sizes of original CU 502A.

Described by " quaternary tree " data parameters (such as mark or bit) and CTU 402 is divided into CU 502A and more Little CU 502B, " quaternary tree " data parameters, as being referred to as the expense of grammer, is encoded into output together with coded data Bit stream 314.

Fig. 6 is to illustrate the representative quaternary tree 600 and data parameters divided for the CTU 402 shown in Fig. 5 The figure of expression.Quaternary tree 600 includes the primary nodal point 602A that multiple node, the plurality of node are included on a level horizontal (hereafter, quadtree's node can may be alternatively referred to as " node " with the secondary nodal point 602B on relatively low level horizontal 602).At each node 602 of quaternary tree, if node 602 is split into child node further, then give " division mark " or Bit " 1 ", otherwise gives bit " 0 ".

Such as, CTU 402 shown in Figure 5 divides and can be represented by the quaternary tree 600 presented in Fig. 6, four forks Tree 600 includes that the division mark " 1 " being associated with the node 602A at top CU 502 (indicates and has 4 in lower-level level Individual additional node).The quaternary tree 600 of diagram also includes the division mark being associated with the node 602B in intergrade CU 502 " 1 ", to indicate this CU to be also divided into four other CU 502 in next (end) level CU.Source encoder 202 can limit Little and maximum CU 502 size, thus changes the maximum possible degree of depth of CU 502 division.

Encoder 202 generates the AV information 106 of coding, and its form is to include the bit stream of Part I and Part II 314, wherein Part I has the coded data for CU 502 and Part II includes being referred to as the expense of syntactic element.Compile Code data include the data of the CU 502 corresponding to coding, and (i.e., as described further below, motion associated there is vowed Amount, predictor coded residual together or relevant residual).Part II includes can be with the syntactic element of presentation code parameter, its The coded data of block can not be corresponded directly to.Such as, syntactic element can include address and mark, the amount of the CU 502 in image Change parameter, selected between the coding/instruction of interior coding mode, quaternary tree 600 or other information.

CU 502 is corresponding to basic coding element and includes two correlator unit: predicting unit (PU) and converter unit (TU), both there is the full-size of the size being equal to corresponding CU 502.

Fig. 7 is the figure illustrating and CU 502 being divided into one or more PU 702.PU 702 is corresponding to the CU divided 502 and the pixel value of type in the predicted pictures or between picture.PU 702 is to divide expansion H.264/AVC for estimation Exhibition, and it is divided into the CU 502 of other CU (" division mark "=0) to define PU 702 for each the most further quilt.Such as figure Shown in 7, at each leaf 604 of quaternary tree 600, final (end level) CU 502 of 2Nx2N can have four may PU mould One of formula: 2Nx2N (702A), 2NxN (702B), Nx2N (702C) and NxN (702D).

CU 502 can be carried out space or time prediction coding.If CU 502 is with " interior " pattern-coding, enter the most as follows Described in one step, each PU 702 of CU 502 can have spatial prediction direction and the image information of their own.And, at " interior " In pattern, the PU 702 of CU 502 can depend on another CU 502, this is because it can be used in the sky in another CU Between neighbours.If CU 502 is with " " pattern-coding, the most as further described below, each PU 702 of CU 502 can have it Oneself motion vector and the reference picture being associated.

Fig. 8 shows the set of the converter unit (TU) 802 CU 502 being divided into four PU 702 and be associated Figure.TU 802 is used to indicate the base unit being carried out spatial alternation by DCT (discrete cosine transform).By further below The size of each piece of conversion TU 802 that " residual " quaternary tree (RQT) illustrated describes in CU 502 and position.

Fig. 9 shows the figure of the RQT 900 of the TU 802 of CU 502 in the example of Fig. 8.Note, RQT's 900 " 1 " of primary nodal point 902A indicates " 1 " instruction of the secondary nodal point 902B that there is four branches and adjacent lower layer level level Node indicated by has four branches further.The data describing RQT 900 are also encoded and as in bit stream 314 Expense is transmitted.

The coding parameter of video sequence can be stored in the special NAL unit being referred to as parameter set.Two can be utilized The parameter set NAL unit of type.First Parameter Set Type is referred to as sequence parameter set (SPS), and includes NAL unit, NAL unit Unchanged parameter during being included in whole video sequence.Generally, SPS processes coding brief introduction, the size of frame of video and other ginsengs Number.The parameter set of Second Type is referred to as image parameters collection (PPS), and to may be from an image to another image change Different values encodes.

Room and time is predicted

One of technology for compression bit stream 314 is to give up the storage of pixel value self, replacement, uses at decoder At 220, recursive process is come predicted pixel values and storage or sends the difference between predicted pixel values and actual pixel value (referred to as residual error).As long as decoder 220 can calculate identical predicted pixel values according to provided information, then can be by by residual Difference adds predictive value to recover actual picture value.Constructed can be used for compresses other data.

It is provided to predictor module 307 referring again to Fig. 3, each PU 702 of the CU 502 processed.Predictor Module 307 based on the information (infra-frame prediction, it is performed by spatial predictors 324) in PU 702 neighbouring in same number of frames and time The information (inter prediction, it is performed by versus time estimator 330) of the PU 702 in frame the most close between predicts PU's 702 Value.But, time prediction may not have and current PU always based on the PU configured because the PU of configuration is defined as being positioned at Reference/the non-reference frame of the x and y coordinates that the x and y coordinates of 702 is identical.These technology make use of the space between PU 702 and time Between associate.

Therefore coding unit can be classified as include two types: (1) non-temporal predicting unit and (2) time prediction list Unit.Using present frame to predict non-temporal predicting unit, the PU 702 adjacent or neighbouring in including frame of present frame is (such as, in frame Prediction), and generated non-temporal predicting unit by spatial predictors 324.According to a time picture (such as P frame) or root Predicted time predicting unit is carried out according to the time upper at least two reference picture (i.e. B frame) shifting to an earlier date and/or postponing.

Spatial prediction

Figure 10 is the figure of the spatial prediction illustrating PU 702.Picture can include PU 702 and the most close Other PU 1-4, including neighbouring PU 702N.Spatial predictors 324 is by employing the encoded of the pixel of present image " in the frame " of the PU 702 of other blocks predicts current block (the block C of such as Figure 10).

Spatial predictors 324 location is suitable to the neighbouring PU (such as, the PU 1,2,3 or 4 of Figure 10) of space encoding and determines To this adjacent to the angle prediction direction of PU.In HEVC, it may be considered that 35 directions, the most each PU can have associated One of 35 directions of connection, including level, vertical, 45 degree of diagonal angles, 135 degree of diagonal angles, DC etc..The sky of PU is indicated in grammer Between prediction direction.

Referring back to the spatial predictors 324 of Fig. 3, this neighbouring PU positioned is used for element 305 and calculates Residual PU 704 (e), using the difference between the pixel and the pixel of current PU 702 of neighbouring PU 702N.This result is interior pre- Surveying PU element 1006, it includes prediction direction 1002 and interior prediction residual PU 1004.Can be by according to the most close PU And the spatial correlation deduction direction of picture encodes prediction direction 1002, so that the encoding rate of interior prediction direction pattern Can be lowered.

Time prediction

Figure 11 is the figure illustrating time prediction.Time prediction considers from time upper adjacent picture or the letter of frame Breath, such as before picture, picture i-1.

Generally, time prediction includes single prediction (P type) and many predictions (B type), and P type is by with reference to being only from one One reference zone of reference picture predicts PU 702, and B type is by with reference to from two of one or two reference picture Reference zone predicts PU.Reference picture is the figure being encoded and being reconstructed subsequently (by decoding) in video sequence Picture.

(one for P type or several are for B at one or several of these reference zones for versus time estimator 330 Type) in the region of pixel in frame neighbouring in recognition time so that they are used as the prediction of this current PU 702 Device.In the situation (B type) using some regional prediction devices, they can be fused to generate an independent prediction.In ginseng Examine in frame and identify that reference zone 1102, motion vector MV 1104 are defined as present frame (figure by motion vector (MV) 1104 Displacement between reference zone 1102 (refIdx) in current PU 702 and reference frame (picture i-1) in sheet i).PU in B picture Can have up to two MV.MV and refIdx information is included in the grammer of HEVC bit stream.

Referring again to Fig. 3, the difference of the pixel value between reference zone 1102 and current PU 702 can be by by switching 306 institutes The element 305 selected calculates.This difference predicts the residual error of PU 1106 between being referred to as.In time or the end of inter predication process Place, current PU 1006 is made up of a motion vector MV 1104 and residual error 1106.

But, as it has been described above, be to use decoder 220 repeatably means for compressing a technology of data, generate For the predictive value of data, thus calculate the prediction of data and the difference (residual error) of actual value and send residual error to decode.As long as Decoder 220 can reappear predictive value, then residual values may be used to determine whether actual value.

Actual MV 1104 and the difference (residual error) of prediction MV 1104 can be calculated by generating the prediction of MV 1104 and incite somebody to action MV residual error sends in bit stream 314, and this technology is applied in time prediction the MV 1104 used.If decoder 220 The MV 1104 of prediction can be reappeared, it is possible to calculate actual MV 1104 according to residual error.HEVC uses between neighbouring PU 702 The spatial coherence of motion calculate the prediction MV for each PU 702.

Figure 12 is the figure of the use illustrating the motion vector predictor (MVP) in HEVC.Motion vector predictor V₁,V₂And V₃Take from the MV 1104 being positioned at neighbouring or adjacent to block (C) to be encoded multiple pieces 1,2 and 3.Due to these vectors Relate to the motion vector of the spatial neighboring blocks in identical time frame and can be used for predicting the motion vector of block to be encoded, this A little vectors are referred to as space motion predictor.

Figure 12 also illustrates temporal motion vector predictor V_T, its picture decoded before sequence is (to solve Code order) in the motion vector of block C ' of common location (such as, the block of picture i-1 is positioned at and is being coded of block (image i Block C) identical locus, locus).

Spatial motion vector prediction device V₁,V₂And V₃And temporal motion vector predictor V_TComponent may be used for generate Median motion vector predictor V_M.In HEVC, according to predetermined availability rule, can be as shown in Figure 12 from following piece Extract three spatial motion vector prediction devices, it is, be positioned at the block (V in the left side of block to be encoded₁), be positioned on block (V₃) and be positioned at one of the block at each angle of block to be encoded (V₂).This MV predictor selection technique is referred to as advanced motion and vows Amount prediction (AMVP).

Therefore, it is thus achieved that there is spatial predictors (such as V₁,V₂And V₃) and versus time estimator V_TMultiple (usual five) MV Predictor (MVP) candidate.In order to reduce the expense sending motion vector predictor in the bitstream with signaling, can be by eliminating The data of the motion vector repeated reduce the set of motion vector predictor and (such as, have the value identical with the value of other MV MV can eliminate from candidate).

Encoder 202 can select the motion vector predictor of " best " from candidate, and calculates as selected motion The motion vector predictor residual error of the difference of vector predictor and actual motion vector, and by motion vector predictor residual error than Special stream 314 sends.In order to perform this operation, actual motion vector must store in case after a while by decoder 220 use (though So in bit stream 314, do not send actual motion vector).Signaling bit or mark be included in bit stream 314 with specify from Which MV residual error is normalization motion vector predictor calculate, and decoder uses this signaling bit or mark to recover after a while Motion vector.These bits or mark are further described below it.

Referring back to Fig. 3, predict interior prediction residual 1004 and a prediction of process acquisition from space (interior) or time () Residual error 1106 is transformed module 308 subsequently and is transformed into above-mentioned converter unit (TU) 802.Can use above for described in Fig. 9 RQT decompose further TU 802 is split into less TU.In HEVC, generally use the decomposition of 2 or 3 grades and approved Transform size is from 32 × 32,16 × 16,8 × 8 and 4 × 4.As it has been described above, according to discrete cosine transform (DCT) or discrete sine Conversion (DST) is converted.

Residual error variation coefficient is quantified subsequently by quantizer 310.Quantify in data compression, play very important angle Color.In HEVC, quantify to be converted into high precision variation coefficient the probable value of limited quantity.Although quantifying to have granted big piezometric Contracting, but quantifying is to damage operation, and quantization loss can not be resumed.

The coefficient of the conversion residual error quantified is encoded by entropy coder 312 subsequently, then as the figure of coding AV information A part for the useful data of picture is inserted in compression bit stream 310.The space correlation between syntactic element can also be used Syntax elements encoded is encoded to increase code efficiency by property.HEVC provides context adaptive binary arithmetic coding (CABAC).Other form codings or entropy code or arithmetic coding can also be used.

In order to calculate the predictor used above, encoder 202 uses and includes element 316,318,320,322,328 " decode " PU 702 that circulation 315 decoding is the most encoded.This decoding circulation 315 has rebuild PU and residual from quantization transform The image of difference.

Conversion residual error coefficient E after quantization is provided to quantizer 316, and inverse operation is applied to quantizer 310 by it Conversion residual error coefficient E after quantization, to produce the conversion residual error coefficient (E ') 708 going to quantify of PU.Go quantify data 708 with After be provided to inverse converter 318, the inverse operation of conversion that its application conversion module 308 is applied, to generate the PU's of reconstruct Residual error coefficient (e ') 710.

The residual error coefficient 710 of the PU of reconstruct is added to by selector 306 subsequently from an interior prediction PU 1004 and prediction PU The coefficient of correspondence of the corresponding prediction PU selected in 1106 (x ') 702 '.Such as, if the residual error rebuild is from spatial prediction " interior " cataloged procedure of device 324, then add to this residual error by " interior " predictor (x '), in order to the PU that recovers to rebuild (x ") The original PU that 712, PU (x ") 712 revises corresponding to the loss that caused by conversion (such as be quantization operation in this situation) 702.If residual error 710 is from " " cataloged procedure of versus time estimator 330, then region (this pointed to by current motion vector A little regions belong to the reference picture in the reference buffer 328 being stored in referenced by present image indexes) be fused then by Add to this decoded residual.So, original PU 702 is caused due to quantization operation loss and revise.

For the motion-vector prediction technology that encoder 202 use is similar to image prediction technology as above, can To use motion vector buffer 329 to store motion vector for use in time posterior frame.As described further below, can With arrange mark and transmit this mark with grammer, with instruction for current decoded frame motion vector should by least for After coded frame rather than replace the content of MV buffer 329 with the MV of present frame.

Recursive filter 322 is applied to reconstruction signal (x ") 712, in order to reduces the severe to the residual error obtained and quantifies The impact caused, and improve signal quality.Recursive filter 322 can include, such as de-blocking filter, and it is used for smoothing Border between PU, with the high frequency caused by cataloged procedure of visually decaying, also includes linear filter, and it is in the institute of image Have PU to be decoded to be employed afterwards to minimize the variance with original image and (SSD).Linear filtering process is held by frame by frame Go and use the some pixels around the pixel waiting to be filtered, and also using the spatial correlation between frame pixel.Linear filter Ripple device coefficient can be encoded and be transmitted at a head of bit stream, it is common that picture or the head of section.

The image of filtering, also referred to as reconstructs image, consequently as the reference picture quilt from reference picture buffers 328 Storage, in order to allow follow-up " " prediction to occur during the compression of the successive image of current video sequence.

Reference picture grammer

As it has been described above, in order to reduce mistake and improve compression, HEVC permits the use of some reference pictures, in order to currently Image carries out estimating and motion compensation.Current PU 702, the PU 1102 that particular slice is configured in given photo current Reside in the neighbouring reference/non-reference picture of association.Such as, in fig. 12, PU 702 current in picture (i) is configured PU 1102 reside in the neighbouring reference picture (i-1) of association.Some of multiple reference/non-reference picture select work as Best " " predictor of front PU 702 or versus time estimator, this can based on display order in time at photo current Before or after picture (being backward and forward prediction respectively).

For HEVC, the reference picture list described in section grammer define the index to reference picture.By List_0 (RefPicList0) defines forward prediction, and is defined back forecast, and row by list_1 (RefPicList1) Table 0 and list 1 can comprise with display order before photo current or/and multiple reference picture afterwards.

Figure 13 illustrates the example of the use of reference picture list.Consider picture shown in Figure 13 0,2,4,5,6, 8 and 10, wherein, the numeral of each picture represents that display order and photo current are pictures 5.In the case, there is rising Reference picture index and the list_0 reference picture starting from null index are 4,2,0,6,8 and 10, and have rising ginseng Examine picture indices and to start from the list_1 reference picture of null index be 6,8,10,4,2 and 0.Motion compensated prediction quilt The section being limited to list_0 prediction is referred to as prediction or P section.By using the collocated_ref_idx rope in HEVC Attract the picture of instruction configuration.Its motion is supplemented and predicts that the section including more than one reference picture is that bi-predictive or B is cut Sheet.Cutting into slices for B, motion compensated prediction can include from list_1 prediction and the reference picture of list_0.

Therefore, the PU 1102 configured is placed in the reference picture specified in list_0 or list_1.Mark (collocated_from_l0_flag) institute should be obtained from list_0 or list_1 for appointment for particular slice type The division of configuration.Each reference picture is also associated with motion vector.

At Benjamin Bross, Woo-Jin Han, Jens-Rainer Ohm, Gary J.Sullivan, Thomas Wiegand,“WD4:Working Draft 4of High-Efficiency Video Coding,“Joint Collaborative Team on Video Coding(JCT-VC)of ITU-T SG16WP3and ISO/IEC JTC1/ SC29/WG11, JCTVC-F803_d5,6th Meeting:Torino, IT, 14-22July, 2011 (pass through to quote at this to be incorporated to 8.4.1.2.9 section herein) describes the storage of the reference picture for emerging HEVC standard and associated motion vector And retrieval.

According to standard, if slice_type is 0 equal to B and collocated_from_l0_flag, then Reference picture is appointed as the division comprising common location as specified by RefPicList1 by collocated_ref_idx variable Picture.Otherwise (slice_type is equal to 1 or slice_type equal to P equal to B and collocated_from_l0_flag), Then reference picture is appointed as the division comprising configuration as specified by RefPicList0 by collocated_ref_idx variable Picture.

Figure 14 is the figure illustrating the process that the encoder 202 according to above-mentioned standard performs.Block 1402 determines currently Whether picture is the reference picture for another picture.If it is not, then need not store reference picture or motion vector information. If photo current is the reference picture for another picture, then block 1504 determines that " another " picture is P type or B type map Sheet.If picture is P type picture, then processing and be passed to block 1410, colloc_from_10_flag is set to 1 by it, and Storage reference picture and motion vector in list 0.If " another picture " is B type picture, the most however block 1406 is still So when required reference picture will be stored in list 0, process is directed to block 1408 and 1410, in required reference picture and When motion vector will be stored in list 1, process is directed to block 1412 and 1414.This decision-making may be based on whether wish from On time formerly or posterior picture selects reference picture.Which in reference picture selects multiple may be basis Collocated_ref_idx index determines.

Figure 15 depict decoder 220 according to HEVC standard before decoding a in collocated_from_10_flag Use.Block 1502 determines that the current slice type calculated is interior or I type.Such section is in coding/decoding mistake Journey does not use time upper neighbouring section, and therefore need not find time upper neighbouring reference picture.If slice type Be not I type, then block 1504 determines whether section is B section.If section is not B type, then it is the section of P type, and root According to the value of collocated_ref_idx, find the reference picture of the division comprising configuration in list 0.If section is B class Type, then collocated_from_10_flag determines and finds reference picture in list 0 or list 1.As indicated by index , therefore the picture configured is defined as reference picture, depend on slice type (B type or P type) and The value of collocated_from_10_flag, reference picture has the collocated_ref_ of instruction in list 0 or list 1 idx.In an embodiment of HEVC, the first reference picture (has index [0] reference picture to be selected as shown in Figure 13 The picture of configuration).

Baseline picture parameter set syntax

Figure 16 A and 16B is the figure presenting baseline PPS raw byte sequence payload (RBSP) grammer.For processing in PPS The grammer of extension is as illustrated in figure 16b.Logic 1602 determines whether to include that the first extension carrys out coding/decoding media, and reads Suitably signaling and data.Logic 1602 includes statement 1606-1616.It is the most selected that statement 1606 reading indicates the first extension Select the pps_extensiona1_flag for coding/decoding process.In one embodiment, logical value " 1 " instruction will use First extension processes media, and logical value " 0 " indicates and the first extension will not used to process media.Statement 1608 is condition State, its boot statement 1612-1614 holds according to the value (reading) of transform_skip_enabled_flag before OK.Particularly, if transform_skip_enabled_flag is logical one or true, then the logic illustrated performs statement Operation shown in 1612-1614.The transform_skip_enabled_flag 1601 of PPS grammer is as shown in fig. 16.

It is to allow to omit the extension of the dct transform of TU in certain circumstances that conversion is omitted.Substantially, dct transform for For the media of high coherent signal, there is advantage, which results in outstanding energy compression.But, for having high unrelated signal Media (such as there are the media of a large amount of details) for, compression performance is very different.For some media, dct transform process has There is the fewest compression performance, in order to more preferable process performance process is preferably omitted.transform_skip_enabled_ Flag indicates when that the dct transform of TU is omitted in license.At such as " Early Termination of Transform Skip Mode for High Efficiency Video Coding,”by Do Kyung Lee,Miso Park,Hyung-Do Kim and Je-Chang Jeong in the Proceedings of the 2014International Conference on Communications, Signal Processing and Computers has been described, at this by quoting also Enter herein.If transform_skip_enabled flag is logic 1 (very), then processes and be delivered to statement 1612 He 1614.Otherwise, process is delivered to statement 1618.Statement 1612 performs reading value log2_transform_skip_max_ The operation of size_minus2, its indicate can be omitted maximum TU size (if transform_skip_enabled_ Flag instruction license performs the dct transform of TU).Statement 1614 performs to read the operation of mark pps_extension2_flag, its Indicate whether to realize other extensions (extension2).

It follows that perform logic 1604.Logic 1604 includes statement 1618-1622.Statement 1618 is conditional statement, if Pps_extension2_flag is logic 1, then process passes to the logic of statement 1620 and 1622.Statement 1620 and 1622 Additional pps_extension_data_flags is read when there are RBSP data.

In the PPS of aforesaid HEVC range expansion designs, pps_extension2_flag considers Unidentified extension Data.According to above-mentioned logic, if pps_extension1_flag is true, then pps_extension2_flag exists.As Really pps_extension1_flag is not true, then pps_extension2_flag does not exists.If pps_extension2_ Flag does not exists, then pps_extension2_flag is inferred to be equal to 0.If pps_extension2_flag is 0, then do not have There is any additional growth data.

The conception of this logic always checks the value of the pps_extension2_flag for possible additional extension grammer, State regardless of pps_extension1_flag.But, if pps_extension1_flag is 0, then need not check Pps_extension2_flag, because if pps_extension1_flag is 0, then pps_extension2_flag will not Exist, and if pps_extension2_flag does not exists, will infer that it is equal to 0, this indicates and there is no other growth daties.

Entitled " MODIFICATION OF PICTURE PARAMETER SET (PPS) FOR HEVC EXTENSIONS's " Related U.S.Patent utility patent application serial number 14/533,386 describes the amendment of aforementioned grammer, wherein, the logic 1604 of Figure 16 B (statement 1616-1620) is incorporated in conditional statement 1608, and only when pps_extension1_flag is detected as logic 1 from quilt Perform.If this allows pps_extension1_flag to be detected as logical zero, the logic of statement 1610-1620 will be omitted, by This saves the execution time.

Only extend at a PPS when only one of which PPS extension (extension is omitted in such as conversion) will be activated and may also have The of the reading additional data (such as to send with signaling by pps_extension2_flag) just performed in the case of being performed When two PPS extensions are read, this is designed with use.But, if there being additional PPS to extend, this design may be invalid, because grammer Require after a while extension must resolve all before extension syntax, even if the extension performed before and/or grammer may be with after a while The extension and/or the grammer that perform are independent or unrelated.

The image parameters collection grammer improved

Figure 17 A-17D is the figure presenting amended PPS raw byte sequence payload (RBSP) grammer.Generally speaking, repair There is Signaling Flag (pps_extension_present_flag) in the extension of the RBSP syntactic definition after changing, it sends with signaling Whether will process the picture in sequence based in part at least one spread function.If pps_extension_ Present_flag is detected as vacation, then know the PPS extension that not will comply with, and be no longer necessary to definition and process such extension Grammer logic, and no longer perform and perform the process that such grammer logic is associated, therefore saving process resource, deposit Memory resource and the time of process.Amended PPS RBSP grammer also includes one or more expanded signalling mark, each with Signaling sends the existence of the PPS spread function being associated.Which increase the parsing of PPS grammer and the efficiency of execution, because being not required to Grammer to store to read also without processor or perform one or more expanded signalling mark, associated data and logic and refer to Order.

In one embodiment, PPS RBSP grammer is modified so that expanded signalling mark is indexed and changes further Generation ground reading.Such as, n PPS expanded signalling mark can be denoted as pps_extension_flag [i], wherein i be its value from The index of 0 to n-1.In one embodiment, it is possible to use seven PPS expanded signalling mark (n=7) of definition.Each so Independent PPS extension flag can control resolve particular extension function grammer.Such as, a PPS extension flag can control The parsing of the extension dependent parser of HEVC scope, the 2nd PPS extension flag can control the parsing of MV-HEVC dependent parser.

In another embodiment, by using additional pps_extension_7bits grammer, foregoing teachings can be expanded Exhibition is for accommodating more than n (n >=8) individual extension.This additional grammer have granted the signaling further expanded, and can refer in future Fixed more than further expanding that seven PPS being not enough to task indicate.In a preferred embodiment, the number of extended bit (and maximum of above-mentioned index) is arranged to the multiple of 8 bits (0-7) so that can be easily accomplished the solution of byte-by-byte Analysis.

Figure 17 A is to illustrate to be used to one or more spread function and carry out the multiple picture of coding/decoding The flow chart of exemplary operation of sequence.At block 1700, read extension and there is Signaling Flag.There is Signaling Flag instruction in extension Whether will process the picture referenced by PPS grammer based in part at least one spread function.At block 1702, about The extension read exists whether Signaling Flag indicates and at least one spread function will be used at least in part to process and PPS language The picture that method is associated, is determined.In one embodiment, this is to be determined by spread function to have whether Signaling Flag has The first value is had to complete." value " can be logical value (such as true or false) or numeral or the word that may refer to show logical value Female digital value (such as 1 or 0).Not there is the first value if extension exists Signaling Flag (indicate and will not use any extension letter Number processes the picture being associated with PPS grammer), then can skip the operation shown in block 1704-1708.If extension exists Signaling Flag is confirmed as indicating and will carry out decoding picture based in part at least one spread function, then as in block 1704 Shown in, the first spread function Signaling Flag is read, and regardless of the value of the first spread function Signaling Flag, in block 1706 Shown in, read the second spread function signaling function.Consequently, because the value of the first spread function Signaling Flag read before no matter Or state the most all reads the second spread function Signaling Flag, so the reading and being read of the second spread function Signaling Flag The value of the first spread function Signaling Flag is unrelated.This is contrasted with the grammer shown in Figure 16 B, in fig. 16b, is only patrolling In volumes 1608, pps_extension1_flag is detected as true time and is just read pps_extension2_flag by logic 1614.? After, as block 1708 is described, perform the spread function sent by the mark read in block 1704-1706 with signaling.

Figure 17 B is the figure presenting the exemplary PPS grammer that can be used to perform the operation shown in Figure 17 A, and this is exemplary PPS grammer is for use one or more spread functions to decode the sequence of multiple picture.In an illustrated embodiment, extension Function includes that above-mentioned TU dct transform omits expanded function.

Logic 1712 reads extension and there is Signaling Flag.In an illustrated embodiment, there is Signaling Flag and include in extension pps_extension_present_flag.Logic 1714 detects to determine whether pps_extension_present_flag patrols It is true on Ji, and time only in this way, just performs logic 1716-1740.If it is determined that pps_extension_present_flag Logically false, then process and be delivered to logic 1740.It is important that, it means that any extension flag will not be read, and Any spread function will not be performed process.

If Figure 17 B performs the operation described in block 1704 and 1706 of Figure 17 A, the most therefore logic 1716-1720 reads and expands Exhibition function Signaling Flag (here, indicates for pps_extension_flag [i] and/or pps_extension_7bits).Especially It is that, for i=0 to n-1, logic 1716 and 1718 reads pps_extension_flag [i] (in the exemplary embodiment, n= 1, the most only read a mark, i.e. pps_extension_flag [0]).Logic 1720 reading value pps_extension_ 7bits, it is for sending more than being quoted by pps_extension_flag [0]-pps_extension_flag [6] with signaling The additional extension function of up to 7 spread functions.

Logic 1722 and 1728 detection indicates needs determining whether the pps_extension_flag [0] of reading has The logical value of related expanding function (TU DCT omission).If pps_extension_flag [0] has a value that (such as examine Surveying is true in logic), then perform logic 1724-1730.

Logic 1724 detects to determine whether that enabling conversion saves by detection transform_skip_enabled_flag Slightly.If it is activated (such as transform_skip_enabled_flag is detected as very), then perform the logic of PPS grammer 1726-1728.The reading of logic 1726-1728 is represented by log2_max_transform_skip_block_size_minus2 Value, which specify the block size of maximum converter unit (TU) for omitting dct transform.

Logic 1732 detects to determine whether the value of the pps_extension_7bits read by logic 1760 is detected as Very.If so, then logic 1734-1738 reads such additional signaling bits.

Figure 17 C is the flow chart further illustrating the exemplary PPS grammer presented in Figure 17 B, as it has been described above, First all of spread function Signaling Flag (such as pps_extension_flag [i]-pps_extension_flag is read [n-1]), perform each spread function the most one by one.

Referring to Figure 17 C, block 1750 reads extension and there is Signaling Flag.Block 1752 determines that extension exists whether Signaling Flag has There is the value indicated performing at least one spread function.If there is Signaling Flag in extension indicates the extension that will not perform Function, then process after being passed to block 1758.If there is Signaling Flag and indicate one or more for execution extensions in extension Function, then process and be passed to block 1753, and it have read all spread function Signaling Flag (such as pps_extension_flag [i]-pps_extension_flag[n-1]).Process is then communicated to block 1754, and its detection is to determine the first spread function letter Make mark whether have and will perform the value of the first spread function with signaling transmission.Figure 17 B illustrates in logic 1722 Perform the example syntax of this detection.

If the instruction of spread function Signaling Flag will not perform spread function, then process and walk around block 1756 and 1758.For The grammer performing these operations is arrived exemplified by logic 1730 by the logic 1722 of Figure 17 B.If spread function Signaling Flag indicates Spread function will be performed, then process and be passed to block 1756, and perform at least some of of spread function process.For performing this The grammer of a little operations is by illustrated in logic 1724-1728 in Figure 17 B, and it reads for by transform_skip_ The size of the maximum converter unit block of DCT change can be omitted in the case of enabled_flag 1601 instruction.

Block 1758 detects to determine whether to already have accounted for all spread functions.If already having accounted for all extension letters Number, then process terminates (logic 1740 being similar in the grammer shown in Figure 17 B).If also not accounting for all extension letters Number, then process and be passed to block 1760, this results in next the function Signaling Flag considered by block 1754.

Previous figures illustrates to process logic, wherein reads all spread function Signaling Flag, holds the most one at a time The each spread function of row.This embodiment is reading spread function by incremental index as shown in logic 1716 and 1718 The embodiment of Signaling Flag is particularly useful because its by reading and the spread function self of mark (tape index) (can carry or Can not tape index) perform decoupling.For example, it is possible to only by including the grammer for performing each spread function, one connects A ground (such as, to perform logic 1754-1756, subsequently by being inserted in the logical statement between logic 1756 and 1758, hold The further logic of row is to perform next spread function) perform the processing cycle that represented by block 1754-1760.Or, they can To use incremental index to perform, its same index that could be for reading spread function Signaling Flag, or different index.

Figure 17 D is the flow chart illustrating alternative embodiment, be wherein replaced in start to perform spread function from Read all spread function Signaling Flag before body, read each spread function Signaling Flag and reading next spread function letter Spread function is performed before order mark.Block 1760 reads the first spread function Signaling Flag (can be with tape index), and block 1762 is examined Survey whether the first spread function Signaling Flag read indicates and will perform the first spread function.If function will not be performed, then Process is passed to block 1768 and does not perform spread function.But, if the first spread function Signaling Flag whether indicate by Perform the first spread function, then process and be passed to block 1764, wherein performed such before process is delivered to block 1768 Process.Once completing this to process, block 1768 determines whether to have have read all spread function Signaling Flag.If Words, then process and exit, but if it is not, as illustrated in block 1770, it is considered to next spread function Signaling Flag.Read the Two spread function Signaling Flag, and for this second spread function Signaling Flag and the second spread function of being associated thereof, repeat The operation of block 1760-1768.This can also complete via the use of one or more incremental indexes, and different index will be used for Read spread function Signaling Flag and perform spread function self.

Figure 18 is the figure of the embodiment presenting the PPS grammer for HEVC range expansion.As before, in logical statement 1712 The pps_extension_present_flag read indicates that at least one pps_extension_flag [i] is present in PPS In grammer.This pps_extension_present_flag is in logical statement 1714, to indicate logical statement 1716 He 1718 should be performed, and these logical statements read the pps_extension_flag [i] for i=0 to n.pps_ Extension_flag [i] value is 1 grammatical structure indicating the existence pps_extension for being associated, and pps_ Extension_flag [i] value is 0 to indicate the grammatical structure not existed for the pps_extension being associated with mark.

In example grammar shown in figure 18, pps_extension_flag [0] value is 1 to indicate at PPS RBSP language Method structure exists following HEVC range expansion coherent element, as shown in logical statement 1724,1726 and 1804-1820:

·log2_max_transform_skip_block_size_minus2

·luma_chroma_prediction_enabled_flag

·chroma_qp_adjustment_enabled_flag

·diff_cu_chroma_qp_adjustment_depth

·chroma_qp_adjustment_table_size_minus1

·cb_qp_adjustment

·cr_qp_adjustment

Therefore, pps_extension_flag [0] specifies these syntactic elements do not exist equal to 0.

Further, pps_extension_7bits value be 0 specify do not exist in PPS RBSP grammatical structure more Pps_extension_data_flag syntactic element, and logical statement 1822 eliminates logical statement 1824 and 1828.pps_ Extension_7bits should have the value of 0 in meeting the bit stream leaving over specification version, because pps_extension_ 7bits value is not equal to 0 and is preserved for ITU-T/ISO/IEC future usage.HEVC decoder should allow pps_ The value of extension_7bits is not equal to 0 and all pps_extension_data_ that should ignore in PPS NAL unit Flag syntactic element.

Figure 19 A-19C shows the further alternative embodiment of expanded signalling grammer.Figure 19 A illustrates general language Method, wherein there is Signaling Flag (pps_extension_present_flag) and be used for being sent in PPS with signaling and be in extension No existence further spread function grammer.As before, logical statement 1712 reads pps_extension_present_flag. The grammer for one or more spread functions, logic is there is during only pps_extesion_present_flag indicates PPS The execution of statement 1714 just command logic statement 1716-1742.Logical statement 1716 reads the pps_ for all i values Extension_flag [i], and logical statement 1720 reads pps_extension_7bits.Logical statement 1732-1740 reads Pps_extension_data_flag and associated data.

Figure 19 B illustrates PPS grammer, wherein, read in separate statement spread function Signaling Flag rather than Via the index being incremented by processing cycle.Specifically, logical statement 1902-1906 read indicate range expansion process will be held Capable first indicates (pps_range_extension_flag), indicates multilamellar or multi views (MV-HEVC) extension process general Second be performed indicates (pps_multilayer_extension_flag) and further expands the 3rd of data for reading Mark (pps_extension_bits6).Logical statement 1910-1912 performs the pps_range_ read such as logical statement 1902 (it can be placed in by pps_range_ in pps_range_extension () process indicated by extension_flag In the independent PPS range expansion grammer that extension () logical statement is quoted).Logical statement 1914-1916 performs such as pps_ Pps_multilayer_extension () indicated by multilayer_extension_flag (can also be by pps_ The different PPS grammer that multilayer_extension () logical statement is quoted is designated).Logical statement 1918-1926 Read pps_extension_data_flag and the data being associated.

Figure 19 C illustrates PPS grammer, wherein use incremental index read spread function Signaling Flag rather than Spread function Signaling Flag for perform extension process is detected and used with statement independent, that do not index.Specifically, logic The index i that statement 1930-1932 uses adopted value to be 0 and 1 reads two pps_extension_flag, i.e. pps_ Extension_flag [0] and pps_extension_flag [1].Logical statement 1934 reads pps_extension_6bits Value, except quoting pps_extension_flag and making a distinction with [0] or the index of [1] rather than different names, logic language Sentence 1938-1952 is similar to logical statement 1910-1926 and operates.

Other embodiments of aforementioned grammer are also contemplated to.Deposit for example, it is possible to be grouped extension by type or catalogue At Signaling Flag (such as pps_extension_flag).The extension that this license has similar data demand is sent out with signaling together Send and process, thereby saving syntax statement and decoder processes.

As it has been described above, the spread function with signaling transmission can be independent, or can functionally be correlated with.Example As, before the second spread function can be done, the second spread function can need to use first (pre-treatment or execution ) result of spread function.Or, the second spread function can mutually exclusive with the first spread function (such as, otherwise will perform First spread function or or will perform the second spread function, but will not both of which perform).Or, the second spread function is permissible Being unless also performed the function that the first spread function otherwise will not perform, therefore, the second spread function value is only in the first extension Just it is implied or performs in processing sequence when function is also carried out.Such as, calculate may need from the first spread function and The output of both the second spread functions or result, and therefore, the existence of the first spread function must imply the second extension letter Number, vice versa.

Describing aforementioned operation about decoding process, it can be as the generation of a part for coded treatment at source decoder In 220 or in encoder 202.Coded treatment is also expressed as including determining one or more cutting according to slice type data Whether the section in sheet is to cut into slices between prediction, and if section be to cut into slices between prediction, then the section head being associated with section The first parameter configuration in portion is the signaling enabled value of the state of the weight estimation of the view data being associated with section.

Hardware environment

Figure 20 illustrates the example processing system 2000 that can be used to realize embodiments of the invention.Computer 2002 include processor 2004 and memorizer, such as random access memory (RAM) 2006.Computer 2002 is operatively coupled to show Showing device 2022, the image of such as window is presented to user on graphic user interface 2018B by it.Computer 2002 can couple To other equipment, such as keyboard 2014, mouse 2016, printer etc..Certainly, it would be recognized by those skilled in the art that above-mentioned group The combination in any of part, or any number of different assembly, ancillary equipment or other equipment, can make together with computer 2002 With.

Generally, under the control of the computer 2002 operating system 2008 in being stored in memorizer 2006 operate, and with Family is alternately to accept input and order and to present result by graphic user interface (GUI) module 2018A.Although GUI module 2018A is depicted as standalone module, but the instruction performing GUI function can be resident or be distributed in operating system 2008, computer In program 2010, or realize with private memory and processor.Computer 2002 also achieves compiler 2012, and it allows Processor will be translated into the application program 2010 of such as COBOL, the programming language of C++, FORTRAN or other language 2004 readable codes.After completion, application 2010 employing uses the relation of compiler 2012 generation and logic to access and grasp The data of storage in the memorizer 2006 of control computer 2002.Computer 2002 includes external communication device the most alternatively, such as Modem, satellite link, Ethernet card or for other equipment of other compunications.

In one embodiment, it is achieved the instruction of operating system 2008, computer program 2010 and compiler 2012 is tangible Ground is presented as computer-readable medium, and such as data storage device 2020, it can include one or more fixed or movable Data storage device, such as zip drive, floppy disk 2024, hard disk drive, CD-ROM drive, tape drive Etc..Further, operating system 2008 and computer program 2010 are made up of instruction, refer to when computer 2002 reads and perform this When making, this instruction makes computer 2002 perform realize and/or use step essential to the invention.Computer program 2010 And/or operational order can also be tangibly embodied as memorizer 2006 and/or data communications equipment 2030, thus make manufactured Computer program or article.Thus, terms used herein " article of manufacture ", " program storage device " and " calculating Machine program product " being intended to comprise can be from any computer readable device or the computer program of medium access.

Processing system 2000 may be embodied in desktop computer, laptop computer, flat board, notebook, individual Personal digital assistant (PDA), cell phone, smart phone or there is any equipment of suitable treatments and storage capacity.Enter one Step, processing system 2000 can utilize specialized hardware to perform some or all aforementioned functions.Such as, above-mentioned coding and decoding Process can be performed by application specific processor and the memorizer being associated.

It would be recognized by those skilled in the art that in the case of without departing substantially from the scope of the present disclosure, this configuration can be carried out Many amendments.Such as, it would be recognized by those skilled in the art that the combination in any of said modules, or any number of different group Part, ancillary equipment and other equipment, can be used.Such as, specific function as herein described can be held by hardware module OK, or can be performed by the processor of instruction performing to store with software or form of firmware.Further, merit as herein described Can be combined or expand in multiple modules performing in individual module.

Conclusion

For the purpose of illustration and description, the described above of preferred embodiment is had been presented for.Be not intended to limit or Disclosure is restricted to disclosed precise forms by person.In view of instructing above, many amendments and deformation are all possible. It is intended to the scope of right do not described in detail by this and limited, but is limited by appended claim.

Claims

1. the method processing described sequence in include the processing equipment of sequence of multiple picture for decoding, at least partly Ground processes each picture according to image parameters collection, and described method includes:

A () is read extension and be there is Signaling Flag；

B () determines that read extension exists whether Signaling Flag indicates based in part at least one spread function Process described picture；

(c) only work as read extension exist Signaling Flag indicate by based in part on described at least one extension letter When number processes described picture:

Read the first spread function Signaling Flag sending the first spread function with signaling；

Unrelated with the value of the first spread function Signaling Flag read, read the second extension letter sending the second extension with signaling Number Signaling Flag.

The most described second spread function is unrelated with described first spread function.

3. the method for claim 1, farther includes:

D () determines whether described first spread function Signaling Flag indicates based in part on described first spread function Process described picture.

4., wherein, only there is Signaling Flag in the extension read and indicate at least portion in method as claimed in claim 3 Ground is divided just to perform (d) when processing described picture according at least one spread function described.

5. method as claimed in claim 3, wherein, performs (d) after (a)-(c).

6. method as claimed in claim 4, farther includes:

E () only indicates will come based in part on described first spread function at described first spread function Signaling Flag Described first spread function is just performed when processing described picture.

7. method as claimed in claim 6, farther includes:

F () determines whether described second spread function Signaling Flag indicates based in part on described second spread function Process described picture, and only indicate based in part on described at described second spread function Signaling Flag Described second spread function is just performed when two spread functions process described picture.

The most described first spread function Signaling Flag is worth uniquely with the first of index It is associated, and described second spread function Signaling Flag is associated uniquely with the second value of described index, and wherein:

The described first spread function Signaling Flag of described reading and described second spread function Signaling Flag include:

The value of described first spread function Signaling Flag is read according to described index；

It is incremented by described index；And

The value of described second spread function Signaling Flag is read according to be incremented by index.

9. method as claimed in claim 8, wherein:

Described first spread function includes range expansion function；And

Described second spread function includes efficient video coding (HEVC) multilamellar or multi views spread function.

10. include a device for the sequence of multiple picture for decoding, process often based in part on image parameters collection Individual picture, described device includes:

Processor；

Memorizer, described memory communication is coupled to described processor, and described memorizer stores multiple instructions, the plurality of instruction Instruction including for following operation:

A () is read extension and be there is Signaling Flag；

11. devices as claimed in claim 10, wherein, described second spread function is unrelated with described first spread function.

12. devices as claimed in claim 10, wherein, described instruction farther includes the instruction for following operation:

, wherein, only there is Signaling Flag in the extension read and indicate at least in 13. devices as claimed in claim 12 (d) is performed when partly processing described picture according at least one spread function described.

14. devices as claimed in claim 12, wherein, perform (d) after (a)-(c).

15. devices as claimed in claim 13, wherein, described instruction farther includes the instruction for following operation:

16. devices as claimed in claim 15, wherein, described instruction farther includes the instruction for following operation:

17. devices as claimed in claim 10, wherein, described first spread function Signaling Flag is unique with the first value of index Be associated, and described second spread function Signaling Flag is associated uniquely with the second value of described index, and wherein:

Include for reading the described instruction of described first spread function Signaling Flag and described second spread function Signaling Flag Instruction for following operation:

It is incremented by described index；And

18. devices as claimed in claim 17, wherein:

Described first spread function includes range expansion function；And

The method of 19. 1 kinds of sequences including multiple picture for coding, processes often based in part on image parameters collection Individual picture, described method includes:

(a) determine whether by based in part at least one spread function to process described picture；

(b) extension is existed Signaling Flag write described image parameters collection, wherein, if will not based in part on described extremely A few spread function processes described picture, and the most described extension exists Signaling Flag and has the first value, and if will at least Partly processing described picture according at least one spread function described, there is Signaling Flag and has second in the most described extension Value；

C () will be only when processing described picture based in part at least one spread function described:

Determining whether will be based in part on the first spread function to process described picture；

If described picture will be processed based in part on the first spread function, then will send described first extension with signaling First spread function Signaling Flag of function writes described image parameters collection；

Determining whether will be based in part on the second spread function to process described picture；And

If described picture will be processed based in part on described second spread function, with the first spread function write The value of Signaling Flag is unrelated, and the second spread function Signaling Flag sending described second spread function with signaling is write described figure Sheet parameter set.

20. methods as claimed in claim 19, wherein:

If described picture will not processed according to described first spread function, the most described first expanded signalling mark has first Value, if described picture will be processed according to described first spread function, the most described first expanded signalling mark has the second value；

If described picture will not processed according to described second spread function, the most described second expanded signalling mark has first Value, if described picture will be processed according to described second spread function, the most described second expanded signalling mark has the second value； And

Described method farther includes:

If described picture will be processed according to described first spread function, then will be according to described first spread function Signaling Flag The grammer being conditionally executed described first spread function writes described image parameters collection；And

If described picture will be processed according to described second spread function, then will be according to described second spread function Signaling Flag The grammer being conditionally executed described second spread function writes described image parameters collection.