CN104685892A

CN104685892A - Sub-bitstream applicability to nested sei messages in video coding

Info

Publication number: CN104685892A
Application number: CN201380051435.2A
Authority: CN
Inventors: 王益魁
Original assignee: Qualcomm Inc
Current assignee: Qualcomm Inc
Priority date: 2012-10-08
Filing date: 2013-09-20
Publication date: 2015-06-03
Anticipated expiration: 2033-09-20
Also published as: AP2015008363A0; TW201436536A; DK2904782T3; SI2904782T1; CO7350643A2; CA2885807C; SG11201501832UA; RU2643463C2; SG11201501833PA; HK1207775A1; US20140098896A1; WO2014058598A1; UA116998C2; CN104704842B; ES2727814T3; KR20150067319A; KR20150056877A; MA37971B1; BR112015007763B1; UA116363C2

Abstract

A device determines, based at least in part on a syntax element in a scalable nesting supplemental enhancement information (SEI) message encapsulated by an SEI Network Abstraction Layer (NAL) unit, whether a nested SEI message encapsulated by the scalable nesting SEI message is applicable to a default sub-bitstream. The default sub-bitstream is an operation point representation of an operation point defined by a layer identifier specified in a NAL unit header of the SEI NAL unit and a temporal identifier specified in the NAL unit header. When the nested SEI message is applicable to the default sub-bitstream, the device uses the nested SEI message in an operation on the default sub-bitstream.

Description

For the sub-bit stream applicability of nido supplemental enhancement information message in video coding

Subject application advocates the U.S. Provisional Patent Application case the 61/711st that on October 8th, 2012 applies for, the right of No. 098, the full content of described application case is incorporated herein by reference.

Technical field

The present invention relates to Video coding and video decode.

Background technology

Digital video capabilities can be incorporated in the device of broad range, and described device comprises Digital Television, digital direct broadcast system, wireless broadcast system, personal digital assistant (PDA), on knee or desktop PC, flat computer, E-book reader, digital camera, digital recorder, digital media player, video game apparatus, video game console, honeycomb fashion or satellite radiotelephone, so-called " intelligent telephone ", video teletype conference device, stream video device and its fellow.Digital video apparatus implements video compression technology, such as be described in those technology in following each: by MPEG-2, MPEG-4, ITU-T H.263, the ITU-T H.264/MPEG-4 standard, high efficiency video coding (HEVC) standard under development at present that define of the 10th part advanced video decoding (AVC), and the expansion of these standards.Video-unit transmitting more efficiently by implementing these video compression technologies, receiving, encode, decoding and/or storing digital video information.

Video compression technology performs space (in picture) prediction and/or time (between picture) prediction, to reduce or to remove redundancy intrinsic in video sequence.For block-based video coding, video segment (that is, a part for frame of video or frame of video) can be divided into video block.Use the video block in intra-coding (I) section carrying out encoded picture relative to the spatial prediction of the reference sample in the adjacent block in identical picture.The video block in interframe decoding (P or B) section of picture can use relative to the spatial prediction of the reference sample in the adjacent block in identical picture or the time prediction relative to the reference sample in other reference picture.Picture can be referred to as frame, and reference picture can be referred to as reference frame.

Spatial prediction or time prediction cause decoding for the predictability block of block.Residual data represents that the pixel between original block to be decoded and predictability block is poor.Motion vector according to the block pointing to the reference sample forming predictability block is encoded through interframe decode block, and residual data instruction is through the difference between decode block and predictability block.Encode through intra-coding block according to Intra coding modes and residual data.For compressing further, residual data can be transformed to transform domain from pixel domain, thus produce the residual coefficients can then carrying out quantizing.Can scan initial placement becomes the coefficient through quantizing of two-dimensional array to produce the one-dimensional vector of coefficient, and can apply entropy decoding and even more compress to reach.

Multi views decoding bit stream is produced from multiple visual angles coded views by (such as).Develop some three-dimensionals (3D) video standard of multi views decoding aspect.Such as, different views can transmit left-eye view and right-eye view, to support 3D video.Alternatively, some 3D video coding processes can apply the decoding of so-called multi views plus depth.In the decoding of multi views plus depth, 3D video bit stream not only can contain texture view component, also containing depth views component.Such as, each view can comprise a texture view component and a depth views component.

Summary of the invention

Generally, the present invention describes the signalling of hypothetical reference decoder (HRD) parameter, and the nest cover of supplemental enhancement information (SEI) message in video coding.More particularly, in the scalable nest cover SEI message be encapsulated by SEI network abstract layer (NAL) unit, video encoder can comprise the syntactic element indicating the nido SEI message be encapsulated by described scalable nest cover SEI message whether to be applicable to give tacit consent to sub-bit stream.The sub-bit stream of described acquiescence represents for the operating point of the operating point that time identifier specified in layer identifier specified in the NAL unit header by described SEI NAL unit and described NAL unit header defines.In addition, a kind of device at least partly based on the institute's syntax elements in described scalable nest cover SEI message, can determine whether the described nido SEI message be encapsulated by described scalable nest cover SEI message is applicable to the sub-bit stream of described acquiescence.When described nido SEI message is applicable to the sub-bit stream of described acquiescence, described device can use described nido SEI message in the operation to the sub-bit stream of described acquiescence.

In an example, the present invention describes a kind of method of processing video data.Described method comprises at least partly based on the syntactic element in the scalable nest cover SEI message be encapsulated by SEI NAL unit, determines whether the nido SEI message be encapsulated by described scalable nest cover SEI message is applicable to the sub-bit stream of acquiescence of coded video bitstream.The sub-bit stream of described acquiescence represents for the operating point of the operating point that time identifier specified in layer identifier specified in the NAL unit header by described SEI NAL unit and described NAL unit header defines.In addition, described method comprises when described nido SEI message is applicable to the sub-bit stream of described acquiescence, uses described nido SEI message in the operation to the sub-bit stream of described acquiescence.

In another example, the present invention describes a kind of device, it comprises the syntactic element being configured to overlap based on the scalable nest be encapsulated by SEI NAL unit at least partly in SEI message, determines whether the nido SEI message be encapsulated by described scalable nest cover SEI message is applicable to one or more processor of the sub-bit stream of acquiescence of coded video bitstream.The sub-bit stream of described acquiescence represents for the operating point of the operating point that time identifier specified in layer identifier specified in the NAL unit header by described SEI NAL unit and described NAL unit header defines.One or more processor described is configured, and make when described nido SEI message is applicable to the sub-bit stream of described acquiescence, one or more processor described uses described nido SEI message in the operation to the sub-bit stream of described acquiescence.

In another example, the present invention describes a kind of device, for the syntactic element at least part of scalable nest cover SEI message based on being encapsulated by SEI NAL unit, it comprises determines whether the nido SEI message be encapsulated by described scalable nest cover SEI message is applicable to the device of the sub-bit stream of acquiescence of coded video bitstream.The sub-bit stream of described acquiescence represents for the operating point of the operating point that time identifier specified in layer identifier specified in the NAL unit header by described SEINAL unit and described NAL unit header defines.Described device also comprises the device for using described nido SEI message in the operation to the sub-bit stream of described acquiescence when described nido SEI message is applicable to the sub-bit stream of described acquiescence.

In another example, the present invention describes a kind of computer-readable storage medium storing instruction, when one or more processor by device performs described instruction, described instruction configures described device with the syntactic element at least part of scalable nest cover SEI message based on being encapsulated by SEI NAL unit, determines whether the nido SEI message be encapsulated by described scalable nest cover SEI message is applicable to the sub-bit stream of acquiescence of coded video bitstream.The sub-bit stream of described acquiescence represents for the operating point of the operating point that time identifier specified in layer identifier specified in the NAL unit header by described SEI NAL unit and described NAL unit header defines.When executed, described instruction configures described device, and make when described nido SEI message is applicable to the sub-bit stream of described acquiescence, described device uses described nido SEI message in the operation to the sub-bit stream of described acquiescence.

In another example, the present invention describes a kind of method of coding video frequency data.Described method is included in the scalable nest be encapsulated by SEI NAL unit and overlaps in SEI message, comprises the syntactic element indicating the nido SEI message be encapsulated by described scalable nest cover SEI message whether to be applicable to the sub-bit stream of acquiescence of coded video bitstream.The sub-bit stream of described acquiescence represents for the operating point of the operating point that time identifier specified in layer identifier specified in the NAL unit header by described SEI NAL unit and described NAL unit header defines.Described method is also included within described coded video bitstream and sends described scalable nest cover SEI message with signal.

In another example, the present invention describes a kind of video coding apparatus, it comprises the scalable nest be configured to being encapsulated by SEI NAL unit and overlaps in SEI message, comprises one or more processor indicating the nido SEI message be encapsulated by described scalable nest cover SEI message whether to be applicable to the syntactic element of the sub-bit stream of acquiescence of coded video bitstream.The sub-bit stream of described acquiescence represents for the operating point of the operating point that time identifier specified in layer identifier specified in the NAL unit header by described SEI NAL unit and described NAL unit header defines.One or more processor described is also configured to send described scalable nest cover SEI message with signal in described coded video bitstream.

In another example, the present invention describes a kind of video coding apparatus, it comprises in the scalable nest cover SEI message be encapsulated by SEI NAL unit, comprises the device indicating the nido SEI message be encapsulated by described scalable nest cover SEI message whether to be applicable to the syntactic element of the sub-bit stream of acquiescence of coded video bitstream.The sub-bit stream of described acquiescence represents for the operating point of the operating point that time identifier specified in layer identifier specified in the NAL unit header by described SEI NAL unit and described NAL unit header defines.Described video coding apparatus also comprises the device for sending described scalable nest cover SEI message in described coded video bitstream with signal.

In another example, the present invention describes a kind of computer-readable storage medium storing instruction, when performing described instruction by a video coding apparatus, described instruction configures described video coding apparatus with in the scalable nest cover SEI message be encapsulated by SEI NAL unit, comprises the syntactic element indicating the nido SEI message be encapsulated by described scalable nest cover SEI message whether to be applicable to the sub-bit stream of acquiescence of coded video bitstream.The sub-bit stream of described acquiescence represents for the operating point of the operating point that time identifier specified in layer identifier specified in the NAL unit header by described SEI NAL unit and described NAL unit header defines.When executed, described instruction also configures described video coding apparatus to send described scalable nest cover SEI message with signal in described coded video bitstream.

The details of one or more example of the present invention is set forth in alterations and following description.From described description, graphic and claims, further feature, target and advantage will be apparent.

Accompanying drawing explanation

Fig. 1 illustrates the block diagram that can utilize the instance video decoding system of technology described in the present invention.

Fig. 2 illustrates the block diagram can implementing the example video encoder of technology described in the present invention.

Fig. 3 illustrates the block diagram can implementing the instance video decoder of technology described in the present invention.

Fig. 4 illustrates the flow chart according to the example operation of the video encoder of one or more technology of the present invention.

Fig. 5 illustrates the flow chart according to the example operation of the device of one or more technology of the present invention.

Fig. 6 illustrates the flow chart according to the example operation of the video encoder of one or more technology of the present invention.

Fig. 7 illustrates the flow chart according to the example operation of the device of one or more technology of the present invention.

Fig. 8 illustrates the flow chart according to the example operation of the video encoder of one or more technology of the present invention.

Fig. 9 illustrates the flow chart according to the example operation of the device of one or more technology of the present invention.

Embodiment

Video encoder can produce the bit stream comprising encoded video data.Because bit stream comprises encoded video data, so bit stream can be referred to as coded video bitstream in this article.Bit stream can comprise a series of network abstract layer (NAL) unit.NAL unit can comprise video coding layer (VCL) NAL unit and non-VCL NAL unit.VCL NAL unit can comprise cutting into slices through decoding of picture.Non-VCL NAL unit can comprise the data of video parameter collection (VPS), sequence parameter set (SPS), image parameters collection (PPS), supplemental enhancement information (SEI) or other type.VPS is can containing being applicable to zero or multiple all through the syntactic structure of the syntactic element of coded video sequence.SPS is can containing being applicable to zero or multiple all through the syntactic structure of the syntactic element of coded video sequence.Single VPS is applicable to multiple SPS.PPS is can containing being applicable to zero or multiple all through the syntactic structure of the syntactic element of decoding picture.Single SPS is applicable to multiple PPS.

Such as the device of content transmission network (CDN) device, media aware element (MANE), video encoder or Video Decoder can extract sub-bit stream from bit stream.Device performs sub-bit stream leaching process by removing some NAL unit from bit stream.The sub-bit stream of gained comprises the residue of bit stream without the NAL unit removed.In some instances, lower frame rate can be had from the video data of sub-bitstream decoding, and/or the view fewer than original bit stream can be represented.

Video coding standard can comprise various feature to support sub-bit stream leaching process.Such as, the video data of bit stream can be divided the set of stratification.For each in described layer, can without the need to the data of decoding in lower level with reference to the data in any higher level.NAL unit is only encapsulated the data of simple layer.Therefore, can remove from bit stream the NAL unit being encapsulated the data of the highest rest layers of bit stream, and the decodabillity of the data in the rest layers of bit stream can not be affected.In scalable video coding (SVC), higher level can comprise enhancing data, and described enhancing data improve the time speed (time adjustability) of the picture in the quality (quality adjustability) of the picture in lower level, the Space format (spatial scalability) expanding the picture in lower level or raising lower level.In multi views decoding (MVC) and 3 D video (3DV) decoding, higher level can comprise additional views.

Each NAL unit can comprise header and pay(useful) load.The header of NAL unit can comprise nuh_reserved_zero_6bits syntactic element.If NAL unit is relevant to the basic unit in MVC, 3DV decoding or SVC, so the nuh_reserved_zero_6bits syntactic element of NAL unit equals 0.Can without the need to the data of decoding in the basic unit of bit stream with reference to the data in other layer any of bit stream.If NAL unit is not relevant to the basic unit in MVC, 3DV or SVC, so nuh_reserved_zero_6bits syntactic element can have other nonzero value.Specifically, if NAL unit is not relevant to the basic unit in MVC, 3DV or SVC, so the nuh_reserved_zero_6bits syntactic element of NAL unit specifies the layer identifier identifying the layer be associated with NAL unit.

In addition, can without the need to some pictures with reference to other picture in identical layer in decoding layer.Therefore, can remove from bit stream the NAL unit being encapsulated the data of some picture of layer, and the decodabillity of other picture in layer can not be affected.Such as, can without the need to decoding with reference to the picture with odd number POC value there is the picture of even number picture order count (POC) value.Remove the frame rate that the NAL unit being encapsulated the data of these pictures can reduce bit stream.The subset of the picture in the layer can decoded without the need to other picture in reference layer can be referred to as " sublayer " or " time sublayer " in this article.

NAL unit can comprise nuh_temporal_id_plus1 syntactic element.The nuh_temporal_id_plus1 syntactic element of NAL unit can specify the time identifier of NAL unit.If the time identifier of the first NAL unit is less than the time identifier of the second NAL unit, so can without the need to the data be encapsulated by the first NAL unit of decoding with reference to the data be encapsulated by the second NAL unit.

The operating point of bit stream is associated with the set of layer identifier (that is, the set of nuh_reserved_zero_6bits value) and time identifier separately.The set of layer identifier can be expressed as OpLayerIdSet, and time identifier can be expressed as TemporalID.If the layer identifier of NAL unit is the set of the layer identifier of operating point, and the time identifier of NAL unit is less than or equal to the time identifier of operating point, so NAL unit is associated with operating point.Operating point is expressed as the bit stream subset (that is, sub-bit stream) be associated with operating point.The operating point of operating point represents each NAL unit that can comprise and be associated with operating point.Operating point represents and does not comprise the uncorrelated VCLNAL unit with operating point.

External source can specify the set of the destination layer identifier for operating point.Such as, content transmission network (CDN) device can the set of intended target layer identifier.In this example, CDN device can use the set of destination layer identifier with identifying operation point.The operating point that CDN device then can extract operating point represents, and is represented by operating point but not original bit stream is relayed to user end apparatus.Extract operating point to represent and operating point is represented and be relayed to the bit rate that user end apparatus can reduce bit stream.

In addition, video coding standard designated buffer model.Video buffer model also can be referred to as " hypothetical reference decoder " or " HRD ".HRD describes how buffered data is for decoding with the data how cushioned through decoding for output.Such as, HRD describe in Video Decoder through decoding picture buffer (" CPB ") and the operation through decode picture buffer (" DPB ").CPB is the first-in first-out buffer containing access unit by the decoding order of being specified by HRD.DPB is that the picture of preservation through decoding is for the reference of being specified by HRD, the buffer exporting rearrangement or output delay.

Video encoder available signal sends the set of HRD parameter.The various aspects of HRD state modulator HRD.HRD parameter can comprise initial CPB and remove delay, CPB size, bit rate, initial DPB output delay and DPB size.Can these HRD parameters of decoding in hrd_parameters () syntactic structure specified in VPS and/or SPS.Also HRD parameter can be specified in Buffer period SEI message or picture sequential SEI message.

As explained above, operating point represents to have and compares the different frame rate of original bit stream and/or bit rate.This is because operating point represents some pictures and/or some data that can not comprise original bit stream.Therefore, when processing original bit stream, if Video Decoder will remove data with special speed from CPB and/or DPB, with when processing operating point and representing, if Video Decoder will remove data with phase same rate from CPB and/or DPB, so Video Decoder can remove too much or very few data from CPB and/or DPB.Therefore, video encoder can send the different sets of HRD parameter for different operating point signal.In emerging high efficiency video coding (HEVC) standard, video encoder available signal sends the set of the HRD parameter in VPS, or video encoder available signal sends the set of the HRD parameter in SPS.

Optionally, the set of HRD parameter comprises the set for the common information in all time sublayers.Time sublayer is the time scalability layer of the time scalable bit stream be made up of with the non-VCL NAL unit be associated the VCL NAL unit with special time identifier.Except the set of common information, the set of HRD parameter also can comprise the set of the syntactic element specific to respective time sublayer.Because the set of common information for HRD parameter multiple set be common, so the set of common information can be sent in the set of multiple HRD parameter with signal.For in the some of the recommendations of HEVC, when the set of HRD parameter is the set of a HRD parameter in VPS, common information can be present in the set of HRD parameter, or when the set of HRD parameter is associated with the first operating point, common information can be present in the set of HRD parameter.

But, when there is the set of multiple HRD parameter in VPS, the set of the multiple different common information of the set had for HRD parameter may be needed.When there is the HRD parameter syntactic structure of greater number in VPS, this situation can be especially correct.Therefore, be different from a HRD parameter syntactic structure, the set in HRD parameter syntactic structure with common information may be needed.

Technology of the present invention provides the design allowing the common information sending HRD parameter syntactic structure for any HRD parameter syntactic structure clearly with signal.In other words, technology of the present invention can allow to send for the common information in all sublayers with signal clearly for any hrd_parameters () syntactic structure.This situation can improve decoding efficiency.

Therefore, according to one or more technology of the present invention, the device of such as Video Decoder or other device can at least partly based on the syntactic element comprised in the VPS of multiple HRD parameter syntactic structure, and whether the specific HRD parameter syntactic structure determining in VPS comprises each sublayer for bit stream is the set of common HRD parameter.Device decodable code is from the syntactic element of VPS.One or more HRD parameter syntactic structure can occur prior to specific HRD parameter syntactic structure by decoding order in VPS.In response to determining it is the set of common HRD parameter each sublayer that specific HRD parameter syntactic structure comprises for bit stream, device can use the specific HRD parameter syntactic structure set of common HRD parameter (comprise each sublayer for bit stream be) and executable operations.

In addition, video encoder can produce scalable nest cover SEI message.Scalable nest cover SEI message contains one or more SEI message.Nest is placed on the out of Memory that the SEI message in scalable nest cover SEI message can comprise HRD parameter or be associated with operating point.Some of the recommendations for HEVC do not allow a SEI message to be applicable to multiple operating point.This situation can reduce bit-rate efficiency, this is because it can make video encoder signal send to have multiple SEI message of identical information.Therefore, technology of the present invention can allow a SEI message to be applicable to multiple operating point.Such as, scalable nest cover SEI message can comprise to specify and is applicable to the syntactic element that nest is placed on multiple operating points of the SEI message in scalable nest cover SEI message.

In addition, be similar to the NAL unit of other type, SEI NAL unit comprises NAL unit header and NAL unit main body.The NAL unit main body of SEI NAL unit can comprise SEI message, such as the SEI message of scalable nest cover SEI message or another type.Be similar to other NAL unit, the NAL unit header of SEI NAL unit can comprise nuh_reserved_zero_6bits syntactic element and nuh_temporal_id_plus1 syntactic element.But, for in the some of the recommendations of HEVC, the nuh_reserved_zero_6bits syntactic element of the NAL unit header of SEI NAL unit and/or nuh_temporal_id_plus1 syntactic element are also not used in the operating point determining to be applicable to the SEI message (or multiple SEI message) be encapsulated by SEI NAL unit.But these syntactic elements of SEI NAL unit header can through re-using, to reduce the number of the position sent with signal.Therefore, according to technology of the present invention, syntactic element can be sent with signal, to indicate the operating point of the nido SEI message be applicable in SEI NAL unit whether for the operating point indicated by the layer identifying information in the NAL unit header of SEI NAL unit in scalable nest cover SEI message.Layer identifying information in the NAL unit header of SEI NAL unit can comprise nuh_reserved_zero_6bits value and the nuh_temporal_id_plus1 value of NAL unit header.

Fig. 1 illustrates the block diagram that can utilize the instance video decoding system 10 of technology of the present invention.As used herein, term " video decoder " refers to video encoder and Video Decoder substantially.In the present invention, term " video coding " or " decoding " can refer to Video coding or video decode substantially.

As demonstrated in Figure 1, video decoding system 10 comprises source apparatus 12 and destination device 14.Source apparatus 12 produces encoded video data.Therefore, source apparatus 12 can be referred to as video coding apparatus or video encoder.The encoded video data that destination device 14 decodable code is produced by source apparatus 12.Therefore, destination device 14 can be referred to as video decoder or video decoding apparatus.Source apparatus 12 and destination device 14 can be the example of video decoding apparatus or video decoding equipment.

Source apparatus 12 and destination device 14 can comprise the device of broad range, comprise desktop PC, action calculation element, notes type (such as, on knee) computer, flat computer, Set Top Box, such as so-called " intelligent " phone telephone bandset, TV, video camera, display unit, digital media player, video game console, car-mounted computer, or its fellow.

Destination device 14 can receive encoded video data via channel 16 from source apparatus 12.Channel 16 can comprise one or more media or the device that encoded video data can be moved to destination device 14 from source apparatus 12.In an example, channel 16 can comprise one or more communication medium that source apparatus 12 can be made in real time encoded video data to be directly transferred to destination device 14.In this example, source apparatus 12 can modulate encoded video data according to communication standard (such as, wireless communication protocol), and can by through modulating video transfer of data to destination device 14.One or more communication medium can comprise wireless and/or wired communication media, such as radio frequency (RF) frequency spectrum or one or more physical transmission line.One or more communication medium can form the part of the network (such as, local area network (LAN), wide area network or global network (such as, internet)) based on package.One or more communication medium can comprise router, interchanger, base station, or promotes other equipment from source apparatus 12 to the communication of destination device 14.

In another example, channel 16 can comprise the medium storing the encoded video data produced by source apparatus 12.In this example, destination device 14 (such as) can access medium via disk access or card access.Medium can comprise the data storage medium of multiple local terminal access, such as Blu-ray Disc, DVD, CD-ROM, flash memory, or for storing other suitable digital storage media of encoded video data.

In an example again, channel 16 can include file server or store another intermediate storage mean of the encoded video data produced by source apparatus 12.In this example, destination device 14 can be stored in the encoded video data at file server place or other intermediate storage mean place via stream transmission or download.File server can be the server of the type that can store encoded video data and encoded video data is transferred to destination device 14.Instance file server comprises web page server (such as, for website), file transfer protocol (FTP) (FTP) server, network attached storage (NAS) device and local terminal disc unit.

Destination device 14 connects (such as, Internet connection) by normal data and accesses encoded video data.The example types of data cube computation can comprise wireless channel (such as, Wi-Fi connects), wired connection (such as, DSL, cable modem etc.) or be suitable for accessing both combinations of the encoded video data be stored on file server.Encoded video data can be stream transmission transmission from the transmission of file server, downloads transmission, or both combinations.

Technology of the present invention is not limited to wireless application or setting.Technology can be applicable to support the video coding in the multiple multimedia application of such as following application: airborne television broadcast, CATV transmission, satellite TV transmissions, Streaming video such as via internet are transmitted, for being stored in the coding of the video data on data storage medium, the decoding of the video data be stored on data storage medium, or other is applied.In some instances, video decoding system 10 can be configured to support that unidirectional or bi-directional video transmission is to support such as stream video, video playback, video broadcasting, and/or the application of visual telephone.

Fig. 1 is only example and the video coding that technology of the present invention is applicable to any data communication that may not comprise between code device and decoding device sets (such as, Video coding or video decode).In other example, from the regional memory retrieve data of transmitting as a stream via network or fellow.Video coding apparatus codified data and data are stored into memory, and/or video decoder can from memory search data and decoded data.In many instances, by not communicating with one another, but simply data encoding is performed Code And Decode to memory and/or from the device of memory search data and decoded data.

In the example of fig. 1, source apparatus 12 comprises video source 18, video encoder 20, and output interface 22.In some instances, output interface 22 can comprise modulator/demodulator (modulator-demodulator) and/or transmitter.Video source 18 can comprise the video capture device of such as video camera, video containing previous captured video data seals shelves up for safekeeping, in order to the video feed-in interface from video content provider's receiving video data and/or the computer graphics system for generation of video data, or the combination in these sources of video data.

Video encoder 20 codified is from the video data of video source 18.In some instances, encoded video data is directly transferred to destination device 14 via output interface 22 by source apparatus 12.In other example, encoded video data also can be stored in medium or on file server, to access for decoding for destination device 14 after a while and/or to play.

In the example of fig. 1, destination device 14 comprises input interface 28, Video Decoder 30 and display unit 32.In some instances, input interface 28 comprises receiver and/or modulator-demodulator.Input interface 28 can receive encoded video data via channel 16.Display unit 32 can integrate with destination device 14, or can be outside at destination device 14.Generally, display unit 32 shows through decode video data.Display unit 32 can comprise multiple display unit, such as liquid crystal display (LCD), plasma display, Organic Light Emitting Diode (OLED) display, or the display unit of another type.

Video encoder 20 and Video Decoder 30 can be embodied as any one in multiple appropriate circuitry separately, such as one or more microprocessor, digital signal processor (DSP), application-specific integrated circuit (ASIC) (ASIC), field programmable gate array (FPGA), discrete logic, hardware or its any combination.When partly implementing technology in software, the instruction being used for software can be stored in suitable non-transitory computer-readable storage medium by device, and one or more processor can be used within hardware to perform described instruction to perform technology of the present invention.Any one in foregoing (comprising the combination etc. of hardware, software, hardware and software) can be considered as one or more processor.Each in video encoder 20 and Video Decoder 30 can be contained in one or more encoder or decoder, and wherein any one accessible site is the part of the combined encoding device/decoder (CODEC) in related device.

The present invention can refer to substantially by the video encoder 20 of some information " signalling " to another device (such as, Video Decoder 30).Term " signalling " can refer to substantially for the syntactic element of compressed video data and/or the reception and registration of other data of decoding.Can pass in real time or close to there is this in real time.Alternatively, a time span can be lasted and this reception and registration occurs, such as when locating in the scramble time, when syntactic element in encoded bit stream is stored into computer-readable storage medium, can be there is this pass on, then, institute's syntax elements can be retrieved at any time by decoding device after being stored in these media.

In some instances, video encoder 20 and Video Decoder 30 operate according to video compression standard, H.264, video compression standard such as ISO/IEC MPEG-4Visual and ITU-T (is also referred to as ISO/IEC MPEG-4AVC), comprises its scalable video coding (SVC) expansion, multi-view video decoding (MVC) expansion and the 3DV based on MVC and expands.In some cases, any bit stream met based on the 3DV of MVC contains the sub-bit stream setting shelves (such as, three-dimensional sound pitch setting shelves) in accordance with MVC all the time.In addition, positive ongoing effort is expanded, namely based on the 3DV of AVC 3 D video (3DV) decoding H.264/AVC to produce.In other example, video encoder 20 and Video Decoder 30 can according to ITU-T H.261, ISO/IEC MPEG-1Visual, ITU-T H.262 or ISO/IEC MPEG-2Visual, and ITU-T H.264, ISO/IEC Visual and operating.

In other example, video encoder 20 and Video Decoder 30 can operate according to high efficiency video coding (HEVC) standard developed by video coding associating cooperative groups (JCT-VC) of ITU-T Video Coding Expert group (VCEG) and ISO/IEC motion picture expert group (MPEG) at present.The draft (it is referred to as " HEVC working draft 8 ") of HEVC standard on the horizon is described in " high efficiency video coding (HEVC) text preliminary specifications 8 (High Efficiency Video Coding (HEVC) text specification draft 8) " (video coding associating cooperative groups (JCT-VC) of ITU-T SG 16WP3 and ISO/IEC JTC1/SC29/WG11 of the people such as Bu Luosi (Bross), 10th meeting, Stockholm, SWE, in July, 2012, to 13 days June in 2013, it can obtain from http://phenix.int-evry.fr/jct/doc_end_user/documents/10_Stockho lm/wg11/JCTVC-J1003-V8.zip).Another draft (being referred to as " HEVC working draft 9 ") of HEVC standard on the horizon is described in " high efficiency video coding (HEVC) text preliminary specifications 9 (High Efficiency Video Coding (HEVC) text specification draft 9) " (video coding associating cooperative groups (JCT-VC) of ITU-T SG16WP3 and ISO/IEC JTC1/SC29/WG11 of the people such as Bu Luosi, 11st meeting, Chinese Shanghai, in October, 2012, to 13 days June in 2013, it can obtain from http://phenix.int-evry.fr/jct/doc_end_user/documents/11_Shangha i/wg11/JCTVC-K1003-v13.zip).In addition, positive ongoing effort is expanded with SVC, MVC and the 3DV produced for HEVC.The 3DV of HEVC expands 3DV or HEVC-3DV that can be referred to as based on HEVC.

In HEVC and other video coding standard, video sequence comprises a series of pictures usually.Picture also can be referred to as " frame ".Picture can comprise and is expressed as S _l, S _cband S _crthree array of samples.S _lfor the two-dimensional array (that is, block) of lightness sample.S _cbfor the two-dimensional array of Cb chroma sample.S _crfor the two-dimensional array of Cr chroma sample.Chroma sample also can be referred to as " colourity " sample in this article.In other cases, picture can be monochromatic, and only can comprise the array of lightness sample.

For producing the encoded expression of picture, video encoder 20 can produce the set of decoding tree-shaped unit (CTU).Each in CTU can be the decoding tree-shaped block of lightness sample, two corresponding decoding tree-shaped blocks of chroma sample, and for the syntactic structure of the sample of decoding tree-shaped block described in decoding.Decoding tree-shaped block can be the NxN block of sample.CTU also can be referred to as " tree-shaped block " or " maximum decoding unit " (LCU).The CTU of HEVC can be similar to the macro block of such as other standard H.264/AVC widely.But CTU may not be limited to specific size, and one or more decoding unit (CU) can be comprised.Section can be included in an integer number CTU of continuous sequencing in raster scan.

For producing through decoding CTU, video encoder 20 can perform four points of tree segmentations to the decoding tree-shaped block of CTU with pulling over, so that decoding tree-shaped block is divided into decode block, and therefore called after " decoding tree-shaped unit ".Decode block is the NxN block of sample.CU can be two corresponding decode block of the decode block of the lightness sample of the picture with lightness array of samples and chroma sample, Cb array of samples and Cr array of samples, and the syntactic structure of sample for decode block described in decoding.The decode block of CU can be divided into one or more prediction block by video encoder 20.Prediction block can be rectangle (that is, square or the non-square) block of the sample applying identical prediction.The predicting unit (PU) of CU can be the prediction block of the lightness sample of picture, two correspondences of chroma sample predict blocks and the syntactic structure for predicting described prediction block sample.Video encoder 20 can produce the lightness prediction block of each PU of CU, Cb predicts that block and Cr predict the predictability lightness block of block, Cb block and Cr block.

Video encoder 20 can use infra-frame prediction or inter prediction, to produce the predictability block of PU.If video encoder 20 uses infra-frame prediction to produce the predictability block of PU, so video encoder 20 can based on the predictability block producing PU through decoded samples of the picture be associated with PU.

If video encoder 20 uses inter prediction to produce the predictability block of PU, so video encoder 20 can based on be different from the picture be associated with PU one or more picture through decoded samples, produce the predictability block of PU.Video encoder 20 can use single directional prediction or bi-directional predicted with the predictability block producing PU.When video encoder 20 uses single directional prediction to produce the predictability block of PU, PU can have single movement vector.When video encoder 20 use bi-directional predicted with the predictability block producing PU time, PU can have two motion vectors.

After video encoder 20 produces the predictability lightness block of one or more PU of CU, Cb block and Cr block, video encoder 20 can produce the lightness residual block of CU.Lightness sample in one in the predictability lightness block of each sample instruction CU in the lightness residual block of CU and the difference between the corresponding sample in the original lightness decode block of CU.In addition, video encoder 20 can produce the Cb residual block of CU.Each sample in the Cb residual block of CU can indicate the Cb sample in the one in the predictability Cb block of CU and the difference between the corresponding sample in the original Cb decode block of CU.Video encoder 20 also can produce the Cr residual block of CU.Each sample in the Cr residual block of CU can indicate the Cr sample in the one in the predictability Cr block of CU and the difference between the corresponding sample in the original Cr decode block of CU.

In addition, video encoder 20 can use four points to set segmentation so that the lightness residual block of CU, Cb residual block and Cr residual block are resolved into one or more lightness transform block, Cb transform block and Cr transform block.Transform block can be the rectangular block of the sample applying identical conversion.The converter unit (TU) of CU can be the transform block of lightness sample, two correspondent transform blocks of chroma sample, and for converting the syntactic structure of described transform block sample.Therefore, each TU of CU can with lightness transform block, Cb transform block, and Cr transform block is associated.The lightness transform block be associated with TU can be the sub-block of the lightness residual block of CU.Cb transform block can be the sub-block of the Cb residual block of CU.Cr transform block can be the sub-block of the Cr residual block of CU.

One or more conversion can be applied to the lightness transform block of TU by video encoder 20, to produce the brightness coefficient block of TU.Coefficient block can be the two-dimensional array of conversion coefficient.Conversion coefficient can be scale quantity.One or more conversion can be applied to the Cb transform block of TU by video encoder 20, to produce the Cb coefficient block of TU.One or more conversion can be applied to the Cr transform block of TU by video encoder 20, to produce the Cr coefficient block of TU.

After generation coefficient block (such as, brightness coefficient block, Cb coefficient block or Cr coefficient block), video encoder 20 can quantization parameter block.Quantize substantially to refer to following process: by quantization of transform coefficients to reduce the amount of the data representing described conversion coefficient possibly, thus provide further compression.After video encoder 20 quantization parameter block, video encoder 20 can the syntactic element of conversion coefficient of entropy code instruction through quantizing.Such as, video encoder 20 can perform context-adaptive binary arithmetic decoding (CABAC) to the syntactic element of the conversion coefficient of instruction through quantizing.Video encoder 20 can export the syntactic element through entropy code in bit stream.

Video encoder 20 is exportable comprises the bit stream of the syntactic element through entropy code.Bit stream can comprise the bit sequence of the expression formed through decoding picture and associated data.Bit stream can comprise network abstract layer (NAL) unit sequence.Each in NAL unit comprises NAL unit header, and is encapsulated Raw Byte Sequence Payload (RBSP).NAL unit header can comprise the syntactic element of instruction NAL unit type codes.The type of the NAL unit type codes instruction NAL unit of being specified by the NAL unit header of NAL unit.RBSP can be the syntactic structure containing the integer number byte be encapsulated in NAL unit.In some cases, RBSP comprises zero-bit.

Dissimilar NAL unit can be encapsulated dissimilar RBSP.Such as, the NAL unit of the first kind can be encapsulated the RBSP of image parameters collection (PPS); The NAL unit of Second Type can be encapsulated the RBSP through decoding section; The NAL unit of the 3rd type can the RBSP being encapsulated SEI etc.The NAL unit being encapsulated the RBSP (RBSP in contrast to RBSP and the SEI message of parameter set) of video coding data can be referred to as video coding layer (VCL) NAL unit.

Video Decoder 30 can receive the bit stream produced by video encoder 20.In addition, Video Decoder 30 can dissect bit stream with the syntactic element of decoding from bit stream.Video Decoder 30 can the picture of rebuild video data based on the syntactic element of decoding from bit stream at least partly.The process of rebuild video data can be reciprocal in the process performed by video encoder 20 substantially.Such as, Video Decoder 30 can use the motion vector of PU to determine the predictability block of the PU of current C U.In addition, Video Decoder 30 can the transformation coefficient block that is associated with the TU of current C U of inverse quantization.Video Decoder 30 can perform inverse transformation to transformation coefficient block, the transform block be associated with the TU of current C U with rebuild.Added to the corresponding sample of the transform block of the TU of current C U by the sample of the predictability block of the PU by current C U, Video Decoder 30 restructural builds the decode block of current C U.By the decode block of each CU of rebuild picture, Video Decoder 30 restructural builds picture.

In multi views decoding, multiple views of the same scene from different points of view can be there are.Term " access unit " is used in reference to the set of the picture corresponding to same time example.Therefore, video data can through being conceptualized as a series of access unit passed in time." view component " can be representing through decoding of view in single access unit.In the present invention, " view " can refer to the view component sequence that is associated with identical view identifier.

Inter-view prediction is supported in multi views decoding.Inter-view prediction is similar to the inter prediction used in HEVC, and can use same syntax of figs element.But when predicting between video decoder is to current video cell (such as, PU) execution view, video encoder 20 can will to be in the access unit identical with current video cell but picture in different views is used as reference picture.Contrastingly, the picture in different access unit is only used as reference picture by conventional inter prediction.

In multi views decoding, if Video Decoder (such as, Video Decoder 30) can without the need to the picture of decoding in view with reference to the picture in other view any, so described view can be referred to as " base view ".When picture in the one in decoding non-basic view, if (namely picture is in the identical time instance of the picture of positive decoding current with video decoder, access unit) in but in different views, so described picture can add in reference picture list by video decoder (such as, video encoder 20 or Video Decoder 30).Other inter prediction reference picture similar, video decoder can insert inter-view prediction reference picture in any position of reference picture list.

Video coding standard designated buffer model.H.264/AVC with in HEVC, buffer model is referred to as " hypothetical reference decoder " or " HRD ".In HEVC working draft 8, HRD is described in appendix C.

Description should how buffered data be for decoding for HRD, and how to cushion through decoded data for output.Such as, HRD describe CPB operation, through decode picture buffer (" DPB "), and video decoding process.CPB is the first-in first-out buffer containing access unit by the decoding order of being specified by HRD.DPB preserves through decoding picture for the reference of being specified by HRD, the buffer exporting rearrangement or output delay.Mathematical method can specify the behavior of CPB and DPB.HRD can directly to sequential, buffer size and bit rate impulsive constraints.In addition, HRD can indirectly to various bit stream characteristic and statistics impulsive constraints.

H.264/AVC with in HEVC, bit stream accordance and decoder accordance are appointed as the part of HRD specification.In other words, HRD model is specified in order to determine whether bit stream follows the test of standard, and in order to determine whether decoder follows the test of standard.Although HRD to be called certain decoder, video encoder uses HRD to guarantee bit stream accordance usually, and Video Decoder does not need HRD usually.

H.264/AVC bit stream accordance or the HRD accordance of two types is all specified with HEVC, that is, I type and II type.I type bit stream is only containing for the VCL NAL unit of all access units in bit stream and the number of fillers NAL unit stream according to NAL unit.II type bit stream be except for the VCL NAL unit of all access units in bit stream and number of fillers according to except NAL unit, the NAL unit stream containing at least one in following each: be different from the extra non-VCL NAL unit of number of fillers according to NAL unit; With all leading_zero_8bits, zero_byte, start_coded_prefix_one_3bytes of forming byte stream from NAL unit stream, and trailing_zero_8bits syntactic element.

When device execution determines whether bit stream follows the bit stream compliance test of video coding standard, device can select the operating point of bit stream.Then, device can determine the set of the HRD parameter being applicable to selected operating point.Device can use the set of the HRD parameter being applicable to selected operating point, to configure the behavior of HRD.More particularly, device can use the applicable set of HRD parameter, to configure the behavior of the specific component of HRD, such as, and imagination stream scheduler (HSS), CPB, decode procedure, DPB etc.Subsequently, according to specific scheduling, HSS can by the CPB injecting HRD through decoded video data of bit stream.In addition, device can call the decode procedure through decoded video data in decoding CPB.Decode procedure can output to DPB by through decoding picture.When data mobile is passed through HRD by device, device can determine whether still to meet particular constraints set.Such as, when the operating point of the operating point selected by HRD decoding represents, device can determine whether occur in CPB or DPB to overflow or underflow condition.Device can this way selection and each operating point processing bit stream.If without the operating point causing the bit stream violating constraint, so device can determine that bit stream follows video coding standard.

H.264/AVC the decoder accordance of two types is all specified with HEVC, that is, output timing decoder accordance and output order decoder accordance.Advocate can successfully to decode all bit streams that the bit stream accordance of following video coding standard (such as, HEVC) requires for the decoder of the accordance specifically setting shelves, layer and level.In the present invention, " setting shelves " can refer to the subset of bit stream syntax.Can set in shelves at each and specify " layer " and " level ".The level of layer can be the named aggregate of the constraint of the value of the syntactic element forced in bit stream.These retrain the simple restriction that can be value.Alternatively, described constraint can take the form of the constraint of the arithmetic of value being combined to (such as, picture width is multiplied by the number that picture height is multiplied by the picture of decoding per second).Usually, the level of specifying for lower level than the level of specifying for higher level through comparatively multiple constraint.

When device performs decoder compliance test to determine whether tested decoder (DUT) follows video coding standard, the bit stream following video coding standard can be supplied to HRD and DUT by device.HRD can the mode process bit stream about bit stream compliance test as described above.If the order through decoding picture exported by HRD through the sequence matches of decoding picture exported by DUT, so device can determine that DUT follows video coding standard.In addition, if the sequential that DUT exports through decoding picture mates the sequential of HRD output through decoding picture, so device can determine that DUT follows video coding standard.

H.264/AVC with in HEVC HRD model, decoding or CPB remove can based on access unit.That is, suppose that HRD once decodes whole access unit and remove whole access unit from CPB.In addition, H.264/AVC with in HEVCHRD model, assuming that picture decode is instantaneous.Video encoder 20 can send decode time to start coded access unit with signal in picture sequential SEI message.In actual applications, if the Video Decoder met strictly follows the decode time through sending to start coded access unit with signal, so export the specific decode time that may equal those particular picture the earliest the time through decoding picture and add the time of decoding needed for those particular picture.But in real world, the time needed for decoding picture can not equal zero.

HRD parameter can the various aspects of control HRD.In other words, the responsible HRD parameter of HRD.HRD parameter can comprise initial CPB and remove delay, CPB size, bit rate, initial DPB output delay and DPB size.These HRD parameters are sent with signal in the hrd_parameters () syntactic structure that video encoder 20 can be specified in video parameter collection (VPS) and/or sequence parameter set (SPS).Indivedual VPS and/or SPS can comprise multiple hrd parameters () syntactic structures of the different sets for HRD parameter.In some instances, video encoder 20 can send HRD parameter with signal in Buffer period SEI message or picture sequential SEI message.

As explained above, the operating point of bit stream and the set (that is, the set of nuh_reserved_zero_6bits value) of layer identifier and time identifier are associated.Operating point represents each NAL unit that can comprise and be associated with operating point.Operating point represents to have compares the different frame rate of original bit stream and/or bit rate.This situation is because operating point represents can not comprise some pictures of original bit stream and/or some data of original bit stream.Therefore, when processing original bit stream, if Video Decoder 30 will remove data with special speed from CPB and/or DPB, and when processing operating point and representing, if Video Decoder 30 will remove data with phase same rate from CPB and/or DPB, so Video Decoder 30 can remove too much or very few data from CPB and/or DPB.Therefore, video encoder 20 can send the different sets of HRD parameter for different operating point signal.Such as, in VPS, video encoder 20 can comprise multiple hrd_parameters () syntactic structure, and hrd_parameters () syntactic structure comprises the HRD parameter for different operating point.

In HEVC working draft 8, optionally, the set of HRD parameter comprises the set for the common information in all time sublayers.In other words, the set of HRD parameter optionally comprises the set of the common syntactic element being applicable to the operating point comprising sublayer any time.Time sublayer can be the time scalability layer of the time scalable bit stream be made up of with the non-VCL NAL unit be associated the VCL NAL unit of the particular value with TemporalId.Except the set of common information, the set of HRD parameter also can comprise the set of the syntactic element specific to respective time sublayer.Such as, hrd_parameters () syntactic structure optionally comprises for all sublayers common, and comprises the set of the information of the specific information in sublayer all the time.Because the set of common information for HRD parameter multiple set be common, so the set of common information can be sent in the set of multiple HRD parameter with signal.Truth is, in HEVC working draft 8, when the set of HRD parameter is the set of a HRD parameter in VPS, common information can be present in the set of HRD parameter, or when the set of HRD parameter is associated with the first operating point index, common information can be present in the set of HRD parameter.Such as, when hrd_parameters () syntactic structure is the hrd_parameters () syntactic structure in VPS, or when hrd_parameters () syntactic structure is associated with the first operating point index, HEVC working draft 8 supports the existence of common information.

Following table 1 is the example syntactic structure of the hrd_parameters () syntactic structure in HEVC.

Table 1---HRD parameter

In the example and other syntax table of the present invention of table 1 above, the integer of adjustable length not signed that the syntactic element with type specification symbol ue (v) can be index Columbus (Exp-Golomb) decoding of use 0 rank and encodes, wherein from left position.At the example of table 1 with in following table, the syntactic element with the descriptor of the form of u (n) (wherein n is nonnegative integer) is the not signed values of length n.

In the example grammer of table 1, the syntactic element in " if (commonInfPresentFlag) { ... } " block is the common information of HRD parameter syntactic structure.In other words, the common information of the set of HRD parameter can comprise syntactic element timing_info_present_flag, num_units_in_tick, time_scale, nal_hrd_parameters_present_flag, vcl_hrd_parameters_present_flag, sub_pic_cpb_params_present_flag, tick_divisor_minus2, du_cpb_removal_delay_length_minus1, bit_rate_scale, cpb_size_scale, initial_cpb_removal_delay_length_minus1, cpb_removal_delay_length_minus1, and dpb_output_delay_length_minus1.

In addition, in the example of table 1, syntactic element fixed_pic_rate_flag [i], pic_duration_in_tc_minusl [i], low_delay_hrd_flag [i] and cpb_cnt_minusl [i] can be the set of the specific HRD parameter in sublayer.In other words, these syntactic elements of hrd_parameter () syntactic structure can be only applicable to comprise the specific operating point in sublayer.Therefore, except the common information optionally comprised, the HRD parameter of hrd_parameters () syntactic structure also can comprise the set of the specific HRD parameter in sublayer specific to the particular sublayers of bit stream.

When HighestTid equals i, fixed_pic_rate_flag [i] syntactic element can indicate retrain in a concrete fashion by export order any two continuous pictures HRD output time between time gap.HighestTid can be the variable of identification (such as, operating point) the highest time sublayer.When HighestTid equals i, pic_duration_in_tc_minus1 [i] syntactic element can be specified in coded video sequence by the time gap between the HRD output time of any continuous picture of output order by clock scale.When HighestTid equals i, low_delay_hrd_flag [i] syntactic element can specify HRD operator scheme, specified by the appendix C of HEVC working draft 8.When HighestTid equals i, cpb_cnt_minus1 [i] syntactic element can specify the number of the substituting CPB specification in the bit stream of coded video sequence, and one of them substituting CPB specification means one of the set with specific CPB parameter specific CPB operation.

Video encoder 20 can use SEI message with in bit stream, comprise the sample value that is correctly decoded picture and unwanted after establish data.But the rear data of establishing be contained in SEI message can be used for other object various by Video Decoder 30 or other device.Such as, the rear data of establishing in SEI message can be used for picture output timing, picture display, loss detection and error concealing by Video Decoder 30 or another device.

Video encoder 20 can comprise one or more SEI NAL unit in access unit.In other words, any number SEI NAL unit can be associated with access unit.In addition, each SEI NAL unit can contain one or more SEI message.HEVC standard describes the syntax and semantics being used for various types of SEI message.But HEVC standard does not describe the disposal of SEI message, this is because SEI message does not affect specification decode procedure.A reason in HEVC standard with SEI message is realize decipher supplementary data in the same manner in the different system using HEVC.Use the specification of HEVC and system may need video encoder produce some SEI message or definable particular type receive the concrete disposal of SEI message.Following table 2 is listed SEI message specified in HEVC and is described its object briefly.

Table 2---the general introduction of SEI message

The U.S. Provisional Patent Application case 61/705,102 of application on September 24th, 2012 describes the various methods being used for sending and select HRD parameter with signal, comprises and to send with signal and to select the deferred message in SEI message and time sequence information.Chinese Nuck plug draws " operating point in AHG9:VPS and nest cover SEI (AHG9:Operation points in VPS and nesting SEI) " (video coding associating cooperative groups (JCT-VC) of ITU-T SG 16WP 3 and ISO/IEC JTC 1/SC 29/WG11 of people such as (Hannuksela), 11st meeting, Chinese Shanghai, 10 to 19 October in 2012, No. JCTVC-K0180v1st, file, to 13 days June in 2013, it can obtain from http://phenix.int-evry.fr/jct/doc_end_user/documents/11_Shangha i/wg11/JCTVC-K0180-v1.zip) be provided for sending the other method of HRD parameter and the mechanism for nest cover SEI message with signal.

Some problem or shortcoming is there is in existing for the technology sending HRD parameter with signal.Such as, existing technology can not allow the set being shared HRD parameter by multiple operating point.But when the number of operating point is higher, guarantee that the accordance of bit stream is to produce video encoder 20 or another unit of the set of different HRD parameters for each operating point for trial, this situation can be burden.Truth is, by guaranteeing that each operating point is associated with the set of HRD parameter, but the specific collection of HRD parameter can be shared by multiple operating point and guarantee the accordance of bit stream.One or more technology of the present invention can provide the design allowing the set being shared HRD parameter by multiple operating point.In other words, the singleton of HRD parameter is applicable to multiple operating point.This design can allow the video encoder 20 of the accordance attempting guaranteeing bit stream or another unit to accept or reject between complexity and performance.

In the problem of existing technologies sending HRD parameter with signal or another example of shortcoming, when there is the set of multiple HRD parameter in VPS, multiple different sets of the common information of the set had for HRD parameter may be needed.When there is greater number HRD parameter syntactic structure in VPS, this situation can be especially correct.Therefore, the set in the HRD parameter syntactic structure being different from a HRD parameter syntactic structure with common information may be needed.Such as, when there is multiple hrd_parameters () syntactic structure in VPS, especially when the total number of hrd_parameters () syntactic structure is relatively high, for providing augmented performance, be different from the common information of hrd_parameters () syntactic structure, or be different from the common information of the first operating point index, may need that there is the different common information for hrd_parameters () syntactic structure.

One or more technology of the present invention provides the design allowing any set for HRD parameter to send the common information of the set of HRD parameter clearly with signal.Such as, technology of the present invention can allow to send for the common information in all sublayers with signal clearly for any hrd_parameters () syntactic structure.

In this way, video encoder 20 can send with signal the VPS comprising multiple HRD parameter syntactic structures in bit stream, each self-contained HRD parameter of described multiple HRD parameter syntactic structure.For the corresponding HRD parameter syntactic structure of each in multiple HRD parameter syntactic structure, except the set that VPS comprises the specific HRD parameter information in sublayer indicating the HRD parameter of corresponding HRD parameter syntactic structure except the particular sublayers specific to bit stream further, whether also comprise the syntactic element of the common set of HRD parameter.The common set of HRD parameter for bit stream all sublayers be common.

Similarly, Video Decoder 30 or another device can comprise the VPS of multiple HRD parameter syntactic structure from bitstream decoding, each self-contained HRD parameter of described multiple HRD parameter syntactic structure.For the corresponding HRD parameter syntactic structure of each in multiple HRD parameter syntactic structure, VPS can comprise the syntactic element indicating the HRD parameter of corresponding HRD parameter syntactic structure whether to comprise the common set of HRD parameter further.Video Decoder 30 or other device can use the HRD parameter of at least one in HRD parameter syntactic structure and executable operations.

In addition, the Existing methods for nest cover SEI message can have some problem or shortcoming.Such as, a SEI message can not be allowed to be applicable to multiple operating point by the existing technologies that signal sends HRD parameter.Technology of the present invention can provide permission SEI message to be applicable to the design of multiple operating point.

In particular, scalable nest cover SEI message can comprise to specify and is applicable to the syntactic element that nest is placed on multiple operating points of the SEI message in scalable nest cover SEI message.In other words, scalable nest cover SEI message can be provided for SEI message and bit stream subset (such as, operating point represents) are associated, or the mechanism be associated with specific layer and sublayer.

In this way, video encoder 20 can produce the scalable nest cover SEI message comprising multiple syntactic element, multiple operating points that the nido SEI message that described multiple syntactic element identification is encapsulated by scalable nest cover SEI message is suitable for.In addition, video encoder 20 can send scalable nest cover SEI message with signal in bit stream.

In this way, in video coding process, multiple syntactic elements of the operating point that the nido SEI message that Video Decoder 30 or another device can be encapsulated from the identification of scalable nest cover SEI source codec by scalable nest cover SEI message is suitable for.In addition, Video Decoder 30 or other device can the executable operations based on one or many person in the syntactic element of nido SEI message at least partly.

The problem of existing technologies or another example of shortcoming of nest cover SEI message are about the following fact: the existing technologies of nest cover SEI message does not use layer identifier syntactic element in current SEI NAL unit (such as, nuh_reserved_zero_6bits) value, to determine that the scalable nest being applicable to be encapsulated by current SEI NAL unit overlaps the operating point of SEI message.

Whether technology of the present invention provides the operating point with signal sends the nido SEI message be applicable in SEI NAL unit to be the design of the operating point indicated by the layer identifying information in the NAL unit header of SEI NAL unit.Layer identifying information in the NAL unit header of SEI NAL unit can comprise the value of the nuh_reserved_zero_6bits of NAL unit header and the value of nuh_temporal_id_plus1.In other words, whether technology of the present invention can be provided for being applicable to by being contained in current SEI NAL unit (namely by sending nido SEI message with signal, SEI NAL unit containing scalable nest cover SEI message) NAL unit header in the default action point that identifies of layer identifying information, and use the design of the layer identifying information (such as, the value of nuh_reserved_zero_6bits and the value of nuh_temporal_id_plus1) in the NAL unit header of current SEI NAL unit.

In this way, in the scalable nest cover SEI message be encapsulated by SEI NAL unit, video encoder 20 can comprise the syntactic element indicating the nido SEI message be encapsulated by scalable nest cover SEI message whether to be applicable to give tacit consent to sub-bit stream.Give tacit consent to sub-bit stream to represent for the operating point of the operating point defined by time identifier specified in layer identifier specified in the NAL unit header of SEI NAL unit and NAL unit header.In addition, the exportable bit stream comprising scalable nest cover SEI message of video encoder 20.

Similarly, the device of such as Video Decoder 30 or another device at least partly based on the syntactic element in the scalable nest cover SEI message be encapsulated by SEI NAL unit, can determine whether the nido SEI message be encapsulated by scalable nest cover SEI message is applicable to give tacit consent to sub-bit stream.As mentioned above, give tacit consent to sub-bit stream to represent for the operating point of the operating point defined by time identifier specified in layer identifier specified in the NAL unit header of SEI NAL unit and NAL unit header.When nido SEI message is applicable to give tacit consent to sub-bit stream, device can use nido SEI message in the operation to the sub-bit stream of acquiescence.Such as, nido SEI message can comprise one or more HRD parameter.In this example, device can use one or more HRD parameter described to determine to give tacit consent to the bit stream compliance test whether sub-bit stream follows video coding standard (such as, HEVC) to perform.Alternatively, in this example, device can use one or more HRD parameter described to determine whether Video Decoder 30 meets decoder compliance test.

In the problem of Existing methods of overlapping SEI message for nest or another example of shortcoming, the explicit decoding of layer identifier is poor efficiency.Technology of the present invention by differential decoding or usage flag decoding and increase the efficiency of the explicit decoding of layer identifier.

Fig. 2 illustrates the block diagram can implementing the example video encoder 20 of technology of the present invention.Fig. 2 provides for illustrative purposes, and should not be regarded as the restriction to the technology extensively illustrating and describe in such as the present invention.For illustrative purposes, the present invention describes video encoder 20 in the context of HEVC decoding.But technology of the present invention is applicable to other coding standards or method.

In the example of figure 2, video encoder 20 comprise prediction processing unit 100, residue generation unit 102, conversion process unit 104, quantifying unit 106, inverse quantization unit 108, inverse transformation processing unit 110, rebuild unit 112, filter cell 114, through decode picture buffer 116 and entropy code unit 118.Prediction processing unit 100 comprises inter prediction processing unit 120 and intra-prediction process unit 126.Inter prediction processing unit 120 comprises motion estimation unit 122 and motion compensation units 124.In other example, video encoder 20 can comprise more, less or different functional units.

Video encoder 20 can receiving video data.Each CTU in the section of the picture of video encoder 20 codified video data.Each in CTU can be associated with the corresponding CTB of lightness decoding tree-shaped block (CTB) of equal sizes and picture.As the part of coding CTU, prediction processing unit 100 can perform four points of tree segmentations so that the CTB of CTU is divided into block less gradually.Smaller piece can be the decode block of CU.Such as, the CTB be associated with CTU can be divided into the sub-block of four equal sizes, by one or many person sub-sub-block being divided into four equal sizes in sub-block etc. by prediction processing unit 100.

The CU of video encoder 20 codified CTU represents (that is, through decoding CU) to produce the encoded of CU.As the part of coding CU, the divisible decode block be associated with the CU in one or more PU of CU of prediction processing unit 100.Therefore, with lightness, each PU can predict that block and corresponding colorimetric prediction block are associated.Video encoder 20 and Video Decoder 30 can support the PU with all size.As indicated above, the large I of CU refers to the size of the lightness decode block of CU, and the large I of PU refers to the size of the lightness prediction block of PU.Assuming that the size of specific CU is 2N × 2N, so video encoder 20 and Video Decoder 30 can support the PU size of 2N × 2N for infra-frame prediction or N × N, with the symmetrical PU size of 2N × 2N, 2N × N, N × 2N, N × N, or for the similar size of inter prediction.Video encoder 20 and Video Decoder 30 also can support the asymmetric segmentation of the PU size of 2NxnU, 2N × nD, nLx2N and nR × 2N for inter prediction.

By performing inter prediction to each PU of CU, inter prediction processing unit 120 can produce the predictive data for PU.Predictive data for PU can comprise the predictability block of PU and the movable information for PU.Depend on that PU is in I section, in P section or in B section, inter prediction processing unit 120 can perform different operating for the PU of CU.In I section, all PU are through infra-frame prediction.Therefore, if PU is in I section, so inter prediction processing unit 120 does not perform inter prediction to PU.Therefore, for the block of encoding in I pattern, use the spatial prediction of the adjacent block of previous coding in same number of frames to form predictability block.

If PU is in P section, so motion estimation unit 122 can at searching for reference picture in the list of the reference picture of the reference zone for PU (such as, " RefPicList0 ").The reference zone of PU can be in reference picture, contains the region of the sample block of the sample block the most closely corresponding to PU.Motion estimation unit 122 can produce the reference key of the position in the RefPicList0 of the reference picture of the reference zone of instruction containing PU.In addition, motion estimation unit 122 can produce the motion vector of the spatial displacement between the decode block of instruction PU and the reference position being associated in reference zone.Such as, motion vector can be the bivector provided from the coordinate photo current to the skew of the coordinate in reference picture.The movable information that motion estimation unit 122 can be PU with reference to index and motion vector output.Based on actual sample or the interpolation sample of the reference position indicated by the motion vector of PU, motion compensation units 124 can produce the predictability block of PU.

If PU is in B section, so motion estimation unit 122 can perform single directional prediction or bi-directional predicted for PU.For performing single directional prediction for PU, motion estimation unit 122 can search for the reference picture of RefPicList0 or the second reference picture list (" RefPicList1 ") to obtain the reference zone being used for PU.Motion estimation unit 122 can by the motion vector of the reference key of the position in RefPicList0 or RefPicList1 of the reference picture of instruction containing reference zone, spatial displacement between the instruction prediction block of PU and the reference position being associated in reference zone and indicate reference picture to be that one or more prediction direction designator in RefPicList0 or in RefPicList1 exports be the movable information of PU.Motion compensation units 124 can produce the predictability block of PU at least partly based on the actual sample of the reference zone indicated by the motion vector of PU or interpolation sample.

For performing bidirectional interframe predictive for PU, motion estimation unit 122 can search for reference picture in RefPicList0 to obtain the reference zone for PU, and the reference picture also can searched in RefPicList1 is to obtain another reference zone for PU.Motion estimation unit 122 can produce the reference key of the position in RefPicList0 and RefPicList1 of the reference picture of instruction containing reference zone.In addition, motion estimation unit 122 can produce the motion vector of the space displacement between reference position and the prediction block of PU indicating and be associated with reference zone.The movable information of PU can comprise reference key and the motion vector of PU.Motion compensation units 124 can produce the predictability block of PU at least partly based on the actual sample of the reference zone indicated by the motion vector of PU or interpolation sample.

The predictive data that intra-prediction process unit 126 produces for PU by performing infra-frame prediction to PU.Predictive data for PU can comprise predictability block for PU and various syntactic element.Intra-prediction process unit 126 can perform infra-frame prediction to the PU in I section, in P section and in B section.

For performing infra-frame prediction to PU, intra-prediction process unit 126 can use multiple intra prediction mode to produce the set for multiple predictive datas of PU.The predictability block that intra-prediction process unit 126 can produce for PU based on the sample of adjacent PU.For PU, CU and CTU, assuming that from left to right coding orders from top to down, so adjacent PU can above PU, upper right side, upper left side or the left side.Intra-prediction process unit 126 can use a various number intra prediction mode, such as, and 33 directional intra-prediction patterns.In some instances, the number of intra prediction mode can be depending on the size of the prediction block of PU.

Prediction processing unit 100 can from produced by inter prediction processing unit 120 for the predictive data of PU, or from the predictive data for selecting the PU for CU the predictive data of PU produced by intra-prediction process unit 126.In some instances, prediction processing unit 100 based on the set of predictive data bit rate/distortion metrics and select the predictive data of the PU for CU.The predictability block of selected predictive data can be referred to as selected predictability block in this article.

Based on the lightness decode block of CU, Cb decode block and Cr decode block, with predictability lightness block, predictability Cb block and the predictability Cr block selected by the PU of CU, residue generation unit 102 can produce the lightness residual block of CU, Cb residual block and Cr residual block.Such as, residue generation unit 102 can produce the residual block of CU, each sample in residual block is had equal the PU of the sample in the decode block of CU and CU corresponding selected by predictability block in corresponding sample between the value of difference.

Conversion process unit 104 can perform four points of tree segmentations the residual block of CU to be divided into the transform block be associated with the TU of CU.Therefore, TU can be associated by chromaticity transformation block corresponding to lightness transform block and two.The lightness transform block of the TU of CU and the size of chromaticity transformation block and position can or can not based on the size of the prediction block of the PU of CU and positions.Be called that four sub-tree structure of " remaining four points of trees " (RQT) can comprise the node be associated with each in region.The TU of CU may correspond to the leaf node in RQT.

By one or more conversion being applied to the transform block of TU, conversion process unit 104 can produce the transformation coefficient block of each TU of CU.Various conversion can be applied to the transform block be associated with TU by conversion process unit 104.Such as, discrete cosine transform (DCT), directional transforms or conceptive similar conversion can be applied to transform block by conversion process unit 104.In some instances, conversion is not applied to transform block by conversion process unit 104.In these examples, transform block can be treated as transformation coefficient block.

Quantifying unit 106 can conversion coefficient in quantization parameter block.Quantizing process can reduce the bit depth be associated with some or all in conversion coefficient.Such as, during quantizing, n bit map coefficient depreciation can be truncated to m bit map coefficient, wherein n is greater than m.Based on the quantization parameter be associated with CU (QP) value, quantifying unit 106 can quantize the coefficient block be associated with the TU of CU.Video encoder 20 adjusts by adjusting the QP value that is associated with CU the quantization degree being applied to the coefficient block be associated with CU.Quantification can make information dropout, and the conversion coefficient therefore through quantizing can have the precision lower than original transform coefficient.

Inverse quantization and inverse transformation can be applied to coefficient block by inverse quantization unit 108 and inverse transformation processing unit 110 respectively, with from coefficient block rebuild residual block.Residual block through rebuild can be added to the corresponding sample of one or more predictability block carrying out free prediction processing unit 100 and produce by rebuild unit 112, with produce be associated with TU through rebuilding transform block.By the transform block of each TU of rebuild CU in this way, video encoder 20 restructural builds the decode block of CU.

Filter cell 114 can perform one or more deblocking operation to reduce the block artifacts in the decode block that is associated with CU.At filter cell 114 to after performing one or more deblocking operation through rebuild decode block, can store through rebuild decode block through decode picture buffer 116.Inter prediction processing unit 120 can use containing the reference picture through rebuild decode block to perform inter prediction to the PU of other picture.In addition, intra-prediction process unit 126 can use in decode picture buffer 116 through rebuild decode block, with in the picture identical with CU other PU perform infra-frame prediction.

Entropy code unit 118 can receive data from other functional unit of video encoder 20.Such as, entropy code unit 118 can receive coefficient block from quantifying unit 106, and can receive syntactic element from prediction processing unit 100.Entropy code unit 118 can perform the operation of one or more entropy code, to produce through entropy code data to data.Such as, entropy code unit 118 can perform context-adaptive variable-length decoding (CAVLC) operation to data, CABAC operates, can change to the decoded operation of variable (V2V) length, operate based on the context-adaptive binary arithmetic decoding (SBAC) of grammer, probability interval splits entropy (PIPE) decoded operation, exp-Golomb coding operates or the entropy code operation of another type.Video encoder 20 is exportable comprises the bit stream through entropy code data produced by entropy code unit 118.Such as, bit stream can comprise the data of the RQT representing CU.

As indicated above, technology of the present invention can provide the design allowing to send the common information of HRD parameter syntactic structure clearly with signal for any HRD parameter syntactic structure in VPS.For making it possible to the common information sending HRD parameter syntactic structure for any HRD parameter in VPS clearly with signal, video encoder 20 can produce the VPS syntactic structure following the example grammer shown in following table 3.

Table 3---VPS syntactic structure

Grammer and the difference between the corresponding table from HEVC working draft 8 of the italicized item dial gauge 3 of table 3.In addition, in the example grammer of table 3, num_ops_minus1 syntactic element specifies the number of the operation_point () syntactic structure existed in VPS.The number of the operating point that hrd_applicable_ops_minus1 [i] syntactic element specifies the i-th hrd_parameters () syntactic structure to be suitable for.The jth operating point that hrd_op_idx [i] [j] syntactic element specifies the i-th hrd_parameters () syntactic structure in VPS to be suitable for.As simply mentioned above, technology of the present invention can allow the set being shared HRD parameter by multiple operating point.The operating point that hrd_applicable_ops_minus1 [i] syntactic element and hrd_op_idx [i] [j] can be used to indicate the set of HRD parameter to be suitable for.Do not allow multiple operating point to be applicable in some examples of the singleton of HRD parameter, omit hrd_applicable_ops_minus1 [i] syntactic element and hrd_op_idx [i] [j] syntactic element from table 3.

In the example grammer of table 3, VPS can comprise the set (that is, syntactic element) that common parameters exists flag, is expressed as cprms_present_fiag [i] in table 3.Cprms_present_fiag [i] the syntactic element appointment equaling 1 is present in for the HRD parameter that all sublayers are common in the i-th hrd_parameters () syntactic structure in VPS.Cprms_present_flag [i] the syntactic element appointment equaling 0 is not present in for the HRD parameter that all sublayers are common in the i-th hrd_parameters () syntactic structure in VPS, but identical with (i-1) the hrd_parameters () syntactic structure in VPS through derivation.

Cprms_present_flag [0] can equal 1 through being inferred as.That is, device can automatically determine (by decoding order) in (that is, infer) VPS the one hrd_parameters () syntactic structure comprise for the common HRD parameter in all sublayers.Therefore, in VPS, the common set of HRD parameter is comprised by the HRD parameter syntactic structure that signal sends.One or more follow-up HRD parameter syntactic structure in VPS can comprise the different common set of HRD parameter.

As simply mentioned above, technology of the present invention can allow the common information (that is, being common HRD parameter for each in sublayer) sending HRD parameter syntactic structure for any HRD parameter syntactic structure clearly with signal.Cprms_present_fiag [i] syntactic element of table 3 Video Decoder 30 or another device can be enable to determine in HRD parameter syntactic structure which comprise set for each in sublayer being common HRD parameter.Therefore, although a HRD parameter syntactic structure can comprise the common set of HRD parameter all the time, in VPS, do not comprise the common set of HRD parameter by one or more HRD parameter syntactic structure that signal sends.Device can use cprms_present_flag [i] syntactic element which determining in the HRD parameter syntactic structure of VPS to comprise the common set of HRD parameter with.

HRD parameter syntactic structure (such as, hrd_parameters () syntactic structure) can comprise the set of the specific HRD parameter in sublayer, and no matter whether HRD parameter syntactic structure comprises for the common HRD parameter in all sublayers.When Video Decoder 30 or another device determine that specific HRD parameter syntactic structure does not comprise the common set of HRD parameter, Video Decoder 30 or another device can use the common set of the HRD parameter be associated with the set of the specific HRD parameter in sublayer of previous HRD parameter syntactic structure and specific HRD parameter syntactic structure and executable operations.Previous HRD parameter syntactic structure can be the set of the HRD parameter sent with signal in VPS before specific HRD parameter syntactic structure by decoding order.If previously HRD parameter syntactic structure comprises the common set of HRD parameter, the common set of the HRD parameter be so associated with previous HRD parameter syntactic structure is the common set of the HRD parameter be contained in previous HRD parameter syntactic structure.If previously HRD parameter syntactic structure does not comprise the common set of HRD parameter, so device can determine that the common set of the HRD parameter be associated with previous HRD parameter syntactic structure is the common set with the HRD parameter be associated prior to the HRD parameter syntactic structure of the previous HRD parameter syntactic structure by decoding order by decoding order.

As mentioned above, device can use the specific HRD parameter of the common set of HRD parameter and sublayer and executable operations.This operation during, device can manage the operation of CPB according to one or many person in HRD parameter, decode video data, and according to one or many person in HRD parameter manage in DPB through decoding picture.In another example, the specific HRD parameter of common set and sublayer of HRD parameter can be used for performing bit stream compliance test or decoder compliance test.

In addition, in some instances, the scalable nest cover SEI message mechanism that is provided for making SEI message and bit stream subset (such as, operating point represents) or is associated with specific layer and sublayer.In some these examples, scalable nest cover SEI message can contain one or more SEI message.Nido SEI message can be referred to as containing the SEI message in scalable nest cover SEI message.Can not be referred to as non-nido SEI message containing the SEI message in scalable nest cover SEI message.In some instances, the nido SEI message in scalable nest cover SEI message can comprise the set of HRD parameter.

In some instances, for the nest message of overlapping which type there is some restrictions.Such as, in identical scalable nest cover SEI message, the SEI message of Buffer period SEI message and other type any can not be overlapped by nest.The initial delay that Buffer period SEI message can indicate HRD to operate.In another example, in identical scalable nest cover SEI message, the SEI message of picture sequential SEI message and other type any can not be overlapped by nest.The picture output time that picture sequential SEI message can indicate HRD to operate and picture/sub-pictures remove the time.In other example, in identical scalable nest cover SEI message, picture sequential SEI message and sub-pictures sequential SEI message can be overlapped by nest.Sub-pictures sequential SEI message CPB can be removed deferred message be supplied to be associated with SEI message through decoding unit.

As indicated above, one or more technology of the present invention can allow a SEI message to be applicable to multiple operating point.In addition, whether one or more technology of the present invention can make video encoder 20 can be the operating point indicated by the layer identifying information in the NAL unit header of SEI NAL unit with the operating point that signal sends the nido SEI message be applicable in SEI NAL unit.In addition, one or more technology of the present invention increases the efficiency of the explicit decoding of layer identifier by differential decoding.These technology can be implemented in the example grammer shown in following table 4 and semanteme of enclosing.

Table 4---scalable nest cover SEI message

In the example of table 4, italicized item can indicate the difference with HEVC working draft 8.Specifically, in the example grammer of table 4, the SEI message that the bitstream_subset_flag syntactic element equaling 0 specifies nest to be placed in scalable nest cover SEI message is applicable to specific layer and sublayer.The bitstream_subset_flag syntactic element equaling 1 specifies the nest SEI message be placed in scalable nest cover SEI message to be applicable to the sub-bit stream produced by the sub-bit stream leaching process of the sub-clause 10.1 of HEVC working draft 8, and described process has as the following specified input specified by the syntactic element of scalable nest cover SEI message.The sub-clause 10.1 of HEVC working draft 8 describes the operation being used for extracting sub-bit stream (that is, operating point represents) from bit stream.Specifically, the sub-clause 10.1 of HEVC working draft 8 provide by remove from bit stream have be greater than tldTarget time identifier (such as, TemporalID) all NAL unit or have the value be not in targetDecLayerIdSet layer identifier (such as, nuh_reserved_zero_6bits) all NAL unit and derive sub-bit stream.TIdTarget and targetDecLayerIdSet is the parameter of bit stream leaching process.In some instances, if nido SEI message is picture buffering SEI message, picture sequential SEI message or sub-pictures sequential SEI message, so bitstream_subset_flag syntactic element equals 1.Otherwise in these examples, bitstream_subset_flag syntactic element equals 0.

In addition, in the example grammer of table 4, if bitstream_subset_flag syntactic element equals 1, so scalable nest cover SEI message comprises default_op_applicable_flag syntactic element.(namely the default_op_applicable_flag syntactic element equaling 1 specifies nido SEI message, nest is placed on the SEI message in scalable nest cover SEI message) be applicable to give tacit consent to sub-bit stream, the sub-bit stream of described acquiescence is the output of the sub-bit stream leaching process of the sub-clause 10.1 of HEVC working draft 8, described process has the input of the tIdTarget of the time identifier (TemporalId) equaling current SEI NAL unit, with by the input being in the targetDecLayerIdSet that 0 forms to all values of the nuh_reserved_zero_6bits in nuh_reserved_zero_6bits (the comprise 0 and nuh_reserved_zero_6bits) scope of current SEI NAL unit.Therefore, give tacit consent to sub-bit stream can be by remove from bit stream have all NAL unit of the time identifier of the time identifier being greater than current SEI NAL unit or removal have be in 0 to all NAL unit of the layer identifier in layer identifier (such as, nuh_reserved_zero_6bits) (the comprise 0 and nuh_reserved_zero_6bits) scope of current SEI NAL unit the bit stream of deriving.Such as, give tacit consent to the subset that sub-bit stream can be bit stream, and give tacit consent to the VCL NAL unit that sub-bit stream can not comprise the bit stream of the layer identifier with the layer identifier be greater than indicated by the layer identifier syntactic element of NAL unit header, or there is the VCL NAL unit of bit stream of time identifier of the time identifier be greater than indicated by the time horizon identifier syntactic element (such as, nuh_temporal_id_plus1) of NAL unit header.The default_op_applicable_flag syntactic element equaling 0 is specified nido SEI message and is not suitable for the sub-bit stream of acquiescence.

In the example grammer of table 4, if bitstream_subset_flag syntactic element equals 1, so scalable nest cover SEI message comprises nesting_num_ops_minus1 syntactic element.Nesting_num_ops_minus1 syntactic element adds the number of nesting_op_idx [i] syntactic element in 1 appointment scalable nest cover SEI message.Therefore, if nesting_num_ops_minus1 syntactic element adds that 1 is greater than 0, so nesting_num_ops_minus1 syntactic element can indicate scalable nest cover SEI message whether to comprise the multiple syntactic elements identifying multiple operating points that nido SEI message is suitable for.In this way, device can overlap the syntactic element (nesting_num_ops_minus1) of the number of the operating point that SEI source codec instruction nido SEI message be suitable for from scalable nest.When nesting_num_ops_minus1 syntactic element does not exist, the value of nesting_num_ops_minus1 can equal 0 through being inferred as.Therefore, if bitstream_subset_flag syntactic element equals 0, so scalable nest cover SEI message does not comprise nesting_op_idx [i] syntactic element.

The nesting_op_flag syntactic element equaling 0 is specified by all_layers_flag syntactic element, and (when it is present) nesting_layer_id_delta [i] syntactic element (in the scope that all values i is in 0 to nesting_num_layers_minus1 (comprise 0 and nesting_num_layers_minus1)) appointment nestingLayerIdSet [0].NestingLayerIdSet [] syntactic element is the array of layer identifier.The nesting_op_flag syntactic element equaling 1 is specified by nesting_op_idx [i] syntactic element appointment nestingLayerIdSet [i].When not existing, infer that the value of nesting_op_flag equals 1.

Nesting_max_temporal_id_plus1 [i] syntactic element named variable maxTemporalId [i].In the example grammer of table 4, the value of nesting_max_temporal_id_plus1 [i] syntactic element is greater than the value of the nuh_tempora1_id_plus1 syntactic element of current SEI NAL unit (that is, the NAL unit containing scalable nest cover SEI message).Variable maxTemporalId [i] is set as equaling nesting_max_temporal_id_plus1 [i]-1.

Nesting_op_idx [i] syntactic element is used to specify setting nestingLayerIdSet [i].Setting nestingLayerIdSet [i] can be made up of op_layer_id [nesting_op_idx] [i] (wherein all values of i is in the scope of 0 to op_num_layer_id_values_minus1 [nesting_op_idx] interior (comprise 0 and op_num_layer_id_values_minus1 [nesting_op_idx])).VPS can specify op_layer_id [] [] value and op_num_layer_values_minus1 [] value on.

In addition, in the example grammer of table 4, the all_layers_flag syntactic element given settings nestingLayerIdSet [0] equaling 0 is made up of nestingLayerId [i] (all values i is in the scope of 0 to nesting_num_layers_minus1 (comprise 0 and nesting_num_layers_minus1)).Variable nestingLayerId [i] is below described.The all_layers_flag syntactic element given settings nestingLayerIdSet equaling 1, by the nuh_reserved_zero_6bits being equal to or greater than current SEI NAL unit, is present in all values composition of the nuh_reserved_zero_6bits in current access unit.

Nesting_num_layers minus1 syntactic element adds the number of nesting_layer_id_delta [i] syntactic element in 1 appointment scalable nest cover SEI message.When i equals 0, nesting_layer_id_delta [i] syntactic element specifies the difference be contained between the first (that is, the 0th) nuh_reserved_zero_6bits value in setting nestingLayerIdSet [0] and the nuh_reserved_zero_6bits syntactic element of current SEI NAL unit.When i is greater than 0, nesting_layer_id_delta [i] syntactic element specifies the difference be contained between the i-th nuh_reserved_zero_6bits value in setting nestingLayerIdSet [0] and (i-1) nuh_reserved_zero_6bits value.

Can induced variable nestingLayerId [i] as follows, wherein nuh_reserved_zero_6bits is the NAL unit header from current SEINAL unit.

nestingLayerId[0]＝nuh_reserved_zero_6bits+nesting_layer_id_delta[0]

for(i＝l；i＜＝nesting_num_layers_minus1；i++)

nestingLayerId[i]＝nestingLayerId[i-1]+nesting_layer_id_delta[i]

Setting nestingLayerIdSet [0] is set to and is made up of nestingLayerId [i] (all i values are in the scope interior (comprise 0 and nesting_num_layers_minus1) of 0 to nesting_num_layers_minus1).When bitstream_subset_flag syntactic element equals 0, nido SEI message is applicable to have the NAL unit of the nuh_reserved_zero_6bits be contained in setting nestingLayerIdSet [0], or there is the NAL unit of the nuh_reserved_zero_6bits equaling current SEI NAL unit, and wherein nuh_temporal_id_plus1 is in the scope interior (comprising nuh_temporal_id_plus1 and maxTemporalId [0]+1 of current SEINAL unit) of nuh_temporal_id_plus1 to the maxTemporalId [0]+1 of current SEI NAL unit.When bitstream_subset_flag syntactic element equals 1, nido SEI message is applicable to the output of the sub-bit stream leaching process of the sub-clause 10.1 of HEVC working draft 8, described process has the input of the tIdTarget equaling maxTemporalId [i], with the input of targetDecLayerIdSet equaling nestingLayerIdSet [i] (in the scope that each i value is in 0 to nesting_num_ops_minus1 (comprise 0 and nesting_num_ops_minus1)), and when default_op_applicable_flag syntactic element equals 1, nido SEI message is also applicable to give tacit consent to sub-bit stream.Sub-bit stream through extracting can be had all NAL unit of the time identifier being greater than maxTemporalId [i] by removal, or removes all NAL unit of the layer identifier had in the scope being in 0 to nesting_num_ops_minus1 and produce.

In this way, for at least one the corresponding operating point in multiple operating points that nido SEI message is suitable for, device (such as, video encoder 20, Video Decoder 30, or another device of such as content transmission network device) can from scalable nest cover SEI source codec first syntactic element (such as, nesting_max_temporal_id_plusl [i]) and the second syntactic element (such as, nesting_op_idx [i]).In addition, device at least partly based on the first syntactic element, can determine the maximum time identifier of corresponding operating point.Device at least partly based on the second syntactic element, can determine the set of the layer identifier of corresponding operating point.

In the example of table 4, nesting_zero_bit syntactic element equals 0.Nesting_zero_bit syntactic element can be used to guarantee that scalable nest cover SEI message is byte-aligned.When the number of the position in scalable nest cover SEI message can be divided exactly by 8, scalable nest cover SEI message can be byte-aligned.

In addition, in the example of table 4, sei_message () syntactic structure comprises SEI message.Therefore, device can overlap by scalable nest multiple nido SEI message that SEI message be encapsulated from scalable nest cover SEI source codec.Each in nido SEI message is applicable to all operations point identified by multiple syntactic element (such as, nesting_max_temporal_idplus1 [i], nesting_op_idx [i] etc.).

In alternate examples, scalable nest cover SEI message can follow the example grammer of following table 5.In the example grammer of table 5, according to one or more technology of the present invention, scalable nest cover SEI message increases the efficiency of the explicit decoding of layer identifier by using decoding flag.

Table 5---scalable nest cover SEI message

In the example of table 5, italicized item shows the difference with HEVC working draft 8.As in table 5 show, bitstream_subset_flag syntactic element, default_op_applicable_flag syntactic element, nesting_num_ops_minus1 syntactic element, nesting_max_temporal_id_plus1 syntactic element, nesting_op_idx [i] syntactic element and nesting_zero_bit syntactic element can have with above about the semanteme that the semanteme described by table 4 is identical.

In addition, in the example of table 5, variable minLayerId is set to and equals nuh_reserved_zero_6bits+1, and wherein nuh_reserved_zero_6bits is the NAL unit header from current SEI NAL unit.The nesting_op_flag syntactic element equaling 0 is specified by all_layers_flag syntactic element and (when it is present) nesting_layer_id_included_flag [i] (all values i is in the scope interior (comprise 0 and nesting_max_layer_id-minLayerId-1) of 0 to nesting_max_layer_id-minLayerId-1) given settings nestingLayerIdSet [0].The nesting_op_flag syntactic element equaling 1 is specified by nesting_op_idx [i] syntactic element given settings nestingLayerIdSet [i].When nesting_op_flag syntactic element does not exist, infer that the value of nesting_op_flag equals 1.

In the example of table 5, the all_layers_flag syntactic element given settings nestingLayerIdSet [0] equaling 0 is made up of nestingLayerId [i] (all i values are in the scope of 0 to nesting_max_layer_id-minLayerId (comprise 0 and nesting_max_layer_id-minLayerId)).NestingLayerId [i] variable is below described.In the example of table 5, the all_layers_flag given settings nestingLayerIdSet equaling 1, by the nuh_reserved_zero_6bits syntactic element being more than or equal to current SEI NAL unit, is present in all values composition of the nuh_reserved_zero_6bits in current access unit.

In addition, in the example of table 5, the maximum of the nuh_reserved_zero_6bits in nesting_max_layer_id syntactic element given settings nestingLayerIdSet [0].Nesting_layer_id_included_flag [i] syntactic element equaling 1 specifies the value equaling the nuh_reserved_zero_6bits of i+minLayerId to be contained in setting nestingLayerIdSet [0].Nesting_layer_id_included_flag [i] syntactic element equaling 0 specifies the value equaling the nuh_reserved_zero_6bits of i+minLayerId not to be contained in setting nestingLayerIdSet [0].

Can induced variable nestingNumLayersMinus1 and variable nestingLayerId [i] (in the scope that i is in 0 to nestingNumLayersMinus1 (comprise 0 and nestingNumLayersMinus1)) as follows:

for(i＝0，j＝0；i＜nesting_max_layer_id；i++)

if(nesting_layer_id_incuded_fiag[i])

nestingLayerId[j++]＝I+minLayerId

nestingLayerId[j]＝nesting_max_layer_id

nestingNumLayersMinus1＝j

Setting nestingLayerIdSet [0] can be made up of nestingLayerId [i] (all i values are in the scope of 0 to nestingNumLayersMinus1 (comprise 0 and nestingNumLayersMinus1)) through being set to.

When bitstream_subset_flag syntactic element equals 0, nido SEI message is applicable to the NAL unit with the nuh_reserved_zero_6bits be contained in setting nestingLayerIdSet [0], or there is the NAL unit of nuh_reserved_zero_6bits of the nuh_reserved_zero_6bits syntactic element equaling current SEI NAL unit, and wherein nuh_temporal_id_plus1 is in the scope interior (comprising nuh_temporal_id_plus1 syntactic element and the maxTemporalId [0]+1 of current SEI NAL unit) from the nuh_temporal_id_plus1 syntactic element of current SEI NAL unit to maxTemporalId [0]+1.

When the bitstream_subset_flag syntactic element of scalable nest cover SEI message equals 1, nido SEI message is applicable to the output of the sub-bit stream leaching process of sub-clause 10.1, described process has the input of the tIdTarget equaling maxTemporalId [i], with the input of targetDecLayerIdSet equaling nestingLayerIdSet [i] (in the scope that each i value is in 0 to nesting_num_ops_minus1 (comprise 0 and nesting_num_ops_minus1)), and when default_op_applicable_flag equals 1, nido SEI message is also applicable to give tacit consent to sub-bit stream.

Fig. 3 illustrates the block diagram being configured to implement the instance video decoder 30 of technology of the present invention.Fig. 3 provides for illustrative purposes, and does not limit the technology as illustrated widely in the present invention and describing.For illustrative purposes, the present invention describes Video Decoder 30 in the context of HEVC decoding.But technology of the present invention is applicable to other coding standards or method.

In the example of fig. 3, Video Decoder 30 comprises entropy decoding unit 150, prediction processing unit 152, inverse quantization unit 154, inverse transformation processing unit 156, rebuild unit 158, filter cell 160 and through decode picture buffer 162.Prediction processing unit 152 comprises motion compensation units 164 and intra-prediction process unit 166.In other example, Video Decoder 30 can comprise more, less or different functional units.

The encoded video data (such as, NAL unit) of bit stream can be received and store through decoding picture buffer (CPB) 151.Entropy decoding unit 150 can receive NAL unit from CPB 151, and dissects NAL unit with syntactic element of decoding.Entropy decoding unit 150 can entropy decoding NAL unit in through entropy code syntactic element.Prediction processing unit 152, inverse quantization unit 154, inverse transformation processing unit 156, rebuild unit 158 and filter cell 160 can produce through decode video data based on the syntactic element extracted from bit stream.

The NAL unit of bit stream can comprise through decoding section NAL unit.As decoding bit stream part, entropy decoding unit 150 can extract with entropy decoding hang oneself decoding section NAL unit syntactic element.Each in decoding section can comprise slice header and slice of data.Slice header can containing the syntactic element about section.Syntactic element in slice header can comprise the syntactic element identifying the PPS be associated with containing the picture of cutting into slices.

Except except bitstream decoding syntactic element, Video Decoder 30 can perform rebuild operation to the CU without segmentation.For performing rebuild operation to the CU without segmentation, Video Decoder 30 can perform rebuild operation to each TU of CU.By performing rebuild operation to each TU of CU, Video Decoder 30 restructural builds the residual block of CU.

As the part TU of CU being performed to rebuild operation, inverse quantization unit 154 can the coefficient block that is associated with TU of inverse quantization (that is, de-quantization).Inverse quantization unit 154 can use the QP value be associated with the CU of TU, to determine quantization degree, with equally, and the inverse quantization degree that inverse quantization unit 154 is applied.That is, the value of the QP used when quantization transform coefficient by adjustment and control compression ratio, namely for representing the ratio of original series and the number of the position of compressed sequence.Compression ratio also can be depending on utilized entropy interpretation method.

After inverse quantization unit 154 dequantized coefficients block, one or more inverse transformation can be applied to coefficient block by inverse transformation processing unit 156, to produce the residual block be associated with TU.Such as, inverse DCT, inverse integer transform, anti-card can be neglected Nan-Luo Wei (Karhunen-Loeve) conversion (KLT), despining conversion, opposite orientation conversion or another inverse transformation and be applied to coefficient block by inverse transformation processing unit 156.

If use intraframe predictive coding PU, so intra-prediction process unit 166 can perform infra-frame prediction to produce the predictability block for PU.Intra-prediction process unit 166 can, based on the prediction block of spatially adjacent PU, use intra prediction mode to produce the predictability lightness block, predictability Cb block and the predictability Cr block that are used for PU.Intra-prediction process unit 166 based on one or more syntactic element from bitstream decoding, can determine the intra prediction mode of PU.

Prediction processing unit 152 based on the syntactic element extracted from bit stream, can build the first reference picture list (RefPicList0) and the second reference picture list (RefPicList1).In addition, if use inter prediction encoding PU, so entropy decoding unit 150 can extract the movable information for PU.Motion compensation units 164 based on the movable information of PU, can determine one or more reference zone of PU.Motion compensation units 164 based on the sample block at one or more reference block place for PU, can produce the predictability lightness block, predictability Cb block and the predictability Cr block that are used for PU.

Lightness transform block, Cb transform block and Cr transform block that rebuild unit 158 can use (as being suitable for) to be associated with the TU of CU, with the predictability lightness block of the PU of CU, predictability Cb block and predictability Cr block (namely, intra-prediction data or inter prediction data), with the lightness decode block of rebuild CU, Cb decode block and Cr decode block.Such as, the sample of lightness transform block, Cb transform block and Cr transform block can be added to the corresponding sample of predictability lightness block, predictability Cb block and predictability Cr block by rebuild unit 158, with the lightness decode block of rebuild CU, Cb decode block and Cr decode block.

Filter cell 160 can perform deblocking operation to reduce the block artifacts be associated with the lightness decode block of CU, Cb decode block and Cr decode block.The lightness decode block of CU, Cb decode block and Cr decode block can be stored in decode picture buffer 162 by Video Decoder 30.Reference picture can be provided for subsequent motion compensation, infra-frame prediction through decode picture buffer 162 and represent in the display unit of the display unit 32 of such as Fig. 1.Such as, Video Decoder 30 based on the lightness block in decode picture buffer 162, Cb block and Cr block, can perform infra-frame prediction or inter prediction operation to the PU of other CU.In this way, Video Decoder 30 can extract the conversion coefficient level of a large amount of brightness coefficient block from bit stream; Inverse quantization conversion coefficient level; Conversion is applied to conversion coefficient level to produce transform block; Decode block is produced at least partly based on transform block; With output decode block for display.

Fig. 4 illustrates the flow chart according to the example operation 200 of the video encoder 20 of one or more technology of the present invention.In the example in figure 4, video encoder 20 can produce the VPS comprising multiple HRD parameter syntactic structures, each self-contained HRD parameter (202) of described multiple HRD parameter syntactic structure.For the corresponding HRD parameter syntactic structure of each in multiple HRD parameter syntactic structure, whether also comprise the syntactic element of the common set of HRD parameter except the set that VPS comprises the specific HRD parameter information in sublayer indicating the HRD parameter of corresponding HRD parameter syntactic structure except the particular sublayers specific to bit stream further, wherein HRD parameter common set for bit stream all sublayers be common.In addition, video encoder 20 can send VPS (204) with signal in bit stream.

Fig. 5 illustrates the flow chart according to the example operation 250 of the device of one or more technology of the present invention.Can by video encoder 20, Video Decoder 30 or another device executable operations 250.Illustrated by the example of fig. 5, device can comprise the VPS of multiple HRD parameter syntactic structure from bitstream decoding, each self-contained HRD parameter (252) of described multiple HRD parameter syntactic structure.For the corresponding HRD parameter syntactic structure of each in multiple HRD parameter syntactic structure, VPS comprises the syntactic element indicating the HRD parameter of corresponding HRD parameter syntactic structure whether to comprise the common set of HRD parameter further.

In addition, device can use the HRD parameter of at least one in HRD parameter syntactic structure and executable operations (254).In some instances, the operating point that bit stream can comprise particular point of operation represents, specific HRD parameter syntactic structure is applicable to particular point of operation, and device can use the HRD parameter of specific HRD parameter syntactic structure to carry out executable operations.Such as, device can use HRD parameter, determines whether the operating point being applicable to HRD parameter syntactic structure follows the bit stream compliance test of video coding standard (such as, HEVC) to perform.In another example, device can use HRD parameter to perform decoder compliance test.

The common set of HRD parameter for bit stream all sublayers common.In some instances, the HRD parameter of each HRD parameter syntactic structure comprises the set of the specific HRD parameter in sublayer specific to the particular sublayers of bit stream.In some instances, each in the set of sublayer specific HRD parameter comprises syntactic element (time gap between such as, indicating by the HRD output time of any two the continuous pictures exporting order, indicate the syntactic element of the substituting number through decoding picture buffer specification in the bit stream of coded video sequence).In some instances, when device determines that specific HRD parameter syntactic structure does not comprise the common set of HRD parameter, device can use the common set of the HRD parameter be associated with the set of the specific HRD parameter in sublayer of previous HRD parameter syntactic structure and specific HRD parameter syntactic structure to carry out executable operations.

Fig. 6 illustrates the flow chart according to the example operation 300 of the video encoder 20 of one or more technology of the present invention.Illustrated by the example of Fig. 6, video encoder 20 can produce the scalable nest cover SEI message comprising multiple syntactic element, multiple operating points (302) that the nido SEI message that described multiple syntactic element identification is encapsulated by scalable nest cover SEI message is suitable for.In addition, video encoder 20 can send scalable nest cover SEI message (304) with signal in bit stream.

Fig. 7 illustrates the flow chart according to the example operation 350 of the device of one or more technology of the present invention.Video encoder 20, Video Decoder 30 or another device can executable operations 350.Illustrated by the example of Fig. 7, multiple syntactic elements (352) of multiple operating points that the nido SEI message that device can be encapsulated from the identification of scalable nest cover SEI source codec by scalable nest cover SEI message is suitable for.In some instances, device can indicate scalable nest cover SEI message whether to comprise the syntactic element (such as, nesting_num_ops_minus1) of multiple syntactic elements of identifying operation point from scalable nest cover SEI source codec.

In addition, device can use one or more syntactic element of nido SEI message, to perform the operation of any one (354) in the operating point that is suitable for about nido SEI message.Such as, whether device can follow in any one determination in the operating point that nido SEI message is suitable for the syntactic element using nido SEI message in the bit stream compliance test of video coding standard (such as, HEVC).In another example, device can use the syntactic element of nido message to perform decoder compliance test.

Fig. 8 illustrates the flow chart according to the example operation 400 of the video encoder 20 of one or more technology of the present invention.Illustrated by the example of Fig. 8, in the scalable nest cover SEI message be encapsulated by SEI NAL unit, video encoder 20 can comprise syntactic element (such as, default_op_applicable_fiag) (402) that indicate the nido SEI message be encapsulated by scalable nest cover SEI message whether to be applicable to give tacit consent to sub-bit stream.Giving tacit consent to sub-bit stream is layer identifier specified in NAL unit header by SEI NAL unit, and in NAL unit header, the operating point of the operating point that specified time identifier defines represents.The first syntactic element in NAL unit header (such as, nuh_reserved_zero_6bits) can marker identifier, and the second syntactic element (such as, nuh_reserved_temporal_id_plus1) in NAL unit header can instruction time identifier.

In the example of Fig. 8, in scalable nest cover SEI message, video encoder 20 can comprise the time identifier identifying operation bidirectional point, and one or more extra syntactic element (404) of the maximum layer identifier of operation bidirectional point.In addition, video encoder 20 can send scalable nest cover SEI message (406) with signal in bit stream.In some instances, the syntactic element indicating the nido SEI message be encapsulated by scalable nest cover SEI message whether to be applicable to give tacit consent to sub-bit stream can be referred to as the first syntactic element, and video encoder 20 can comprise the second syntactic element (such as, bitstream_subset_flag) in scalable nest cover SEI message.Second syntactic element can indicate the nido SEI message be encapsulated by scalable nest cover SEI message whether to be applicable to the sub-bit stream extracted from bit stream, or whether nido SEI message is applicable to specific layer and the sublayer of bit stream.When the second syntactic element instruction nido SEI message is applicable to the sub-bit stream extracted from bit stream, video encoder 20 can only comprise the first syntactic element.

Fig. 9 illustrates the flow chart according to the example operation 450 of the device of one or more technology of the present invention.Video encoder 20, Video Decoder 30 or another device can executable operations 450.Illustrated by the example of Fig. 9, device can overlap the first syntactic element of SEI message (such as based on scalable nest at least partly, bitstream_subset_flag), determine whether the nido SEI message be encapsulated by scalable nest cover SEI message is applicable to the sub-bit stream (452) extracted from bit stream.In response to determining that the nido SEI message be encapsulated by scalable nest cover SEI message is applicable to the sub-bit stream ("Yes" of 452) extracted from bit stream, default action point syntactic element (such as, default_op_applicable_flag) (454) in device decodable code scalable nest cover SEI message.Whether default action point syntactic element can indicate the nido SEI message be encapsulated by scalable nest cover SEI message to be applicable to give tacit consent to sub-bit stream.

Giving tacit consent to sub-bit stream can be by layer identifier specified in the NAL unit header of SEI NAL unit, and in NAL unit header, the operating point of the operating point that specified time identifier defines represents.In some instances, the first syntactic element in NAL unit header (such as, nuh_reserved_zero_6bits) marker identifier, and the second syntactic element (such as, nuh_reserved_temporal_id_plus1) identifier instruction time in NAL unit header.Give tacit consent to the subset that sub-bit stream can be bit stream, and give tacit consent to the following VCL NAL unit that sub-bit stream do not comprise bit stream: the layer identifier with the layer identifier be greater than indicated by the first syntactic element of NAL unit header, or the time identifier with the time identifier be greater than indicated by the second syntactic element of NAL unit header.

In addition, device can at least partly based on the syntactic element in the scalable nest cover SEI message be encapsulated by SEI NAL unit (such as, default_op_applicable_flag), determine whether the nido SEI message be encapsulated by scalable nest cover SEI message is applicable to the sub-bit stream of acquiescence (456) of bit stream.In some instances, scalable nest cover SEI message is encapsulated multiple nido SEI message.In these examples, device based on syntactic element (such as, default_op_applicable_fiag), can determine whether each in the nido SEI message in scalable nest cover SEI message is applicable to give tacit consent to sub-bit stream.

When nido SEI message is applicable to give tacit consent to sub-bit stream ("Yes" of 456), device can use nido SEI message (458) in the operation to the sub-bit stream of acquiescence.Such as, nido SEI message can comprise the set of HRD parameter.In this example, whether device can follow in the operation of video coding standard (such as, HEVC) at the sub-bit stream of test acquiescence the HRD parameter used in nido SEI message.In another example, device can use the HRD parameter in nido SEI message in decoder compliance test.In another example, device can use nido SEI message in the decode operation to the sub-bit stream of acquiescence.In another example, initial CPB removes and postpones to can be used for guidance system to set up suitable initial end opposite end delay, and when carrying video via RTP, DPB output time can be used for deriving RTP time stamp.

Otherwise, when nido SEI message and be not suitable for acquiescence sub-bit stream ("No" of 456) time, or when scalable nest cover SEI message and when not being suitable for sub-bit stream ("No" of 452) extracted from bit stream, device does not use nido SEI message (460) in the operation to the sub-bit stream of acquiescence.Such as, device can based on one or more the extra syntactic element in scalable nest cover SEI message (such as, nesting_max_temporal_id_plus1 [i], nesting_op_idx [i] etc.), determine the time identifier of the second operating point and the maximum layer identifier of the second operating point.In this example, device can use nido SEI message in the operation to extra sub-bit stream, and extra sub-bit stream is that the operating point of the second operating point represents.

In one or more example, hardware, software, firmware or its any combination can implement described function.If with implement software, so function can be used as one or more instruction or code storage on computer-readable media, or via computer-readable media transmission, and performed by hardware based processing unit.Computer-readable media can comprise the computer-readable storage medium of the tangible medium corresponding to such as data storage medium, or communication medium, communication medium, including (for example) according to communication protocol, promotes any media computer program being sent to another place from.In this way, computer-readable media may correspond to the tangible computer readable memory medium in (1) non-transitory substantially, or the communication medium of (2) such as signal or carrier wave.Data storage medium can be can by one or more computer or one or more processor access with search instruction, Procedure Codes and/or data structure, for any useable medium implementing technology described in the present invention.Computer program can comprise computer-readable media.

Non-limiting by example, these computer-readable storage mediums can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage apparatus, disk storage device or other magnetic storage device, flash memory, or can in order to store form in instruction or data structure want program code and can by other media any of computer access.Again, any connection suitably can be called computer-readable media.Such as, if use coaxial cable, Connectorized fiber optic cabling, twisted-pair feeder, Digital Subscriber Line (DSL) or wireless technology (such as, infrared ray, radio and microwave) and from website, server or other remote source transfer instruction, so coaxial cable, Connectorized fiber optic cabling, twisted-pair feeder, DSL or wireless technology (such as, infrared ray, radio and microwave) are contained in the definition of media.However, it should be understood that computer-readable storage medium and data storage medium do not comprise be connected, carrier wave, signal or other temporary media, but for non-transitory tangible storage medium.As used herein, disk and case for computer disc are containing compact disk (CD), laser-optical disk, optical compact disks, digital image and sound optical disk (DVD), floppy disk and Blu-ray Disc, wherein disk is usually with magnetic means rendering data, and CD is by laser rendering data to be optically.The combination of each above also should be contained in the scope of computer-readable media.

Instruction can be performed by such as one or more digital signal processor (DSP), general purpose microprocessor, application-specific integrated circuit (ASIC) (ASIC), field programmable logic array (FPGA) or other equivalence one or more processor that is integrated or discrete logic.Therefore, term used herein " processor " can refer to said structure or be suitable for implementing any one in other structure any of technology described herein.In addition, in certain aspects, can by described herein functional be provided in be configured for use in Code And Decode specialized hardware and/or software module in, or to be incorporated in composite type codec.Again, described technology could be fully implemented in one or more circuit or logic element.

Technology of the present invention can extensive multiple device or equipment be implemented, and described device or equipment comprise the set (such as, chipset) of wireless handset, integrated circuit (IC) or IC.Describe various assembly, module or unit in the present invention to emphasize to be configured to the function aspects of the device performing the technology disclosed, but may not require to be realized by different hardware unit.Truth is, as described above, by various unit combination in codec hardware unit, or the set of interoperability hardware cell (comprising one or more processor as described above) can be passed through and provides described unit in conjunction with suitable software and/or firmware.

Various example has been described.These and other example belongs in the scope of following claims.

Claims

1. a method for processing video data, described method comprises:

At least partly based on the syntactic element in the scalable nest cover SEI message be encapsulated by supplemental enhancement information SEI network abstract layer NAL unit, determine to overlap by described scalable nest the bit stream of acquiescence whether nido SEI message that SEI message is encapsulated is applicable to coded video bitstream, the sub-bit stream of wherein said acquiescence represents for the operating point of the operating point that time identifier specified in layer identifier specified in the NAL unit header by described SEI NAL unit and described NAL unit header defines; And

When described nido SEI message is applicable to the sub-bit stream of described acquiescence, in the operation to the sub-bit stream of described acquiescence, use described nido SEI message.

2. method according to claim 1, the first syntactic element in wherein said NAL unit header indicates described layer identifier, and the described time identifier of the second syntactic element instruction in described NAL unit header.

3. method according to claim 2, the sub-bit stream of wherein said acquiescence is the subset of coded video bitstream, and the sub-bit stream of described acquiescence does not comprise the following video coding layer VCL NAL unit of described coded video bitstream: the layer identifier with the described layer identifier be greater than indicated by described first syntactic element of described NAL unit header, or the time identifier with the described time identifier be greater than indicated by described second syntactic element of described NAL unit header.

4. method according to claim 1, wherein said nido SEI message comprises the set of hypothetical reference decoder HRD parameter.

5. method according to claim 4, wherein uses described nido SEI message to be included in the sub-bit stream of the described acquiescence of test and whether follows in the operation of video coding standard the described HRD parameter used in described nido SEI message.

6. method according to claim 1, uses described nido SEI message in the decode operation wherein using described nido SEI message to be included in the sub-bit stream of described acquiescence.

7. method according to claim 1, wherein:

Described scalable nest cover SEI message is encapsulated multiple nido SEI message, and

Determine whether described nido SEI message is applicable to the sub-bit stream of described acquiescence and comprises at least partly based on institute's syntax elements, determines whether each in described nido SEI message is applicable to the sub-bit stream of described acquiescence.

8. method according to claim 1, wherein:

Institute's syntax elements in described scalable nest cover SEI message is the first syntactic element, and

Described method comprises further at least partly based on indicating the nido SEI message be encapsulated by described scalable nest cover SEI message to be applicable to the second syntactic element of the sub-bit stream extracted from described coded video bitstream in described scalable nest cover SEI message, determines that described scalable nest cover SEI message comprises described first syntactic element.

9. method according to claim 1, wherein said method comprises further:

Based on one or more the extra syntactic element in described scalable nest cover SEI message, determine the time identifier of the second operating point, and the maximum layer identifier of described second operating point; And

In the operation to extra sub-bit stream, use described nido SEI message, described extra sub-bit stream is that the operating point of described second operating point represents.

10. comprise a device for one or more processor, described processor is configured to:

11. device according to claim 10, the first syntactic element in wherein said NAL unit header indicates described layer identifier, and the described time identifier of the second syntactic element instruction in described NAL unit header.

12. devices according to claim 11, the sub-bit stream of wherein said acquiescence is the subset of described coded video bitstream, and the sub-bit stream of described acquiescence does not comprise the following video coding layer VCL NAL unit of described coded video bitstream: the layer identifier with the described layer identifier be greater than indicated by described first syntactic element of described NAL unit header, or the time identifier with the described time identifier be greater than indicated by described second syntactic element of described NAL unit header.

13. devices according to claim 10, wherein said nido SEI message comprises the set of hypothetical reference decoder HRD parameter.

14. devices according to claim 13, one or more processor wherein said is configured to whether follow in the operation of video coding standard at the sub-bit stream of the described acquiescence of test the described HRD parameter used in described nido SEI message.

15. devices according to claim 10, one or more processor wherein said is configured in the decode operation to the sub-bit stream of described acquiescence, use described nido SEI message.

16. devices according to claim 10, wherein:

One or more processor described is configured to, at least partly based on institute's syntax elements, determine whether each in described nido SEI message is applicable to the sub-bit stream of described acquiescence.

17. devices according to claim 10, wherein:

One or more processor described is configured to, at least partly based on indicating the nido SEI message be encapsulated by described scalable nest cover SEI message to be applicable to the second syntactic element of the sub-bit stream extracted from described coded video bitstream in described scalable nest cover SEI message, determine that described scalable nest cover SEI message comprises described first syntactic element further.

18. devices according to claim 10, one or more processor wherein said is configured to further:

19. 1 kinds of devices, it comprises:

Determine to overlap by described scalable nest the device whether nido SEI message that SEI message is encapsulated is applicable to the sub-bit stream of acquiescence of coded video bitstream for the syntactic element at least part of scalable nest cover SEI message based on being encapsulated by supplemental enhancement information SEI network abstract layer NAL unit, the sub-bit stream of wherein said acquiescence represents for the operating point of the operating point that time identifier specified in layer identifier specified in the NAL unit header by described SEI NAL unit and described NAL unit header defines; And

For using the device of described nido SEI message in the operation to the sub-bit stream of described acquiescence when described nido SEI message is applicable to the sub-bit stream of described acquiescence.

20. devices according to claim 19, wherein:

The first syntactic element in described NAL unit header indicates described layer identifier, and the described time identifier of the second syntactic element instruction in described NAL unit header, and

The sub-bit stream of described acquiescence is the subset of coded video bitstream, and the sub-bit stream of described acquiescence does not comprise the following video coding layer VCL NAL unit of described coded video bitstream: the layer identifier with the described layer identifier be greater than indicated by described first syntactic element of described NAL unit header, or the time identifier with the described time identifier be greater than indicated by described second syntactic element of described NAL unit header.

21. devices according to claim 19, wherein:

Described nido SEI message comprises the set of hypothetical reference decoder HRD parameter, and

Described device comprises the device for whether following in the operation of video coding standard the described HRD parameter used in described nido SEI message at the sub-bit stream of the described acquiescence of test.

22. 1 kinds of computer-readable storage mediums storing instruction, when one or more processor by device performs described instruction, described instruction configure described device with:

23. computer-readable storage mediums according to claim 22, wherein:

24. computer-readable storage mediums according to claim 22, wherein:

Described instruction configures described device further whether to follow in the operation of video coding standard at the sub-bit stream of the described acquiescence of test the described HRD parameter used in described nido SEI message.

The method of 25. 1 kinds of coding video frequency datas, described method comprises:

In the scalable nest cover SEI message be encapsulated by supplemental enhancement information SEI network abstract layer NAL unit, comprise instruction and overlap by described scalable nest the syntactic element whether nido SEI message that SEI message is encapsulated is applicable to the sub-bit stream of acquiescence of coded video bitstream, the sub-bit stream of wherein said acquiescence represents for the operating point of the operating point that time identifier specified in layer identifier specified in the NAL unit header by described SEI NAL unit and described NAL unit header defines; And

Described scalable nest cover SEI message is sent with signal in described coded video bitstream.

26. method according to claim 25, the first syntactic element in wherein said NAL unit header indicates described layer identifier, and the described time identifier of the second syntactic element instruction in described NAL unit header.

27. methods according to claim 26, the sub-bit stream of wherein said acquiescence is the subset of described coded video bitstream, and the sub-bit stream of described acquiescence does not comprise the following video coding layer VCL NAL unit of described coded video bitstream: the layer identifier with the described layer identifier be greater than indicated by described first syntactic element of described NAL unit header, or the time identifier with the described time identifier be greater than indicated by described second syntactic element of described NAL unit header.

28. methods according to claim 25, wherein said nido SEI message comprises the set of hypothetical reference decoder HRD parameter.

29. methods according to claim 28, it is included in the sub-bit stream of the described acquiescence of test further and whether follows in the operation of video coding standard the described HRD parameter used in described nido SEI message.

30. methods according to claim 25, wherein:

Institute's syntax elements indicates each in described nido SEI message whether to be applicable to the sub-bit stream of described acquiescence.

31. methods according to claim 25, wherein:

Institute's syntax elements in described scalable nest cover SEI message is the first syntactic element in described scalable nest cover SEI message,

Described method is included in further in described scalable nest cover SEI message and comprises the second syntactic element, whether the nido SEI message that described second syntactic element instruction is encapsulated by described scalable nest cover SEI message is applicable to the sub-bit stream extracted from described coded video bitstream, or whether described nido SEI message is applicable to specific layer and the sublayer of described coded video bitstream, and

When described second syntactic element indicates described nido SEI message to be applicable to the described sub-bit stream extracted from described coded video bitstream, described scalable nest cover SEI message only comprises described first syntactic element.

32. methods according to claim 25, wherein:

Be the first operating point by the described operating point of one or more syntactic element identification of the NAL unit header of described SEI NAL unit, and

Described method is included in one or more the extra syntactic element comprising the time identifier of identification second operating point and the maximum layer identifier of described second operating point in described scalable nest cover SEI message further.

33. 1 kinds of video coding apparatus comprising one or more processor, described processor is configured to:

34. video coding apparatus according to claim 33, the first syntactic element in wherein said NAL unit header indicates described layer identifier, and the described time identifier of the second syntactic element instruction in described NAL unit header.

35. video coding apparatus according to claim 34, the sub-bit stream of wherein said acquiescence is the subset of described coded video bitstream, and the sub-bit stream of described acquiescence does not comprise the following video coding layer VCL NAL unit of described coded video bitstream: the layer identifier with the described layer identifier be greater than indicated by described first syntactic element of described NAL unit header, or the time identifier with the described time identifier be greater than indicated by described second syntactic element of described NAL unit header.

36. video coding apparatus according to claim 33, wherein said nido SEI message comprises the set of hypothetical reference decoder HRD parameter.

37. video coding apparatus according to claim 36, one or more processor wherein said is configured to whether follow in the operation of video coding standard at the sub-bit stream of the described acquiescence of test the described HRD parameter used in described nido SEI message further.

38. video coding apparatus according to claim 33, wherein:

39. video coding apparatus according to claim 33, wherein:

One or more processor described is configured to comprise the second syntactic element further in described scalable nest cover SEI message, whether the nido SEI message that described second syntactic element instruction is encapsulated by described scalable nest cover SEI message is applicable to the sub-bit stream extracted from described coded video bitstream, or whether described nido SEI message is applicable to specific layer and the sublayer of described coded video bitstream, and

40. video coding apparatus according to claim 33, wherein:

One or more processor described is configured to one or more the extra syntactic element comprising the time identifier of identification second operating point and the maximum layer identifier of described second operating point in described scalable nest cover SEI message further.

41. 1 kinds of video coding apparatus, it comprises:

Overlap by described scalable nest the device whether nido SEI message that SEI message is encapsulated is applicable to the syntactic element of the sub-bit stream of acquiescence of coded video bitstream for comprising instruction in the scalable nest cover SEI message be encapsulated by supplemental enhancement information SEI network abstract layer NAL unit, the sub-bit stream of wherein said acquiescence represents for the operating point of the operating point that time identifier specified in layer identifier specified in the NAL unit header by described SEI NAL unit and described NAL unit header defines; And

For sending the device of described scalable nest cover SEI message in described coded video bitstream with signal.

42. video coding apparatus according to claim 41, wherein:

43. video coding apparatus according to claim 41, wherein:

Described video coding apparatus comprises the device for whether following in the operation of video coding standard the described HRD parameter used in described nido SEI message at the sub-bit stream of the described acquiescence of test.

44. 1 kinds of computer-readable storage mediums storing instruction, when performing described instruction by video coding apparatus, described instruction configure described video coding apparatus with:

45. computer-readable storage mediums according to claim 44, wherein:

46. computer-readable storage mediums according to claim 44, wherein:

Described instruction configures described video coding apparatus further whether to follow in the operation of video coding standard at the sub-bit stream of the described acquiescence of test the described HRD parameter used in described nido SEI message.