WO2015096540A1 - 码流的生成和处理方法、装置及系统 - Google Patents

码流的生成和处理方法、装置及系统 Download PDF

Info

Publication number
WO2015096540A1
WO2015096540A1 PCT/CN2014/088677 CN2014088677W WO2015096540A1 WO 2015096540 A1 WO2015096540 A1 WO 2015096540A1 CN 2014088677 W CN2014088677 W CN 2014088677W WO 2015096540 A1 WO2015096540 A1 WO 2015096540A1
Authority
WO
WIPO (PCT)
Prior art keywords
code stream
image
poc
control information
alignment operation
Prior art date
Application number
PCT/CN2014/088677
Other languages
English (en)
French (fr)
Inventor
李明
吴平
尚国强
谢玉堂
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Priority to EP14875097.9A priority Critical patent/EP3089454A4/en
Priority to US15/107,730 priority patent/US10638141B2/en
Priority to EP20172525.6A priority patent/EP3713242A1/en
Priority to JP2016542931A priority patent/JP6285034B2/ja
Priority to BR112016015000A priority patent/BR112016015000A2/pt
Priority to KR1020167019738A priority patent/KR101882596B1/ko
Publication of WO2015096540A1 publication Critical patent/WO2015096540A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/43Hardware specially adapted for motion estimation or compensation
    • H04N19/433Hardware specially adapted for motion estimation or compensation characterised by techniques for memory access
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the present invention relates to the field of image transmission technologies, and in particular, to a method, device and system for generating and processing a code stream.
  • the HEV Multi-view video coding extension framework (MH-HEVC) is currently being developed based on the H.265/High Efficiency Video Coding (HEVC) standard compatible three-dimensional video (3DV) coding standard.
  • HEVC High Efficiency Video Coding
  • 3D-HEVC 3D High Efficiency Video Coding
  • Scalable Video Coding a unified high-rise structure design is adopted. This unified design structure is based on the concept of "multilayer video coding", abstracting the texture component and depth component (Depth Component) of MV-HEVC and 3D-HEVC, and the different scalable layers of scalable coding as " Layer and use Layer Id to identify different viewpoints and scalable layers.
  • the currently published H.265/HEVC standard is called the "H.265/HEVC Version 1" standard.
  • AU Access Unit
  • different layers of images can use different encoding methods.
  • the image of a certain layer may be an Intra Random Access Point (IRAP) image that can be used as a random point
  • IRAP Intra Random Access Point
  • one or more other layers are ordinary inter-frames and layers. Inter-predictive coded image.
  • different layers may select respective IRAP image insertion strategies according to network transmission conditions, video content conversion conditions, and the like.
  • a higher frequency IRAP image insertion strategy may be adopted for the H.265/HEVC compatible base layer video image, and a lower frequency IRAP image insertion strategy may be employed for the enhancement layer video image.
  • a layer-wise access multi-layer video coding structure the random access performance of the multi-layer video coded stream can be guaranteed without a large rate increase.
  • BL base layer
  • BL base layer
  • EL Enhancement View or a Dependent View
  • the method of extracting a multi-layer video encoded code stream can be obtained only for traditional two-dimensional television broadcasting.
  • IRAP images there are three types of IRAP images, namely an IDR (Instantaneous Decoding Refresh) image, a BLA (Broken Link Access) image, and a CRA (Clean Random Access) image. All three images are encoded using Intra coding, and their decoding is independent of other images. The three image types differ in the operation of the Picture Order Count (POC) and the Decoded Picture Buffer (DPB).
  • POC Picture Order Count
  • DPB Decoded Picture Buffer
  • the POC is a sequence number used to identify the order in which images are played in H.265/HEVC Version 1.
  • the POC value of an image consists of two parts.
  • PicOrderCntVal PicOrderCntMsb + PicOrderCntLsb.
  • PicOrderCntMsb is the MSB (Most Significant Bit) value of the image POC value
  • PicOrderCntLsb is the LSB (Least Significant Bit) of the image POC value.
  • the value of PicOrderCntMsb is equal to the value of PicOrderCntMsb of the image in which the previous TemporalId of the current image is equal to 0 in the decoding order
  • the value of PicOrderCntLsb is equal to the value of the slice_pic_order_cnt_lsb field in the slice header information.
  • the number of bits of the slice_pic_order_cnt_lsb field is determined by log2_max_pic_order_cnt_lsb_minus4 in the Sequance Parameter Set (SPS), and the required number of bits is equal to log2_max_pic_order_cnt_lsb_minus4+4.
  • the value of PicOrderCntMsb will be set to 0, and the slice header information does not include the slice_pic_order_cnt_lsb field.
  • the value of PicOrderCntLsb defaults to 0. If the current image is a BLA image, the value of PicOrderCntMsb will be set to 0, and the slice header information includes a slice_pic_order_cnt_lsb field for determining the value of PicOrderCntLsb.
  • the POC is calculated using a usual method; if the current image is a CRA image and the value of the flag bit HandleCraAsBlaFlag is equal to 1, the method of calculating the CRA image using the BLA image method is used. POC value.
  • the slice header information of the enhancement layer always includes the slice_pic_order_cnt_lsb field.
  • the decoder determines the start and end positions of the AUs in the code stream by using the POC value, and all the images in the AU are required. All have the same POC value.
  • the AU may contain both IRAP images and non-IRAP images.
  • the IRAP image is an IDR image and a BLA image
  • the AU package The POC value with the image will be different. Therefore, it is necessary to design a POC Alignment function for a multi-layer video coding standard to satisfy that each image in an AU can have the same POC in a layer-wise structure.
  • a POC alignment method was proposed in the JCT-VC standard conference proposal JCTVC-N0244.
  • the method uses a reserved bit in the slice header information to add a poc_reset_flag field of length 1 bit.
  • the image POC value is first decoded according to the usual method, and then the POC value of the image in the same layer (including BL) in the DPB is reduced by the previously calculated POC value (ie, the POC translation operation), and finally The POC value of the image where the slice is located is set to 0.
  • the main disadvantage of this method is that its BL code stream is not compatible with the H.265/HEVC Version 1 standard, that is, the decoder conforming to the H.265/HEVC Version 1 standard cannot be guaranteed to be decoded from the multi-layer video coded stream. BL code stream.
  • JCT-VC conference proposal JCTVC-O0140 and JCTVC-O0213 propose that based on JCTVC-N0244, when POC alignment is required, only the MSB in the POC is set to zero. Further, the delay operation option of POC alignment has been added to JCTVC-O0213 to cope with the application of fragment loss and frame rate with flag bits with reset POC values.
  • JCTVC-O0176 proposes to perform POC alignment directly on the IDR image instead of using the explicit slice header flag, and adds a reserved bit in the IDR image slice header of the BL code stream for calculating if the image is a CRA image.
  • the POC value at the time of the non-IDR image is used for the POC translation operation of storing the image in the EL layer DPB.
  • JCTVC-O0275 proposes a concept of layer POC for maintaining two different sets of POCs for EL layer images.
  • the Layer POC is a POC value obtained without POC alignment, and the value is used for a related operation of a decoding algorithm such as a Reference Picture Set (RPS); the other is a POC subjected to POC alignment processing, the POC Consistent with the POC value of the BL image in the same AU, the POC value is used to control the output and playback process of the image.
  • RPS Reference Picture Set
  • JCTVC-O0275 uses the information of the BL during the POC alignment process, and the triggering of the POC alignment process uses the variable flag bit maintained internally by the codec, and the value of the flag bit is related to the BL layer image type.
  • POC alignment is required in most cases, so that the images of the layers included in the AU have the same POC value to facilitate operations such as image output control, AU boundary detection, and the like.
  • POC alignment is not required.
  • BL and EL are encoded using different video coding standards.
  • the explicit POC alignment operation flag is not used in the method of JCTVC-N0244 implicitly deriving the POC alignment operation with BL information or prediction structure information.
  • the POC alignment operation is performed.
  • the POC alignment operation cannot be partially and/or completely closed when no POC alignment operations need to be performed.
  • the present invention provides a method, an apparatus and a system for generating and processing a code stream to at least solve the above problems.
  • a method for generating a code stream including: determining, according to an application requirement, whether a video image sequence number POC alignment operation needs to be performed on an entirety and/or a portion of a code stream; and identifying and controlling according to the judgment result.
  • Information is written to the code stream, wherein the identification and control information includes indication information as to whether to perform an overall and/or partial POC alignment operation on the code stream.
  • the identifier and control information is located in a field in which the parameter set in the code stream is located, and is used to indicate whether all and/or part of the image of the parameter set in which the identifier and control information is used in the code stream is executed. POC alignment operation.
  • the parameter set includes at least one of the following: a video parameter set VPS, a sequence parameter set SPS, and an image parameter set PPS.
  • a plurality of the parameter sets include identification and control information indicating whether to perform a POC alignment operation, according to a reference relationship between the parameter sets, the identifier and the control information indicating whether to perform the POC alignment operation in the current parameter set are performed.
  • the identification and control information indicating whether the POC alignment operation is performed corresponding to the parameter set directly and/or indirectly referenced by it is covered.
  • the identifier and control information is located in a field corresponding to at least the data structure of the image layer except the parameter set field in the code stream, and is used to indicate a valid range of the data structure of the code stream. Whether all and/or part of the image within the image performs a POC alignment operation.
  • the identifier and control information is located in a field of the supplemental enhancement auxiliary information SEI in the code stream, and is used to indicate whether all and/or part of the image in the code stream in the valid range of the SEI information performs a POC alignment operation. .
  • the identification and control information is located at a system layer of the code stream for describing a video media attribute, and is used to indicate whether all and/or part of the image stream included in the system code stream performs POC. Alignment operation.
  • the identification and control information is located in a media file of the code stream for describing a video media attribute, and is used to indicate whether all and/or part of the image of the code stream included in the media file is Perform a POC alignment operation.
  • the identification and control information further includes: opening or closing start image position information of the POC alignment operation and/or terminating image position information of turning on or off the POC alignment operation; writing the identification and control information Before describing the code stream, the method further includes determining, according to the prediction structure and the application requirement, a start and/or a termination position of the image that is continuous in the image playback order or the image decoding order of the POC alignment operation.
  • a code stream generating apparatus comprising: a determining module configured to determine, according to an application requirement, whether a video image sequence number POC alignment operation needs to be performed on an entirety and/or a portion of a code stream; And entering, into the module, the identifier and the control information are written into the code stream according to the determination result, wherein the identifier and the control information comprise: indication information of whether to perform a POC alignment operation on the whole and/or part of the code stream .
  • the identification and control information further includes: starting or closing start image position information of the POC alignment operation and/or opening or closing the end image position information of the POC alignment operation; the device further comprising: a determining module, configured to Based on the predicted structure and application requirements, a start and/or end position of an image that is continuous in and/or closed for the POC alignment operation in a sequence of image playback or image decoding is determined.
  • a determining module configured to Based on the predicted structure and application requirements, a start and/or end position of an image that is continuous in and/or closed for the POC alignment operation in a sequence of image playback or image decoding is determined.
  • a method for processing a code stream including: obtaining identification and control information from a code stream, wherein the identification and control information includes: whether the overall and the code stream are And/or partially executing the indication information of the video image sequence number POC alignment operation; performing a POC alignment operation on all and/or part of the image in the code stream that needs to perform a POC alignment operation according to the indication of the identification and control information.
  • the identifier and control information is located in a field in which the parameter set in the code stream is located, and is used to indicate whether all and/or part of the image of the parameter set in which the identifier and the information is used in the code stream is executed. Alignment operation.
  • the parameter set includes at least one of the following: a video parameter set VPS, a sequence parameter set SPS, and an image parameter set PPS.
  • the identifier of the current parameter set indicating whether to perform the POC alignment operation and the control information are covered according to the reference relationship between the parameter sets.
  • the identification and control information indicating whether the POC alignment operation is performed corresponding to the parameter set directly and/or indirectly referenced by it.
  • the identifier and control information is located in a field corresponding to at least the data structure of the image layer except the parameter set field in the code stream, and is used to indicate a valid range of the data structure of the code stream. Whether all and/or a portion of the images within the image perform a POC alignment operation.
  • the identifier and control information is located in a field of the enhanced auxiliary information SEI in the code stream, and is used to indicate whether all and/or part of the image in the code stream in the valid range of the SEI information performs a POC alignment operation.
  • the identification and control information is located at a system layer of the code stream for describing a video media attribute, and is used to indicate whether all and/or part of the image stream included in the system code stream performs POC. Alignment operation.
  • the identification and control information is located in a media file of the multi-layer video stream for describing a video media attribute, and is used to indicate all and/or of the code stream included in the media file. Whether part of the image performs a POC alignment operation.
  • performing a POC alignment operation on the image in the code stream that needs to perform a POC alignment operation according to the indication of the identifier and the control information including: determining, according to the valid range of the field where the identifier and control information is located, and the identifier and Controlling the value of the information, determining an image of the POC alignment operation to be turned on and/or off in the code stream; performing a POC alignment operation on the image in which the POC alignment operation is turned on.
  • a processing device for a code stream comprising: an obtaining module, configured to obtain identification and control information from a code stream, wherein the identification and control information includes: The whole and/or part of the code stream performs indication information of the video image sequence number POC alignment operation; the execution module is configured to perform all of the POC alignment operations in the code stream according to the indication of the identification and control information Part of the image performs a POC alignment operation.
  • the execution module includes: a determining module, configured to determine to enable and/or close POC alignment in the code stream according to a valid range of the field in which the identifier and control information is located and a value of the identifier and the control information.
  • An image of the operation a control module configured to perform a POC alignment operation on the image in which the POC alignment operation is turned on.
  • a communication system using a code stream comprising: a source device including the above-described code stream generating means; and a sink device including the above-described code stream processing means.
  • the present invention when generating a code stream, it is written in the code stream whether it is necessary to perform an indication and control information of the POC alignment operation on the entirety and/or part of the code stream, so that the POC alignment operation can be partially and/or entirely Turn off, increasing the flexibility of stream control.
  • FIG. 1 is a flow chart of a method 100 of generating a code stream in accordance with an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of a code stream generating apparatus 200 according to an embodiment of the present invention.
  • FIG. 3 is a flowchart of a method 300 for processing a code stream according to an embodiment of the present invention
  • FIG. 4 is a schematic structural diagram of a processing device 400 for a code stream according to an embodiment of the present invention.
  • FIG. 5 is a block diagram showing the structure of a communication system 500 using a code stream according to an embodiment of the present invention.
  • the encoder can generate a multi-layer video encoded code stream using the existing POC alignment method, and the decoder obtains the POC value of the current image using the decoding method corresponding to the prior art POC alignment method.
  • POC alignment there is no need to use POC alignment.
  • applications such as uncoordinated simulcasting, hybrid scalable video encoding, and the like.
  • the identification information for the POC alignment operation is added. Corresponding operation.
  • the source device refers to a device that includes an encoder, can generate a multi-layer video encoded code stream, and performs necessary file and system layer processing; and the sink device refers to a decoder, A device that performs the necessary files, system layer processing, and decodable multi-layer video encoded bitstreams.
  • a method for generating a code stream is provided.
  • a method for generating a code stream according to an embodiment of the present invention includes steps S102 and S104. It is judged in step S102 whether or not it is necessary to perform a POC alignment operation on the entirety and/or part of the code stream according to application requirements (for example, whether all layers or a plurality of layer images are simultaneously decoded and output). Then, in step S104, the identification and control information is written into the code stream according to the determination result, wherein the identification and control information includes: indication information for performing a POC alignment operation on the whole and/or part of the code stream.
  • the source device may perform the foregoing steps S102 and S104 when the code stream needs to be transmitted, and write the POC alignment operation identifier and the control information in the transmitted code stream.
  • the code stream in the embodiment of the present invention may be a multi-layer video code stream, or may be another code stream on the system layer, which is not limited in the embodiment of the present invention.
  • the identification and control information and the corresponding operation for the POC alignment operation can be added on the high-level data flow of the multi-layer video coding according to the application requirement, thereby increasing the multi-layer video codec and the code stream. flexibility.
  • the scope of the identifier and the control information may be represented by writing the identifier and control information to different positions of the code stream.
  • the identifier and the control information may be located at least one of the following positions of the code stream: an existing parameter set field in the video code stream, and the video code stream acts at least on the image layer except the parameter set.
  • SEI enhanced auxiliary information
  • the foregoing identifier and control information includes, but is not limited to, the following manner to indicate whether to perform an image of the POC alignment operation:
  • the above identification and control information is located in a field of the parameter set in the code stream, and is used to indicate whether the image layer of the parameter set in which the identification and control information is used in the code stream performs a POC alignment operation; wherein the parameter set includes at least the following: One: video parameter set, sequence parameter set, and image parameter set.
  • the control range of the identification and control information is a multi-layer video.
  • the control range of the identification and control information is the image layer of the multi-layer video using the SPS; when the identification and control information is located
  • the Picture Parameter Set (PPS) the control range of the identification and control information is the image layer of the multi-layer video using the PPS.
  • a plurality of the parameter sets include identifiers and control information indicating whether to perform a POC alignment operation, according to a reference relationship between the parameter sets, an identifier indicating whether to perform a POC alignment operation in the current parameter set is performed.
  • the control information covers the identification and control information indicating whether to perform the POC alignment operation corresponding to the parameter set directly and/or indirectly referenced by the control information.
  • the foregoing identification and control information is located in a field corresponding to at least the data structure of the image layer except the parameter set field in the code stream, and is used to indicate an image of the data structure of the code stream. Whether the layer performs a POC alignment operation.
  • the scope of the data structure includes at least one image (frame image and/or field image).
  • the Supplemental Enhancement Information (SEI) of the identifier and the control information in the code stream is used to indicate whether the image in the code stream in the valid range of the SEI information performs a POC alignment operation;
  • the above identification and control information is located in a system layer of the code stream for describing a field of a video media attribute, that is, a system layer descriptor (Descriptor) for indicating the whole of the code stream included in the system code stream. And/or part of the image whether to perform a POC alignment operation.
  • a system layer descriptor (Descriptor) for indicating the whole of the code stream included in the system code stream. And/or part of the image whether to perform a POC alignment operation.
  • the above identification and control information is located in a media file of the multi-layer video stream for describing a field of a video media attribute, and a file descriptor (Descriptor) for indicating a code stream included in the media file. Overall and whether to perform POC alignment operations.
  • the above identification and control information may also be located in the plurality of locations at the same time, while indicating whether the images in the plurality of ranges perform POC alignment operations.
  • the foregoing identification and control information further includes: turning on or off the starting image position information of the POC alignment operation and/or turning on or off the ending image position information of the POC alignment operation.
  • the method 100 may further include: Step S103, determining, according to the prediction structure and the application requirement, an image of the POC alignment operation to be turned on and/or off in a sequence of image playback order or image decoding order. The starting and / or ending position.
  • the code stream may be transmitted, and the receiving side (which may be referred to as a sink device) receives the multi-layer video code stream.
  • the identification and control information is obtained in the multi-layer video code stream, and the indication of the identification and control information performs a decoding operation and/or a playback operation on the multi-layer video code stream. For example, according to the above identification and control information, determining the code stream for turning on and/or off the POC alignment operation, performing POC alignment on the code stream for turning on the POC alignment operation in decoding and/or playing.
  • a code stream generating apparatus is provided, which is used to implement the method provided in Embodiment 1.
  • FIG. 2 is a schematic structural diagram of a code stream generating apparatus 200 according to an embodiment of the present invention.
  • the generating apparatus 200 may include: a determining module 202 and a writing module 204. It should be understood that the connection relationship of each module represented in FIG. 2 is only an example, and those skilled in the art can fully adopt other connection relationships, as long as each module can implement the functions of the present invention under such a connection relationship.
  • the functions of the respective modules can be realized by using dedicated hardware or hardware capable of performing processing in combination with appropriate software.
  • Such hardware or special purpose hardware may include application specific integrated circuits (ASICs), various other circuits, various processors, and the like.
  • ASICs application specific integrated circuits
  • this functionality may be provided by a single dedicated processor, a single shared processor, or multiple independent processors, some of which may be shared.
  • a processor should not be understood to refer to hardware capable of executing software, but may implicitly include, without limitation, digital signal processor (DSP) hardware, read-only memory (ROM) for storing software, random Access memory (RAM), as well as non-volatile storage devices.
  • DSP digital signal processor
  • ROM read-only memory
  • RAM random Access memory
  • the determining module 202 is configured to determine whether it is necessary to perform a video image sequence number POC alignment operation on the whole and/or a portion of the code stream according to an application requirement; the writing module 204 is configured to determine according to the judgment result of the determining module 202, Identification and control information including whether or not to perform a POC alignment operation on the entirety and/or portion of the multi-layer video stream is written to the code stream.
  • the writing module 204 can indicate whether the whole and/or part of the multi-layer video stream is turned on or off by the above-mentioned identification and control information at different positions of the multi-layer video stream. . The details will not be described again.
  • the foregoing identifier and control information further includes: turning on or off the starting image location information of the POC alignment operation and/or turning on or off the ending image location information of the POC alignment operation; as shown in FIG. 2, the device 200 may also Including: the determining module 206 is configured to determine a starting and/or ending position of an image that is continuous in the image playback order or image decoding order for opening and/or closing the POC alignment operation based on the predicted structure and application requirements.
  • the apparatus may further include an output module configured to output the code stream in which the identifier and control information are written.
  • the apparatus of this embodiment may be a related stream generating device in a video communication application, such as a mobile phone, a computer, a server, a set top box, a portable mobile terminal, a digital video camera, a television broadcasting system device, or the like.
  • the device can be located in a source device that can process at least one of the following multi-layer video signals: scalable video, multi-view video, multi-view depth, multi-view video + multi-view depth.
  • stereoscopic video is a special form in which the number of viewpoints of multi-view video is equal to 2.
  • the embodiment of the present invention further provides a method for processing a code stream, which is used for processing the generated code stream.
  • FIG. 3 is a flowchart of a method 300 for processing a code stream according to an embodiment of the present invention.
  • a method 300 for processing a multi-layer video stream according to an embodiment of the present invention mainly includes steps S302 and S304.
  • the identification and control information including the indication information of whether the video image sequence number POC alignment operation is performed on the entirety and/or part of the code stream is acquired from the code stream.
  • a POC operation is performed on a portion of the code stream that needs to perform a POC alignment operation according to the indication of the identification and control information.
  • the identifier and the control information may be located in multiple fields of the code stream to indicate the effective range of the identifier and the control information.
  • the identifier and the control information may be located in multiple fields of the code stream to indicate the effective range of the identifier and the control information.
  • the identifier and the control information are written to different positions of the code stream to indicate whether different parts of the code stream perform a POC alignment operation
  • the step S304 may include: according to the identifier and the control The valid range of the field in which the information is located and the value of the identification and control information determine an image in which the POC alignment operation is turned on and/or off in the code stream; POC alignment is performed on the image in which the POC alignment operation is turned on.
  • a processing device for a code stream is further provided, which is used to implement the method provided in Embodiment 3.
  • FIG. 4 is a schematic structural diagram of a processing device 400 for a code stream according to an embodiment of the present invention.
  • the processing device 400 mainly includes an obtaining module 402 and an executing module 404.
  • the connection relationship of each module represented in FIG. 4 is only an example, and those skilled in the art can fully adopt other connection relationships, as long as each module can implement the functions of the present invention under such a connection relationship.
  • the functions of the respective modules can be realized by using dedicated hardware or hardware capable of performing processing in combination with appropriate software.
  • Such hardware or special purpose hardware may include application specific integrated circuits (ASICs), various other circuits, various processors, and the like.
  • ASICs application specific integrated circuits
  • this functionality may be provided by a single dedicated processor, a single shared processor, or multiple independent processors, some of which may be shared.
  • a processor should not be understood to refer to hardware capable of executing software, but may implicitly include, without limitation, digital signal processor (DSP) hardware, read-only memory (ROM) for storing software, random Access memory (RAM), as well as non-volatile storage devices.
  • DSP digital signal processor
  • ROM read-only memory
  • RAM random Access memory
  • the obtaining module 402 is configured to obtain, from the code stream, identification and control information including indication information of whether to perform a video image sequence POC alignment operation on the whole and/or part of the multi-layer video code stream.
  • the executing module 406 is configured to perform POC alignment on a portion of the code stream that needs to perform a POC alignment operation according to the indication of the identifier and the control information.
  • the executing module 404 may be configured to: determine, according to the valid range of the field where the identifier and the control information are located, and the value of the identifier and the control information, determine that the multi-layer video code stream is enabled and/or Or turn off the image of the POC alignment operation; the control module is set to perform a POC alignment operation on the image in which the POC alignment operation is turned on.
  • the device of this embodiment may be a related stream receiving and playing device in a video communication application, for example, a mobile phone, a computer, a server, a set top box, a portable mobile terminal, a digital video camera, a television broadcasting system device, or the like.
  • the apparatus can be located in a sink device that can process at least one of the following multi-layer video signals: scalable video, multi-view video, multi-view depth, multi-view video + multi-view depth.
  • stereoscopic video is a special form in which the number of viewpoints of multi-view video is equal to 2.
  • a communication system using a code stream there is also provided a communication system using a code stream.
  • FIG. 5 is a schematic structural diagram of a communication system 500 using a code stream according to an embodiment of the present invention.
  • a transmission system 500 for a multi-layer video code stream according to an embodiment of the present invention includes a source device 502 and a sink device 504.
  • the source device 502 includes the code stream generating apparatus 200 described in the second embodiment
  • the sink device 504 includes the code stream processing apparatus 400 described in the fourth embodiment.
  • the source device 502 may generate and output a code stream according to the method described in Embodiment 1, and the sink device 504 receives the code stream, and processes the code stream according to the method described in Embodiment 3.
  • the source device 502 may generate and output a code stream according to the method described in Embodiment 1, and the sink device 504 receives the code stream, and processes the code stream according to the method described in Embodiment 3.
  • the high-level code stream organization method for the POC alignment operation used in the following examples employs the structures as shown in Tables 1 and 2.
  • the code streams shown in Tables 1 and 2 include: identification information identifying whether the overall and/or partial code streams use POC alignment operations.
  • identification information identifying whether the overall and/or partial code streams use POC alignment operations.
  • bit field that identifies one or more bit fields of whether the overall and/or partial code stream uses information of the POC alignment operation.
  • the code stream includes: enabling or disabling the POC alignment operation start and/or end position information.
  • the corresponding code stream carries a bit field that enables or disables one or more bit fields of the POC alignment operation start and/or end position information.
  • the value of the corresponding variable poc_alignment_enable_flag defaults to 1.
  • start_info indicates the start image position of the POC alignment operation is enabled or disabled.
  • Start_info uses the codec method corresponding to se(v).
  • End_info indicates the end image position of the POC alignment operation is enabled or disabled.
  • End_info uses the codec method corresponding to se(v).
  • the fields in Table 3 can be used in combination with the fields in Tables 1 and 2.
  • the start image position and the end image position of the POC alignment operation are turned on; when the fields in Table 3 are used in combination with the fields in Table 2, the POC alignment operation is turned off. The starting image position and the ending image position.
  • the starting image position and the ending image position may be identified by one or more of the following listed information, including: the POC value of the image, the value of the low bit (LSB) of the POC, and the high bit of the POC.
  • Bit (MSB) information image time stamp information, image decoding order information, image playback order information.
  • the syntax elements in Table 1, Table 2, and Table 3 can be used in one or more of the following data structures, and the bit fields corresponding to the syntax elements in Table 1, Table 2, and Table 3 are used in the code stream corresponding to the data structure. in:
  • VPS Video Parameter Set
  • SEI Supplemental Enhancement Information
  • a data structure including at least one image (frame image and/or field image) in addition to the above data structure;
  • the code stream fields of Table 1 and Table 2 are not located in the same data structure.
  • the following implementation method is described by taking the code stream field poc_alignment_enable_flag of Table 1 as an example.
  • the corresponding implementation method of the code stream field poc_alignment_disable_flag of Table 2 is the same, except that the value of the poc_alignment_disable_flag is opposite to the value of the poc_alignment_enable_flag in the same case.
  • the poc_alignment_enable_flag is located in the VPS, and its control range is all layers of the multi-layer video.
  • the value is equal to 1, it means that the images contained in the same AU have the same POC value.
  • the value is equal to 0, it means that the images contained in the same AU may have the same or different POC values.
  • the bit field of Table 3 can be further selected to further identify the starting image position and the ending image position of the corresponding operation of the poc_alignment_enable_flag.
  • the effective range of the operation defined by the poc_alignment_enable_flag is the entire image including the start image and the end image indicated by the bit field of Table 3 and between (in order of image playback order or image decoding order).
  • the transmission of the multi-layer video stream mainly includes the following steps:
  • Step 1 The source device determines whether it is necessary to perform a POC alignment operation on the multi-layer video according to application requirements (eg, whether all layers or layers of images are simultaneously decoded and output).
  • the source device sets the value of poc_alignment_enable_flag to 1, otherwise it is set to 0.
  • Step 2 The source device writes the value of the poc_alignment_enable_flag to the VPS code stream by using an encoding method corresponding to u(1).
  • Step 3 The source device determines, according to the prediction structure and the application requirement, whether to perform (in accordance with the image playback order or the image decoding order) a continuous segment of the image to perform the operation indicated by the value of the poc_alignment_enable_flag. If yes, the source device determines the values of start_info and end_info according to the prediction structure and application requirements, and writes the values of the two to the code stream by using the encoding method corresponding to se(v).
  • Step 4 After receiving the code stream, the sink device obtains the value of the poc_alignment_enable_flag from the VPS code stream by using a decoding method corresponding to u(1).
  • the sink device When the field shown in Table 3 exists in the code stream, the sink device obtains the values of start_info and end_info from the code stream using the decoding method corresponding to se(v). The sink device determines, according to the values of the values of start_info and end_info, the effective image range of the corresponding operation of the value of poc_alignment_enable_flag. If the field shown in Table 3 is not included in the code stream, the sink device sets the effective range of the corresponding value of the poc_alignment_enable_flag operation to be the entire Video Code Sequence (CVS).
  • CVS Video Code Sequence
  • the sink device may perform AU boundary division on the video code stream using the POC condition.
  • the sink device sets "the POC value equal to the output image of the AU" as its code stream error detection and playback operation control condition. If the decoded code stream does not comply with the control condition, the sink device performs an error control mechanism, performs error concealment and/or reports an error to the source device through the feedback information.
  • the sink device can directly perform image output and playback control based on the POC value.
  • the sink device determines that the value of the poc_alignment_enable_flag is 0, the sink device performs AU boundary division on the video code stream using the non-POC condition. For images obtained by the same AU, if the time stamp information in the corresponding system layer or file is the same, the sink device performs output and playback control on the image at the time indicated by the event marker information.
  • the poc_alignment_enable_flag is located in the SPS, and its control range is the layer of the multi-layer video using the SPS.
  • the value is equal to 1, it indicates that the layer image in the AU has the same POC value as the BL layer image (present in the AU or assumed to exist).
  • the value is equal to 0, it means that the layer image in the AU may have the same or possibly different POC values as the BL layer image (present or assumed to exist in the AU).
  • the bit field of Table 3 can be further selected to further identify the starting image position and the ending image position of the corresponding operation of the poc_alignment_enable_flag.
  • the effective range of the operation defined by the poc_alignment_enable_flag is the entire image including the start image and the end image indicated by the bit field of Table 3 and between (in order of image playback order or image decoding order).
  • the transmission of the multi-layer video stream mainly includes the following steps:
  • Step 1 The source device determines whether it is necessary to perform a POC alignment operation on the layer image and the BL layer image in the video bitstream according to application requirements (eg, whether all layers or layers of images are simultaneously decoded and output).
  • the source device sets the value of the poc_alignment_enable_flag in the SPS used by the layer to 1, otherwise it is set to 0.
  • Step 2 The source device writes the value of the poc_alignment_enable_flag to the SPS code stream by using an encoding method corresponding to u(1).
  • Step 3 The source device determines, according to the prediction structure and the application requirement, whether to perform (in accordance with the image playback order or the image decoding order) a continuous segment of the image to perform the operation indicated by the value of the poc_alignment_enable_flag. If yes, the source device determines the values of start_info and end_info according to the prediction structure and application requirements, and writes the values of the two to the code stream by using the encoding method corresponding to se(v).
  • Step 4 After receiving the code stream, the sink device obtains the value of the poc_alignment_enable_flag from the SPS code stream by using a decoding method corresponding to u(1).
  • the sink device When the field shown in Table 3 exists in the code stream, the sink device obtains the values of start_info and end_info from the code stream using the decoding method corresponding to se(v). The sink device determines, according to the values of the values of start_info and end_info, the effective image range of the corresponding operation of the value of poc_alignment_enable_flag. If the field shown in Table 3 is not included in the code stream, the sink device sets the effective range of the corresponding value of the poc_alignment_enable_flag operation to be the entire CVS.
  • the sink device may perform AU boundary division on the video code stream using the POC condition.
  • the sink device sets "the POC value of the output image of this layer in the AU is equal to (assuming there is) the BL layer image" as its code stream error detection and playback operation control condition. If the decoded code stream does not comply with the control condition, the sink device performs an error control mechanism, performs error concealment and/or reports an error to the source device through the feedback information.
  • the sink device can directly perform output and playback control between the layer image and the BL image according to the POC value. For other images located in the same AU but with different POC values, the sink device performs output and playback control on the image at the time indicated by the event tag information.
  • the sink device determines that the value of the poc_alignment_enable_flag is 0, the sink device performs AU boundary division on the video code stream using the non-POC condition. For images obtained by the same AU, if the time stamp information in the corresponding system layer or file is the same, the sink device performs output and playback control on the image at the time indicated by the event marker information.
  • the poc_alignment_enable_flag is located in the PPS, and its control range is one or more images in the layer of the multi-layer video using the PPS.
  • the value is equal to 1, it indicates that the layer image in the AU has the same POC value as the BL layer image (present in the AU or assumed to exist).
  • the value is equal to 0, it means that the layer image in the AU may have the same or possibly different POC values as the BL layer image (present or assumed to exist in the AU).
  • the bit field of Table 3 can be further selected to further identify the starting image position and the ending image position of the corresponding operation of the poc_alignment_enable_flag.
  • the effective range of the operation defined by the poc_alignment_enable_flag is the entire image including the start image and the end image indicated by the bit field of Table 3 and between (in order of image playback order or image decoding order).
  • the transmission of the multi-layer video stream mainly includes the following steps:
  • Step 1 The source device determines whether it is necessary to perform a POC alignment operation on a certain layer or a certain layer of the layer image and the BL layer image according to application requirements (eg, whether all layers or layers of images are simultaneously decoded and output).
  • the source device sets the value of the poc_alignment_enable_flag in the SPS used by the layer to 1, otherwise it is set to 0.
  • Step 2 The source device writes the value of the poc_alignment_enable_flag to the PPS code stream by using an encoding method corresponding to u(1).
  • Step 3 The source device determines, according to the prediction structure and the application requirement, whether to perform (in accordance with the image playback order or the image decoding order) a continuous segment of the image to perform the operation indicated by the value of the poc_alignment_enable_flag. If yes, the source device determines the values of start_info and end_info according to the prediction structure and application requirements, and writes the values of the two to the code stream by using the encoding method corresponding to se(v).
  • Step 4 After receiving the code stream, the sink device obtains the value of the poc_alignment_enable_flag from the PPS code stream by using a decoding method corresponding to u(1).
  • the sink device When the field shown in Table 3 exists in the code stream, the sink device obtains the values of start_info and end_info from the code stream using the decoding method corresponding to se(v). The sink device determines, according to the values of the values of start_info and end_info, the effective image range of the corresponding operation of the value of poc_alignment_enable_flag. If the field shown in Table 3 is not included in the code stream, the sink device sets the effective range of the corresponding value of the poc_alignment_enable_flag operation to be the entire CVS.
  • the sink device may perform AU boundary division on the video code stream using the POC condition.
  • the sink device sets "the POC value of the output image of this layer in the AU is equal to (assuming there is) the BL layer image" as its code stream error detection and playback operation control condition. If the decoded code stream does not comply with the control condition, the sink device performs an error control mechanism, performs error concealment and/or reports an error to the source device through the feedback information.
  • the sink device can directly perform output and playback control between the layer image and the BL image according to the POC value. For other images located in the same AU but with different POC values, the sink device performs output and playback control on the image at the time indicated by the event tag information.
  • the sink device determines that the value of the poc_alignment_enable_flag is 0, the sink device performs AU boundary division on the video code stream using the non-POC condition. For images obtained by the same AU, if the time stamp information in the corresponding system layer or file is the same, the sink device performs output and playback control on the image at the time indicated by the event marker information.
  • the poc_alignment_enable_flag is located at the SEI, which indicates that the range is one or more of the layers of the multi-layer video using the SEI.
  • the value is equal to 1, it means that all images in the AU have the same POC value (SEI associated with the entire AU), or indicate that an EL layer image has the same POC as the BL layer image (existing or assumed to exist in the AU) Value (SEI associated with an EL layer).
  • the bit field of Table 3 can be further selected to further identify the starting image position and the ending image position of the corresponding operation of the poc_alignment_enable_flag.
  • the effective range of the operation defined by the poc_alignment_enable_flag is the entire image including the start image and the end image indicated by the bit field of Table 3 and between (in order of image playback order or image decoding order).
  • the transmission of the multi-layer video stream mainly includes the following steps:
  • Step 1 The source device determines, according to the generated multi-layer video encoded code stream, whether to align the POC used by the one or a certain segment of the image.
  • the source device sets the value of the pec_alignment_enable_flag in the corresponding SEI to 1, otherwise it is set to 0.
  • the corresponding SEI refers to an SEI associated with the entire AU, or an SEI associated with a certain EL layer.
  • Step 2 The source device uses the encoding method corresponding to u(1) to write the value of the poc_alignment_enable_flag into the SEI code stream, and inserts the field of the SEI into the associated position in the video code stream.
  • Step 3 The source device determines, according to the prediction structure and the application requirement, whether to perform (in accordance with the image playback order or the image decoding order) a continuous segment of the image to perform the operation indicated by the value of the poc_alignment_enable_flag. If yes, the source device determines the values of start_info and end_info according to the prediction structure and application requirements, and writes the values of the two to the code stream by using the encoding method corresponding to se(v).
  • Step 4 After receiving the code stream, the sink device obtains the value of the poc_alignment_enable_flag from the SEI code stream by using a decoding method corresponding to u(1).
  • the sink device When the field shown in Table 3 exists in the code stream, the sink device obtains the values of start_info and end_info from the code stream using the decoding method corresponding to se(v). The sink device determines, according to the values of the values of start_info and end_info, the effective image range of the corresponding operation of the value of poc_alignment_enable_flag. If the field shown in Table 3 is not included in the code stream, the sink device sets the effective range of the corresponding value of the poc_alignment_enable_flag operation to be the entire Video Code Sequence (CVS).
  • CVS Video Code Sequence
  • the sink device may perform AU boundary partitioning on the video code stream using the POC condition.
  • the sink device sets "the POC value equal to the output image of the AU" as its code stream error detection and playback operation control condition. If the decoded code stream does not comply with the control condition, the sink device performs an error control mechanism, performs error concealment and/or reports an error to the source device through the feedback information.
  • the sink device can directly perform image output and playback control based on the POC value.
  • the sink device may perform AU boundary partitioning on the video code stream using the non-POC condition.
  • the sink device sets "the POC value of the output image of this layer in the AU is equal to (assuming there is) the BL layer image" as its code stream error detection and playback operation control condition. If the decoded code stream does not comply with the control condition, the sink device performs an error control mechanism, performs error concealment and/or reports an error to the source device through the feedback information.
  • the sink device can directly perform output and playback control between the layer image and the BL image according to the POC value. For other images located in the same AU but with different POC values, the sink device performs output and playback control on the image at the time indicated by the event tag information.
  • the sink device determines that the value of the poc_alignment_enable_flag is 0, the sink device performs AU boundary division on the video code stream using the non-POC condition. For images obtained by the same AU, if the time stamp information in the corresponding system layer or file is the same, the sink device performs output and playback control on the image at the time indicated by the event marker information.
  • the identifier information is located in a data structure in which at least one image (frame image and/or field image) is included in other scopes, and the other range of motion includes at least one image (frame image and/or field image).
  • the data structure if the data structure contains data information that must be used in other decoding processes, the data structure is a data structure necessary for the decoding process. At this time, if the scope of the data structure is the entire multi-layer video, the source device and the sink device operate on the poc_alignment_enable_flag in a similar manner to the example 1.
  • the source device and the sink device operate on the poc_alignment_enable_flag in a similar manner to the example 2. If the scope of the data structure is one or more images on an EL in the entire multi-layer video, the source device and the sink device operate on the poc_alignment_enable_flag similarly to the example 3.
  • the data structure in which the other scope of action includes at least one image (frame image and/or field image)
  • the data structure does not include other data information that must be used in the decoding process
  • the data structure is not in the decoding process.
  • the required data structure At this time, the source device and the sink device operate on the poc_alignment_enable_flag in a similar manner to the example 4.
  • the bit field of Table 3 can be further selected to further identify the starting image position and the ending image position of the corresponding operation of the poc_alignment_enable_flag.
  • the effective range of the operation defined by the poc_alignment_enable_flag is the entire image including the start image and the end image indicated by the bit field of Table 3 and between (in order of image playback order or image decoding order).
  • the difference between the example 5 and the example 1 to the example 4 is that the source device writes the value of the poc_alignment_enable_flag to the data structure including at least one image (frame image and/or field image) using the encoding method corresponding to u(1).
  • the sink device parses the field corresponding to the poc_alignment_enable_flag from the corresponding code stream of the data structure including at least one image (frame image and/or field image) by using a decoding method corresponding to u(1).
  • the value of poc_alignment_enable_flag is the source device writes the value of the poc_alignment_enable_flag to the data structure including at least one image (frame image and/or field image) using the encoding method corresponding to u(1).
  • the foregoing identifier information is located in the system descriptor.
  • the scope of the descriptor including the poc_alignment_enable_flag is the entire multi-layer video encoded code stream in the system code stream
  • the operation method and instance of the source device and the sink device on the poc_alignment_enable_flag is similar.
  • the source device and the sink device operate on the poc_alignment_enable_flag and the instance 4 "with an EL layer"
  • the operation method in the case of the associated SEI is similar.
  • the difference between the example 6 and the example 4 is that the source device uses the same encoding method as u(1) or the same encoding method as u(1) to write the value of the poc_alignment_enable_flag into the corresponding system code stream of the descriptor, if necessary,
  • the source device uses the encoding method corresponding to se(v) or the same as se(v) to write the values of start_info and end_info into the corresponding system code stream of the descriptor;
  • the sink device uses the corresponding u(1) or The same decoding method as u(1) parses the field corresponding to the poc_alignment_enable_flag from the corresponding system code stream describing the sub-data structure, and obtains the value of the poc_alignment_enable_flag.
  • the sink device uses se ( v) The decoding method corresponding to and or the same as se(v) obtains the values of start_info and end_info from the code stream.
  • the identifier information is located in the file descriptor.
  • the scope of the descriptor including the poc_alignment_enable_flag is the entire multi-layer video code stream in the media file stream
  • the source device and The method of operation of the sink device on the poc_alignment_enable_flag is similar to the operation method in the case of "SEI associated with the entire AU" in Example 4.
  • the source device and the sink device operate on the poc_alignment_enable_flag and the example 4
  • the operation method in the case of layer-associated SEI" is similar.
  • Example 7 is different from the example 4 in that the source device uses the same encoding method as u(1) or the same encoding method as u(1) to write the value of the poc_alignment_enable_flag into the corresponding system code stream of the descriptor, if needed.
  • the source device uses the encoding method corresponding to se(v) or the same as se(v) to write the values of start_info and end_info into the corresponding system code stream of the descriptor; the sink device uses the corresponding u(1) Or the same decoding method as u(1) parses the field corresponding to the poc_alignment_enable_flag from the corresponding system code stream of the description sub-data structure, and obtains the value of the poc_alignment_enable_flag. If the table 3 field exists in the code stream, the sink device uses se (v) The decoding method corresponding to and or the same as se(v) obtains the values of start_info and end_info from the code stream.
  • the above identification information is carried by a method of mixing and using.
  • the PPS refers to the SPS and the SPS references the VPS.
  • the VPS will be referred to as a higher layer data structure than the SPS and PPS, and the SPS is a higher data structure than the PPS.
  • the poc_alignment_enable_flag can be encoded in different levels of data structures.
  • the poc_alignment_enable_flag in the lower-layer data structure covers the poc_alignment_enable_flag in the upper-layer data structure.
  • the range of values of poc_alignment_enable_flag is the minimum intersection of the start_info and end_info defined image ranges in the high-level data structure and the start_info and end_info-defined image ranges in the lower-level data structure.
  • the range of the value of the poc_alignment_enable_flag is defined in the high-level data structure in the start_info and end_info defined image ranges and lower-level data structures.
  • Start_info and end_info define the maximum union of image ranges.
  • the source device first determines how the POC alignment is used based on the input video, the encoding prediction structure, and the application requirements, including: the POC aligned layer, and the POC aligned image.
  • the source device sets the values of the poc_alignment_enable_flag in the VPS, the SPS, the PPS, and the required start_info and end_info according to the information determined in the above, using the methods of the example 1, the instance 2, and the instance 3, and uses the corresponding encoding method. It is written to the code stream.
  • the source device uses the methods described in Example 4, Example 6, and Example 7, sets the auxiliary information required for the video code stream, and the corresponding field information in the system layer and the media file related descriptor, and uses corresponding The encoding method writes it to the code stream.
  • the sink device processes the received code stream, and uses the methods of the example 1, the instance 2, and the instance 3 to obtain the values of the poc_alignment_enable_flag and the required start_info and end_info from the VPS, the SPS, and the PPS, and set different image segments and layers.
  • the POC alignment uses the control and decodes the received code stream.
  • the sink device uses the above methods to set the error control and playback control module in the receiving and decoding process.
  • the sink device can perform AU boundary partitioning on the video bitstream using non-POC conditions.
  • the sink device sets "the POC value of the output image of this layer in the AU is equal to (assuming there is) the BL layer image" as its code stream error detection and playback operation control condition. If the decoded code stream does not comply with the control condition, the sink device performs an error control mechanism, performs error concealment and/or reports an error to the source device through the feedback information.
  • the sink device can directly perform output and playback control between the layer image and the BL image according to the POC value. For other images located in the same AU but with different POC values, the sink device performs output and playback control on the image at the time indicated by the event tag information.
  • the sink device determines that the value of the poc_alignment_enable_flag is 0, the sink device performs AU boundary division on the video code stream using the non-POC condition. For images obtained by the same AU, if the time stamp information in the corresponding system layer or file is the same, the sink device performs output and playback control on the image at the time indicated by the event marker information.
  • the user can obtain information of each terminal of the other party before the communication (for example, the name of the terminal, which can also be called For the nickname, it is possible to actively select the terminal of the communication receiver to communicate, obtain the name (nickname) of the multiple terminals of the communication recipient before communicating with the other party, and selectively select one to initiate communication (voice call, visible) Phone or message) enhances the user experience.
  • the name of the terminal which can also be called For the nickname
  • the method provided by the embodiment of the present invention can increase the description of POC alignment in the upper layer of the code stream, the auxiliary information, and the description of the system layer.
  • a layered description mechanism is adopted on the high-level structure of the code stream, which is beneficial to the flexible control in the code stream generation process.
  • modules or steps of the present invention described above can be implemented by a general-purpose computing device that can be centralized on a single computing device or distributed across a network of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein.
  • the steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps thereof are fabricated as a single integrated circuit module.
  • the invention is not limited to any specific combination of hardware and software.
  • the code stream when the code stream is generated, whether the instruction and the control information for performing the POC alignment operation on the whole and/or the part of the code stream are required to be written in the code stream, so that the POC can be aligned.
  • the operation is partially and/or globally closed, increasing the flexibility of code stream control.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

一种码流的生成和处理方法、装置及系统。其中,该码流的生成方法,包括:根据应用需求判断是否需要对码流的整体和/或部分执行视频图像序号POC对齐操作;根据判断结果,将标识及控制信息写入上述码流,其中,上述标识及控制信息包括:是否对上述码流进行整体和/或部分POC对齐操作的指示信息,能够提高码流控制的灵活性。

Description

码流的生成和处理方法、装置及系统 技术领域
本发明涉及图像传输技术领域,具体而言,涉及一种码流的生成和处理方法、装置及系统。
背景技术
目前正在制定的基于高性能视频编码(H.265/High Efficiency Video Coding,HEVC)标准兼容的三维视频(Three-Dimensional Video,3DV)编码标准MV-HEVC(HEVC Multi-view video coding extension framework)、3D-HEVC(3D High Efficiency Video Coding)和可伸缩视频编码(Scalable video coding)中,采用了统一的高层结构设计。这个统一的设计结构基于“多层视频编码”的概念,将MV-HEVC和3D-HEVC的纹理分量(Texture Component)和深度分量(Depth Component)、可伸缩编码的不同可伸缩层均抽象为“层(Layer)”,并使用层表示序号(Layer Id)来标识不同的视点和可伸缩层。目前已发布的H.265/HEVC标准称为“H.265/HEVC Version 1”标准。
在多层视频编码中,同时刻获得的视频图像及其对应的编码比特组成一个接入单元(Access Unit,AU)。在同一个AU中,各层图像可使用不同的编码方法。这样,在同一个AU中,某层的图像可以是能够作为随机点的帧内编码随机接入(Intra Random Access Point,IRAP)图像,而其他某一个或多个层是普通的帧间、层间预测编码图像。在实际应用中,不同层可以根据网络传输状况、视频内容变换情况等选择各自的IRAP图像插入策略。例如,对于兼容H.265/HEVC的基本层视频图像可采用较高频次的IRAP图像插入策略,对增强层视频图像可采用频次较低的IRAP图像插入策略。这样,使用这种逐层(layer-wise)接入的多层视频编码结构,可以在不出现大的码率激增的情况下,保证多层视频编码码流的随机接入性能。
对于多层视频编码码流,其基本层(Base Layer,BL)码流必须符合H.265/HEVC Version 1标准的规范。即,多层视频编码码流必须保证根据H.265/HEVC Version 1标准设计的解码器能够正确解码从多层视频编码码流中提取出的BL码流。特别地,对于MV-HEVC和3D-HEVC,BL对应于基本视点(Base View)或独立视点(Independent View),EL对应于增强视点(Enhancement View)或非独立视点(Dependent View)。实际应用中,可通过提取多层视频编码码流的方法,获得仅用于传统二维电视播放的 基本视点码流、支持三维立体显示的双视点码流以及支持更加丰富三维立体显示的多视点码流。
在H.265/HEVC Version 1标准中,IRAP图像的类型有三种,分别是IDR(Instantaneous Decoding Refresh)图像、BLA(Broken Link Access)图像和CRA(Clean Random Access)图像。这三种图像均使用帧内(Intra)编码方式进行编码,其解码不依赖于其他图像。这三种图像类型的不同之处在于对视频图像序号(Picture Order Count,POC)和解码图像缓冲区(Decoded Picture Buffer,DPB)的操作。
POC是H.265/HEVC Version 1中用于标识图像播放顺序的序号。根据H.265/HEVC Version 1标准,图像的POC值由两部分组成。使用PicOrderCntVal表示图像的POC值,则PicOrderCntVal=PicOrderCntMsb+PicOrderCntLsb。其中,PicOrderCntMsb是图像POC值的MSB(Most Significant Bit)取值,PicOrderCntLsb是图像POC值的LSB(Least Significant Bit)。通常情况下,PicOrderCntMsb的取值等于按解码顺序当前图像的前一个TemporalId等于0的图像的PicOrderCntMsb的取值,PicOrderCntLsb的取值等于分片(slice)头信息中的slice_pic_order_cnt_lsb字段的取值。slice_pic_order_cnt_lsb字段的比特数由序列参数集(Sequance Parameter Set,SPS)中的log2_max_pic_order_cnt_lsb_minus4确定,所需比特数等于log2_max_pic_order_cnt_lsb_minus4+4。
在H.265/HEVC Version 1中,若当前图像是IDR图像,则PicOrderCntMsb的取值将被置为0,分片头信息中不包含slice_pic_order_cnt_lsb字段,PicOrderCntLsb的取值默认为0。若当前图像是BLA图像,则PicOrderCntMsb的取值将被置为0,分片头信息中包含slice_pic_order_cnt_lsb字段用于确定PicOrderCntLsb的取值。若当前图像是CRA图像且标志位HandleCraAsBlaFlag的取值等于0,则使用通常方法计算POC;若当前图像是CRA图像且标志位HandleCraAsBlaFlag的取值等于1,则使用BLA图像的方法计算该CRA图像的POC值。
需要说明的是,在多层视频编码标准中,不论图像类型,增强层(Enhancement Layer,EL)的分片头信息中始终包含slice_pic_order_cnt_lsb字段。
在此基础上,对于多层视频编码码流,为保证在DPB控制过程中检测同时刻的图像,同时为方便解码器使用POC值在码流中确定各AU的起止位置,要求AU中所有图像均具有相同的POC值。对于layer-wise的编码结构,AU中可能同时包含有IRAP图像和非IRAP图像。这样,如果IRAP图像是IDR图像和BLA图像,则该AU中包 含图像的POC值将不同。因此,需要为多层视频编码标准设计POC对齐(POC Alignment)功能以满足在layer-wise结构时AU中各图像可以具有相同的POC。
为解决这个问题,JCT-VC标准会议提案JCTVC-N0244中提出了一种POC对齐方法。该方法是用分片头信息中的预留比特,增加长度为1比特的poc_reset_flag字段。当该字段的取值等于1时,首先按照通常方法解码图像POC值,然后将DPB中同层(包括BL)中图像的POC值减少之前计算得到的POC值(即POC平移操作),最后将该分片所在图像的POC值设置为0。
该方法的最主要缺点是其BL码流无法兼容H.265/HEVC Version 1标准,即不能保证符合H.265/HEVC Version 1标准的解码器能够解码从多层视频编码码流中抽取得到的BL码流。
为解决该兼容性问题,JCT-VC会议提案JCTVC-O0140和JCTVC-O0213提出在JCTVC-N0244的基础上,在需要进行POC对齐时,仅将POC中的MSB置为0。进一步,JCTVC-O0213中增加了POC对齐的延迟操作选项,以应对携带有重置POC值的标志位的分片丢失和帧率不同的应用情况。JCTVC-O0176提出在IDR图像时直接进行POC对齐,而不是用显式的分片头标志位,并且在BL码流的IDR图像分片头中增加预留比特,用于计算若该图像是CRA图像而非IDR图像时的POC值,该计算得到的POC值用于EL层DPB中存储图像的POC平移操作。JCTVC-O0275提出了一种layer POC的概念,对于EL层图像,维护两套不同的POC。其中,Layer POC为不使用POC对齐条件下得到的POC值,该值用于参考图像集(Reference Picture Set,RPS)等解码算法的相关操作;另外一套是经过POC对齐处理的POC,该POC与同AU中BL图像的POC值一致,该POC值用于控制图像的输出、播放过程。JCTVC-O0275提出的方法在进行POC对齐过程中使用BL的信息,且POC对齐过程的触发使用编解码器内部维护的变量标志位,该标志位的取值与BL层图像类型相关。
对于多层视频,多数情况下需要进行POC对齐,使得同AU中包含的层的图像具有相同的POC值,以便利图像输出控制、AU边界检测等操作。尽管如此,对于某些应用,并不需要进行POC对齐。例如,非协调的联播(uncoordinated simulcast)中,由于某段时间内仅适用BL或单独某个EL层的视频码流,这种情况下,在这段码流中不需要使用POC对齐;如果对联播码流进行抽取、编辑和重组时,产生码流的过程中也不需要使用POC对齐。另外,对于混合可伸缩视频编码(hybrid scalable video coding),其BL和EL使用不同的视频编码标准进行编码,由于不同编码标准使用不同的POC系统和基于POC的图像输出控制操作方式,因此,在混合可伸缩视频编码下,也可以不需要使用POC对齐操作。另外,对于多层视频编码码流,可以通过系统 层或媒体文件打包时增加的时间标记信息来实现将同时刻采集得到的图像的在播放时间上进行对齐,此时,不需要对视频码流进行POC对齐。
由此可见,由于JCTVC-N0244以BL信息或预测结构信息隐含推导执行POC对齐操作的方法中不使用显式的POC对齐操作标志位。当预测结构满足一定条件时,即执行POC对齐操作。在不需要执行POC对齐操作时,无法对POC对齐操作进行局部和/或整体地关闭。
发明内容
针对相关技术中的上述问题,本发明提供了一种码流的生成和处理方法、装置及系统,以至少解决上述问题。
根据本发明的一个实施例提供了一种码流的生成方法,包括:根据应用需求判断是否需要对码流的整体和/或部分执行视频图像序号POC对齐操作;根据判断结果,将标识及控制信息写入所述码流,其中,所述标识及控制信息包括:是否对所述码流进行整体和/或部分POC对齐操作的指示信息。
优选地,所述标识及控制信息位于所述码流中的参数集所在的字段,用于指示所述码流中使用所述标识及控制信息所在的参数集的全部和/或部分图像是否执行POC对齐操作。
优选地,所述参数集包括以下至少之一:视频参数集VPS、序列参数集SPS、图像参数集PPS。
优选地,如果多个所述参数集中都包含有指示是否执行POC对齐操作的标识及控制信息,则根据参数集之间的引用关系,当前参数集中的指示是否执行POC对齐操作的标识及控制信息覆盖被其直接和/或间接引用的参数集中对应的指示是否执行POC对齐操作的标识及控制信息。
优选地,所述标识及控制信息位于所述码流中除参数集字段之外的其他至少作用于图像层的数据结构对应的字段,用于指示所述码流的所述数据结构有效作用范围内的全部和/或部分图像是否执行POC对齐操作。
优选地,所述标识及控制信息位于所述码流中补充增强辅助信息SEI所在字段,用于指示该SEI信息有效范围内的所述码流中的全部和/或部分图像是否执行POC对齐操作。
优选地,所述标识及控制信息位于所述码流的系统层用于描述视频媒体属性的字段,用于指示包含在系统码流中的所述码流的全部和/或部分图像是否执行POC对齐操作。
优选地,所述标识及控制信息位于所述码流的媒体文件中用于描述视频媒体属性的字段,用于指示包含在所述媒体文件中的所述码流的全部和/或部分图像是否执行POC对齐操作。
优选地,所述标识及控制信息还包括:开启或关闭POC对齐操作的起始图像位置信息和/或开启或关闭POC对齐操作的终止图像位置信息;在将所述标识及控制信息写入所述码流之前,所述方法还包括:根据预测结构和应用需求,确定开启和/或关闭POC对齐操作的一段按照图像播放顺序或图像解码顺序连续的图像的起始和/或终止位置。
根据本发明的另一个实施例,提供了一种码流的生成装置,包括:判断模块,设置为根据应用需求判断是否需要对码流的整体和/或部分执行视频图像序号POC对齐操作;写入模块,设置为根据判断结果,将标识及控制信息写入所述码流,其中,所述标识及控制信息包括:是否对所述码流的整体和/或部分执行POC对齐操作的指示信息。
优选地,所述标识及控制信息还包括:开启或关闭POC对齐操作的起始图像位置信息和/或开启或关闭POC对齐操作的终止图像位置信息;所述装置还包括:确定模块,设置为根据预测结构和应用需求,确定开启和/或关闭POC对齐操作的一段按照图像播放顺序或图像解码顺序连续的图像的起始和/或终端位置。
根据本发明的再一个实施例,提供了一种码流的处理方法,包括:从码流中获取标识及控制信息,其中,所述标识及控制信息包括:是否对所述码流的整体和/或部分执行视频图像序号POC对齐操作的指示信息;根据所述标识及控制信息的指示,对所述码流中需要执行POC对齐操作的全部和/或部分图像执行POC对齐操作。
优选地,所述标识及控制信息位于所述码流中的参数集所在的字段,用于指示所述码流中使用所述标识及信息所在的参数集的全部和/或部分图像是否执行POC对齐操作。
优选地,所述参数集包括以下至少之一:视频参数集VPS、序列参数集SPS、图像参数集PPS。
优选地,如果多个所述参数集中都包含有指示执行POC对齐操作的标识及控制信息,则根据参数集之间的引用关系,当前参数集中的指示是否执行POC对齐操作的标识及控制信息覆盖被其直接和/或间接引用的参数集中对应的指示执行是否POC对齐操作的标识及控制信息。
优选地,所述标识及控制信息位于所述码流中除参数集字段之外的其他至少作用于图像层的数据结构对应的字段,用于指示所述码流的所述数据结构有效作用范围内的全部和/或部分所述图像是否执行POC对齐操作。
优选地,所述标识及控制信息位于所述码流中增强辅助信息SEI所在字段,用于指示该SEI信息有效范围内的所述码流中的全部和/或部分图像是否执行POC对齐操作。
优选地,所述标识及控制信息位于所述码流的系统层用于描述视频媒体属性的字段,用于指示包含在系统码流中的所述码流的全部和/或部分图像是否执行POC对齐操作。
优选地,所述标识及控制信息位于所述多层视频码流的媒体文件中用于描述视频媒体属性的字段,用于指示包含在所述媒体文件中的所述码流的全部和/或部分图像是否执行POC对齐操作。
优选地,根据所述标识及控制信息的指示,对所述码流中需要执行POC对齐操作的图像执行POC对齐操作,包括:根据所述标识及控制信息所在字段的有效范围以及所述标识及控制信息的取值,确定所述码流中开启和/或关闭POC对齐操作的图像;对开启POC对齐操作的图像执行POC对齐操作。
根据本发明的又一个实施例,提供了一种码流的处理装置,包括:获取模块,设置为从码流中获取标识及控制信息,其中,所述标识及控制信息包括:是否对所述码流的整体和/或部分执行视频图像序号POC对齐操作的指示信息;执行模块,设置为根据所述标识及控制信息的指示,对所述码流中需要执行POC对齐操作的全部和/或部分图像执行POC对齐操作。
优选地,所述执行模块包括:确定模块,设置为根据所述标识及控制信息所在字段的有效范围以及所述标识及控制信息的取值,确定所述码流中开启和/或关闭POC对齐操作的图像;控制模块,设置为对开启POC对齐操作的图像执行POC对齐操作。
根据本发明的又一个实施例,提供了一种使用码流的通信系统,包括:源设备,包括上述的码流的生成装置;以及宿设备,包括上述的码流的处理装置。
通过本发明,在生成码流时,在码流中写入是否需要对码流的整体和/或部分执行POC对齐操作的指示及控制信息,从而可以对POC对齐操作进行局部和/或整体地关闭,提高了码流控制的灵活性。
附图说明
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:
图1是根据本发明实施例的码流的生成方法100的流程图;
图2是根据本发明实施例的码流的生成装置200的结构示意图;
图3是根据本发明实施例的码流的处理方法300的流程图;
图4是根据本发明实施例的码流的处理装置400的结构示意图;以及
图5是根据本发明实施例的使用码流的通信系统500的结构示意图。
具体实施方式
下文中将参考附图并结合实施例来详细说明本发明。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。
当需要使用POC对齐的情况下,编码器可使用现有POC对齐方法生成多层视频编码码流,解码器使用现有技术POC对齐方法对应的解码方法获得当前图像的POC值。对于某些应用,不需要使用POC对齐。例如,非协调的联播、混合可伸缩视频编码等应用。
因此,在本发明实施例中,为增加多层视频编解码器和码流的灵活性,适应多种应用需求,在多层视频编码的高层数据流程上,增加针对POC对齐操作的标识信息和对应操作。
需要说明的是,在本发明实施例中,源设备是指包含有编码器、可产生多层视频编码码流并进行必要的文件、系统层处理的设备;而宿设备指包含有解码器、进行必要的文件、系统层处理和可解码多层视频编码码流的设备。
实施例一
根据本发明实施例,提供了一种码流的生成方法。
图1为根据本发明实施例的码流的生成方法100的流程图。如图1所示,根据本发明实施例的码流的生成方法包括步骤S102和S104。在步骤S102中根据应用需求(例如,是否同时解码和输出全部层或多个层图像)判断是否需要对码流的整体和/或部分执行POC对齐操作。然后,在步骤S104中,根据判断结果,将标识及控制信息写入码流中,其中,该标识及控制信息中包括:对码流的整体和/或部分执行POC对齐操作的指示信息。
在具体实施过程中,源设备可以在需要传输码流时,执行上述步骤S102和步骤S104,在传输的码流中写入POC对齐操作标识及控制信息。
本发明实施例中所述的码流可以是多层视频码流,也可以是系统层上其他码流,具体本发明实施例不作限定。
通过上述步骤S102至S104,可以根据应用需求,在多层视频编码的高层数据流程上,增加针对POC对齐操作的标识及控制信息和对应操作,从而可以增加多层视频编解码器和码流的灵活性。
在本发明实施例中,可以通过将上述标识及控制信息写入码流的不同位置来表示该标识及控制信息的作用范围。在本发明实施例中,上述标识及控制信息可以位于码流的以下位置的至少之一:视频码流中的现有参数集字段,视频码流中除参数集外其他至少作用于图像层的数据结构对应的字段,视频码流中增强辅助信息(SEI)所在字段、系统层用于描述视频媒体属性的字段,媒体文件中用于描述视频媒体属性的字段。
可选地,在本发明实施例中,上述标识及控制信息包括但不限于以下方式来指示是否执行POC对齐操作的图像:
(1)上述标识及控制信息位于码流中的参数集所在的字段,用于指示码流中使用标识及控制信息所在的参数集的图像层是否执行POC对齐操作;其中,参数集包括以下至少之一:视频参数集、序列参数集及图像参数集。例如,当标识及控制信息位于视频参数集(Video Parameter Set,VPS),则该标识及控制信息的控制范围是多层视频 的全部图像层;而当标识及控制信息位于序列参数集(Sequence Parameter Set,SPS),则该标识及控制信息的控制范围是使用该SPS的多层视频的图像层;当标识及控制信息位于图像参数集(Picture Parameter Set,PPS),则该标识及控制信息的控制范围是使用该PPS的多层视频的图像层。
在这种情况下,如果多个所述参数集中都包含有指示是否执行POC对齐操作的标识及控制信息,则根据参数集之间的引用关系,当前参数集中的指示是否执行POC对齐操作的标识及控制信息覆盖被其直接和/或间接引用的参数集中对应的指示是否执行POC对齐操作的标识及控制信息。
(2)上述标识及控制信息位于所述码流中除参数集字段之外的其他至少作用于图像层的数据结构对应的字段,用于指示所述码流的所述数据结构所作用的图像层是否执行POC对齐操作。其中,该数据结构的作用范围至少包含1个图像(帧图像和/或场图像)。
(3)上述标识及控制信息位于码流中的补充增强信息(Supplemental Enhancement Information,SEI),用于指示该SEI信息有效范围内的所述码流中的图像是否执行POC对齐操作;
(4)上述标识及控制信息位于所述码流的系统层用于描述视频媒体属性的字段,即系统层描述子(Descriptor),用于指示包含在系统码流中的所述码流的整体和/或部分图像是否执行POC对齐操作。
(5)上述标识及控制信息位于所述多层视频码流的媒体文件中用于描述视频媒体属性的字段,文件描述子(Descriptor),用于指示包含在所述媒体文件中的码流的整体和是否执行POC对齐操作。
上述标识及控制信息也可以同时位于上述多个位置中,同时指示多个范围内的图像是否执行POC对齐操作。
可选的,在本发明实施列中,上述标识及控制信息还包括:开启或关闭POC对齐操作的起始图像位置信息和/或开启或关闭POC对齐操作的终止图像位置信息。则如图1所示,在步骤S104之前,方法100还可以包括:步骤S103,根据预测结构和应用需求,确定开启和/或关闭POC对齐操作的一段按照图像播放顺序或图像解码顺序连续的图像的起始和/或终止位置。
可选的,在本发明实施例中,在将上述标识及控制信息写入码流之后,可以传输该码流,而接收侧(可以称为宿设备)接收该多层视频码流,从所述多层视频码流中获取所述标识及控制信息,并所述标识及控制信息的指示对多层视频码流执行解码操作和/或播放操作。例如,根据上述标识及控制信息,确定开启和/或关闭POC对齐操作的码流,在解码和/或播放对开启POC对齐操作的码流执行POC对齐.
实施例二
根据本发明实施例,提供了一种码流的生成装置,该装置用于实现实施例一所提供的方法。
图2为根据本发明实施例的码流的生成装置200的结构示意图,如图2所示,生成装置200可以包括:判断模块202和写入模块204。应当理解,图2中所表示的各个模块的连接关系仅为示例,本领域技术人员完全可以采用其它的连接关系,只要在这样的连接关系下各个模块也能够实现本发明的功能即可。
在本说明书中,各个模块的功能可以通过使用专用硬件、或者能够与适当的软件相结合来执行处理的硬件来实现。这样的硬件或专用硬件可以包括专用集成电路(ASIC)、各种其它电路、各种处理器等。当由处理器实现时,该功能可以由单个专用处理器、单个共享处理器、或者多个独立的处理器(其中某些可能被共享)来提供。另外,处理器不应该被理解为专指能够执行软件的硬件,而是可以隐含地包括、而不限于数字信号处理器(DSP)硬件、用来存储软件的只读存储器(ROM)、随机存取存储器(RAM)、以及非易失存储设备。
在本发明实施例中,判断模块202设置为根据应用需求判断是否需要对码流的整体和/或部分执行视频图像序号POC对齐操作;写入模块204设置为根据判断模块202的判断结果,将包括是否对所述多层视频码流的整体和/或部分执行POC对齐操作的指示信息的标识及控制信息写入码流。
与上述实施例一对应,写入模块204可以通过在多层视频码流的不同位置写入上述标识及控制信息,来指示多层视频码流的整体和/或部分是否开启或关闭POC对齐操作。具体不再赘述。
可选的,上述标识及控制信息还包括:开启或关闭POC对齐操作的起始图像位置信息和/或开启或关闭POC对齐操作的终止图像位置信息;则如图2所示,装置200还可以包括:确定模块206设置为根据预测结构和应用需求,确定开启和/或关闭POC对齐操作的一段按照图像播放顺序或图像解码顺序连续的图像的起始和/或终端位置。
可选的,该装置还可以包括一个输出模块,设置为输出写入所述标识及控制信息的所述码流。
本实施例的装置可以是视频通信应用中相关码流生成设备,例如,手机、计算机、服务器、机顶盒、便携式移动终端、数字摄像机,电视广播系统设备等。该装置可以位于源设备中,该输出装置可以处理以下至少一种多层视频信号:可伸缩视频,多视点视频,多视点深度,多视点视频+多视点深度。其中,立体视频是多视点视频的一种视点数等于2的特殊形式。
实施例三
与上述实施例一提供的方法对应,本发明实施例还提供了一种码流的处理方法,用于上述生成的码流进行处理。
图3为根据本发明实施例的码流的处理方法300的流程图,如图3所示,根据本发明实施例的多层视频码流的处理方法300主要包括步骤S302和步骤S304。在步骤S302中,从码流中获取包括是否对码流的整体和/或部分执行视频图像序号POC对齐操作的指示信息的标识及控制信息。然后,在步骤S304中,根据上述标识及控制信息的指示,对码流中需要执行POC对齐操作的部分执行POC操作。
在本实施例中,与上述实施例一对应,上述标识及控制信息可以位于码流的多个字段以指示标识及控制信息的有效范围,具体参见实施例一,在本实施例中不再赘述。
可选地,对应于上述实施例一中,通过将标识及控制信息写入到码流的不同位置以指示码流的不同部分是否执行POC对齐操作,步骤S304可以包括:根据所述标识及控制信息所在字段的有效范围以及所述标识及控制信息的取值,确定所述码流中开启和/或关闭POC对齐操作的图像;对开启POC对齐操作的图像进行POC对齐。
实施例四
根据本发明实施例,还提供了一种码流的处理装置,该装置用于实现实施例三所提供的方法。
图4是根据本发明实施例的码流的处理装置400的结构示意图,如图4所示,处理装置400主要包括:获取模块402和执行模块404。应当理解,图4中所表示的各个模块的连接关系仅为示例,本领域技术人员完全可以采用其它的连接关系,只要在这样的连接关系下各个模块也能够实现本发明的功能即可。
在本说明书中,各个模块的功能可以通过使用专用硬件、或者能够与适当的软件相结合来执行处理的硬件来实现。这样的硬件或专用硬件可以包括专用集成电路(ASIC)、各种其它电路、各种处理器等。当由处理器实现时,该功能可以由单个专用处理器、单个共享处理器、或者多个独立的处理器(其中某些可能被共享)来提供。另外,处理器不应该被理解为专指能够执行软件的硬件,而是可以隐含地包括、而不限于数字信号处理器(DSP)硬件、用来存储软件的只读存储器(ROM)、随机存取存储器(RAM)、以及非易失存储设备。
在本发明实施例中,获取模块402,设置为从码流中获取包括是否对多层视频码流的整体和/或部分执行视频图像序号POC对齐操作的指示信息的标识及控制信息。执行模块406,设置为根据所述标识及控制信息的指示,对所述码流中需要执行POC对齐操作的部分执行POC对齐。
可选的,执行模块404可以包括:确定模块,设置为根据所述标识及控制信息所在字段的有效范围以及所述标识及控制信息的取值,确定所述多层视频码流中开启和/或关闭POC对齐操作的图像;控制模块,设置为对开启POC对齐操作的图像执行POC对齐操作。
本实施例的装置可以是视频通信应用中相关码流接收播放设备,例如,手机、计算机、服务器、机顶盒、便携式移动终端、数字摄像机,电视广播系统设备等。该装置可以位于宿设备中,该处理装置可以处理以下至少一种多层视频信号:可伸缩视频,多视点视频,多视点深度,多视点视频+多视点深度。其中,立体视频是多视点视频的一种视点数等于2的特殊形式。
实施例五
根据本发明实施例,还提供了一种使用码流的通信系统。
图5为根据本发明实施例的使用码流的通信系统500的结构示意图,如图5所示,根据本发明实施例的多层视频码流的传输系统500包括源设备502和宿设备504。其中,源设备502包括上述实施例二中所述的码流的生成装置200,宿设备504包括上述实施例四中所述的码流的处理装置400。在本实施例中,源设备502可以按照实施例一中所述的方法生成码流并输出,宿设备504接收该码流,并按照实施例三中所述的方法对该码流进行处理,具体参见上述明实施例,本实施列中不再赘述。
为了进一步说明本发明实施例所提供的技术方案,下面通过具体实例对本发明实施例所提供的技术方案进行描述。
以下实例所使用的针对POC对齐操作的高层码流组织方法采用如表1和表2所示的结构。
如表1和表2所示码流中包含:标识整体和/或部分码流是否使用POC对齐操作的标识信息。在对应的码流中,携带有如下比特字段:标识整体和/或部分码流是否使用POC对齐操作的信息的一个或多个比特字段。
如表3所示码流中包含:启用或关闭POC对齐操作起始和/或结束位置信息。对应的码流中,携带有如下比特字段:启用或关闭POC对齐操作起始和/或结束位置信息的一个或多个比特字段。
表1.启用POC对齐的码流组织方法
Figure PCTCN2014088677-appb-000001
表2.关闭POC对齐的码流组织方法
Figure PCTCN2014088677-appb-000002
表3.POC对齐相关起止位置的码流组织方法
Figure PCTCN2014088677-appb-000003
Figure PCTCN2014088677-appb-000004
其中,表1中各字段的语义(对应的控制操作)如下:poc_alignment_enable_flag取值等于1时表示解码码流过程中需要使用POC对齐操作。poc_alignment_enable_flag取值等于0是表示解码码流过程中不使用POC对齐操作,当然,对于本领域技术人员来说poc_alignment_enable_flag取值及其所表示的含义也可以采用其它方式,具体本发明实施例中不作限定。poc_alignment_enable_flag使用u(1)对应的编解码方法。
可选地,如果码流中不存在poc_alignment_enable_flag对应的比特字段,则其对应变量poc_alignment_enable_flag的取值默认为1。
表2中各字段的语义(对应的控制操作)如下:poc_alignment_disable_flag取值等于1时表示解码码流过程中不使用POC对齐操作。poc_alignment_disable_flag取值等于0是表示解码码流过程中使用POC对齐操作。poc_alignment_disable_flag使用u(1)对应的编解码方法。可选地,如果码流中不存在poc_alignment_disable_flag对应的比特字段,则其对应变量poc_alignment_disable_flag的取值默认为0。当然,对于本领域技术人员来说poc_alignment_disable_flag取值及其所表示的含义也可以采用其它方式,具体本发明实施例中不作限定。
表3中各字段的语义(对应的控制操作)如下:start_info指示启用或关闭POC对齐操作的起始图像位置。start_info使用se(v)对应的编解码方法。end_info指示启用或关闭POC对齐操作的终止图像位置。end_info使用se(v)对应的编解码方法。
表3中字段可以与表1和表2中的字段组合使用。表3中字段与表1中字段组合使用时,指示的是开启POC对齐操作的起始图像位置和终止图像位置;表3中字段与表2中字段组合使用时,指示的是关闭POC对齐操作的起始图像位置和终止图像位置。
表3中,起始图像位置和终止图像位置可以使用以下所列信息中的一种或多种标识,包括:图像的POC值,POC的低比特位(LSB)的取值,POC的高比特位(MSB)信息,图像时间戳信息,图像解码顺序信息,图像播放顺序信息。
表1、表2和表3中的语法元素可以使用在以下一个或多个数据结构中,表1、表2和表3中的语法元素对应的比特字段使用在对应于该数据结构的码流中:
(1)视频参数集(Video Parameter Set,VPS);
(2)序列参数集(Sequence Parameter Set,SPS);
(3)图像参数集(Picture Parameter Set,PPS);
(4)补充增强信息(Supplemental Enhancement Information,SEI);
(5)除上述数据结构外其他作用范围至少包含1个图像(帧图像和/或场图像)的数据结构;
(6)系统层描述子(Descriptor);
(7)文件描述子(Descriptor);
(8)混合使用。
以下对上述各种方法的描述仅为各对应方法的具体实例。当同时使用多种方法时,可将下述各方法的实例进行简单组合和串联,即可得到对应的实例。
以下实施方法中,表1和表2的码流字段不位于同一个数据结构中。以下实施方法以表1的码流字段poc_alignment_enable_flag为例进行说明。表2的码流字段poc_alignment_disable_flag的对应的实施方法之相同,区别在于同一情况下poc_alignment_disable_flag的取值与poc_alignment_enable_flag的取值相反。
实例1
在本实例中,poc_alignment_enable_flag位于VPS,其控制范围是多层视频的全部层。其值等于1时,表示同一个AU中包含的图像具有相同的POC值。其值等于0时,表示同一个AU中包含的图像可能具有相同、也可能具有不相同的POC值。
在poc_alignment_enable_flag字段后,可进一步选择使用表3的比特字段,进一步标识poc_alignment_enable_flag对应操作的起始图像位置和终止图像位置。当表3的比特字段存在时,poc_alignment_enable_flag所限定的操作的有效范围为包含表3比特字段指示的起始图像和终止图像及其之间(按照图像播放顺序或图像解码顺序)的全部图像。
在本实例中,多层视频码流的传输主要包括以下步骤:
步骤1,源设备根据应用需求(如,是否同时解码和输出全部层或多个层图像),判断是否需要对多层视频执行POC对齐操作。
若需要使用POC对齐操作,源设备将poc_alignment_enable_flag的取值设置为1,否则设置为0。
步骤2,源设备使用u(1)对应的编码方法将poc_alignment_enable_flag的取值写入VPS码流。
步骤3,源设备根据预测结构和应用需求,确定是否制定(按照图像播放顺序或图像解码顺序)连续的一段图像执行poc_alignment_enable_flag取值所指示的操作。若是,则源设备根据预测结构和应用需求,确定start_info和end_info的取值,并使用se(v)对应的编码方法将二者的取值写入码流。
步骤4,宿设备接收到码流后,使用u(1)对应的解码方法从VPS码流中获得poc_alignment_enable_flag的取值。
当码流中存在表3所示字段时,宿设备使用se(v)对应的解码方法从码流中获得start_info和end_info的取值。宿设备根据start_info和end_info的取值的取值确定poc_alignment_enable_flag取值对应操作的有效图像范围。若码流中不包含表3所示字段,宿设备设定poc_alignment_enable_flag取值对应操作的有效范围是整个视频编码序列(Coded Video Sequence,CVS)。
宿设备判断poc_alignment_enable_flag的取值为1时,宿设备可使用POC条件对视频码流进行AU边界划分。宿设备设置“同AU的输出图像的POC值相等”作为其码流检错和播放操作控制条件。若解码码流不符合该控制条件时,宿设备执行差错控制机制,进行误码掩盖和/或通过反馈信息向源设备报告错误。宿设备可直接根据POC值进行图像输出和播放控制。
宿设备判断poc_alignment_enable_flag的取值为0时,宿设备使用非POC条件对视频码流进行AU边界划分。对于同一个AU获得的图像,若其对应的系统层或文件中的时间标记信息相同,宿设备在该事件标记信息所指示的时刻对图像进行输出和播放控制。
实例2
本实例中,poc_alignment_enable_flag位于SPS,其控制范围是使用该SPS的多层视频的层。其值等于1时,表示AU中该层图像与BL层图像(在该AU中存在或假设存在)具有相同的POC值。其值等于0时,表示AU中该层图像与BL层图像(在该AU中存在或假设存在)可能具有相同、也可能具有不相同的POC值。
在poc_alignment_enable_flag字段后,可进一步选择使用表3的比特字段,进一步标识poc_alignment_enable_flag对应操作的起始图像位置和终止图像位置。当表3的比特字段存在时,poc_alignment_enable_flag所限定的操作的有效范围为包含表3比特字段指示的起始图像和终止图像及其之间(按照图像播放顺序或图像解码顺序)的全部图像。
在本实例中,多层视频码流的传输主要包括以下步骤:
步骤1,源设备根据应用需求(如,是否同时解码和输出全部层或多个层图像),判断是否需要将视频码流中该层图像与BL层图像执行POC对齐操作。
若需要使用POC对齐操作,源设备将该层所使用的SPS中poc_alignment_enable_flag的取值设置为1,否则设置为0。
步骤2,源设备使用u(1)对应的编码方法将poc_alignment_enable_flag的取值写入SPS码流。
步骤3,源设备根据预测结构和应用需求,确定是否制定(按照图像播放顺序或图像解码顺序)连续的一段图像执行poc_alignment_enable_flag取值所指示的操作。若是,则源设备根据预测结构和应用需求,确定start_info和end_info的取值,并使用se(v)对应的编码方法将二者的取值写入码流。
步骤4,宿设备接收到码流后,使用u(1)对应的解码方法从SPS码流中获得poc_alignment_enable_flag的取值。
当码流中存在表3所示字段时,宿设备使用se(v)对应的解码方法从码流中获得start_info和end_info的取值。宿设备根据start_info和end_info的取值的取值确定poc_alignment_enable_flag取值对应操作的有效图像范围。若码流中不包含表3所示字段,宿设备设定poc_alignment_enable_flag取值对应操作的有效范围是整个CVS。
宿设备判断poc_alignment_enable_flag的取值为1时,宿设备可使用POC条件对视频码流进行AU边界划分。宿设备设置“AU中该层的输出图像的POC值与(假设存在)BL层图像相等”作为其码流检错和播放操作控制条件。若解码码流不符合该控制条件时,宿设备执行差错控制机制,进行误码掩盖和/或通过反馈信息向源设备报告错误。宿设备可直接根据POC值进行该层图像与BL图像间进行输出和播放控制。对其他位于同AU但POC值不相等的图像,宿设备在该事件标记信息所指示的时刻对图像进行输出和播放控制。
宿设备判断poc_alignment_enable_flag的取值为0时,宿设备使用非POC条件对视频码流进行AU边界划分。对于同一个AU获得的图像,若其对应的系统层或文件中的时间标记信息相同,宿设备在该事件标记信息所指示的时刻对图像进行输出和播放控制。
实例3
本实例中,poc_alignment_enable_flag位于PPS,其控制范围是使用该PPS的多层视频的层中的一个或多个图像。其值等于1时,表示AU中该层图像与BL层图像(在该AU中存在或假设存在)具有相同的POC值。其值等于0时,表示AU中该层图像与BL层图像(在该AU中存在或假设存在)可能具有相同、也可能具有不相同的POC值。
在poc_alignment_enable_flag字段后,可进一步选择使用表3的比特字段,进一步标识poc_alignment_enable_flag对应操作的起始图像位置和终止图像位置。当表3的比特字段存在时,poc_alignment_enable_flag所限定的操作的有效范围为包含表3比特字段指示的起始图像和终止图像及其之间(按照图像播放顺序或图像解码顺序)的全部图像。
在本实例中,多层视频码流的传输主要包括以下步骤:
步骤1,源设备根据应用需求(如,是否同时解码和输出全部层或多个层图像),判断是否需要将某一个或某一段该层图像与BL层图像执行POC对齐操作。
若需要使用POC对齐操作,源设备将该层所使用的SPS中poc_alignment_enable_flag的取值设置为1,否则设置为0。
步骤2,源设备使用u(1)对应的编码方法将poc_alignment_enable_flag的取值写入PPS码流。
步骤3,源设备根据预测结构和应用需求,确定是否制定(按照图像播放顺序或图像解码顺序)连续的一段图像执行poc_alignment_enable_flag取值所指示的操作。若是,则源设备根据预测结构和应用需求,确定start_info和end_info的取值,并使用se(v)对应的编码方法将二者的取值写入码流。
步骤4,宿设备接收到码流后,使用u(1)对应的解码方法从PPS码流中获得poc_alignment_enable_flag的取值。
当码流中存在表3所示字段时,宿设备使用se(v)对应的解码方法从码流中获得start_info和end_info的取值。宿设备根据start_info和end_info的取值的取值确定poc_alignment_enable_flag取值对应操作的有效图像范围。若码流中不包含表3所示字段,宿设备设定poc_alignment_enable_flag取值对应操作的有效范围是整个CVS。
宿设备判断poc_alignment_enable_flag的取值为1时,宿设备可使用POC条件对视频码流进行AU边界划分。宿设备设置“AU中该层的输出图像的POC值与(假设存在)BL层图像相等”作为其码流检错和播放操作控制条件。若解码码流不符合该控制条件时,宿设备执行差错控制机制,进行误码掩盖和/或通过反馈信息向源设备报告错误。宿设备可直接根据POC值进行该层图像与BL图像间进行输出和播放控制。对其他位于同AU但POC值不相等的图像,宿设备在该事件标记信息所指示的时刻对图像进行输出和播放控制。
宿设备判断poc_alignment_enable_flag的取值为0时,宿设备使用非POC条件对视频码流进行AU边界划分。对于同一个AU获得的图像,若其对应的系统层或文件中的时间标记信息相同,宿设备在该事件标记信息所指示的时刻对图像进行输出和播放控制。
实例4
本实例中,poc_alignment_enable_flag位于SEI,其指示范围是使用该SEI的多层视频的层中的一个或多个图像。其值等于1时,表示AU中的全部图像具有相同的POC值(与整个AU关联的SEI),或指示某EL层图像具有与BL层图像(在该AU中存在或假设存在)相同的POC值(与某个EL层关联的SEI)。其值等于0时,表示AU中的全部图像可能具有、也可能不具有相同的POC值(与整个AU关联的SEI),或指示某EL层图像具有与BL层图像(在该AU中存在或假设存在)可能具有、也可能不具有相同的POC值(与某个EL层关联的SEI)。
在poc_alignment_enable_flag字段后,可进一步选择使用表3的比特字段,进一步标识poc_alignment_enable_flag对应操作的起始图像位置和终止图像位置。当表3的比特字段存在时,poc_alignment_enable_flag所限定的操作的有效范围为包含表3比特字段指示的起始图像和终止图像及其之间(按照图像播放顺序或图像解码顺序)的全部图像。
在本实例中,多层视频码流的传输主要包括以下步骤:
步骤1,源设备根据所产生的多层视频编码码流,判断是否需要将某一个或某一段图像所使用的POC对齐操作。
若使用POC对齐操作,源设备将对应的SEI中poc_alignment_enable_flag的取值设置为1,否则设置为0。所述对应的SEI指:与整个AU关联的SEI,或与某个EL层关联的SEI。
步骤2,源设备使用u(1)对应的编码方法将poc_alignment_enable_flag的取值写入SEI码流,并将SEI的字段插入到视频码流中关联位置。
步骤3,源设备根据预测结构和应用需求,确定是否制定(按照图像播放顺序或图像解码顺序)连续的一段图像执行poc_alignment_enable_flag取值所指示的操作。若是,则源设备根据预测结构和应用需求,确定start_info和end_info的取值,并使用se(v)对应的编码方法将二者的取值写入码流。
步骤4,宿设备接收到码流后,使用u(1)对应的解码方法从SEI码流中获得poc_alignment_enable_flag的取值。
当码流中存在表3所示字段时,宿设备使用se(v)对应的解码方法从码流中获得start_info和end_info的取值。宿设备根据start_info和end_info的取值的取值确定poc_alignment_enable_flag取值对应操作的有效图像范围。若码流中不包含表3所示字段,宿设备设定poc_alignment_enable_flag取值对应操作的有效范围是整个视频编码序列(Coded Video Sequence,CVS)。
宿设备判断poc_alignment_enable_flag的取值为1时,若该SEI是与整个AU关联的SEI,宿设备可使用POC条件对视频码流进行AU边界划分。宿设备设置“同AU的输出图像的POC值相等”作为其码流检错和播放操作控制条件。若解码码流不符合该控制条件时,宿设备执行差错控制机制,进行误码掩盖和/或通过反馈信息向源设备报告错误。宿设备可直接根据POC值进行图像输出和播放控制。
宿设备判断poc_alignment_enable_flag的取值为1时,若该SEI是与某EL关联的SEI,宿设备可使用非POC条件对视频码流进行AU边界划分。宿设备设置“AU中该层的输出图像的POC值与(假设存在)BL层图像相等”作为其码流检错和播放操作控制条件。若解码码流不符合该控制条件时,宿设备执行差错控制机制,进行误码掩盖和/或通过反馈信息向源设备报告错误。宿设备可直接根据POC值进行该层图像与BL图像间进行输出和播放控制。对其他位于同AU但POC值不相等的图像,宿设备在该事件标记信息所指示的时刻对图像进行输出和播放控制。
宿设备判断poc_alignment_enable_flag的取值为0时,宿设备使用非POC条件对视频码流进行AU边界划分。对于同一个AU获得的图像,若其对应的系统层或文件中的时间标记信息相同,宿设备在该事件标记信息所指示的时刻对图像进行输出和播放控制。
实例5
本实例中,上述标识信息位于其他作用范围至少包含1个图像(帧图像和/或场图像)的数据结构,对于所述其他作用范围至少包含1个图像(帧图像和/或场图像)的数据结构,若该数据结构中包含有其他解码过程中必须使用的数据信息,则该数据结构是解码过程中所必需的数据结构。此时,如果该数据结构的作用范围是整个多层视频,则源设备和宿设备对poc_alignment_enable_flag的操作方法与实例1相似。如果该数据结构的作用范围是多层视频中的某EL视频,则源设备和宿设备对poc_alignment_enable_flag的操作方法与实例2相似。如果该数据结构的作用范围是整个多层视频中的某EL上的一个或多个图像,则源设备和宿设备对poc_alignment_enable_flag的操作方法与实例3相似。
对于所述其他作用范围至少包含1个图像(帧图像和/或场图像)的数据结构,若该数据结构中没有包含其他解码过程中必须使用的数据信息,则该数据结构不是解码过程中所必需的数据结构。此时,源设备和宿设备对poc_alignment_enable_flag的操作方法与实例4相似。
在poc_alignment_enable_flag字段后,可进一步选择使用表3的比特字段,进一步标识poc_alignment_enable_flag对应操作的起始图像位置和终止图像位置。当表3的比特字段存在时,poc_alignment_enable_flag所限定的操作的有效范围为包含表3比特字段指示的起始图像和终止图像及其之间(按照图像播放顺序或图像解码顺序)的全部图像。
实例5与实例1至实例4的不同之处在于源设备使用u(1)对应的编码方法将poc_alignment_enable_flag的取值写入所述至少包含1个图像(帧图像和/或场图像)的数据结构的对应码流中,宿设备使用使用u(1)对应的解码方法从所述至少包含1个图像(帧图像和/或场图像)的数据结构的对应码流中解析poc_alignment_enable_flag对应的字段,获得poc_alignment_enable_flag的取值。
实例6
本实例中,上述标识信息位于系统描述子中,当包含有poc_alignment_enable_flag的描述子的作用范围是系统码流中的整个多层视频编码码流时,源设备和宿设备对poc_alignment_enable_flag的操作方法与实例4中“与整个AU关联的SEI”情况下的操作方法相似。
当包含有poc_alignment_enable_flag的描述子的作用范围是系统码流中的多层视频编码码流中某个EL码流时,源设备和宿设备对poc_alignment_enable_flag的操作方法与实例4中“与某个EL层关联的SEI”情况下的操作方法相似。
实例6与实例4的不同之处在于源设备使用u(1)对应的或与u(1)相同的编码方法将poc_alignment_enable_flag的取值写入所述描述子的对应系统码流中,如需要,源设备使用se(v)对应的或与se(v)相同的编码方法将start_info和end_info的取值写入所述描述子的对应系统码流中;宿设备使用使用u(1)对应的或与u(1)相同的解码方法从所述描述子数据结构的对应的系统码流中解析poc_alignment_enable_flag对应的字段,获得poc_alignment_enable_flag的取值,若表3字段在码流中存在,宿设备使用se(v)对应和或与se(v)相同的解码方法从码流中获取将start_info和end_info的取值。
实例7
本实例中,本实例中,上述标识信息位于文件描述子中,在实例中,当包含有poc_alignment_enable_flag的描述子的作用范围是媒体文件码流中的整个多层视频编码码流时,源设备和宿设备对poc_alignment_enable_flag的操作方法与实例4中“与整个AU关联的SEI”情况下的操作方法相似。
当包含有poc_alignment_enable_flag的描述子的作用范围是媒体文件码流中的多层视频编码码流中某个EL码流时,源设备和宿设备对poc_alignment_enable_flag的操作方法与实例4中“与某个EL层关联的SEI”情况下的操作方法相似。
实例7与实例4的的不同之处在于源设备使用u(1)对应的或与u(1)相同的编码方法将poc_alignment_enable_flag的取值写入所述描述子的对应系统码流中,如需要,源设备使用se(v)对应的或与se(v)相同的编码方法将start_info和end_info的取值写入所述描述子的对应系统码流中;宿设备使用使用u(1)对应的或与u(1)相同的解码方法从所述描述子数据结构的对应的系统码流中解析poc_alignment_enable_flag对应的字段,获得poc_alignment_enable_flag的取值,若表3字段在码流中存在,宿设备使用se(v)对应和或与se(v)相同的解码方法从码流中获取将start_info和end_info的取值。
实例8
本实例中,采用混和使用的方法携带上述标识信息。对于多层视频编码码流结构,在解码过程中,PPS引用SPS、SPS引用VPS。这里,将称为VPS是比SPS和PPS更高层的数据结构,SPS是比PPS更高的数据结构。
在实例中,poc_alignment_enable_flag可在不同层次的数据结构中进行编码。当高层数据结构与低层数据结构中的poc_alignment_enable_flag的取值不同时,低层数据结构中的poc_alignment_enable_flag覆盖高层数据结构中的poc_alignment_enable_flag。若表3中start_info和end_info的取值不同时,poc_alignment_enable_flag取值的作用范围为高层数据结构中start_info和end_info限定图像范围和低层数据结构中start_info和end_info限定图像范围的最小交集。
当高层数据结构与低层数据结构中的poc_alignment_enable_flag取值相同、但表3中start_info和end_info的取值不同时,poc_alignment_enable_flag取值的作用范围为高层数据结构中start_info和end_info限定图像范围和低层数据结构中start_info和end_info限定图像范围的最大并集。
在混合使用方法中,源设备首先根据输入视频、编码预测结构和应用需求,确定POC对齐的使用方式,包括:使用POC对齐的层、使用POC对齐的图像的起止位置。源设备根据以上所确定的信息,使用实例1、实例2、实例3的所述方法,设置VPS、SPS、PPS中的poc_alignment_enable_flag以及所需要的start_info和end_info的取值,并使用相应的编码方法将其写入码流。同时,源设备根据应用需求,使用实例4、实例6和实例7所述的方法,设置视频码流所需的辅助信息、以及系统层、媒体文件相关描述子中对应字段信息,并使用相应的编码方法将其写入码流。
宿设备对接收码流进行处理,使用实例1、实例2、实例3的所述方法,从VPS、SPS、PPS中获得poc_alignment_enable_flag以及所需要的start_info和end_info的取值,设置不同图像段和层的POC对齐使用控制,对并对接收码流进行解码。在解码过程中,当宿设备可以获得视频辅助信息、系统层信息、媒体文件信息中的POC对齐描述信息,宿设备使用上述各方法设置接收、解码过程中的差错控制、播放控制模块。
宿设备可使用非POC条件对视频码流进行AU边界划分。宿设备设置“AU中该层的输出图像的POC值与(假设存在)BL层图像相等”作为其码流检错和播放操作控制条件。若解码码流不符合该控制条件时,宿设备执行差错控制机制,进行误码掩盖和/或通过反馈信息向源设备报告错误。宿设备可直接根据POC值进行该层图像与BL图像间进行输出和播放控制。对其他位于同AU但POC值不相等的图像,宿设备在该事件标记信息所指示的时刻对图像进行输出和播放控制。
宿设备判断poc_alignment_enable_flag的取值为0时,宿设备使用非POC条件对视频码流进行AU边界划分。对于同一个AU获得的图像,若其对应的系统层或文件中的时间标记信息相同,宿设备在该事件标记信息所指示的时刻对图像进行输出和播放控制。
从以上的描述中,可以看出,通过上述实施例之一提供的技术方案,对于即时通讯等业务,用户能够在通信之前获取到对方的各个终端的信息(例如,终端的名称,也可以称为昵称),因而可以主动的选择通信接收方的终端进行通信,在与对方通信之前获取通信接受方的多个终端的名称(昵称),有针对性地选择一个发起通信(语音电话、可视电话或者消息),提升了用户体验。
综上所述,通过本发明实施例提供的方法,可以在码流高层、辅助信息、系统层描述等增加对POC对齐的描述。同时,在码流高层结构上采用分层描述机制,有利于码流生成过程中的灵活控制。
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。
工业实用性
基于本发明实施例提供的上述技术方案,采用在生成码流时,在码流中写入是否需要对码流的整体和/或部分执行POC对齐操作的指示及控制信息,从而可以对POC对齐操作进行局部和/或整体地关闭,提高了码流控制的灵活性。

Claims (23)

  1. 一种码流的生成方法,包括:
    根据应用需求判断是否需要对码流的整体和/或部分执行视频图像序号POC对齐操作;
    根据判断结果,将标识及控制信息写入所述码流,其中,所述标识及控制信息包括:是否对所述码流进行整体和/或部分POC对齐操作的指示信息。
  2. 根据权利要求1所述的方法,其中,所述标识及控制信息位于所述码流中的参数集所在的字段,用于指示所述码流中使用所述标识及控制信息所在的参数集的全部和/或部分图像是否执行POC对齐操作。
  3. 根据权利要求2所述的方法,其中,所述参数集包括以下至少之一:视频参数集VPS、序列参数集SPS、图像参数集PPS。
  4. 根据权利要求2所述的方法,其中,如果多个所述参数集中都包含有指示是否执行POC对齐操作的标识及控制信息,则根据参数集之间的引用关系,当前参数集中的指示是否执行POC对齐操作的标识及控制信息覆盖被其直接和/或间接引用的参数集中对应的指示是否执行POC对齐操作的标识及控制信息。
  5. 根据权利要求1所述的方法,其中,所述标识及控制信息位于所述码流中除参数集字段之外的其他至少作用于图像层的数据结构对应的字段,用于指示所述码流的所述数据结构有效作用范围内的全部和/或部分图像是否执行POC对齐操作。
  6. 根据权利要求1所述的方法,其中,所述标识及控制信息位于所述码流中补充增强辅助信息SEI所在字段,用于指示该SEI信息有效范围内的所述码流中的全部和/或部分图像是否执行POC对齐操作。
  7. 根据权利要求1所述的方法,其中,所述标识及控制信息位于所述码流的系统层用于描述视频媒体属性的字段,用于指示包含在系统码流中的所述码流的全部和/或部分图像是否执行POC对齐操作。
  8. 根据权利要求1所述的方法,其中,所述标识及控制信息位于所述码流的媒体文件中用于描述视频媒体属性的字段,用于指示包含在所述媒体文件中的所述码流的全部和/或部分图像是否执行POC对齐操作。
  9. 根据权利要求1至8中任一项所述的方法,其中,
    所述标识及控制信息还包括:开启或关闭POC对齐操作的起始图像位置信息和/或开启或关闭POC对齐操作的终止图像位置信息;
    在将所述标识及控制信息写入所述码流之前,所述方法还包括:
    根据预测结构和应用需求,确定开启和/或关闭POC对齐操作的一段按照图像播放顺序或图像解码顺序连续的图像的起始和/或终止位置。
  10. 一种码流的生成装置,包括:
    判断模块,设置为根据应用需求判断是否需要对码流的整体和/或部分执行视频图像序号POC对齐操作;
    写入模块,设置为根据判断结果,将标识及控制信息写入所述码流,其中,所述标识及控制信息包括:是否对所述码流的整体和/或部分执行POC对齐操作的指示信息。
  11. 根据权利要求10所述的装置,其中,
    所述标识及控制信息还包括:开启或关闭POC对齐操作的起始图像位置信息和/或开启或关闭POC对齐操作的终止图像位置信息;
    所述装置还包括:确定模块,设置为根据预测结构和应用需求,确定开启和/或关闭POC对齐操作的一段按照图像播放顺序或图像解码顺序连续的图像的起始和/或终端位置。
  12. 一种码流的处理方法,包括:
    从码流中获取标识及控制信息,其中,所述标识及控制信息包括:是否对所述码流的整体和/或部分执行视频图像序号POC对齐操作的指示信息;
    根据所述标识及控制信息的指示,对所述码流中需要执行POC对齐操作的全部和/或部分图像执行POC对齐操作。
  13. 根据权利要求12所述的方法,其中,所述标识及控制信息位于所述码流中的参数集所在的字段,用于指示所述码流中使用所述标识及信息所在的参数集的全部和/或部分图像是否执行POC对齐操作。
  14. 根据权利要求13所述的方法,其中,所述参数集包括以下至少之一:视频参数集VPS、序列参数集SPS、图像参数集PPS。
  15. 根据权利要求13所述的方法,其中,如果多个所述参数集中都包含有指示执行POC对齐操作的标识及控制信息,则根据参数集之间的引用关系,当前参数集中的指示是否执行POC对齐操作的标识及控制信息覆盖被其直接和/或间接引用的参数集中对应的指示执行是否POC对齐操作的标识及控制信息。
  16. 根据权利要求12所述的方法,其中,所述标识及控制信息位于所述码流中除参数集字段之外的其他至少作用于图像层的数据结构对应的字段,用于指示所述码流的所述数据结构有效作用范围内的全部和/或部分所述图像是否执行POC对齐操作。
  17. 根据权利要求12所述的方法,其中,所述标识及控制信息位于所述码流中增强辅助信息SEI所在字段,用于指示该SEI信息有效范围内的所述码流中的全部和/或部分图像是否执行POC对齐操作。
  18. 根据权利要求12所述的方法,其中,所述标识及控制信息位于所述码流的系统层用于描述视频媒体属性的字段,用于指示包含在系统码流中的所述码流的全部和/或部分图像是否执行POC对齐操作。
  19. 根据权利要求12所述的方法,其中,所述标识及控制信息位于所述多层视频码流的媒体文件中用于描述视频媒体属性的字段,用于指示包含在所述媒体文件中的所述码流的全部和/或部分图像是否执行POC对齐操作。
  20. 根据权利要求12所述的方法,其中,根据所述标识及控制信息的指示,对所述码流中需要执行POC对齐操作的图像执行POC对齐操作,包括:
    根据所述标识及控制信息所在字段的有效范围以及所述标识及控制信息的取值,确定所述码流中开启和/或关闭POC对齐操作的图像;
    对开启POC对齐操作的图像执行POC对齐操作。
  21. 一种码流的处理装置,包括:
    获取模块,设置为从码流中获取标识及控制信息,其中,所述标识及控制信息包括:是否对所述码流的整体和/或部分执行视频图像序号POC对齐操作的指示信息;
    执行模块,设置为根据所述标识及控制信息的指示,对所述码流中需要执行POC对齐操作的全部和/或部分图像执行POC对齐操作。
  22. 根据权利要求21所述的装置,其中,所述执行模块包括:
    确定模块,设置为根据所述标识及控制信息所在字段的有效范围以及所述标识及控制信息的取值,确定所述码流中开启和/或关闭POC对齐操作的图像;
    控制模块,设置为对开启POC对齐操作的图像执行POC对齐操作。
  23. 一种使用码流的通信系统,包括:
    源设备,包括权利要求10或11所述的装置;以及
    宿设备,包括权利要求21或22所述的装置。
PCT/CN2014/088677 2013-12-27 2014-10-15 码流的生成和处理方法、装置及系统 WO2015096540A1 (zh)

Priority Applications (6)

Application Number Priority Date Filing Date Title
EP14875097.9A EP3089454A4 (en) 2013-12-27 2014-10-15 METHODS AND DEVICES FOR GENERATING AND PROCESSING BINARY STREAMS AND SYSTEM
US15/107,730 US10638141B2 (en) 2013-12-27 2014-10-15 Bitstream generation method and apparatus, bitstream processing method and apparatus, and system
EP20172525.6A EP3713242A1 (en) 2013-12-27 2014-10-15 Bitstream generation method and apparatus, bitstream processing method and apparatus, and system
JP2016542931A JP6285034B2 (ja) 2013-12-27 2014-10-15 コードストリームの生成と処理方法、装置及びシステム
BR112016015000A BR112016015000A2 (pt) 2013-12-27 2014-10-15 Métodos, dispositivos e sistema de geração e processamento de fluxo de bits
KR1020167019738A KR101882596B1 (ko) 2013-12-27 2014-10-15 비트스트림의 생성과 처리 방법, 장치 및 시스템

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310739934.2 2013-12-27
CN201310739934.2A CN104754358B (zh) 2013-12-27 2013-12-27 码流的生成和处理方法、装置及系统

Publications (1)

Publication Number Publication Date
WO2015096540A1 true WO2015096540A1 (zh) 2015-07-02

Family

ID=53477493

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/088677 WO2015096540A1 (zh) 2013-12-27 2014-10-15 码流的生成和处理方法、装置及系统

Country Status (7)

Country Link
US (1) US10638141B2 (zh)
EP (2) EP3089454A4 (zh)
JP (1) JP6285034B2 (zh)
KR (1) KR101882596B1 (zh)
CN (1) CN104754358B (zh)
BR (1) BR112016015000A2 (zh)
WO (1) WO2015096540A1 (zh)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9826232B2 (en) 2014-01-08 2017-11-21 Qualcomm Incorporated Support of non-HEVC base layer in HEVC multi-layer extensions
US10136152B2 (en) * 2014-03-24 2018-11-20 Qualcomm Incorporated Use of specific HEVC SEI messages for multi-layer video codecs
US9807419B2 (en) * 2014-06-25 2017-10-31 Qualcomm Incorporated Recovery point SEI message in multi-layer video codecs
US9729887B2 (en) 2014-06-25 2017-08-08 Qualcomm Incorporated Multi-layer video coding
CN106303673B (zh) 2015-06-04 2021-01-22 中兴通讯股份有限公司 码流对齐、同步处理方法及发送、接收终端和通信系统
BR112018073203B1 (pt) 2016-05-13 2020-10-13 Telefonaktiebolaget Lm Ericsson (Publ) método para operar um equipamento de usuário e equipamento de usuário
CN109889996B (zh) * 2019-03-07 2021-04-20 南京文卓星辉科技有限公司 避免tts与语音业务冲突的方法、公网对讲系统及介质
CN112423108B (zh) * 2019-08-20 2023-06-30 中兴通讯股份有限公司 码流的处理方法、装置、第一终端、第二终端及存储介质
US11356698B2 (en) 2019-12-30 2022-06-07 Tencent America LLC Method for parameter set reference constraints in coded video stream
EP4297418A1 (en) * 2022-06-24 2023-12-27 Beijing Xiaomi Mobile Software Co., Ltd. Signaling encapsulated data representing primary video sequence and associated auxiliary video sequence

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101170697A (zh) * 2006-10-24 2008-04-30 华为技术有限公司 多视图象编解码方法及编解码器
CN101299649A (zh) * 2008-06-19 2008-11-05 中兴通讯股份有限公司 基于通用成帧规程的多业务混合汇聚方法和装置
CN102685469A (zh) * 2012-05-04 2012-09-19 北京航空航天大学 一种基于mpeg-2 aac及h.264音视频传输码流的组帧方法
CN103379320A (zh) * 2012-04-16 2013-10-30 华为技术有限公司 视频图像码流处理方法和设备
CN103379333A (zh) * 2012-04-25 2013-10-30 浙江大学 编解码方法、视频序列码流的编解码方法及其对应的装置

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050201471A1 (en) * 2004-02-13 2005-09-15 Nokia Corporation Picture decoding method
JP4935746B2 (ja) * 2008-04-07 2012-05-23 富士通株式会社 動画像符号化装置、動画像復号化装置及びその符号化、復号化方法
US9264717B2 (en) * 2011-10-31 2016-02-16 Qualcomm Incorporated Random access with advanced decoded picture buffer (DPB) management in video coding
US9451252B2 (en) * 2012-01-14 2016-09-20 Qualcomm Incorporated Coding parameter sets and NAL unit headers for video coding
US9532052B2 (en) * 2013-04-08 2016-12-27 Qualcomm Incorporated Cross-layer POC alignment for multi-layer bitstreams that may include non-aligned IRAP pictures
US10104362B2 (en) * 2013-10-08 2018-10-16 Sharp Kabushiki Kaisha Image decoding device, image coding device, and coded data
WO2015052939A1 (en) * 2013-10-10 2015-04-16 Sharp Kabushiki Kaisha Alignment of picture order count
US20150103912A1 (en) * 2013-10-11 2015-04-16 Electronics And Telecommunications Research Institute Method and apparatus for video encoding/decoding based on multi-layer
MY178305A (en) * 2013-10-11 2020-10-07 Vid Scale Inc High level syntax for hevc extensions
EP3058733B1 (en) * 2013-10-14 2017-01-04 Telefonaktiebolaget LM Ericsson (publ) Picture order count alignment in scalble video
US9628820B2 (en) * 2013-11-19 2017-04-18 Qualcomm Incorporated POC value design for multi-layer video coding
US9654774B2 (en) * 2013-12-12 2017-05-16 Qualcomm Incorporated POC value design for multi-layer video coding
CN104754347B (zh) * 2013-12-26 2019-05-17 中兴通讯股份有限公司 视频图像序号的编码、解码方法及装置、电子设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101170697A (zh) * 2006-10-24 2008-04-30 华为技术有限公司 多视图象编解码方法及编解码器
CN101299649A (zh) * 2008-06-19 2008-11-05 中兴通讯股份有限公司 基于通用成帧规程的多业务混合汇聚方法和装置
CN103379320A (zh) * 2012-04-16 2013-10-30 华为技术有限公司 视频图像码流处理方法和设备
CN103379333A (zh) * 2012-04-25 2013-10-30 浙江大学 编解码方法、视频序列码流的编解码方法及其对应的装置
CN102685469A (zh) * 2012-05-04 2012-09-19 北京航空航天大学 一种基于mpeg-2 aac及h.264音视频传输码流的组帧方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3089454A4 *

Also Published As

Publication number Publication date
JP2017507522A (ja) 2017-03-16
KR20160102490A (ko) 2016-08-30
JP6285034B2 (ja) 2018-02-28
EP3089454A1 (en) 2016-11-02
KR101882596B1 (ko) 2018-08-24
BR112016015000A2 (pt) 2017-08-08
CN104754358A (zh) 2015-07-01
US10638141B2 (en) 2020-04-28
CN104754358B (zh) 2019-02-19
EP3089454A4 (en) 2016-11-02
EP3713242A1 (en) 2020-09-23
US20160323590A1 (en) 2016-11-03

Similar Documents

Publication Publication Date Title
WO2015096540A1 (zh) 码流的生成和处理方法、装置及系统
US10708608B2 (en) Layer based HRD buffer management for scalable HEVC
US10397595B2 (en) Provision of supplemental processing information
ES2903112T3 (es) Atributos de señalización para datos de vídeo transmitidos por red
US10827170B2 (en) Method and device for coding POC, method and device for decoding POC, and electronic equipment
TW200822758A (en) Scalable video coding and decoding
JP2014197848A (ja) メディア・コンテナ・ファイル管理
US20170150160A1 (en) Bitstream partitions operation
US20150264099A1 (en) Systems and methods for constraining a bitstream
KR20090099547A (ko) 멀티뷰 코딩 비디오에서 비디오 에러 정정을 위한 방법 및 장치
KR101584111B1 (ko) 클라우드 컴퓨팅을 이용한 멀티미디어 서비스 품질 향상 방법 및 이를 위한 기기
WO2015136945A1 (en) Systems and methods for constraining a bitstream

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14875097

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2016542931

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 15107730

Country of ref document: US

REEP Request for entry into the european phase

Ref document number: 2014875097

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2014875097

Country of ref document: EP

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112016015000

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 20167019738

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 112016015000

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20160624