WO2014047943A1 - 视频编码及解码方法、装置及系统 - Google Patents

视频编码及解码方法、装置及系统 Download PDF

Info

Publication number
WO2014047943A1
WO2014047943A1 PCT/CN2012/082494 CN2012082494W WO2014047943A1 WO 2014047943 A1 WO2014047943 A1 WO 2014047943A1 CN 2012082494 W CN2012082494 W CN 2012082494W WO 2014047943 A1 WO2014047943 A1 WO 2014047943A1
Authority
WO
WIPO (PCT)
Prior art keywords
independently decodable
area
picture
video
tile
Prior art date
Application number
PCT/CN2012/082494
Other languages
English (en)
French (fr)
Inventor
杨晓峰
张园园
石腾
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to KR1020157010905A priority Critical patent/KR101661436B1/ko
Priority to CN201810107420.8A priority patent/CN108419076B/zh
Priority to PCT/CN2012/082494 priority patent/WO2014047943A1/zh
Priority to CN201810107231.0A priority patent/CN108429917B/zh
Priority to JP2015533407A priority patent/JP6074509B2/ja
Priority to CN201280001898.3A priority patent/CN103907350B/zh
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP12885466.8A priority patent/EP2887663B1/en
Priority to AU2012391251A priority patent/AU2012391251B2/en
Publication of WO2014047943A1 publication Critical patent/WO2014047943A1/zh
Priority to US14/631,658 priority patent/US11089319B2/en
Priority to US14/631,645 priority patent/US20150172692A1/en
Priority to US17/375,936 priority patent/US11533501B2/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • H04N19/463Embedding additional information in the video signal during the compression process by compressing encoding parameters before transmission
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/162User input
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/16Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter for a given display mode, e.g. for interlaced or progressive display mode

Definitions

  • the present invention relates to image processing techniques, and more particularly to a video encoding and decoding method, apparatus and system.
  • 3DTV Three-Dimensional Television
  • 2DTV Tele-Dimensional Television
  • frame packing 3DTV technology combines two left and right views into one frame image, and then encodes and transmits with 2D encoder and transmission device, and adds two views to the encoded code stream.
  • the message, or a message that directly indicates the position information of the two views in one frame, is decoded, and after decoding, the two views are output according to the above message.
  • Figure 1 exemplarily shows two types of stitching: left and right type and top and bottom type.
  • the flip type refers to whether the arrangement order of the left and right views is reversed, or whether the arrangement data of the top and bottom views is inverted.
  • Figure 1 shows different images of different stitching types and flip types.
  • the various different types of decoders available today consist of two main components: a decoding module and a local memory.
  • the local memory is used to store the encoded picture before decoding, and the decoded picture. These decoded pictures need to be decoded reference frames of subsequent pictures or have not yet reached the output time.
  • the decoder needs to allocate enough storage resources for the local memory, and the decoding module needs to consume the computing resources of the decoder.
  • the video to be transmitted is encoded to form a code stream, and the code stream is transmitted in each code stream.
  • profile indicates which encoding tools are used by the encoder in video encoding (for example: the bit depth of the pixel in the main profile can only be 8 bits, the picture parameter set identifies the PPS id cannot exceed 63, the tile encoding is not enabled, etc., but at high There are no such restrictions in the profile), and the decoder cannot decode if it does not support one of the encoding tools.
  • Level represents the computing power and storage resources required for decoding by the decoder. For example, the current hevc draft defines level 4 and level 4.1, respectively.
  • the decoders that meet these two standards can achieve 32 frames/s and 64 frames/s when decoding a high-definition stream with a resolution of 1920*1080, but only A decoder that meets the level 4 or lower standard cannot decode a high-definition stream with a resolution of 1920*1080.
  • the decoder receives a 3D video code stream encoded by a frame-packaged 3DTV technology and is connected to a 2D display device, the picture is decoded by the decoder and only one of the two views is taken out, and then output.
  • a 2D display device see Figure 2.
  • the profile and level requirements for encoding or decoding the 3D video stream are higher than the profile and level of the encoded or decoded 2D video stream, a higher level of decoder versus 3D video stream is required. After decoding, it is output to the 2D display device.
  • the computing and storage resources of the decoder are wasted.
  • a video encoding method the video consisting of a sequence of pictures, the method comprising:
  • a video decoding method includes:
  • Video code stream includes a video to be decoded and an auxiliary message, where the video to be decoded is composed of a sequence of pictures to be decoded;
  • an independently decodable region location identifier of the independently decodable region of the to-be-decoded picture where the independently decodable region location identifier is composed of one or more tiled tile identifiers (tile id); as well as
  • a video encoder the video consisting of a sequence of pictures, including:
  • An independently decodable view confirmation unit configured to determine an independently decodable view in the picture to be encoded according to the configuration file corresponding to the video;
  • a block dividing unit configured to divide the picture into at least two tiles, where an area corresponding to one or more tiles covering the independently decodable view is an independently decodable area;
  • An auxiliary message generating unit configured to generate an auxiliary message corresponding to the picture, where the auxiliary message includes the independently decodable area location identifier, where the independently decodable area location identifier is composed of one or more tile identifiers (tile id); as well as
  • An encoding unit is configured to encode all the tiles included in the picture, so as to form an encoded video code stream, where the encoded video code stream includes the auxiliary message.
  • the execution coding unit further includes: a determining unit, configured to determine that the current to be edited Whether the code tile is a tile in the independently decodable region; if yes, setting the independently decodable region of the encoded picture to the inter-frame reference candidate region of the current tile; if not, setting the entire picture region of the encoded picture to the frame of the current tile.
  • the optimal reference region is selected according to the inter-frame reference candidate region corresponding to the tile to be encoded.
  • a receiving unit configured to receive a video code stream, where the video code stream includes a video to be decoded and an auxiliary message, where the video to be decoded is composed of a sequence of pictures to be decoded;
  • Performing a decoding unit configured to obtain a picture to be decoded; obtaining, according to the auxiliary message, an independently decodable area location identifier of the independently decodable area of the picture to be decoded, where the independently decodable area location identifier is represented by one or more tiles (tile And a block identifier (tile id); and obtaining an independently decodable region of the to-be-decoded picture according to the independently decodable region location identifier, and decoding the independently decodable region.
  • One or more memories are One or more memories
  • One or more programs wherein the one or more programs are stored in the one or more memories, and the one or more programs are for execution by the one or more processors, the one or more programs Includes:
  • An instruction configured to determine an independently decodable view in the picture to be encoded according to the configuration file corresponding to the video
  • An instruction configured to divide the picture into at least two tiles, where an area corresponding to one or more tiles covering the independently decodable view is an independently decodable area;
  • An instruction configured to generate an auxiliary message corresponding to the picture, where the auxiliary message includes the a decoding area location identifier, the independently decodable area location identifier is identified by one or more partitions
  • the instruction is used to encode all the tiles included in the picture, so as to form an encoded video code stream, and the auxiliary video message is included in the encoded video code stream.
  • the auxiliary message further comprises one of: an independently decodable region identifier, a cropping information for decoding the independently decodable region, and a decoding of the independently decodable region. Profile information, and level information for decoding independently decodable regions.
  • a decoder comprising:
  • One or more processors are One or more processors;
  • One or more memories are One or more memories
  • One or more programs wherein the one or more programs are stored in the one or more memories, and the one or more programs are for execution by the one or more processors, the one or more programs Includes:
  • An instruction configured to receive a video code stream, where the video code stream includes a video to be decoded and an auxiliary message, where the video to be decoded is composed of a sequence of pictures to be decoded;
  • An instruction configured to obtain a picture to be decoded
  • an instruction configured to obtain, according to the auxiliary message, an independently decodable region location identifier of the independently decodable region of the to-be-decoded picture, where the independently decodable region location identifier is identified by a tile of one or more tiles (tile) Id ) composition;
  • An encoder is disposed in a source device for processing a video, and is used to encode a video.
  • the video shown is composed of a sequence of pictures, including: One or more circuits, configured to determine an independently decodable view in the picture to be encoded according to the configuration file corresponding to the video; divide the picture into at least two tiles, where one of the independently decodable views is covered The area corresponding to the multiple tiles is an independently decodable area; and the auxiliary message corresponding to the picture is generated, where the auxiliary message includes the independently decodable area location identifier, where the independently decodable area location identifier is identified by one or more partitions ( Tile id ); and, encoding all the tiles included in the picture, so as to form an encoded video code stream, the auxiliary video message is included in the encoded video code stream.
  • a decoder, disposed in a receiving device for processing video comprising:
  • One or more circuits configured to receive a video code stream, where the video code stream includes a video to be decoded and an auxiliary message, where the video to be decoded is composed of a sequence of pictures to be decoded; acquiring a picture to be decoded; obtaining according to the auxiliary message An independently decodable region location identifier of the independently decodable region of the to-be-decoded picture, the independently decodable region location identifier being composed of one or more tiled tile ids; according to the independently decodable region location Identifying an independently decodable region of the picture to be decoded, and decoding the independently decodable region.
  • a computer readable storage medium storing a number of instructions that, when executed by a device, trigger the device to perform the following operations:
  • auxiliary message corresponding to the picture, where the auxiliary message includes the independently decodable area location identifier, where the independently decodable area location identifier is composed of one or more tile identifiers (tile id); All tiles included in the picture are encoded such that an encoded video stream is formed, the auxiliary message being included in the encoded video stream.
  • a computer readable storage medium storing a number of instructions that, when executed by a device, trigger the device to perform the following operations:
  • Video code stream includes a video to be decoded and an auxiliary message, where the video to be decoded is composed of a sequence of pictures to be decoded;
  • an independently decodable region location identifier of the independently decodable region of the to-be-decoded picture where the independently decodable region location identifier is composed of one or more tiled tile identifiers (tile id); as well as
  • the auxiliary message may further include an independently decodable area identifier, where the independently decodable area identifier is used to identify whether the picture includes an independently decodable area.
  • the auxiliary message further includes cropping information for decoding the independently decodable region, the cropping information being up and down by the independently decodable view relative to the independently decodable region The abscissa or ordinate of the left or right boundary.
  • the auxiliary message further includes profile information for decoding the independently decodable region, the profile information being used to identify a set of coding tools in the independently decodable region.
  • the auxiliary message further includes level information for decoding the independently decodable region, where the level information is used to identify level information that the decoder needs to meet, and the level information is according to the level information.
  • the independent decoding area is calculated as the ratio of the picture.
  • the step of encoding all the tiles included in the picture to be encoded further includes: determining whether the current tile to be encoded is a tile in the independently decodable region; if yes, setting the encoded image
  • the independent decoding area is the inter-frame reference candidate area of the current tile; if not, the entire picture area of the encoded picture is set as the inter-frame reference candidate area of the current tile; when the inter-frame algorithm is used, according to the frame corresponding to the to-be-coded tile
  • the optimal reference area is selected by referring to the candidate area.
  • the image sequence includes different stitching types and flipping types of pictures; the configuration file stores a stitching type and a flipping type of each frame of the image sequence, and different stitching types and flipping types.
  • the image can be independently decoded view.
  • the auxiliary message further includes an independently decodable area location identifier corresponding to the different splicing type and the flip type picture.
  • the auxiliary message further includes cropping information corresponding to the different splicing type and the flip type picture for decoding the independently decodable area.
  • the auxiliary message further includes profile information corresponding to the different splicing type and the flip type picture to decode the independently decodable area.
  • the auxiliary message further includes level information for decoding the independently decodable region corresponding to different splicing types and flip type pictures.
  • the auxiliary message is optionally enhanced by the auxiliary information.
  • auxiliary message is optionally carried by a Sequence Parameter Set (SPS).
  • SPS Sequence Parameter Set
  • a video code stream comprising: a video to be decoded and an auxiliary message, wherein the video to be decoded is composed of a sequence of pictures to be decoded; and the auxiliary message includes an independently decodable area indicating the series of pictures.
  • the area location identifier can be independently decoded, the independently decodable area location identifier consisting of one or more tiled tile ids.
  • an auxiliary message is added to the coded stream, and the Profile and Level information in the auxiliary message are only for the sub-code stream formed by the independently decodable region, which reduces the performance requirement of the decoder.
  • the decoder can only decode the independently decodable regions in the picture according to the auxiliary message, that is, only decode the independently decodable regions, reduce the performance requirement on the decoder, and save the calculation of the decoder. And storage resources.
  • the corresponding can be independent
  • the profile and level requirements of the code region substream always reduce the performance and storage requirements of the decoder, so the decoder decoding time and power consumption can be saved after the decoder is initialized, saving the decoder storage requirements. If the decoder does not meet the profile and level requirements of the original video stream, and meets the profile and level requirements of the corresponding independently codeable region substream, the 3D display compatible 3D display with high resolution or high bit rate requirements is improved. Video stream support. DRAWINGS
  • FIG. 1 is a schematic diagram of a view splicing type of a video code stream using frame-packaging 3DTV technology in the prior art
  • FIG. 2 is a schematic diagram of a process of outputting a video code stream using a frame-packaged 3DTV technology to a 2D display device in the prior art
  • FIG. 3 is a schematic structural diagram of a picture to be encoded in the prior art
  • FIG. 4 is a system architecture diagram of video encoding and decoding according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a hardware of an encoder according to an embodiment of the present disclosure.
  • FIG. 6 is a hardware structural diagram of a decoder according to an embodiment of the present invention.
  • FIG. 7 is a functional block diagram of an encoder according to an embodiment of the present invention.
  • FIG. 8 is a functional block diagram of a decoder according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of a hardware of an encoder according to an embodiment of the present disclosure.
  • FIG. 10 is a flowchart of a video encoding method according to an embodiment of the present invention.
  • FIG. 11 is a flowchart of a method for specifically encoding a frame of pictures in the method flow shown in FIG. 10;
  • FIG. 12 is a flowchart of a video decoding method according to an embodiment of the present invention.
  • FIG. 13 is a flowchart of still another video encoding method according to an embodiment of the present invention
  • 14 is a flowchart of a method for specifically encoding a frame of a picture in the method flow shown in FIG. 13
  • FIG. 15 is a flowchart of still another decoding method according to an embodiment of the present invention
  • 16 is a flowchart of still another video encoding method according to an embodiment of the present invention.
  • FIG. 17 is a flowchart of a method for specifically encoding a frame of a picture in the method flow shown in FIG. 16.
  • FIG. 18 is a flowchart of still another decoding method according to an embodiment of the present invention. detailed description
  • the largest coding unit (LCU): The smallest picture division unit in high efficiency video coding (HEVC) technology, such as the small square in Figure 3.
  • the LCU can be a 64 x 64 pixel square.
  • the HEVC encoder Before encoding a frame of picture, the HEVC encoder divides the picture into a grid of LCUs.
  • Coding unit The encoder determines the division of the optimal coding unit according to the dynamics of the picture texture details.
  • An LCU can be divided into one or more coding units, and the coding and decoding are performed by CU. The units are carried out separately.
  • Tile A higher-level partitioning method for a picture, dividing a picture into m lines n ⁇ ij, each of which is called a Tile. As shown in Figure 3, the picture is divided into 1 row and 3 columns. The rows and columns of the Tile are all in the smallest unit of LCU, that is, one LCU cannot belong to two tiles at the same time. After the division of the tiles is determined, the encoder assigns the block identifiers in order from left to right and then from top to bottom. ( tile id ) dioxide The size of the general tile is divided according to the configuration file corresponding to the picture.
  • the configuration file generally stores the input parameters that need to be determined in the encoding process of the encoder, such as coding tools, coding restrictions, and pictures to be encoded. Attributes and more.
  • Independent tile A type of tile. When intra prediction, CUs in an independent tile cannot be referenced to each other.
  • Dependent tile A type of tile. In intra prediction, the CU in the dependent tile can refer to the CU in the independent tile.
  • the HEVC encoder encodes the LCU as the smallest unit. If the width and height of a picture do not satisfy the integer multiple of the LCU, it needs to be complemented and re-encoded.
  • the LCU with the shaded portion in Figure 3 is the picture area, and the blank part.
  • the LCU is the complement part, which is called LCU alignment.
  • the previously supplemented LCU portion needs to be cropped and then output, which is called cropping.
  • each LCU is assigned an address from the beginning to the bottom in order from left to right and then from top to bottom, which is called an LCU address. According to the division of the tile, any LCU address can be calculated. Which tile belongs to, that is, a lookup table from the LCU address to the Tile id can be established.
  • a video to be encoded can be regarded as a sequence of pictures, which is encoded to form a video code stream, and the video code stream includes a sequence of encoded pictures and a parameter set required for decoding the picture.
  • An access unit (AU) includes a frame of coded pictures and a set of parameters required for decoding pictures, or an AU contains only one frame of coded pictures.
  • a 1-bit identifier tile_splittable_flag is defined in the video usability information (VUI) parameter structure, indicating that the tile in the code stream satisfies the following characteristics: 1.
  • the division of Tile remains the same for each picture in the sequence of pictures;
  • a prediction reference can be made between tiles with the same id between different frames in the sequence
  • Each tile is separately loop-filtered.
  • a complete picture is reconstructed by decoding several CUs.
  • the CU in the picture may be predicted by different parts of different reference frames.
  • the picture obtained by this prediction decoding may be the same as the original picture.
  • the source device 100 is a network side video head end device, and includes a video memory 101 for storing video (ie, a sequence of pictures) before and after encoding, an encoder 102 for encoding a sequence of pictures, and a code for numbering the pictures.
  • the transmitter 103 is streamed to another device.
  • Source device 100 may also include a video capture device, such as a camera, to capture video and store the captured video in video memory 101, and may also include other components such as: intra encoder components, various filters, and the like.
  • Video memory 101 typically includes a large amount of storage space.
  • video memory 101 can include dynamic random access memory (DRAM) or FLASH memory.
  • video memory 101 can include non-volatile memory or any other data storage device.
  • Encoder 102 may be part of executing a video encoding device.
  • the encoder can include a chipset for video codec, including some combination of hardware, software, firmware, processor or digital signal processing (DSP).
  • DSP digital signal processing
  • the transmitter 103 modulates the video code stream through a wired network, a wireless network, etc., and sends it to the receiving end.
  • the receiving device 200 is a user side terminal device, and includes a receiver 203 for receiving the encoded video code stream, a decoder 202 for decoding the video code stream, and outputting the decoded video to the end user.
  • the display device 201 is, for example, an LED television.
  • Receiving device 200 may also include other components such as: a modem, a signal amplifier, a memory, and the like.
  • Decoder 202 may be part of executing a video decoding device.
  • the decoder may include a chipset for video encoding and decoding, including some combination of hardware, software, firmware, processor or digital signal processing (DSP).
  • DSP digital signal processing
  • the display device 200 can be a 2D display device, or a 2D or 3D compatible display device. Such as monitors, televisions, projectors, etc.
  • the encoder 102 includes a buffer 1021 and a processor 1022.
  • the buffer 1021 contains a smaller and faster storage space relative to the video memory 101.
  • buffer 1021 can include synchronous random access memory (SRAM).
  • SRAM synchronous random access memory
  • the buffer 1021 may include an "on-chip" memory that is integrated with other components of the encoder 102 to provide very fast data access during the processor 1022 intensive encoding process.
  • the sequence of pictures to be encoded may be loaded from video memory 101 to buffer 1021 in sequence.
  • the buffer 1021 is also used to store a configuration file of a video to be encoded, a software program for executing a specific encoding algorithm, and the like. In some cases, the buffer 1021 is also used to store a picture that has been encoded, which has not yet reached the transmission time or is used to provide a reference for encoding the next frame picture. In other embodiments, a memory having a memory function can be used as the buffer 1021.
  • the processor 1022 obtains the picture to be encoded from the buffer 1021 and encodes the picture until the picture sequence contained in the video is encoded.
  • buffer 2021 includes a buffer 2021 and a processor 2022.
  • the buffer 2021 is a smaller and faster storage space.
  • buffer 2021 can include synchronous random access memory (SRAM).
  • SRAM synchronous random access memory
  • Buffer 2021 An "on-chip" memory can be included that is integrated with other components of decoder 202 to provide very fast data access during processor 2022 intensive encoding.
  • the sequence of pictures to be decoded may be loaded into a buffer 2021.
  • the buffer 2021 also stores software programs or the like for executing a particular decoding algorithm.
  • the buffer 2021 can also be used to store a picture of a decoded reference frame that has been decoded but has not yet been displayed or needs to be a subsequent picture.
  • a memory having a memory function can be used as the buffer 2021.
  • the processor 2022 obtains the picture to be decoded from the buffer 2021 and decodes the picture until the picture sequence contained in the video code stream is decoded.
  • a message is added to the coded stream, for example: a Supplemental Enhancement Information (SEI) or a sequence parameter set ( In Sequence Parameter Set, SPS), to help decoder 202 decode.
  • SEI Supplemental Enhancement Information
  • SPS Sequence Parameter Set
  • an independently decodable area identifier may be added to the auxiliary message for identifying that there is an independently decodable area in the encoded code stream.
  • the view portion of each frame of the 3D video stream that is finally displayed by the 2D display device is an independently decodable view; corresponding to the tile partition, the area covering the independently decodable view is an independently decodable region.
  • auxiliary message itself can also be regarded as an independently decodable identifier. If the encoded video stream contains an auxiliary message, it is considered that there is an independently decodable area in the code stream; otherwise, there is no independently decodable area.
  • the decoder 202 takes out the independently decodable area of each picture in the encoded video stream (i.e., the picture sequence) and then performs normal decoding.
  • the encoder 102 of the embodiment of the present invention needs to ensure the characteristics of the independently decodable region as follows:
  • the position and size of the independently decodable region remain unchanged between each frame of the same stitching type and flip type;
  • the CUs in the independently decodable region select inter-predicted reference picture blocks from the same splicing type and the independently decodable regions of the flip type picture;
  • Loop filtering can be performed separately in the independent decoding area.
  • the auxiliary message may further include the following information: an independently decodable area location identifier, a cropping information in an independently decodable area output display, and a sub-picture sequence (ie, a substream) composed of sub-pictures in the independently decodable area. And level information.
  • the buffer 1021 of the encoder 102 shown in FIG. 5 can be implemented by the storage unit 1023 as shown in FIG. 7; further, the processor 1022 can be confirmed by the independently decodable view as shown in FIG.
  • the unit 1024, the block dividing unit 1025, the auxiliary message generating unit 1026, and the execution encoding unit 1027 are implemented.
  • the encoder 102 shown in Fig. 5 can perform the encoding methods as shown in Figs. 10, 11, 13, 13, 14, and 17, as an alternative embodiment.
  • the buffer 2021 of the decoder 202 shown in FIG. 6 can be implemented by the storage unit unit 2023 as shown in FIG. 8; further, the processor 2022 can pass through the receiving unit 2024 unit as shown in FIG. And the implementation decoding unit 2025 is implemented.
  • the decoder 102 shown in Fig. 6 can perform the decoding methods as shown in Figs. 12, 15, and 18.
  • FIG. 7 is a schematic structural diagram of a functional module of an encoder according to an embodiment of the present invention.
  • the encoder 102 includes: a storage unit 1023, an independently decodable view confirmation unit 1024, a block division unit 1025, an auxiliary message generation unit 1026, and an execution coding unit 1027.
  • a storage unit 1023 configured to store a configuration file of the video
  • an independently decodable view confirmation unit 1024 configured to determine an independently decodable view in the to-be-coded picture according to the configuration file corresponding to the video
  • a partitioning unit 1025 For dividing the picture into at least two tiles, wherein an area corresponding to one or more tiles covering the independently decodable view is an independently decodable area; an auxiliary message generating unit 1026, Generating an auxiliary message corresponding to the picture, where the auxiliary message includes the independently decodable area location identifier, and the independently decodable area location identifier is composed of one or more tile ids;
  • the encoding unit 1027 is configured to encode all the tiles included in the picture, so as to form an encoded video code stream, where the encoded video code stream includes the auxiliary message.
  • the execution coding unit further includes a determining unit, configured to determine whether the current tile to be encoded is a tile in the independently decodable region; if yes, setting the independently decodable region of the encoded picture to an inter-frame reference candidate region of the current tile; The entire picture area of the encoded picture is set as the inter-frame reference candidate area of the current tile.
  • the optimal reference area is selected according to the inter-frame reference candidate area corresponding to the to-be-coded tile.
  • the storage unit 1023 is further configured to load a picture to be encoded, and load the picture that the execution coding unit 1027 completes encoding.
  • the independently decodable view confirming unit 1024 can also perform step S301 and step S302 as shown in FIG. 10 and step S401 shown in FIG. 11; the blocking unit 1025 can also perform step S402 shown in FIG. 11; the auxiliary message generating unit 1026 It is also possible to perform steps S403 and S404 as shown in FIG. 11; and, the execution encoding unit 1027 can also perform steps S405 to S409 as shown in FIG.
  • the storage unit 1023 is further configured to load a picture to be encoded, and load a picture that performs the encoding by the encoding unit 1027.
  • the independently decodable view confirmation unit 1024 can also perform steps S601 to S603 as shown in FIG. 13; the block division unit 1025 can also perform steps S701 and S702 as shown in FIG. 14; the auxiliary message generation unit 1026 can also execute the figure. Step S604 shown in Fig. 13; and, the execution coding unit 1027 can also perform steps S703 to S707 as shown in Fig. 14.
  • the storage unit 1023 is further configured to load a picture to be encoded, to And loading the execution encoding unit 1027 to complete the encoded picture.
  • the independently decodable view confirmation unit 1024 can also perform steps S901 and S902 as shown in FIG. 16; the block division unit 1025 can also perform steps S903 and S904 as shown in FIG. 16; the auxiliary message generation unit 1026 can also execute the figure. Step S905 shown in FIG. 16; and, the execution coding unit 1027 can also perform steps S1001 to S1006c as shown in FIG.
  • FIG. 8 is a schematic diagram showing the structure of a functional module of a decoder according to an embodiment of the present invention.
  • the decoder 202 includes a storage unit 2023, a receiving unit 2024, and an execution decoding unit 2025.
  • the storage unit 2023 is configured to store a picture to be decoded, and load a picture that is performed by the decoding unit 2025 but not yet displayed.
  • the receiving unit 2024 is configured to receive a video code stream, where the video code stream includes a video to be decoded and an auxiliary message, where the video to be decoded is composed of a sequence of pictures to be decoded, and a decoding unit 2025 is configured to obtain a to-be-decoded Obtaining, according to the auxiliary message, an independently decodable region location identifier of the independently decodable region of the to-be-decoded picture, where the independently decodable region location identifier is identified by a block of one or more tiles ( A tile id is formed; and the independently decodable region of the picture to be decoded is obtained according to the independently decodable region location identifier, and the independently decodable region is decoded.
  • the receiving unit 2024 is further configured to perform step S501 as shown in FIG. 12; and, the execution decoding unit 2025 is further configured to perform steps S502 to S515 as shown in FIG.
  • the receiving unit 2024 is further configured to perform step S801 as shown in FIG. 15; and, the executing decoding unit 2025 is further configured to perform the step S802 to the step shown in FIG. 15 in the present invention.
  • the receiving unit 2024 is further configured to perform step S1101 shown in FIG. 18; and, the executing decoding unit 2025 is further configured to perform step S1102 to step as shown in FIG. 18. Step SI 114.
  • FIG. 9 is a structural diagram of a specific implementation of an encoder according to an embodiment of the present invention.
  • the Fn current 1001 is a frame of the current picture to be encoded in the to-be-coded video
  • the F, n-1 reference 1002 is a framed picture that is encoded in the to-be-coded video, and provides an encoding reference for the picture to be encoded.
  • the input frame Fn is currently processed by the intra-frame or inter-frame predictive coding method. If intra prediction coding is used, its predicted value PRED (represented by P in the figure) is obtained by motion compensation 1006 (MC) of the reference picture that was previously encoded by Fn before 1001, where the reference picture is referenced by F'n-1 1002 indicates. In order to improve the prediction accuracy and thereby increase the compression ratio, the actual reference image can be selected in the frame that has been encoded, decoded, reconstructed or filtered.
  • MC motion compensation 1006
  • a residual block Dn is generated, which is subjected to block transformation and quantization to generate a set of quantized transform coefficients X, and then entropy encoded 1014, and some side information required for decoding (such as prediction) Mode quantization parameters, motion vectors, etc.) together form a compressed code stream for transmission and storage via NAL (Network Adaptive Layer).
  • NAL Network Adaptive Layer
  • the encoder in order to provide a reference image for further prediction, the encoder must have the function of reconstructing the image. Therefore, the Fn obtained by inverse quantization and inverse transformation of the residual image must be added, and the reconstructed 1003 is added to the predicted value P to obtain uFn, (unfiltered frame).
  • a loop filter is provided, and the filtered output Fn' reconstructed 1003, that is, the reconstructed image can be used as a reference image.
  • the limitation of the reference range of the outer coding unit that can be independently decoded is cancelled (refer to the description of FIG. 10 to FIG. 18), so that a more similar reference unit can be selected to improve the prediction. Accuracy, thereby increasing the compression ratio.
  • a video encoding method is provided in the embodiment of the present invention.
  • the auxiliary message is carried in an SEI message.
  • Each frame of the 3D video stream is finally displayed in 2D
  • the view portion of the display is an independently decodable view; the area corresponding to the tile partition covering the independently decodable view is an independently decodable region.
  • the SEI message includes an independently decodable area identifier.
  • the independently decodable region corresponds to a tile, that is, a tile partition is used to include the independently decodable view within a tile.
  • the video to be encoded that is, the sequence of pictures to be encoded, may include pictures having different splicing types and different flip types.
  • the encoding method shown in FIG. 10 is a process of encoding a video by an encoder.
  • the video is a sequence of pictures of M frames. Each frame is exemplified by two views. One of the views is a 2D display device. The final displayed portion, the above-described independently decodable view.
  • a picture spliced by two or more views can be simply converted by this method.
  • Step S301 Read a configuration file of the video to be encoded, where the configuration file stores predetermined input parameters, such as an encoding tool, an encoding restriction, an attribute of the image to be encoded, and the like, which are required in the encoder encoding process.
  • predetermined input parameters such as an encoding tool, an encoding restriction, an attribute of the image to be encoded, and the like, which are required in the encoder encoding process.
  • Step S302 Determine, according to the configuration file, a view in the picture sequence that needs to be independently decoded.
  • an independently decodable view corresponding to each splicing type and flip type picture in the entire picture sequence is preset in the configuration file, for example, a left view or a right view in a frame picture.
  • the same splicing type and the flip-type picture correspond to the independently decodable views, and the different splicing types and flip-type pictures correspond to different independently decodable views.
  • Step S304 Encoding the picture of the i-th frame, and the specific encoding process will be described in detail in FIG. 11 below.
  • Step S305 The AU output corresponding to the current i-th frame picture is saved in its own or external storage device, for example, the video memory 101 of the source device 100 or the buffer 1021 of the encoder 102. Medium, or directly transmitted to the remote receiving device 200 through the network.
  • FIG. 11 is a schematic flowchart of a method for specifically encoding a frame of picture in step S304 in the video encoding method shown in FIG.
  • Step S401 Obtain an independently decodable view corresponding to the splicing type and the flip type of the current i-th frame picture according to the configuration file.
  • Step S402 Determine a tile partition according to a minimum tile covering an independently decodable view.
  • the independently decodable view is included in the scope of a tile, and the tile's partitioning up and down and left and right must meet the LCU alignment requirements.
  • the area corresponding to the tile is an independently decodable area.
  • the i-th frame picture is divided into a tile consisting of a minimum tile covering the independently decodable region and a region other than the smallest tile covering the independently decodable region, a total of two tiles, and the number of sets of the tile N is equal to 2.
  • the number of tiles is not limited to two.
  • the tile id is allocated according to the order from left to right and then from top to bottom, and the minimum tile id of the preset overlay independent viewable view is s.
  • the division of the tile is not necessarily based on the minimum tile covering the independently decodable view, as long as the tile coverage can be independently decoded, and the upper and lower boundaries of the tile satisfy the LCU alignment requirement. .
  • Step S403 determining whether the SEI message needs to be generated, and the determining condition is: the current ith frame picture is the first frame picture in the video stream to be encoded, or is not the first frame picture but is different from the previous frame splicing type or flip type. Then, the SEI message needs to be generated, and step S404 is performed. Alternatively, the current ith frame picture is not the first frame picture in the video stream to be encoded, and the splicing type and the flip type of the current ith frame picture and the previous frame picture are the same. Step S405 is performed. That is, in this embodiment, even The same splicing type and the flip type picture correspond to one SEI message. If the splicing type and the flip type of the two frames are different, a new SEI message needs to be generated.
  • Step S404 Create an SEI message, which is temporarily named INDEC_RGN_SEI, and set each field of the SEI message, which is defined as the following table.
  • the ue(v) in the table indicates that the length of the field is variable, u(n) indicates that the length of the field is n bits (bite), and u(l) indicates that the length of the field is 1 bit.
  • the area location identification information can be independently decoded:
  • Tile_id The id of the smallest tile overriding this independently decodable view, in this case 8 .
  • cropping_enable_flag If the width of the width of the tile tile (s) contained in the independently decodable view (s) are equal, and may be equal to the height of the tile independently decodable view (s) of the height is set cropping_enable_flag to false, no shellfish 1 J is tme.
  • Pic_crop_left_oflfset Contains the abscissa of the independently decodable view relative to the left edge of the tile(s), in pixels.
  • Pic_crop_right_offset A horizontal coordinate that contains the independently verifiable view relative to the right edge of the tile(s), in pixels.
  • Pic_crop_top_offset Contains the ordinate of the independently decodable view relative to the top edge of the tile(s), in pixels.
  • Pic_crop_bottom_of set Contains the ordinate of the independently decodable view relative to the bottom edge of the tile(s), in pixels.
  • New_profile_flag Indicates whether the profile of the independently decodable area substream is the same as the identifier of the profile of the entire code stream. If the value is 0, it means the same; if the value is 1, it means different.
  • New_level_flag indicates whether the level of the sub-code stream can be independently decoded and the entire code stream The identifiers of level are the same. If the value is 0, it means the same. If the value is 1, it means different.
  • Profile_idc The profile id that the encoding toolset in the independently decodable region can match.
  • Level_idc The minimum level id that the decoder needs to satisfy. Calculate the code rate and maximum buffer of the decoded tile(s) according to the ratio of tile(s) to the area of the entire picture. For example, the code rate of decoding the entire picture is x, the maximum buffer is y, and tile(s) occupies the entire picture. The ratio of the area is r, then the code rate of tile(s) is x*r, and the maximum buffer bit is y*r. According to the code rate of the profile_idc, tile(s) x*r and the maximum cache bit y*r, the minimum level corresponding to the decoding performance is found, and the leveljdc is set to the minimum level.
  • Step S405 If k is s, that is, the current tile is a tile covering the independently decodable view, step S406 is performed, and step S407 is performed.
  • Step S406 Set the coded splicing type and the flip type to be the same as the current ith frame picture.
  • the tile(s) is the inter-frame reference candidate area of the current ith frame tile(s).
  • Step S407 Set all the picture areas in the encoded frame picture to be inter-frame reference candidate areas of tile(k).
  • Step S408 Select to use the intra prediction or inter prediction algorithm to encode the tile(k).
  • the inter prediction algorithm select the optimal reference region from among the inter-frame reference candidate regions obtained in steps S406 and S407 for encoding. .
  • the encoding method provided in FIG. 10 and FIG. 11 above adds an auxiliary message to the encoded code stream, which is carried by the SEI message, and the profile and level information in the SEI message are only for the independently decodable area.
  • the resulting sub-stream reduces the performance requirements of the decoder.
  • different inter-frame reference candidate regions are set for the independently decodable region tile and the non-independently decodable region tile in steps S406 and S407, respectively, which ensures that the coding blocks in the region can be independently decoded, and the outer coding blocks are expanded.
  • the range it is possible to refer to the coding block which is more similar to the current block in coding, which improves coding efficiency and saves the amount of transmitted data.
  • a video decoding method is provided in the embodiment of the present invention.
  • a decoder decodes a video code stream that has been encoded by the process shown in FIG. 10 and FIG. 11, that is, decodes a length.
  • the process for the M picture sequence is as follows:
  • Step S501 Receive a video code stream to be decoded, where the video code stream includes a plurality of AUs, and each AU corresponds to one frame of the encoded picture.
  • Step S502 Obtain an AU from the code stream.
  • Step S503 determining whether the current AU includes a picture in a frame stitching format, and the determining method is as follows: 1) the current AU includes an FPA (frame packing arrangement) message, and the cancel flag bit in the message is 0; 2) the current AU does not include the FPA message, However, the last FPA message received in the video stream to be decoded has a cancellation flag of 0. If one of the two conditions is satisfied, step S504 is performed, otherwise step S515 is performed.
  • FPA frame packing arrangement
  • Step S504 Determine whether the current AU includes an SEI message, if yes, go to step S506, otherwise go to step S505.
  • Step S505 Whether the previously received AU includes the SEI message, if the parameter in the message is used to decode and output the picture in the current AU, step S509 is performed, otherwise step S515 is performed.
  • Step S506 It is determined whether the performance of the decoder meets the profile and level requirements in the SEI message. If it is not met, the decoding cannot be performed, and the process ends. If yes, step S507 is performed. Step S507: Initialize the decoder according to the profile and level information in the SEI message.
  • Step S508 Obtain a tile id corresponding to the independently decodable region from the SEI message.
  • the tile id corresponding to the independently decodable region is s.
  • Step S509 The picture information in the AU is taken out, and the picture information is encoded picture information, which is to be decoded by the decoder.
  • Step S510 The picture in the tile(s) in the picture is extracted according to the tile id corresponding to the independently decodable area obtained in the SEI message.
  • Step S511 Decoding the picture in the tile(s), and the decoding method is determined according to the corresponding encoding method in the encoding process.
  • Step S512 The picture in the tile(s) is cut according to the cropping information in the SEI message. If cropping_enable_flag is false, no clipping is required; no shell 1 J extracts the area identified by pic_crop_left_offset, pic_crop_right_oifset, pic_crop_top_offset, and pic_crop_bottom_offset in tile(s), that is, the independently decodable view in tile(s).
  • Step S513 Output the independently decodable view in the independently decodable area.
  • Step S514 If the current AU is the last AU in the code stream, the decoding ends; otherwise, step S512 is performed.
  • Step S515 The normal decoding process.
  • the decoder when the decoder receives the 3D video code stream encoded by the frame encapsulation 3DTV technology, and the 2D display device is connected, the decoder can only extract the two views according to the SEI message.
  • One decoding ie decoding only the independently decodable regions, reduces the decoder performance requirements and saves the decoder's computational and memory resources.
  • the profile and level corresponding to the independently codeable region substreams will generally reduce the performance and storage requirements for the decoder. Therefore, the decoder decoding time and power consumption can be saved after the decoder is initialized, and the decoder storage requirement is saved.
  • the decoder does not meet the profile and level requirements of the original 3D video stream, and meets the profile and level requirements of the sub-code stream corresponding to the independently codeable region, the decoder is improved in 2D display compatible with high resolution or bit rate requirements. 3D video stream support.
  • FIG. 13 another video encoding method is provided in the embodiment of the present invention.
  • the auxiliary message is also carried by an SEI message, but the SEI message is different from that in FIG. 10, FIG. 11, and FIG. SEI message.
  • An SEI message includes different independently decodable area identifiers, cropping information, profile, and level information corresponding to various splicing types and flip type pictures.
  • the view that each frame of the picture in the 3D video stream is finally displayed by the 2D display device is an independently decodable view.
  • the area covering the independently decodable view is represented by a rectangular area composed of a plurality of tiles, and each tile needs to meet the LCU alignment requirement, and the rectangular area composed of the plurality of tiles is an independently decodable area.
  • the video to be encoded that is, the sequence of pictures to be encoded, may include pictures having different splicing types and different flip types.
  • the encoding method shown in FIG. 13 is a process of encoding a video by an encoder.
  • the video is a sequence of pictures of length M, wherein each frame is exemplified by two views, one of which is a 2D display device.
  • the portion that is displayed that is, the above-described independently decodable view.
  • two or more views are used to stitch pictures, and the coding methods thereof can be simply converted by this method.
  • Step S601 Read a configuration file of the video to be encoded, where the configuration file stores predetermined input parameters, such as an encoding tool, an encoding restriction, an attribute of a picture to be encoded, and the like, which are required in the encoding process of the encoder.
  • an independently decodable view corresponding to each splicing type and flip type picture in the entire picture sequence is preset in the configuration file, for example, a left view or a right view in a frame picture.
  • the same splicing type and flip type picture correspond to the same independently decodable view, different splicing types and flipping classes
  • the type of picture corresponds to an independently decodable view.
  • Step S602 Acquire a combination of a stitch type and a flip type of each frame of the picture sequence according to the configuration file of the video to be encoded.
  • Step S603 determining, according to the splicing type and the flip type information of each frame of each splicing type and the flip type preset preset in the configuration file, each frame of the picture sequence according to the splicing type and the flip type information of each frame acquired according to the configuration file. Independently decodable view.
  • Step S604 Create an SEI message.
  • the SEI message is sent only once in a sequence of pictures, and the SEI message includes different independently decodable area identifiers, cropping information, rofile and level information corresponding to various stitching types and flip type pictures.
  • the parameters corresponding to the left and right stitching and the no flip type picture are stored in if (arrange_leftright_no_flip) ⁇ ... ⁇ , and the parameters in this area are taken when decoding the left and right stitched and unflipped pictures.
  • the stitching and flipping combinations are similar.
  • Arrange topdown no flip u(l) Arrange_topdown_flip u(l) if (arrange leftright no flip) ⁇
  • ⁇ ue(v) in the table indicates that the length of the field is variable, and u(n) indicates that the length of the field is n bits (bite). u(l) identifies the length of this field as 1 bit.
  • Arrange — leftright _flip The image is stitched left and right and the left and right views are flipped.
  • Arrange_topdown_no_fli The picture is stitched up and down and no left and right views are flipped.
  • Arrange_topdown_fli The picture is stitched up and down and the left and right views are flipped.
  • the area location identification information can be independently decoded:
  • Tile_num Overrides the number of tiles that can be independently decoded by the view.
  • Tile_ids Overrides the array of tile ids contained in the independently decodable view, indicating the set of ids corresponding to several tiles covering the independently decodable regions.
  • Cropping_enable_flag If the width of the independently decodable view is equal to the width of the independently decodable region, and the height of the independently decodable view is equal to the height of the independently decodable region, set cropping-enable_flag to false, otherwise to tme.
  • Pic_crop_left_offset Contains the abscissa of the independently decodable view relative to the leftmost edge of the tile that covers the independently decodable view, in pixels.
  • Pic_crop_right_offset Contains the abscissa of the independently decodable view relative to the rightmost edge of the tile that can independently decode the view, in pixels.
  • Pic_crop_top_offset The ordinate of the topmost edge of the tile that can be independently decoded relative to the overlay of the independently decodable view, in pixels.
  • Pic_crop_bottom_of set Contains the ordinate of the bottom edge of the tile that can be independently decoded relative to the number of tiles that can independently decode the view, in pixels.
  • Profile_idc The conformable profile id of the encoding toolset in the independently decodable region.
  • Level_idc The lowest level id that the decoder needs to satisfy. Calculating the code rate and maximum buffer of the independently decodable decoding area according to the ratio of the area of the independently decodable area to the entire picture, for example: decoding the entire picture with a code rate of x, the maximum buffer is y, and the independently decodable area occupies the entire picture. The ratio of the area is r, then the code rate of the independently decodable region is x*r, and the maximum buffer bit is y*r. According to the profile_idc, the code rate of the independently decodable region x*r and the maximum buffer bit y*r, the minimum level matching the decoding performance is found, and the level_idc is set to the minimum leveL.
  • Step S606 Encoding the picture of the i-th frame, and the specific encoding process will be described in detail in FIG. 14 below.
  • Step S607 The AU output corresponding to the current i-th frame picture is stored in its own or external storage device, for example, the video memory 101 of the source device 100 or the buffer 1021 of the encoder 102, or directly transmitted to the network through the network. Remote receiving device 200.
  • FIG. 14 is a flow chart showing a method for specifically encoding a frame of picture in step S606 in the video encoding method shown in FIG.
  • Step S701 Determine a tile division scheme according to a configuration file of the video to be encoded and divide the picture.
  • the tile id is assigned in order from left to right and top to bottom.
  • Step S702 Determine, according to the independently splicable view of the splicing type and the flip type picture in the picture sequence determined in step S603, a plurality of tile sets covering the independently decodable view, and each tile needs to meet the LCU alignment requirement.
  • the overlay can independently decode regions corresponding to several tiles of the view It is an independently decodable area.
  • the tile_num field of the SEI message is set by the number of tiles covering the independently decodable region, and the tilejds field is set according to the tile id set covering the independently decodable region, which is an id array of several tiles.
  • the corresponding field of the SEI message cropping information is set according to the corresponding cropping information.
  • Step S703 Determine, according to step S702, whether the current tile(k) belongs to one of a plurality of tiles covering the independently decodable view. If tile(k) belongs to one of several tiles covering the independently decodable view, step S704 is performed, otherwise execution is performed. Step S705.
  • Step S704 If tile(k) belongs to one of several tiles covering the independently decodable view, set the independently decodable region in the picture with the same stitching type and flipping type that is previously encoded as the interframe of the current picture tile(k) Refer to the candidate area.
  • Step S705 If the tile (k) does not belong to one of the plurality of tiles covering the independently decodable view, set all the picture regions in the previously encoded picture to be the inter-frame reference candidate region of the current picture tile(k).
  • Step S706 Select to use the intra prediction or inter prediction algorithm to encode the tile(k).
  • the optimal reference region is selected from the interframe reference candidate regions described in steps S704 and S705.
  • the encoding method provided in the foregoing FIG. 13 and FIG. 14 adds an auxiliary message to the encoded code stream, and is carried by the SEI message.
  • the profile and level information in the SEI message are only for the sub-code stream formed by the independently decodable region, and are reduced.
  • the requirements for decoder performance are different inter-frame reference candidate regions are set for the independently decodable region tile and the non-independently decodable region tile in steps S704 and S705, respectively, which ensures that the coding blocks in the region can be independently decoded, and the outer coding blocks are expanded. Reference range, Therefore, it is possible to refer to a coding block which is more similar to the current block in coding, which improves coding efficiency and saves transmission data amount.
  • a decoder decodes a video code stream that is encoded by the process shown in FIG. 13 and FIG. 14 , that is, decodes a length.
  • the process for the M picture sequence is as follows:
  • Step S801 Receive a video code stream to be decoded, where the video code stream includes a plurality of AUs, and each AU corresponds to one frame of the encoded picture.
  • Step S802 Acquire an AU from the video code stream.
  • Step S803 determining whether the current AU includes a picture in a frame stitching format, and the determining method is as follows: 1) the current AU includes an FPA (frame packing arrangement) message, and the cancel flag bit in the message is 0; 2) the current AU does not include the FPA message, However, the last FPA received in the video stream to be decoded has a cancellation flag of 0. If one of the two conditions is satisfied, step S804 is performed, otherwise step S815 is performed.
  • FPA frame packing arrangement
  • Step S804 Determine whether the current AU includes the SEI message, or the SEI message has been received in the previous code stream, if yes, proceed to step S805; otherwise, go to step S816.
  • Step S805 The encoded picture information in the current AU is taken out.
  • Step S806 According to the FPA message, it is determined that the splicing type and the flip type of the current ith frame picture are the same as the previous frame picture, then step S811 is performed, otherwise step S807 is performed.
  • Step S807 According to the splicing type and the flip type of the current frame, the parameters corresponding to the type in the SEI message are found, and the independently decodable area identification information, the cropping information, and the rofile and level information corresponding to the type are obtained.
  • Step S808 determining whether the performance of the decoder meets the profile and level in the SEI message, such as If it does not match, it cannot be decoded and ends directly; if it is, go to step S809.
  • Step S809 Initialize the decoder according to the profile and level information in the SEI message.
  • Step S810 Obtain, from the foregoing SEI message, a plurality of tile id sets corresponding to the independently decodable regions.
  • Step S811 Extract a plurality of tiles covering the independently decodable regions according to the set of the tile ids described above.
  • Step S812 Decoding a picture in a plurality of tiles covering the independently decodable area, and the decoding method is determined according to a corresponding coding method in the coding process.
  • Step S813 Cut a plurality of in-tile images according to the cropping information in the SEI message. If cropping_enable_flag is false , no clipping is required; otherwise, the regions identified by pic_crop_left_offset , pic_crop_right_oifset , pic_crop_top_offset , and pic_crop_bottom_offset in several tiles are taken, that is, views that can be independently decoded in several tiles.
  • Step S814 Output the independently decodable view to the display device 201 as shown in FIG. 4, or, if the output time has not expired, temporarily store it to the buffer 2021 as shown in FIG.
  • Step S815 If the current AU is the last AU in the code stream, the decoding ends; otherwise, step S802 is performed.
  • Step S816 The normal decoding process.
  • the decoder when the decoder receives the 3D video code stream encoded by the frame encapsulation 3DTV technology, and the 2D display device is connected, the decoder can only extract the two views according to the SEI message.
  • One decoding ie decoding only the independently decodable regions, reduces the decoder performance requirements and saves the decoder's computational and memory resources.
  • the profile and level of the sub-code stream corresponding to the independently codeable region generally reduce the performance and storage requirements of the decoder, so the decoder decoding time and power consumption can be saved after the decoder is initialized, and the decoder storage requirement is saved.
  • the decoder does not meet the profile and level requirements of the original 3D video stream, Supporting the profile and level requirements of the independently coded region substream, it improves the decoder's support for 2D display compatible 3D video streams with high resolution or bit rate requirements.
  • a video encoding method is provided in the embodiment of the present invention.
  • the auxiliary message is carried by an SPS message.
  • the SPS message may include an independently decodable area identifier, and may also include independently decodable area location identification information, cropping information, profile, and level information.
  • the independently decodable region corresponds to a rectangular region composed of one or more independent tiles.
  • An area outside the independent decoding area corresponds to a rectangular area composed of one or more dependent tiles.
  • the video to be encoded that is, the sequence of pictures to be encoded, has the same splicing type and flip type.
  • the encoding process is similar to steps S403 and S404 shown in FIG. 11, and the decoding process is similar to the step S807 shown in FIG. 15, except that the auxiliary messages are respectively SEI. Messages and SPS messages are hosted.
  • the encoding method shown in FIG. 16 is a process of encoding a video by an encoder, wherein the video is a sequence of pictures of length M, wherein each frame is exemplified by two views, one of which is a 2D display device.
  • the portion that is displayed that is, the above-described independently decodable area.
  • two or more views are used to stitch pictures, and the coding methods thereof can be simply converted by this method.
  • Step S901 Read a configuration file of the video to be encoded, where the configuration file stores predetermined input parameters, such as an encoding tool, an encoding restriction, an attribute of the image to be encoded, and the like, which are required in the encoder encoding process.
  • the independently decodable view of each frame of the picture is preset in the configuration file. Since the picture sequence in this embodiment has the same splicing type and flip type, the independently decodable view of each frame picture is the same.
  • Step S904 Determine, according to the independently decodable view of each frame of the picture determined in step S902, a plurality of tile sets covering the independently decodable view, and define a tile in the set as an independent tile, and the tile other than the set is an dependent tile.
  • the coverage of several tiles that can independently decode the view requires the LCU alignment of the upper, lower, left and right borders.
  • Step S905 Set parameters in the SPS message, and set each field in the SPS message according to the identification information, cropping information, profile, and level information of the independently decodable area, and the definition is as follows.
  • Profile tier level ( ProfilePresentFlag, Descriptor MaxNumSubLayersMinus 1 ) ⁇
  • ue(v) in the table indicates that the length of the field is variable
  • u(n) indicates that the length of the field is n bits
  • u(l) identifies that the length of the field is 1 bit.
  • Indec_rgn_present_flag Set to tme if there is an independently decodable view in the video, otherwise false.
  • the area location identification information can be independently decoded:
  • Tile_num Overrides the number of tiles that can be independently decoded by the view.
  • Tile_ids Overrides the array of tile ids contained in the independently decodable view, indicating the set of ids corresponding to several tiles that can independently decode the view.
  • Cropping_enable_flag set cropping_enable_flag to false if the width of the independently decodable view is equal to the total width of the tiles of the independently decodable view, and the height of the independently decodable view is equal to the total height of the tiles of the independently decodable view, otherwise True.
  • Pic_crop_left_oflfset Contains the abscissa of the independently decodable view relative to the leftmost edge of the tile that covers the independently decodable view, in pixels.
  • Pic_crop_right_offset Contains the abscissa of the independently decodable view relative to the rightmost edge of the tile that can independently decode the view, in pixels.
  • Pic_crop_top_offset The ordinate of the topmost edge of the tile that can be independently decoded relative to the overlay of the independently decodable view, in pixels.
  • Pic_crop_bottom_oflfset Contains the ordinate of the bottom edge of the tile that can be independently decoded relative to the number of tiles that can independently decode the view, in pixels.
  • New_profile_flag Indicates whether the prifile of the independently decodable area subcode stream is the same as the identifier of the prifile of the entire code stream. If the value is 0, it means the same; if the value is 1, it means different.
  • New_level_flag Indicates whether the level of the sub-code stream that can be independently decoded is the same as the identifier of the level of the entire code stream. If the value is 0, it means the same; if the value is 1, it means different.
  • Profile_idc The conformable profile id of the encoding toolset in the independently decodable region.
  • Level_idc The minimum level id that the decoder needs to satisfy. Calculating the code rate and maximum buffer of the independently decodable decoding area according to the ratio of the area of the independently decodable area to the entire picture, for example: decoding the entire picture with a code rate of x, the maximum buffer is y, and the independently decodable area occupies the entire picture. The ratio of the area is r, then the code rate of the independently decodable region is x*r, and the maximum buffer bit is y*r. According to profile_idc, the code rate of the independently decodable region x*r and the maximum buffer bit y*r, find the minimum level that satisfies this decoding performance, and set level_idc to the minimum leveL above.
  • Step S907 Encoding the i-th frame picture, the specific encoding process will be described in detail in FIG. 17 below.
  • FIG. 17 is a schematic flowchart of a method for specifically encoding a frame of picture in step S907 in the video encoding method shown in FIG. 16.
  • Step S1002 Determine whether the current tile(k) belongs to the content of the tilejds field in the SPS message. For covering one of the tiles of the independently decodable view, if tile(k) belongs to one of the tiles of the overlay independently decodable view, step S1003 is performed, otherwise step S1004 is performed.
  • Step S1003 If tile(k) belongs to one of several tiles covering the independently decodable view, set the independently decodable region in the frame picture with the same stitching type and flipping type that is previously encoded as the frame of the current frame picture tile(k) Refer to the candidate area.
  • Step S1004 If tile ( k ) does not belong to one of several tiles that can independently decode the view, set all picture regions in the previously encoded frame picture to be the inter-frame reference candidate region of the current frame picture tile(k).
  • Step S1005 Select to encode tile(k) using an intra prediction or an inter prediction algorithm, and when encoding using an inter prediction algorithm, select an optimal reference region from the inter-frame reference candidate regions described in steps S1003 and S1004.
  • the intra prediction coding algorithm if the tile (k) are not independently decodable region covers one or several of the tile i, that is dependent tile, the tile (k) may be ⁇ 1 adjacent image blocks of independent tile as The candidate range of the optimal reference block is selected.
  • the coding method provided in the foregoing FIG. 16 and FIG. 17 adds a new field identifier in the existing SPS message in the encoded code stream to independently decode the area related information to implement the function of the auxiliary message.
  • the Profile and Level information in the SEI message is only for the sub-code stream formed by the independently decodable region, which reduces the requirement for decoder performance.
  • different inter-frame reference candidate regions are set for the independently decodable region tile and the non-independently decodable region tile in steps S1003 and S1004, respectively, which ensures that the coding blocks in the region can be independently decoded, and the outer coding blocks are expanded.
  • the decoder decodes the video code stream encoded by the process shown in FIG. 16 and FIG. 17, that is, decodes a length.
  • the process for the M picture sequence is as follows:
  • Step S1101 Receive a video code stream to be decoded, where the video code stream includes a plurality of AUs, and each AU corresponds to one frame of the encoded picture.
  • Step S1102 Acquire an SPS message in the video code stream, determine whether the indec_rgn_present_flag field in the SPS message is tme, and then set to continue decoding, otherwise step S1114 is performed;
  • Step S1103 Obtain the profile and level information in the SPS message in the video code stream, determine whether the performance of the decoder conforms to the profile and level in the SPS message, and if not, the data cannot be decoded and directly ends; if yes, step S1104 is performed.
  • Step S1104 Initialize the decoder according to the above profile and level information.
  • Step S1105 Obtain a tile id set corresponding to the independently decodable view from the SPS message.
  • Step S1107 Determine whether the current AU includes a picture in a frame stitching format, and determine the condition as follows: 1) The current AU includes an FPA (frame packing arrangement) message, and the cancel flag bit in the message is 0; 2) the current AU does not include the FPA message. However, the last FPA message received before in the code stream has a cancel flag of 0. If one of the two conditions is met, go to the next step, otherwise go to step S1114.
  • FPA frame packing arrangement
  • Step S1108 The encoded picture information in the current AU is taken out.
  • Step S1109 According to the tile id set corresponding to the overlay independently decodable view obtained in step S1105, take out a number of tiles covering the independently decodable view.
  • Step S1110 The decoding covers the pictures in the plurality of tiles that can independently decode the view, and the decoding method is determined according to a corresponding encoding method in the encoding process.
  • Step S1111 The pictures in the tile are cut according to the cropping information in the SPS message. If cropping_enable_flag is false , no clipping is required; otherwise, regions i of pic_crop_left_offset , pic_crop_right_oifset , pic_crop_top_offset , and pic_crop_bottom_offset identified in several tiles are taken, that is, views that can be independently decoded in several tiles.
  • Step S1112 Output the independently decodable view to the display device 201 as shown in FIG. 4, or, if the output time has not expired, temporarily store it to the buffer 2021 as shown in FIG.
  • Step S1113 If the current AU is the last AU in the code stream, the decoding ends; otherwise, step S1106 is performed.
  • Step S1114 The normal decoding process.
  • the decoder when the decoder receives the 3D video code stream encoded by the frame encapsulation 3DTV technology, and the 2D display device is connected, the decoder can only extract the two views according to the SEI message.
  • One decoding ie decoding only the independently decodable regions, reduces the decoder performance requirements and saves the decoder's computational and memory resources.
  • the profile and level corresponding to the independently codeable region substreams generally reduce the performance and storage requirements of the decoder, so the decoder decoding time and power consumption can be saved after the decoder is initialized, and the decoder storage requirements are saved.
  • the decoder does not meet the profile and level requirements of the original 3D video stream, and meets the profile and level requirements of the corresponding independently codeable region substream, the decoder is improved in 2D display compatible with high resolution or code rate requirements. 3D video stream support.
  • the video encoding and decoding method implemented in the foregoing embodiments can be implemented by hardware related to program instructions, and the program can be stored in a readable storage medium.
  • the program executes the corresponding steps in the above method when executed.
  • the storage medium can be as follows:

Abstract

本发明公开了一种视频编码及解码方法、装置和系统,其中视频编码方法包括根据所述视频对应的配置文件,确定待编码图片中的可独立解码视图;将所述图片划分成至少两个分块(tile),其中覆盖可独立解码视图的一个或多个tile对应的区域为可独立解码区域;生成与所述图片对应的辅助消息,所述辅助消息中包含所述可独立解码区域位置标识,所述可独立解码区域位置标识由一个或多个分块标识(tile id)组成;以及编码所述图片包含的所有tile,以至形成编码后的视频码流,所述编码后的视频码流中包括所述辅助消息。解码所述编码后的视频码流可降低解码器的性能要求,提高解码效率。

Description

视频编码及解码方法、 装置及系统
技术领域 本发明涉及图像处理技术, 尤其涉及一种视频编码及解码方法、装置和系 统。
背景技术
3DTV ( Three-Dimensional Television, 立体电视)是一种最常见的立体电 视的技术。 它将左右两个独立的视图显示在一个屏幕上, 左右眼分别接收不同 的视图, 以实现 3D立体的效果。 目前, 服务提供商在提供 3DTV的服务时, 希望尽可能的利用原来的 2DTV ( Two-Dimensional Television, 平面电视) 的 编码工具和传输设备, 以节省视频制作成本和传输设备成本。
为了实现上述需求, 帧封装( frame packing ) 3DTV技术将左右两个视图 拼成一帧图像, 然后用 2D编码器和传输设备进行编码和传输, 在编码后的码 流中添加了两个视图如何拼接的消息,或者添加了直接指示两个视图在一帧中 各自位置信息的消息, 解码器解码后, 根据上述的消息输出两个视图。
采用帧封装 3DTV技术的视图拼接类型有多种,图 1示例性的给出左右型 和上下型两种拼接类型。在一种拼接类型中,如果根据翻转类型的不同也会有 不同的拼接情形。翻转类型是指左右型视图的排列顺序是否翻转,或者上下型 视图的排列数据是否翻转。图 1中展示了不同拼接类型和翻转类型组成的不同 图像。
现有的各种不同类型的解码器均由两大主要部分构成:解码模块和局部存 储器。 局部存储器用于存储解码前的经过编码的图片, 以及解码后的图片, 这 些解码后的图片需要作后续图片的解码参考帧或者还未到输出时间。解码器需 要为局部存储器分配足够的存储资源,解码模块则需要消耗解码器的计算资源。
需要传输的视频经过编码后形成码流, 每个码流中都传输了码流的档次 ( profile )和级别 (level )信息。 profile表示编码器在视频编码时使用了哪些 编码工具(例如: 在 main profile中像素的位深度只能是 8位, 图片参数集标 识 PPS id不能超过 63, tile编码不启用等等, 但在 high profile中均无这些限 制), 解码器如果不支持其中的某个编码工具则无法解码。 level表示解码器解 码时需要的计算能力和存储资源。例如,当前 hevc draft定义了 level 4和 level4.1 分别表示符合这两个标准的解码器在解码分辨率为 1920*1080 的高清码流时 能达到 32帧 /s和 64帧 /s,但仅符合 level 4以下标准的解码器则无法解码分辨 率为 1920*1080的高清码流。
在实践应用中,如果解码器接收的是采用帧封装 3DTV技术编码的 3D视 频码流, 而连接的是 2D显示设备时, 则图片被解码器解码后只取出两个视图 中的一个, 再输出给 2D显示设备, 见图 2所示。 采用该现有技术的方案, 由 于编码或解码 3D视频码流的 Profile及 Level要求相对于编码或解码 2D视频 码流的 Profile和 Level要高, 因此需要更高级别的解码器对 3D视频码流进行 解码后输出给 2D显示设备。 此外, 对于 2D的显示设备, 由于不需要被 2D 显示设备显示的图片也需要被解码, 浪费解码器的计算和存储资源。
发明内容
有鉴于此, 为解决上述浪费解码器或编码器的计算和存储资源的问题。 本发本发明实施方式采用如下技术方案:
一种视频编码方法, 该视频由图片序列组成, 该方法包括:
根据该视频对应的配置文件, 确定待编码图片中的可独立解码视图; 将该图片划分成至少两个分块( tile ), 其中覆盖可独立解码视图的一个 或多个 tile对应的区域为可独立解码区域;
生成与该图片对应的辅助消息, 该辅助消息中包含该可独立解码区域 位置标识, 该可独立解码区域位置标识由一个或多个分块标识 (tile id )组 成; 以及 编码该图片包含的所有 tile, 以至形成编码后的视频码流, 该编码后的 视频码流中包括该辅助消息。 一种视频解码方法, 包括:
接收视频码流, 该视频码流包括待解码的视频和辅助消息, 该待解码 的视频由待解码的图片序列组成;
获取待解码图片;
根据该辅助消息, 获得该待解码图片的可独立解码区域的可独立解码 区域位置标识, 该可独立解码区域位置标识由一个或多个分块 (tile ) 的分 块标识 (tile id )组成; 以及
根据可独立解码区域位置标识获得该待解码图片的可独立解码区域, 解码该可独立解码区域。 一种视频编码器, 所述视频由图片序列组成, 包括:
可独立解码视图确认单元, 用于根据该视频对应的配置文件, 确定待 编码图片中的可独立解码视图;
分块划分单元, 用于将该图片划分成至少两个分块(tile ), 其中覆盖可 独立解码视图的一个或多个 tile对应的区域为可独立解码区域;
辅助消息生成单元, 用于生成与该图片对应的辅助消息, 该辅助消息 中包含该可独立解码区域位置标识, 该可独立解码区域位置标识由一个或 多个分块标识 (tile id )组成; 以及
执行编码单元, 用于编码该图片包含的所有 tile, 以至形成编码后的视 频码流, 该编码后的视频码流中包括该辅助消息。 可选的, 该执行编码单元进一步包括: 判断单元, 用于判断当前待编 码 tile是否为可独立解码区域内的 tile; 如果是, 设置已编码图片的可独立 解码区域为当前 tile的帧间参考候选区域; 如果不是, 设置已编码图片的整 个图片区域为当前 tile的帧间参考候选区域; 当采用帧间算法编码时, 根据 上述待编码 tile对应的帧间参考候选区域选择最优参考区域。 一种视频解码器, 包括:
接收单元, 用于接收视频码流, 该视频码流包括待解码的视频和辅助 消息, 该待解码的视频由待解码的图片序列组成; 以及
执行解码单元, 用于获取待解码图片; 根据该辅助消息, 获得该待解 码图片的可独立解码区域的可独立解码区域位置标识, 该可独立解码区域 位置标识由一个或多个分块(tile ) 的分块标识(tile id )组成; 以及根据可 独立解码区域位置标识获得该待解码图片的可独立解码区域, 解码该可独 立解码区域。 一种编码器, 用于编码视频, 所述视频由图片序列组成, 包括: 一个或多个处理器;
一个或多个存储器;
一个或多个程序, 其中, 该一个或多个程序存储于该一个或多个存储 器中, 并且, 该一个或多个程序用于被该一个或多个处理器执行, 该一个 或多个程序包括:
指令, 用于根据该视频对应的配置文件, 确定待编码图片中的可独立 解码视图;
指令, 用于将该图片划分成至少两个分块(tile ), 其中覆盖可独立解码 视图的一个或多个 tile对应的区域为可独立解码区域;
指令, 用于生成与该图片对应的辅助消息, 该辅助消息中包含该可独 立解码区域位置标识, 该可独立解码区域位置标识由一个或多个分块标识
( tile id )组成; 以及
指令, 用于编码该图片包含的所有 tile , 以致形成编码后的视频码流, 该编码后的视频码流中包括该辅助消息。 51、 如权利要求 50 该的编码器, 其特征在于,该辅助消息进一步包括如下信息之一: 可独立解码区域标识、 解码该可独立解码区域的裁减(cropping )信息、 解码可独立解码区域的档 次(profile )信息、 以及解码可独立解码区域的级别 (level )信息。 一种解码器, 包括:
一个或多个处理器;
一个或多个存储器;
一个或多个程序, 其中, 该一个或多个程序存储于该一个或多个存储 器中, 并且, 该一个或多个程序用于被该一个或多个处理器执行, 该一个 或多个程序包括:
指令,用于接收视频码流,该视频码流包括待解码的视频和辅助消息, 该待解码的视频由待解码的图片序列组成;
指令, 用于获取待解码图片;
指令, 用于根据该辅助消息, 获得该待解码图片的可独立解码区域的 可独立解码区域位置标识, 该可独立解码区域位置标识由一个或多个分块 ( tile ) 的分块标识 (tile id )组成; 以及
指令, 用于根据可独立解码区域位置标识获得该待解码图片的可独立 解码区域, 解码该可独立解码区域。 一种编码器, 设置于处理视频的源装置中, 用于编码视频, 所示视频由 图片序列组成, 包括: 一个或多个电路, 用于根据该视频对应的配置文件, 确定待编码图片 中的可独立解码视图; 将该图片划分成至少两个分块(tile ), 其中覆盖可独 立解码视图的一个或多个 tile对应的区域为可独立解码区域;生成与该图片 对应的辅助消息, 该辅助消息中包含该可独立解码区域位置标识, 该可独 立解码区域位置标识由一个或多个分块标识 ( tile id ) 组成; 以及, 编码该 图片包含的所有 tile, 以致形成编码后的视频码流, 该编码后的视频码流中 包括该辅助消息。 一种解码器, 设置于处理视频的接收装置中, 包括:
一个或多个电路, 用于接收视频码流, 该视频码流包括待解码的视频 和辅助消息,该待解码的视频由待解码的图片序列组成; 获取待解码图片; 根据该辅助消息, 获得该待解码图片的可独立解码区域的可独立解码区域 位置标识, 该可独立解码区域位置标识由一个或多个分块 (tile ) 的分块标 识 (tile id )组成; 根据可独立解码区域位置标识获得该待解码图片的可独 立解码区域, 解码该可独立解码区域。 一种计算机可读存储介质, 该计算机可读存储介质存储若干指令, 当 该若干执行被设备执行时, 将触发该设备执行如下操作:
根据视频对应的配置文件,确定所述视频中待编码图片中的可独立解码 视图;
将该图片划分成至少两个分块( tile ), 其中覆盖可独立解码视图的一个 或多个 tile对应的区域为可独立解码区域;
生成与该图片对应的辅助消息, 该辅助消息中包含该可独立解码区域 位置标识, 该可独立解码区域位置标识由一个或多个分块标识 (tile id )组 成; 编码该图片包含的所有 tile, 以致形成编码后的视频码流, 该编码后的 视频码流中包括该辅助消息。 一种计算机可读存储介质, 该计算机可读存储介质存储若干指令, 当 该若干执行被设备执行时, 将触发该设备执行如下操作:
接收视频码流, 该视频码流包括待解码的视频和辅助消息, 该待解码 的视频由待解码的图片序列组成;
获取待解码图片;
根据该辅助消息, 获得该待解码图片的可独立解码区域的可独立解码 区域位置标识, 该可独立解码区域位置标识由一个或多个分块 (tile ) 的分 块标识 (tile id )组成; 以及
根据可独立解码区域位置标识获得该待解码图片的可独立解码区域, 解码该可独立解码区域。 如上述的各种实施方式, 可选地, 该辅助消息还可以包括可独立解码 区域标识, 该可独立解码区域标识用以标识该图片是否包含可独立解码区 域。 如上述的各种实施方式, 可选地, 该辅助消息进一步包括解码该可独 立解码区域的裁减( cropping )信息, 该 cropping信息由该可独立解码视图 相对于可独立解码区域的上、 下、 左或右边界的横坐标或纵坐标组成。 如上述的各种实施方式, 可选地, 该辅助消息进一步包括解码该可独 立解码区域的档次( profile )信息, 该 profile信息用以标识该可独立解码区 域中的编码工具集。 如上述的各种实施方式, 可选地, 该辅助消息进一步包括解码该可独 立解码区域的级别 (level )信息, 该 level信息用以标识解码器需要满足的 level信息,该 level信息根据该可独立解码区域占该图片的比例计算得到。 如上述的各种编码实施方式, 可选地, 编码待编码图片包含的所有 tile 的步骤还包括: 判断当前待编码 tile是否为可独立解码区域内的 tile; 如果 是, 设置已编码图片的可独立解码区域为当前 tile的帧间参考候选区域; 如 果不是, 设置已编码图片的整个图片区域为当前 tile的帧间参考候选区域; 当采用帧间算法编码时,根据上述待编码 tile对应的帧间参考候选区域选择 最优参考区域。 如上述的各种实施方式, 可选地, 该图片序列包括不同拼接类型和翻 转类型的图片; 该配置文件存储该图片序列中每帧图片的拼接类型和翻转 类型, 以及不同拼接类型和翻转类型图片对应的可独立解码视图。 如上述的各种实施方式, 可选地, 该辅助消息中还包括不同拼接类型 和翻转类型图片对应的可独立解码区域位置标识。 如上述的各种实施方式, 可选地, 该辅助消息中还包括不同拼接类型 和翻转类型图片对应的解码该可独立解码区域的 cropping信息。 如上述的各种解码实施方式, 可选地, 还包括: 根据辅助消息中的 cropping信息, 裁减可独立解码区域得到可独立解码视图。 如上述的各种实施方式, 可选地, 该辅助消息中还包括不同拼接类型 和翻转类型图片对应的解码该可独立解码区域的 profile信息。 如上述的各种实施方式, 可选地, 该辅助消息中还包括不同拼接类型 和翻转类型图片对应的解码该可独立解码区域的 level信息。 如上述的各种实施方式, 可选地, 该辅助消息由辅助增强信息
( Supplemental Enhancement Information , SEI )承载。 如上述的各种实施方式, 可选地, 该辅助消息由序列参数集( Sequence Parameter Set, SPS )承载。 一种视频编码及解码的系统, 由如一上各种实施方式提供的编码器与 一如上各种实施方式提供的解码组成。
一种视频码流, 其特征在于, 包括: 待解码的视频和辅助消息, 所述待解 码的视频由待解码的图片序列组成;所述辅助消息包括指示所述图片系列的可 独立解码区域的可独立解码区域位置标识,所述可独立解码区域位置标识由一 个或多个分块(tile ) 的分块标识(tile id )组成。 对于上述实施方式的技术效果分析如下:
采用上述编码实施方式, 在编码后的码流中添加了辅助消息, 该辅助 消息中的 Profile和 Level信息仅针对可独立解码区域形成的子码流, 降低 了对解码器性能的要求。
采用上述解码实施方式, 解码器可以根据辅助消息只取出图片中的可 独立解码区域进行解码, 即仅对可独立解码区域进行解码, 降低了对解码 器性能的要求, 并且节约了解码器的计算和存储资源。 此外, 对应可独立 编码区域子码流的 profile和 level要求, 通常会减低对解码器的性能和存储 要求, 所以初始化解码器后可节省解码器解码时间和电力消耗, 节省解码 器存储需求。 如果解码器不满足原始视频码流的 profile和 level要求, 而满 足对应可独立编码区域子码流的 profile和 level 要求,则提高了解码器对分 辨率或码率高需求的兼容 2D显示的 3D视频码流的支持。 附图说明
为了更清楚地说明本发明实施例的技术方案,下面将对本发明实施例中所 需要使用的附图作简单地介绍, 显而易见地, 下面所描述的附图仅仅是本发明 的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下, 还可以根据这些附图获得其他的附图。
图 1为现有技术中采用帧封装 3DTV技术的视频码流的视图拼接类型示意 图;
图 2为现有技术中采用帧封装 3DTV技术的视频码流输出给 2D显示设备 的处理过程示意图;
图 3为现有技术中待编码图片的结构示意图;
图 4为本发明实施例提供的一种视频编码及解码的系统架构图;
图 5为本发明实施例提供的一种编码器的硬件结构图;
图 6为本发明实施例提供的一种解码器的硬件结构图;
图 7为本发明实施例提供的一种编码器的功能模块图;
图 8为本发明实施例提供的一种解码器的功能模块图;
图 9为本发明实施例提供的一种编码器的硬件结构图;
图 10为本发明实施例提供的一种视频编码方法流程图;
图 11为图 10所示方法流程中具体编码一帧图片的方法流程图;
图 12为本发明实施方式提供的一种视频解码方法流程图;
图 13为本发明实施方式提供的又一种视频编码方法流程图; 图 14为图 13所示方法流程中具体编码一帧图片的方法流程图; 图 15为本发明实施方式提供的又一种解码方法流程图;
图 16为本发明实施方式提供的再一种视频编码方法流程图;
图 17为图 16所示方法流程中具体编码一帧图片的方法流程图; 图 18为本发明实施方式提供的再一种解码方法流程图。 具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清 楚、 完整地描述, 显然, 所描述的实施例是本发明的一部分实施例, 而不是全 部实施例。基于本发明中的实施例, 本领域普通技术人员在没有做出创造性劳 动的前提下所获得的所有其他实施例, 都应属于本发明保护的范围。
下面先介绍理解本发明方案的基本概念:
最大编码单元 ( largest coding unit, LCU ): 高效率视频编码( high efficiency video coding, HEVC )技术中最小的图片划分单位, 如图 3中的小方格。 LCU 可以是一个 64*64像素的方块。 HEVC编码器在编码一帧图片之前, 先要将图 片分成以 LCU为单位的栅格。
编码单元(coding unit, CU ): 编码器才艮据图片紋理细节的多少动态决定 最优的编码单元的划分, 一个 LCU可以被划分成一个或多个编码单元, 编码 和解码都是以 CU为单位分别进行的。
分块(tile ): 对图片的一种更高层的划分方法, 将一个图片划分成 m行 n 歹 ij, 其中每一个划分块称为 Tile。 如图 3中将图片划分成了 1行 3列。 Tile的 行和列都是以 LCU为最小单位, 即一个 LCU不能同时属于两个 tile。 当 tile 的划分确定之后, 编码器按照先从左到右, 再从上到下的顺序分配分块标识 ( tile id )„ 一般 tile的大小划分是根据图片对应的配置文件进行划分的。 配置 文件中一般存放编码器编码过程中所需要预先确定的输入参数,例如编码工具、 编码限制、 待编码图片的属性等等。
独立分块( independent tile ): tile的一种类型, 在帧内预测时, independent tile中的 CU之间不可相互参考。
依附分块( dependent tile ): tile的一种类型, 在帧内预测时, dependent tile 中的 CU可以参考 independent tile中的 CU。
HEVC的编码器以 LCU为最小单位进行编码, 如果一个图片的宽和高不 满足 LCU的整数倍, 需要补齐再编码,如图 3中的带斜线阴影部分的 LCU为 图片区域, 空白部分的 LCU为补齐部分, 这被称为 LCU对齐。 当解码器完成 解码输出视频流之前, 需要将先前补充的 LCU部分裁剪之后再输出, 这被称 为裁减 (cropping)。
如图 3中, 将每一个 LCU按先从左到右, 再从上到下的顺序, 从 0开始 依次赋以一个地址, 称为 LCU地址; 再根据 tile的划分, 可以计算任意一个 LCU地址属于哪个 Tile, 即可以建立一个 LCU地址到 Tile id的查找表。
通常一段待编码的视频可以看作为一个图片序列,经过编码后形成视频码 流,该视频码流中包括经过编码的图片序列和解码图片所需要的参数集。其中, 一个访问单元(access unit, AU ) 包含一帧经过编码的图片及解码图片所需要 的参数集, 或者一个 AU仅包含一帧经过编码的图片。
在解码图片的参数集中,视频可用性信息( video usability information, VUI ) 参数结构中定义了 1位的标识符 tile_splittable_flag, 表示码流中的 tile是满足 如下特性的: 1. Tile的划分对于图片序列中的每一个图片都维持不变;
2. 序列中不同帧之间只有相同 id的 tile之间可以作预测参考;
3. 每个 tile单独作环路滤波, 一个完整的图片由若干 CU解码后重组而 成, 图片中的 CU可能由不同参考帧的不同部分预测而来, 这个预测解码得到 的图片可能与原图存在误差, 这导致了相邻 CU的边界处存在不连续, 环路滤 波是一种为了消除这种不连续的针对整个图片的滤波操作。
如图 4所示,为实现本发明实施例提供的视频编码及解码方法的系统架构 图。 源装置 100是网络侧视频头端设备, 包含用以在编码之前和之后存储视频 (即: 图片序列) 的视频存储器 101, 用以编码图片序列的编码器 102, 以及 用以将编号后的码流传输到另一装置的发射器 103组成。源装置 100还可以包 含视频俘获装置, 例如: 摄影机, 以俘获视频且将所俘获的视频存储在视频存 储器 101中, 还可以包括其他元件, 例如: 帧内编码器元件、 各种滤波器等。
视频存储器 101 通常包括较大的存储空间。 举例来说, 视频存储器 101 可包括动态随机存储器(DRAM )或 FLASH存储器, 在其他实施例中, 视频 存储器 101可包括非易失性存储器或任何其他数据存储装置。
编码器 102可以是执行视频编码设备的一部分。作为一个特定实施例, 编 码器可包括用于视频编解码的芯片组, 包含硬件、 软件、 固件、 处理器或数字 信号处理(DSP ) 的某种组合。
发射器 103通过有线网络、无线网络等方式将视频码流做信号调制后发往 接收端。
接收装置 200是用户侧终端设备,包含用以接收经过编码的视频码流的接 收器 203, 用以解码视频码流的解码器 202, 以及向终端用户输出解码后视频 的显示装置 201, 例如: LED电视机。 接收装置 200还可以包括其他元件, 例 如: 调制解调器、 信号放大器、 存储器等。
解码器 202可以是执行视频解码设备的一部分。作为一个特定实施例,解 码器可包括用于视频编解码的芯片组, 包含硬件、 软件、 固件、 处理器或数字 信号处理(DSP ) 的某种组合。
显示装置 200可以为 2D显示设备, 或同时兼容 2D或 3D的显示设备。 例如显示器、 电视和投影仪等等。
如图 5所示, 为图 4中所示编码器 102进一步细化的硬件结构图。编码器 102包含緩存器 1021和处理器 1022。所述緩存器 1021包含相对于视频存储器 101来说较小且较快的存储空间。 举例来说, 緩存器 1021可包括同步随机存 取存储器 (SRAM )。 緩存器 1021可包括 "芯片上" 存储器, 其与编码器 102 的其他组件集成,以在处理器 1022密集型编码过程中提供非常快的数据存取。 在编码给定图片序列期间,待编码的图片序列可以依次从视频存储器 101加载 到緩存器 1021。 此外, 緩存器 1021还用于存储待编码视频的配置文件, 以及 用于执行特定编码算法的软件程序等。 部分情况下, 緩存器 1021还用于存储 已完成编码的图片,该图片还未到发送时间或者用于给编码下一帧图片提供参 考。 在其它实施例中, 具备存储功能的存储器均可以用作该緩存器 1021。 处 理器 1022从緩存器 1021获取待编码的图片, 并对图片进行编码, 直到视频中 包含的图片序列均被编码。
如图 6所示, 为图 4中所示解码器 202进一步细化的硬件结构图。解码器
202包含緩存器 2021和处理器 2022。所述緩存器 2021为较小且较快的存储空 间。举例来说,緩存器 2021可包括同步随机存取存储器( SRAM )。緩存器 2021 可包括 "芯片上"存储器, 其与解码器 202的其他组件集成, 以在处理器 2022 密集型编码过程中提供非常快的数据存取。在解码给定图片序列期间,待解码 的图片序列可以加载到緩存器 2021, 此外, 緩存器 2021还存储用于执行特定 解码算法的软件程序等。 并且, 緩存器 2021还可用于存储已完成解码, 但还 未到显示时间或者需要作为后续图片的解码参考帧的图片。 在其它实施例中, 具备存储功能的存储器均可以用作该緩存器 2021。处理器 2022从緩存器 2021 获取待解码的图片, 并对图片进行解码, 直到视频码流包含的图片序列均被解 码。
在本发明提供的实施例中, 编码器 102进行编码的过程中,在编码后的码 流中添力口辅助消息, 例如: 放在辅助增强信息 ( Supplemental Enhancement Information, SEI )或序列参数集( Sequence Parameter Set, SPS ) 中, 以帮助 解码器 202解码。 首先,辅助消息中可添加可独立解码区域标识用于标识编码 后的码流中存在可独立解码区域。 3D视频码流中每帧图片最终被 2D显示设 备显示的视图部分为可独立解码视图; 与 tile划分对应的, 覆盖可独立解码视 图的区域为可独立解码区域。此外,辅助消息本身也可认为是一种可独立解码 标识, 编码后的视频码流中若包含辅助消息, 则认为码流中存在可独立解码区 域; 否则, 不存在可独立解码区域。 解码器 202取出编码后视频码流(即: 图 片序列) 中的每个图片的可独立解码区域, 然后进行正常解码。
本发明实施例的编码器 102在确定可独立解码区域时,需要保证可独立解 码区域的特性如下:
在拼接类型和翻转类型相同的每帧图片之间,可独立解码区域的位置和大 小保持不变; 可独立解码区域内的 CU从相同的拼接类型和翻转类型图片的可独立解 码区域内选取帧间预测的参考图片块;
可独立解码区域内单独作环路滤波。
辅助消息中还可以包括如下信息: 可独立解码区域位置标识、可独立解码 区域输出显示时的 cropping信息、可独立解码区域内的子图片所组成的子图片 序列 (即: 子码流 ) 的 profile和 level信息。
作为可选的实施例, 图 5所示的编码器 102的緩存器 1021可通过如图 7 所示的存储单元 1023实现; 此外, 处理器 1022可通过如图 7所示的可独立解 码视图确认单元 1024、 分块划分单元 1025、 辅助消息生成单元 1026, 以及执 行编码单元 1027实现。 #丈为可选的实施例, 图 5所示的编码器 102可执行如 图 10、 图 11、 图 13、 图 14、 图 16以及图 17所提供的编码方法。
作为可选的实施例, 图 6所示的解码器 202的緩存器 2021可通过如图 8 所示的存储单元单元 2023实现; 此外, 处理器 2022可通过如图 8所示的接收 单元 2024单元和执行解码单元 2025实现。图 6所示的解码器 102可执行如图 12、 图 15和图 18所提供的解码方法。
如图 7所示, 为本发明实施例提供的一种编码器的功能模块组成示意图。 编码器 102包括: 存储单元 1023、 可独立解码视图确认单元 1024、 分块划分 单元 1025、 辅助消息生成单元 1026和执行编码单元 1027。 存储单元 1023, 用于存储所述视频的配置文件; 可独立解码视图确认单元 1024, 用于根据所 述视频对应的配置文件, 确定待编码图片中的可独立解码视图; 分块划分单元 1025 , 用于将所述图片划分成至少两个分块(tile ), 其中覆盖可独立解码视图 的一个或多个 tile对应的区域为可独立解码区域; 辅助消息生成单元 1026,用 于生成与所述图片对应的辅助消息,所述辅助消息中包含所述可独立解码区域 位置标识, 所述可独立解码区域位置标识由一个或多个分块标识(tile id )组 成; 以及, 执行编码单元 1027, 用于编码所述图片包含的所有 tile, 以至形成 编码后的视频码流, 所述编码后的视频码流中包括所述辅助消息。该执行编码 单元还包括判断单元,用于判断当前待编码 tile是否为可独立解码区域内的 tile; 如果是, 设置已编码图片的可独立解码区域为当前 tile的帧间参考候选区域; 如果不是, 设置已编码图片的整个图片区域为当前 tile的帧间参考候选区域; 当采用帧间算法编码时,根据上述待编码 tile对应的帧间参考候选区域选择最 优参考区域。
更为详细地, 存储单元 1023还用于加载待编码的图片, 以及加载执行编 码单元 1027完成编码的图片。可独立解码视图确认单元 1024还可以执行如图 10所示步骤 S301和步骤 S302,以及图 11所示步骤 S401 ;分块划分单元 1025 还可以执行如图 11所示步骤 S402; 辅助消息生成单元 1026还可以执行如图 11所示步骤 S403和步骤 S404; 以及, 执行编码单元 1027还可以执行如图 11 所示步骤 S405至步骤 S409。
在本发明的另一实施例中, 存储单元 1023还用于加载待编码的图片, 以 及加载执行编码单元 1027 完成编码的图片。 可独立解码视图确认单元 1024 还可以执行如图 13所示步骤 S601至步骤 S603; 分块划分单元 1025还可以执 行如图 14所示步骤 S701和步骤 S702;辅助消息生成单元 1026还可以执行如 图 13所示步骤 S604; 以及, 执行编码单元 1027还可以执行如图 14所示步骤 S703至步骤 S707。
在本发明的又一实施例中, 存储单元 1023还用于加载待编码的图片, 以 及加载执行编码单元 1027 完成编码的图片。 可独立解码视图确认单元 1024 还可以执行如图 16所示步骤 S901和步骤 S902; 分块划分单元 1025还可以执 行如图 16所示步骤 S903和步骤 S904;辅助消息生成单元 1026还可以执行如 图 16所示步骤 S905; 以及, 执行编码单元 1027还可以执行如图 17所示步骤 S1001至步骤 S1006c
如图 8所示, 为本发明实施例提供的一种解码器的功能模块组成示意图。 所述解码器 202包括: 存储单元 2023、 接收单元 2024和执行解码单元 2025。 存储单元 2023, 用于存储待解码的图片, 以及加载执行解码单元 2025解码完 成但未到显示时间的图片。 接收单元 2024, 用于接收视频码流, 所述视频码 流包括待解码的视频和辅助消息,所述待解码的视频由待解码的图片序列组成; 以及执行解码单元 2025, 用于获取待解码图片; 根据所述辅助消息, 获得所 述待解码图片的可独立解码区域的可独立解码区域位置标识,所述可独立解码 区域位置标识由一个或多个分块(tile ) 的分块标识(tile id )组成; 以及才艮据 可独立解码区域位置标识获得所述待解码图片的可独立解码区域,解码所述可 独立解码区域。
更为详细地,接收单元 2024还用于执行如图 12所示的步骤 S501 ; 以及, 执行解码单元 2025还用于执行如图 12所示的步骤 S502至步骤 S515。
在本发明的另一实施例中,接收单元 2024还用于执行如图 15所示的步骤 S801 ; 以及, 执行解码单元 2025还用于执行如图 15所示的步骤 S802至步骤 在本发明的又一实施例中,接收单元 2024还用于执行如图 18所示的步骤 S1101 ; 以及, 执行解码单元 2025还用于执行如图 18所示的步骤 S 1102至步 骤 SI 114。
如图 9 所示, 为本发明实施例提供的一种编码器的具体实现结构图。 Fn 当前 1001为待编码视频中的当前待编码的一帧图片, F,n-1参考 1002为待编 码视频中已编码的一帧图片, 为当前待编码的图片提供编码参考。
输入的帧 Fn当前 1001按帧内或帧间预测编码的方法进行处理。如果采用 帧内预测编码, 其预测值 PRED (图中用 P表示)是由 Fn当前 1001之前已 编码的参考图像经运动补偿 1006 ( MC )后得出, 其中参考图像用 F'n-1 参考 1002 表示。 为了提高预测精度, 从而提高压缩比, 实际的参考图像可在已编 码、 解码、 重建或滤波的帧中进行选择。 预测值 PRED 和当前块相减后, 产 生一个残差块 Dn, 经块变换、 量化后产生一组量化后的变换系数 X, 再经熵 编码 1014, 与解码所需的一些边信息 (如预测模式量化参数、 运动矢量等) 一起组成一个压缩后的码流, 经 NAL (网络自适应层)供传输和存储用。
正如上述, 为了提供进一步预测用的参考图像, 编码器必须有重建图像的 功能。 因此必须使残差图像经反量化、 反变换后得到的 Fn, 重建 1003与预测 值 P相加,得到 uFn, (未经滤波的帧)。为了去除编码解码环路中产生的噪声, 为了提高参考帧的图像质量,从而提高压缩图像性能,设置了一个环路滤波器, 滤波后的输出 Fn' 重建 1003即重建图像可用作参考图像。
本发明在 ME和帧内预测模式选择时,取消了可独立解码区域外编码单元 对参考范围的限制 (具体参见图 10至图 18的描述), 因此可以选择更为相似 的参考单元, 提高预测精度, 从而提高压缩比。
如图 10所示,为本发明实施例提供的一种视频编码方法,在本实施例中, 上述辅助消息承载于 SEI消息中。 3D视频码流中每帧图片最终被 2D显示设 备显示的视图部分为可独立解码视图; 与 tile划分对应的, 覆盖可独立解码视 图的区域为可独立解码区域。该 SEI消息中包括可独立解码区域标识。 可独立 解码区域与一个 tile对应, 即采用一种 tile的划分, 将可独立解码视图包含在 一个 tile的范围之内。 待编码视频, 即待编码的图片序列, 可以包含具有不同 拼接类型和不同翻转类型的图片。
图 10所示的编码方法, 为编码器编码一视频的过程, 该视频为一个长度 为 M帧的图片序列, 其中每帧图片以两个视图拼接的形式为例, 其中一个视 图为 2D显示设备最终显示的部分, 即上述可独立解码视图。 对于本领域的技 术人员而言, 采用两个或两个以上视图拼接的图片,其编码方法均可经由此方 法简单变换得到。
步骤 S301 : 读取待编码视频的配置文件, 配置文件中存放编码器编码过 程中所需要预先确定的输入参数, 例如编码工具、 编码限制、 待编码图片的属 性等等。
步骤 S302: 根据配置文件确定图片序列中需要独立解码的视图。 本实施 例中配置文件中预设了整个图片序列中每种拼接类型和翻转类型的图片对应 的可独立解码视图, 例如, 一帧图片中的左视图或右视图。 相同拼接类型和翻 转类型图片对应的可独立解码视图相同,不同拼接类型和翻转类型图片对应的 可独立解码视图不同。
步骤 S303: 获取图片序列中第 i帧图片, 设置 i=l。
步骤 S304: 编码第 i帧图片, 具体的编码过程将在下图 11详细介绍。 步骤 S305: 将当前第 i帧图片编码后对应的 AU输出,保存在自身或外挂 的存储装置,例如:源装置 100的视频存储器 101或编码器 102的緩存器 1021 中, 或者直接通过网络传输给远程的接收装置 200。
步骤 S306:如果 i等于 M(即:第 i帧为最后一帧 )则编码结束;否则 i=i+l , 执行步骤 S303。
图 11为图 10所示视频编码方法中步骤 S304中具体编码一帧图片的方法 流程示意图。
步骤 S401 :根据配置文件,获取当前第 i帧图片的拼接类型和翻转类型对 应的可独立解码视图。
步骤 S402: 根据覆盖可独立解码视图的最小 tile, 确定 tile划分。 即将可 独立解码视图包含在一个 tile的范围之内, 并且, 该 tile 的划分上下左右均要 满足 LCU对齐的要求。该 tile对应的区域即为可独立解码区域。则第 i帧图片 被划分成覆盖可独立解码区域的最小 tile和除覆盖可独立解码区域的最小 tile 以外的区域组成的 tile, —共两个 tile, 设置 tile的数量 N等于 2。 在本发明的 其他实施例中, tile的数量不限于 2个。 根据先从左向右, 再从上到下的顺序 分配 tile id, 预设覆盖可独立解码视图的最小 tile的 id为 s。 在本发明的其他 实施例中, tile的划分, 不一定以覆盖可独立解码视图的最小 tile为基准, 只 要满足该 tile覆盖可独立解码视图,且 tile的上下左右边界满足 LCU对齐的要 求即可。
步骤 S403: 判断是否需要生成 SEI消息, 判断条件为: 当前第 i帧图片是 待编码视频码流中第 1帧图片,或者不是第 1帧图片但是与上一帧拼接类型或 翻转类型不相同, 则需要生成 SEI消息, 执行步骤 S404; 或者, 当前第 i帧图 片不是待编码视频码流中的第 1帧图片, 并且当前第 i帧图片与上一帧图片的 拼接类型与翻转类型相同, 则执行步骤 S405。 也就是说, 在本实施例中, 连 续的相同拼接类型与翻转类型的图片对应一个 SEI消息,如果前后两帧图片的 拼接类型与翻转类型不同, 则需要生成新的 SEI消息。
步骤 S404: 创建 SEI消息, 本发明暂命名为 INDEC_RGN_SEI, 设置此 SEI消息各字段, 其定义如下表。
INDEC_RGN_SEI(payloadSize) { Descriptor tile—id ue(v)
cropping_enable_flag u(l)
new_profile_flag u(l)
new—level—flag u(l)
if (cropping—enable—flag) {
pic crop left offset ue(v) pic crop right offset ue(v) pic_crop_top_offset ue(v) pic crop bottom offset ue(v)
}
if (new_profile_flag) {
profile_idc ue(v)
}
if(new_level_flag) {
level—idc ue(v)
}
Figure imgf000024_0001
表中的 ue(v)表示该字段的长度可变, u(n)表示该字段的长度为 n位(bite ), u(l)标识该字段的长度为 1位。
可独立解码区域位置标识信息:
tile_id: 覆该可独立解码视图的最小 tile的 id, 本例中为 8。
以下为可独立解码区域的 cropping信息:
cropping_enable_flag:如果 tile(s)中包含的可独立解码视图的宽度与 tile(s) 的宽度相等, 且可独立解码视图的高度与 tile(s)的高度相等, 则设置 cropping_enable_flag为 false , 否贝1 J为 tme。
pic_crop_left_oflfset:包含可独立解码视图相对于 tile(s)的左边缘的横坐标, 以像素为单位。
pic_crop_right_offset: 包含可独立解码视图相对于 tile(s)的右边缘的横坐 标, 以像素为单位。
pic_crop_top_offset:包含可独立解码视图相对于 tile(s)的上边缘的纵坐标, 以像素为单位。
pic_crop_bottom_of set: 包含可独立解码视图相对于 tile(s)的下边缘的纵 坐标, 以像素为单位。
以下为可独立解码区域(即 tile ( s )对应的区域 )子码流的 profile和 level 信息:
new_profile_flag: 表示可独立解码区域子码流的 profile是否与整个码流的 profile的标识符相同, 若其值为 0, 则表示相同; 若其值为 1, 则表示不同。
new_level_flag: 表示可独立解码区域子码流的 level是否与整个码流的 level的标识符相同, 若其值为 0, 则表示相同; 若其值为 1, 则表示不同。 profile_idc : 可独立解码区域中的编码工具集符合的 profile id。
level_idc: 解码器需要满足的最低 level id。 根据 tile(s)占整个图片的面积 的比例,计算解码 tile(s)的码率和最大緩存,例如:解码整个图片的码率为 x, 最大緩存为 y, tile(s)占整个图片的面积的比例为 r, 则 tile(s)的码率为 x*r, 最 大緩存位 y*r。 根据 profile_idc、 tile(s)的码率为 x*r和最大緩存位 y*r查找符 合这个解码性能的最小 level, 设置 leveljdc为上述最小 level。
设定当前 tile id为 k, k=l。
步骤 S405: 如果 k为 s, 即当前 tile为覆盖可独立解码视图的 tile, 执行 步骤 S406, 否执行步骤 S407。
步骤 S406:设置已编码的拼接类型与翻转类型都与当前第 i帧图片相同的 图片中 tile(s)为当前第 i帧 tile(s)的帧间参考候选区域。
步骤 S407:设置已编码的帧图片中所有图片区域为 tile(k)的帧间参考候选 区域。
步骤 S408: 选择使用帧内预测或帧间预测算法编码 tile(k), 当采用帧间预 测算法编码时,从步骤 S406和 S407得到的各帧间参考候选区域中选择最优的 参考区域进行编码。
步骤 S409: 如果 k小于 N, 即 tile(k)不是待编码图片中最后一个 tile , N 为一帧图片中 tile的总数量, 则 k=k+l, 转到步骤 S405; 如果 k等于 N, 编码 结束。
上述图 10和图 11提供的编码方法, 在编码后的码流中添加了辅助消息, 由 SEI消息承载,该 SEI消息中的 Profile和 Level信息仅针对可独立解码区域 形成的子码流, 降低了对解码器性能的要求。 此外,在步骤 S406和 S407分别 针对可独立解码区域 tile和非可独立解码区域 tile设置不同的帧间参考候选区 域,既保证了区域内的编码块可独立解码,又扩大了区域外编码块的参考范围, 从而能够在编码中参考与当前块更为相似的编码块,提高了编码效率, 节省传 输数据量。
如图 12所示,为本发明实施例提供的一种视频解码方法,在本实施例中, 解码器解码经过如图 10和图 11所示流程编码后的视频码流, 即, 解码一长度 为 M的图片序列的过程如下:
步骤 S501 : 接收待解码的视频码流, 该视频码流包括若干 AU, 每一 AU 与一帧经过编码的图片对应。
步骤 S502: 从码流中获取一个 AU。
步骤 S503 :判断当前 AU是否包含帧拼接格式的图片,判断方法如下: 1 ) 当前 AU包含 FPA ( frame packing arrangement )消息, 且消息中的取消标志位 为 0; 2 ) 当前 AU不包含 FPA消息, 但待解码视频码流中之前接收到的最后 一个 FPA消息, 其取消标志位为 0。 满足 2个条件之一, 执行步骤 S504, 否 则转步骤 S515。
步骤 S504: 判断当前 AU是否包含 SEI消息, 是则执行步骤 S506, 否则 执行步骤 S505。
步骤 S505 : 之前接收到的 AU中是否包含 SEI消息, 是则重用此消息内 的参数解码和输出当前 AU中的图片, 执行步骤 S509, 否则执行步骤 S515。
步骤 S506:判断解码器的性能是否符合 SEI消息中的 profile和 level要求, 如果不符合则无法解码, 直接结束; 如果符合, 执行步骤 S507。 步骤 S507: 才艮据 SEI消息中的 profile和 level信息初始化解码器。
步骤 S508: 从上述 SEI消息中获取可独立解码区域对应的 tile id, 本实施 例中可独立解码区域对应的 tile id为 s。
步骤 S509:取出 AU中的图片信息,该图片信息是经过编码的图片信息, 待解码器解码。
步骤 S510: 根据 SEI消息中获取的可独立解码区域对应的 tile id, 取出图 片中 tile(s)内的图片。
步骤 S511 : 解码 tile(s)内的图片, 解码方法才艮据编码过程中相应的编码方 法而定。
步骤 S512: 根据上述 SEI消息中的 cropping信息裁减 tile(s)内图片。如果 cropping_enable_flag 为 false, 则 不需要裁减; 否贝1 J取出 tile(s)内 pic_crop_left_offset 、 pic_crop_right_oifset 、 pic_crop_top_offset 和 pic_crop_bottom_offset标识的区域, 即 tile(s)内可独立解码的视图。
步骤 S513: 输出可独立解码区域内可独立解码视图。
步骤 S514: 如果当前 AU为码流中最后一个 AU, 则解码结束; 否则执行 步骤 S512。
步骤 S515: 正常的解码流程。
采用本实施例提供的解码方法,当解码器接收的是采用帧封装 3DTV技术 编码的 3D视频码流, 而连接的是 2D显示设备时, 则解码器可以根据 SEI消 息只取出两个视图中的一个进行解码, 即仅对可独立解码区域进行解码, 降低 了对解码器性能的要求, 并且节约了解码器的计算和存储资源。 此外, 对应可 独立编码区域子码流的 profile和 level通常会减低对解码器的性能和存储要求, 所以初始化解码器后可节省解码器解码时间和电力消耗,节省解码器存储需求。 如果解码器不满足原来 3D视频码流的 profile和 level要求, 而满足对应可独 立编码区域子码流的 profile和 level要求,则提高了解码器对分辨率或码率高 需求的兼容 2D显示的 3D视频码流的支持。
如图 13所示, 为本发明实施例提供的另一种视频编码方法, 在本实施例 中上述辅助消息也是由 SEI消息承载, 但该 SEI消息不同于图 10、 图 11和图 12中的 SEI消息。 一个 SEI消息中包括各种拼接类型和翻转类型图片对应的 不同可独立解码区域标识、 cropping信息、 profile和 level信息。 3D视频码流 中每帧图片最终被 2D显示设备显示的视图为可独立解码视图。 覆盖可独立解 码视图的区域用多个 tile组成的矩形区域表示,且每个 tile均需要满足 LCU对 齐的要求, 该多个 tile组成的矩形区域为可独立解码区域。 待编码视频, 即待 编码的图片序列, 可以包含具有不同拼接类型和不同翻转类型的图片。
图 13所示的编码方法, 为编码器编码一视频的过程, 该视频为一个长度 为 M的图片序列, 其中每帧图片以两个视图拼接的形式为例, 其中一个视图 为 2D显示设备最终显示的部分, 即上述可独立解码视图。 对于本领域的技术 人员而言, 采用两个或两个以上视图拼接图片, 其编码方法均可经由此方法简 单变换得到。
步骤 S601 : 读取待编码视频的配置文件, 配置文件中存放编码器编码过 程中所需要预先确定的输入参数, 例如编码工具、 编码限制、 待编码图片的属 性等等。本实施例中配置文件中预设了整个图片序列中每种拼接类型和翻转类 型的图片对应的可独立解码视图, 例如, 一帧图片中的左视图或右视图。 相同 拼接类型和翻转类型图片对应的可独立解码视图相同,不同拼接类型和翻转类 型图片对应的可独立解码视图不同。
步骤 S602: 根据待编码视频的配置文件, 获取图片序列中每帧图片的拼 接类型和翻转类型的组合。
步骤 S603 : 根据配置文件中预设的每种拼接类型和翻转类型的图片对应 的可独立解码视图,以及根据配置文件获取的每帧图片的拼接类型和翻转类型 信息, 确定图片序列中每帧图片的可独立解码的视图。
步骤 S604: 创建 SEI消息。 该 SEI消息在一个图片序列中只发送一次, 并且,该 SEI消息中包括各种拼接类型和翻转类型图片对应的不同可独立解码 区域标识、 cropping信息、 rofile和 level信息。
设置此 SEI消息各字段, 其定义如下表。根据从配置文件获取的图片序列 的拼接类型和翻转类型的组合, 设置 SEI中开头的 arrange_leftright_no_flip、 arrange_leftright_fli , arrange_topdown_no_fli 以及 arrange_topdown_flip字段, 再分别设置不同拼接类型和翻转类型图片对应的可独立解码区域的位置标识 信息、 cropping信息、 profile和 level信息。 例如: 当 arrange_leftright_no_flip 为 1时,则在 if (arrange_leftright_no_flip) {… }中存放着左右拼接且无翻转类型 图片对应的参数, 解码左右拼接且无翻转的图片时均采这个区域内的参数, 其 它的拼接和翻转组合情况与此类似。
INDEC_RGN_SEI(payloadSize) { Descriptor
arrange leftright no flip u(l)
arrange leftright flip u(l)
arrange topdown no flip u(l) arrange_topdown_flip u(l) if (arrange leftright no flip) {
tile num ue(v) for (i=0;i<tile_num;i++) {
tile—ids [i] ue(v)
}
cropping_enable_flag u(l) if (cropping—enable—flag) {
pic crop left offset ue(v) pic crop right offset ue(v) pic_crop_top_offset ue(v) pic crop bottom offset ue(v)
}
profile_idc ue(v) level—idc ue(v)
if (arrange leftright flip) {
tile num ue(v) for (i=0;i<tile_num;i++) {
tile—ids [i] ue(v)
} cropping_enable_flag u(l) if (cropping—enable—flag) {
pic crop left offset ue(v) pic crop right offset ue(v) pic_crop_top_offset ue(v) pic crop bottom offset ue(v)
}
profile_idc ue(v) level—idc ue(v)
}
if (arrange_topdown_no_flip) {
tile num ue(v) for (i=0;i<tile_num;i++) {
tile—ids [i] ue(v)
}
cropping_enable_flag u(l) if (cropping—enable—flag) {
pic crop left offset ue(v) pic crop right offset ue(v) pic_crop_top_offset ue(v) pic crop bottom offset ue(v) }
profile_idc ue(v)
level—idc ue(v)
}
if (arrange_topdown_flip) {
tile num ue(v)
for (i=0;i<tile_num;i++) {
tile—ids [i] ue(v)
}
cropping_enable_flag u(l)
if (cropping—enable—flag) {
pic crop left offset ue(v)
pic crop right offset ue(v)
pic_crop_top_offset ue(v)
pic crop bottom offset ue(v)
}
profile_idc ue(v)
level—idc ue(v)
} 表中的 ue(v)表示该字段的长度可变, u(n)表示该字段的长度为 n位(bite ), u(l)标识该字段的长度为 1位。
arrange— leftright_no_flip: 图片为左右拼接且无左右视图翻转。
arrange— leftright _flip: 图片为左右拼接且左右视图翻转。
arrange_topdown_no_fli : 图片为上下拼接且无左右视图翻转。
arrange_topdown_fli : 图片为上下拼接且左右视图翻转。
可独立解码区域位置标识信息:
tile_num: 覆盖可独立解码视图包含的 tile数量。
tile_ids: 覆盖可独立解码视图包含的 tile id数组, 指示覆盖可独立解码区 域的若干 tile对应的 id集合。
以下为可独立解码区域的 cropping信息:
cropping_enable_flag: 如果可独立解码视图的宽度与可独立解码区域的宽 度相等, 且可独立解码视图的高度与可独立解码区域的高度相等, 则设置 cropping— enable— flag为 false , 否则为 tme。
pic_crop_left_offset:包含可独立解码视图相对于覆盖可独立解码视图若干 tile的最左边缘的横坐标, 以像素为单位。
pic_crop_right_offset: 包含可独立解码视图相对于覆盖可独立解码视图若 干 tile的最右边缘的横坐标, 以像素为单位。
pic_crop_top_offset:包含可独立解码视图相对于覆盖可独立解码视图若干 tile的最上边缘的纵坐标, 以像素为单位。
pic_crop_bottom_of set: 包含可独立解码视图相对于覆盖可独立解码视图 若干 tile的最下边缘的纵坐标, 以像素为单位。
以下为可独立解码区域子码流的 profile和 level信息: profile_idc: 可独立解码区域中的编码工具集的符合的 profile id。 level_idc: 解码器需要满足的最低 level id。 根据可独立解码区域占整个图 片的面积的比例, 计算解码可独立解码区域的码率和最大緩存, 例如: 解码整 个图片的码率为 x, 最大緩存为 y, 可独立解码区域占整个图片的面积的比例 为 r, 则可独立解码区域的码率为 x*r, 最大緩存位 y*r。 根据 profile_idc、 可 独立解码区域的码率为 x*r和最大緩存位 y*r查找符合这个解码性能的最小 level, 设置 level—idc为上述最小 leveL
步骤 S605: 获取第 i帧图片, 设置 i=l。
步骤 S606: 编码第 i帧图片, 具体的编码过程将在下图 14详细介绍。 步骤 S607: 将当前第 i帧图片编码后对应的 AU输出,保存在自身或外挂 的存储装置,例如:源装置 100的视频存储器 101或编码器 102的緩存器 1021 中, 或者直接通过网络传输给远程的接收装置 200。
步骤 S608:如果 i等于 M(即:第 i帧为最后一帧)则编码结束,否则 i=i+l , 执行步骤 S605。
图 14为图 13所示视频编码方法中步骤 S606中具体编码一帧图片的方法 流程示意图。
步骤 S701 :根据待编码视频的配置文件确定 tile划分方案并划分图片。按 先从左向右, 再从上到下的顺序分配 tile id。 对于相同拼接类型且相同翻转类 型的帧图片 tile划分相同。 设置当前 tile id为 k, k=l , tile的总数为 N。
步骤 S702:根据步骤 S603确定的图片序列中不同拼接类型和翻转类型图 片的可独立解码的视图,确定覆盖可独立解码视图的若干 tile集合,且每个 tile 均需要满足 LCU对齐的要求。 该覆盖可独立解码视图的若干 tile对应的区域 为可独立解码区域。 相应地, 将覆盖可独立解码区域的 tile的数量设置 SEI消 息的 tile_num字段,根据覆盖可独立解码区域的 tile id集合设置 tilejds字段, 该字段为若干个 tile的 id数组。 LCU对齐后, 根据相应的 cropping信息设置 SEI消息 cropping信息对应的字段。
步骤 S703: 根据步骤 S702, 确定当前 tile(k)是否属于覆盖可独立解码视 图的若干 tile之一, 如果 tile ( k )属于覆盖可独立解码视图的若干 tile之一, 则执行步骤 S704, 否则执行步骤 S705。
步骤 S704: 如果 tile ( k )属于覆盖可独立解码视图的若干 tile之一, 设置 之前已编码的拼接类型与翻转类型都相同的图片中的可独立解码区域为当前 图片 tile(k)的帧间参考候选区域。
步骤 S705: 如果 tile ( k )不属于覆盖可独立解码视图若干 tile之一, 则设 置之前已编码的图片中所有图片区域为当前图片 tile(k)的帧间参考候选区域。
步骤 S706: 选择使用帧内预测或帧间预测算法编码 tile(k), 当采用帧间预 测算法编码时,从步骤 S704和 S705所述的帧间参考候选区域中选择最优的参 考区域。
步骤 S707: 如果 k小于 N, 即 tile(k)不是待编码图片中最后一个 tile, 则 k=k+l , 执行步骤 S703 ; 如果 k等于 N, 则编码结束。
上述图 13和图 14提供的编码方法, 在编码后的码流中添加了辅助消息, 由 SEI消息承载,该 SEI消息中的 Profile和 Level信息仅针对可独立解码区域 形成的子码流, 降低了对解码器性能的要求。 此外,在步骤 S704和 S705分别 针对可独立解码区域 tile和非可独立解码区域 tile设置不同的帧间参考候选区 域,既保证了区域内的编码块可独立解码,又扩大了区域外编码块的参考范围, 从而能够在编码中参考与当前块更为相似的编码块,提高了编码效率, 节省传 输数据量。
如图 15所示,为本发明实施例提供的一种视频解码方法,在本实施例中, 解码器解码经过如图 13和图 14所示流程编码后的视频码流, 即, 解码一长度 为 M的图片序列的过程如下:
步骤 S801 : 接收待解码的视频码流, 该视频码流包括若干 AU, 每一 AU 与一帧经过编码的图片对应。
步骤 S802: 从视频码流中获取一个 AU。
步骤 S803 :判断当前 AU是否包含帧拼接格式的图片,判断方法如下: 1 ) 当前 AU包含 FPA ( frame packing arrangement )消息, 且消息中的取消标志位 为 0; 2 ) 当前 AU不包含 FPA消息, 但待解码视频码流中之前接收到的最后 一个 FPA, 其取消标志位为 0。 满足 2个条件之一, 执行步骤 S804, 否则执行 步骤 S815。
步骤 S804: 判断当前 AU是否包含 SEI消息, 或之前的码流中已接收了 SEI消息, 是则继续执行步骤 S805, 否则转步骤 S816。
步骤 S805: 取出当前 AU中的经过编码的图片信息。
步骤 S806: 才艮据 FPA消息, 判断当前第 i帧图片的拼接类型和翻转类型 与前一帧图片都相同, 则执行步骤 S811, 否则执行步骤 S807。
步骤 S807: 才艮据当前帧的拼接类型和翻转类型, 找到 SEI消息中与此类 型相应的参数, 获得此种类型对应的可独立解码区域标识信息、 cropping信息 和 rofile、 level信息。
步骤 S808: 判断解码器的性能是否符合 SEI消息中的 profile和 level, 如 果不符合则无法解码, 直接结束; 如果符合, 执行步骤 S809。
步骤 S809: 才艮据 SEI消息中的 profile和 level信息初始化解码器。
步骤 S810:从上述 SEI消息中获取可独立解码区域对应的若干 tile id集合。 步骤 S811 :根据上述 tile id的集合,取出覆盖可独立解码区域的若干 tile。 步骤 S812:解码覆盖可独立解码区域若干 tile内的图片,解码方法根据编 码过程中相应的编码方法而定。
步骤 S813: 根据上述 SEI消息中的 cropping信息裁减若干 tile内图像。 如果 cropping_enable_flag 为 false , 则不需要裁减; 否则取出若干 tile 内 pic_crop_left_offset 、 pic_crop_right_oifset 、 pic_crop_top_offset 和 pic_crop_bottom_offset标识的区域, 即若干 tile内可独立解码的视图。
步骤 S814: 输出可独立解码视图到如图 4所示的显示设备 201, 或者, 如 果输出时间未到, 暂存至如图 5所示緩存器 2021。
步骤 S815: 如果当前 AU为码流中最后一个 AU, 则解码结束; 否则执行 步骤 S802。
步骤 S816: 正常的解码流程。
采用本实施例提供的解码方法,当解码器接收的是采用帧封装 3DTV技术 编码的 3D视频码流, 而连接的是 2D显示设备时, 则解码器可以根据 SEI消 息只取出两个视图中的一个进行解码, 即仅对可独立解码区域进行解码, 降低 了对解码器性能的要求,并且节约了解码器的计算和存储资源。参见步骤 S809, 对应可独立编码区域子码流的 profile和 level通常会减低对解码器的性能和存 储要求, 所以初始化解码器后可节省解码器解码时间和电力消耗, 节省解码器 存储需求。 如果解码器不满足原来 3D视频码流的 profile和 level要求, 而满 足对应可独立编码区域子码流的 profile和 level要求,则提高了解码器对分辨 率或码率高需求的兼容 2D显示的 3D视频码流的支持。
如图 16所示,为本发明实施例提供的一种视频编码方法,在本实施例中, 上述辅助消息由 SPS消息承载。 在 SPS消息中可包括可独立解码区域标识, 还可以包括可独立解码区域位置标识信息、 cropping信息、 profile和 level信 息。 可独立解码区域与一个或多个 independent tile组成的矩形区域对应。 可独 立解码区域外的区域与一个或多个 dependent tile组成的矩形区域对应。 在本 实施例中, 待编码的视频, 即待编码的图片序列, 具有相同的拼接类型和翻转 类型。 如果图片序列具有不同的拼接类型和翻转类型, 其编码过程与如图 11 所示步骤 S403、 步骤 S404类似, 其解码过程与如图 15所示的步骤 S807步骤 类似, 不同在于辅助消息分别由 SEI消息和 SPS消息承载。
图 16所示的编码方法, 为编码器编码一视频的过程, 该视频为一个长度 为 M的图片序列, 其中每帧图片以两个视图拼接的形式为例, 其中一个视图 为 2D显示设备最终显示的部分, 即上述可独立解码区域。 对于本领域的技术 人员而言, 采用两个或两个以上视图拼接图片, 其编码方法均可经由此方法简 单变换得到。
步骤 S901 : 读取待编码视频的配置文件, 配置文件中存放编码器编码过 程中所需要预先确定的输入参数, 例如编码工具、 编码限制、 待编码图片的属 性等等。本实施例中配置文件中预设了每帧图片的可独立解码视图。 由于本实 施例中图片序列具有相同的拼接类型和翻转类型,因此每帧图片的可独立解码 视图相同。
步骤 S902: 根据配置文件确定图片序列中每帧图片的可独立解码视图, 例如, 一帧图片中的左视图或右视图。 设置当前帧为第 i帧, i=l。 步骤 S903: 根据配置文件确定 tile划分方案并划分图片。 按先从左向右, 再从上到下的顺序分配 tile id。对于相同拼接类型且相同翻转类型的帧图片 tile 划分相同。 设置当前 tile id为 k, k=l , tile的总数为 N。
步骤 S904: 根据步骤 S902确定的每帧图片的可独立解码的视图, 确定覆 盖可独立解码视图的若干 tile集合,定义这个集合中的 tile为 independent tile, 这个集合以外的 tile为 dependent tile。 该覆盖可独立解码视图的若干 tile均需 要满足上下左右边界 LCU对齐的要求。
步骤 S905: 设置 SPS 消息中参数, 根据可独立解码区域的标识信息、 cropping信息、 profile和 level信息设置 SPS消息中各字段, 其定义如下表。
profile tier level( ProfilePresentFlag, Descriptor MaxNumSubLayersMinus 1 ) {
if( ProfilePresentFlag ) {
}
general_level_idc u(8) indec rgn_present flag u(l) if (indec rgn_present flag) {
tile num ue(v)
for (i=0;i<tile_num;i++) {
tile—ids [i] ue(v) }
cropping_enable_flag u(l) new_profile_flag u(l) new—level—flag u(l) if (cropping_enable_flag) {
pic crop left offset ue(v) pic crop right offset ue(v) pic_crop_top_offset ue(v) pic crop bottom offset ue(v)
}
}
if(new_profile_flag) {
profile_idc ue(v)
}
if(new_level_flag) {
level—idc ue(v)
}
for( i = 0; i < MaxNumSubLayersMinus 1; i++ ) {
}
} 表中的 ue(v)表示该字段的长度可变, u(n)表示该字段的长度为 n位, u(l) 标识该字段的长度为 1位。
可独立解码区域标识:
indec_rgn_present_flag: 当视频中存在可独立解码视图, 则设为 tme, 否 则 false。
可独立解码区域位置标识信息:
tile_num: 覆盖可独立解码视图包含的 tile数量。
tile_ids: 覆盖可独立解码视图包含的 tile id数组, 指示覆盖可独立解码视 图的若干 tile对应的 id集合。
以下为可独立解码区域的 cropping信息:
cropping_enable_flag: 如果可独立解码视图的宽度与覆盖可独立解码视图 若干 tile的总宽度相等, 且可独立解码视图的高度与覆盖可独立解码视图若干 tile的总高度相等, 则设置 cropping_enable_flag为 false, 否则为 true。
pic_crop_left_oflfset:包含可独立解码视图相对于覆盖可独立解码视图若干 tile的最左边缘的横坐标, 以像素为单位。
pic_crop_right_offset: 包含可独立解码视图相对于覆盖可独立解码视图若 干 tile的最右边缘的横坐标, 以像素为单位。
pic_crop_top_offset:包含可独立解码视图相对于覆盖可独立解码视图若干 tile的最上边缘的纵坐标, 以像素为单位。
pic_crop_bottom_oflfset: 包含可独立解码视图相对于覆盖可独立解码视图 若干 tile的最下边缘的纵坐标, 以像素为单位。
以下为可独立解码区域子码流的 profile和 level信息: new_profile_flag: 表示可独立解码区域子码流的 prifile是否与整个码流的 prifile的标识符相同, 若其值为 0, 则表示相同; 若其值为 1, 则表示不同。
new_level_flag: 表示可独立解码区域子码流的 level是否与整个码流的 level的标识符相同, 若其值为 0, 则表示相同; 若其值为 1, 则表示不同。
profile_idc: 可独立解码区域中的编码工具集的符合的 profile id。
level_idc: 解码器需要满足的最低 level id。 根据可独立解码区域占整个图 片的面积的比例, 计算解码可独立解码区域的码率和最大緩存, 例如: 解码整 个图片的码率为 x, 最大緩存为 y, 可独立解码区域占整个图片的面积的比例 为 r, 则可独立解码区域的码率为 x*r, 最大緩存位 y*r。 根据 profile_idc、 可 独立解码区域的码率为 x*r和最大緩存位 y*r查找符合这个解码性能的最小 level, 设置 level—idc为上述最小 leveL
步骤 S906: 获取第 i帧图片, 设置 i=l。
步骤 S907: 编码第 i帧图片, 具体的编码过程将在下图 17详细介绍。 步骤 S908: 将当前第 i帧图片编码后对应的 AU输出,保存在自身或外挂 的存储装置,例如:源装置 100的视频存储器 101或编码器 102的緩存器 1021 中, 或者直接通过网络传输给远程的接收装置 200。
步骤 S908: 如果 i等于 M (第 i帧为最后一帧)则编码结束, 否则 i=i+l, 执行步骤 S906。
图 17为图 16所示视频编码方法中步骤 S907中具体编码一帧图片的方法 流程示意图。
步骤 S1001 : tile id为 k, 设置 k=l。
步骤 S1002:根据 SPS消息中的 tilejds字段内容,确定当前 tile(k)是否属 于覆盖可独立解码视图的若干 tile之一, 如果 tile ( k )属于覆盖可独立解码视 图若干 tile之一, 则执行步骤 S1003, 否则执行步骤 S1004。
步骤 S1003 : 如果 tile ( k )属于覆盖可独立解码视图若干 tile之一, 设置 之前已编码的拼接类型与翻转类型都相同的帧图片中的可独立解码区域为当 前帧图片 tile(k)的帧间参考候选区域。
步骤 S1004: 如果 tile ( k ) 不属于覆盖可独立解码视图若干 tile之一, 则 设置之前已编码的帧图片中所有图片区域为当前帧图片 tile(k)的帧间参考候选 区域。
步骤 S1005: 选择使用帧内预测或帧间预测算法编码 tile(k), 当采用帧间 预测算法编码时,从步骤 S1003和 S1004所述的帧间参考候选区域中选择最优 的参考区域。 当采用帧内预测算法编码时, 如果 tile(k)不属于覆盖可独立解码 区 i或若干 tile之一, 即为 dependent tile, 则 tile(k)可以^ 1邻近的 independent tile 中的图像块作为选择最优参考块的候选范围。
步骤 S1006: 如果 k小于 N, 即 tile(k)不是待编码图片中最后一个 tile, 则 k=k+l , 执行步骤 S1002; 如果 k等于 N, 则编码结束。
上述图 16和图 17提供的编码方法, 在编码后的码流中现有 SPS消息中 新增字段标识可独立解码区域相关信息, 以实现辅助消息的功能。该 SEI消息 中的 Profile和 Level信息仅针对可独立解码区域形成的子码流,降低了对解码 器性能的要求。 此外, 在步骤 S1003和 S1004分别针对可独立解码区域 tile和 非可独立解码区域 tile设置不同的帧间参考候选区域, 既保证了区域内的编码 块可独立解码, 又扩大了区域外编码块的参考范围,从而能够在编码中参考与 当前块更为相似的编码块, 提高了编码效率, 节省传输数据量。 如图 18所示,为本发明实施例提供的一种视频解码方法,在本实施例中, 解码器解码经过如图 16和图 17所示流程编码后的视频码流, 即, 解码一长度 为 M的图片序列的过程如下:
步骤 S1101 : 接收待解码的视频码流, 该视频码流包括若干 AU,每一 AU 与一帧经过编码的图片对应。
步骤 S1102 : 获取视频码流中的 SPS 消息, 判断 SPS 消息中的 indec_rgn_present_flag 字段是否为 tme, 是则设置继续解码, 否则执行步骤 S1114;
步骤 S 1103: 获取视频码流中的 SPS消息中 profile和 level信息, 判断解 码器的性能是否符合 SPS消息中的 profile和 level, 如果不符合则无法解码, 直接结束; 如果符合, 执行步骤 S1104。
步骤 S1104: 才艮据上述的 profile和 level信息初始化解码器。
步骤 S1105: 从 SPS消息中获取覆盖可独立解码视图对应的 tile id集合。 步骤 S1106: 从视频码流中获取一个 AU。
步骤 S1107: 判断当前 AU是否包含帧拼接格式的图片, 判断条件方法如 下: 1 ) 当前 AU包含 FPA ( frame packing arrangement ) 消息, 且消息中的取 消标志位为 0; 2 ) 当前 AU不包含 FPA消息, 但码流中之前接收到的最后一 个 FPA消息, 其取消标志位为 0。 满足 2个条件之一, 则转下一步, 否则转步 骤 S1114。
步骤 S1108: 取出当前 AU中的经过编码的图片信息。
步骤 S1109: 根据步骤 S1105获得的覆盖可独立解码视图对应的 tile id集 合, 取出覆盖可独立解码视图的若干 tile。 步骤 S1110: 解码覆盖可独立解码视图若干 tile内的图片, 解码方法根据 编码过程中相应的编码方法而定。
步骤 S1111 : 才艮据上述 SPS消息中的 cropping信息裁减若干 tile内图片。 如果 cropping_enable_flag 为 false , 则不需要裁减; 否则取出若干 tile 内 pic_crop_left_offset 、 pic_crop_right_oifset 、 pic_crop_top_offset 和 pic_crop_bottom_offset标识的区 i或, 即若干 tile内可独立解码的视图。
步骤 S1112: 输出可独立解码视图到如图 4所示的显示设备 201, 或者, 如果输出时间未到, 暂存至如图 5所示緩存器 2021。
步骤 S1113: 如果当前 AU为码流中最后一个 AU, 则解码结束; 否则执 行步骤 S1106。
步骤 S1114: 正常的解码流程。
采用本实施例提供的解码方法,当解码器接收的是采用帧封装 3DTV技术 编码的 3D视频码流, 而连接的是 2D显示设备时, 则解码器可以根据 SEI消 息只取出两个视图中的一个进行解码, 即仅对可独立解码区域进行解码, 降低 了对解码器性能的要求, 并且节约了解码器的计算和存储资源。 参见步骤 S1104, 对应可独立编码区域子码流的 profile和 level通常会减低对解码器的 性能和存储要求, 所以初始化解码器后可节省解码器解码时间和电力消耗, 节 省解码器存储需求。 如果解码器不满足原来 3D视频码流的 profile和 level要 求, 而满足对应可独立编码区域子码流的 profile和 level要求, 则提高了解码 器对分辨率或码率高需求的兼容 2D显示的 3D视频码流的支持。
本领域普通技术人员可以理解,实现上述实施例所示视频编码和解码方法 可以通过程序指令相关的硬件来完成,所述的程序可以存储于可读取存储介质 中, 该程序在执行时执行上述方法中的对应步骤。 所述的存储介质可以如:
ROM/RAM、 磁碟、 光盘等。
以上所述仅是本发明的优选实施方式, 应当指出,对于本技术领域的普通 技术人员来说, 在不脱离本发明原理的前提下, 还可以做出若干改进和润饰, 这些改进和润饰也应视为本发明的保护范围。

Claims

权 利 要 求
1、 一种视频编码方法, 所述视频由图片序列组成, 其特征在于, 包括: 根据所述视频对应的配置文件, 确定待编码图片中的可独立解码视图; 将所述图片划分成至少两个分块(tile ), 其中覆盖可独立解码视图的一个 或多个 tile对应的区域为可独立解码区域;
生成与所述图片对应的辅助消息,所述辅助消息中包含所述可独立解码区 域位置标识, 所述可独立解码区域位置标识由一个或多个分块标识 (tile id ) 组成; 以及
编码所述图片包含的所有 tile, 以至形成编码后的视频码流, 所述编码后 的视频码流中包括所述辅助消息。
2、 如权利要求 1所述的方法, 其特征在于, 所述辅助消息进一步包括可 独立解码区域标识,所述可独立解码区域标识用以标识所述图片是否包含可独 立解码区域。
3、 如权利要求 1或 2所述的方法, 其特征在于, 所述辅助消息进一步包 括解码所述可独立解码区域的裁减(cropping )信息, 所述 cropping信息由所 述可独立解码视图相对于可独立解码区域的上、 下、左或右边界的横坐标或纵 坐标组成。
4、 如权利要求 1至 3任一所述的方法, 其特征在于, 所述辅助消息进一 步包括解码所述可独立解码区域的档次(profile )信息, 所述 profile信息用以 标识所述可独立解码区域中的编码工具集。
5、 如权利要求 1至 4任一所述的方法, 其特征在于, 所述辅助消息进一 步包括解码所述可独立解码区域的级别 (level )信息, 所述 level信息用以标 识解码器需要满足的 level信息。
6、 如权利要求 1至 5任一所述的方法, 其特征在于, 所述编码所述图片 包含的所有 tile的步骤进一步包括:
判断当前待编码 tile是否为可独立解码区域内的 tile;
如果是,设置已编码图片的可独立解码区域为当前 tile的帧间参考候选区 域;
如果不是,设置已编码图片的整个图片区域为当前 tile的帧间参考候选区 域;
当采用帧间算法编码时,根据上述待编码 tile对应的帧间参考候选区域选 择最优参考区域。
7、 如权利要求 1至 6所述的方法, 所述图片序列包括不同拼接类型和翻 转类型的图片;所述配置文件存储所述图片序列中每帧图片的拼接类型和翻转 类型, 以及不同拼接类型和翻转类型图片对应的可独立解码视图。
8、 如权利要求 7所述的方法, 所述辅助消息中进一步包括不同拼接类型 和翻转类型图片对应的可独立解码区域位置标识。
9、 如权利要求 1至 8任一所述的方法, 其特征在于, 所述辅助消息由辅 助增强信息 ( Supplemental Enhancement Information, SEI )承载。
10、 如权利要求 1至 8任一所述的方法, 其特征在于, 所述辅助消息由序 歹1 J参数集 ( Sequence Parameter Set, SPS )承载。
11、 一种视频解码方法, 其特征在于, 包括:
接收视频码流, 所述视频码流包括待解码的视频和辅助消息, 所述待解码 的视频由待解码的图片序列组成; 获取待解码图片;
根据所述辅助消息,获得所述待解码图片的可独立解码区域的可独立解码 区域位置标识, 所述可独立解码区域位置标识由一个或多个分块(tile ) 的分 块标识( tile id )组成; 以及
根据可独立解码区域位置标识获得所述待解码图片的可独立解码区域,解 码所述可独立解码区域。
12、 如权利要求 11所述的方法, 其特征在于, 所述辅助消息进一步包括 可独立解码区域标识,所述可独立解码区域标识用以标识所述图片是否包含可 独立解码区域。
13、 如权利要求 11或 12所述的方法, 其特征在于, 所述辅助消息进一步 包括解码所述可独立解码区域对应的裁减( cropping )信息, 所述 cropping信 息由可独立解码视图相对于可独立解码区域的上、 下、左或右边界的横坐标或 纵坐标组成。
14、 如权利要求 13所述的方法, 其特征在于, 该方法进一步包括: 根据辅助消息中的 cropping信息,裁减可独立解码区域得到可独立解码视 图。
15、 如权利要求 11至 13任一所述的方法, 其特征在于, 所述辅助消息进 一步包括解码所述可独立解码区域的档次(profile )信息, 所述 profile信息用 以标识所述可独立解码区域中的编码工具集。
16、 如权利要求 11至 15任一所述的方法, 其特征在于, 所述辅助消息进 一步包括解码所述可独立解码区域的级别 (level )信息, 所述 level信息用以 标识解码器需要满足的 level信息。
17、如权利要求 11至 16所述的方法, 所述待解码的图片序列包括不同拼 接类型和翻转类型的图片。
18、 如权利要求 17所述的方法, 所述辅助消息中进一步包括不同拼接类 型和翻转类型图片对应的可独立解码区域位置标识。
19、 如权利要求 18所述的方法, 所述辅助消息中进一步包括不同拼接类 型和翻转类型图片对应的解码所述可独立解码区域的 cropping信息。
20、 如权利要求 18所述的方法, 所述辅助消息中进一步包括不同拼接类 型和翻转类型图片对应的解码所述可独立解码区域的 profile信息。
21、 如权利要求 18所述的方法, 所述辅助消息中进一步包括不同拼接类 型和翻转类型图片对应的解码所述可独立解码区域的 level信息。
22、 如权利要求 11至 21任一所述的方法, 其特征在于, 所述辅助消息由 辅助增强信息 ( Supplemental Enhancement Information, SEI )承载。
23、 如权利要求 11至 21任一所述的方法, 其特征在于, 所述辅助消息由 序列参数集( Sequence Parameter Set, SPS ) 载。
24、 一种视频编码器, 所述视频由图片序列组成, 其特征在于, 包括: 存储单元, 用于存储所述视频的配置文件;
可独立解码视图确认单元, 用于根据所述视频对应的配置文件, 确定待编 码图片中的可独立解码视图;
分块划分单元, 用于将所述图片划分成至少两个分块(tile ), 其中覆盖可 独立解码视图的一个或多个 tile对应的区域为可独立解码区域;
辅助消息生成单元, 用于生成与所述图片对应的辅助消息, 所述辅助消息 中包含所述可独立解码区域位置标识,所述可独立解码区域位置标识由一个或 多个分块标识( tile id )组成; 以及
执行编码单元, 用于编码所述图片包含的所有 tile, 以至形成编码后的视 频码流, 所述编码后的视频码流中包括所述辅助消息。
25、 如权利要求 24所述的视频编码器, 其特征在于, 所述执行编码单元 进一步包括:
判断单元, 用于判断当前待编码 tile是否为可独立解码区域内的 tile; 如 果是, 设置已编码图片的可独立解码区域为当前 tile的帧间参考候选区域; 如 果不是, 设置已编码图片的整个图片区域为当前 tile的帧间参考候选区域; 当 采用帧间算法编码时,根据上述待编码 tile对应的帧间参考候选区域选择最优 参考区域。
26、 一种视频解码器, 其特征在于, 包括:
接收单元, 用于接收视频码流, 所述视频码流包括待解码的视频和辅助消 息, 所述待解码的视频由待解码的图片序列组成; 以及
执行解码单元, 用于获取待解码图片; 根据所述辅助消息, 获得所述待解 码图片的可独立解码区域的可独立解码区域位置标识,所述可独立解码区域位 置标识由一个或多个分块(tile ) 的分块标识(tile id )组成; 以及根据可独立 解码区域位置标识获得所述待解码图片的可独立解码区域,解码所述可独立解 码区域。
27、一种编码器,用于编码视频,所述视频由图片序列组成,其特征在于, 包括:
一个或多个处理器;
一个或多个存储器; 一个或多个程序, 其中, 所述一个或多个程序存储于所述一个或多个存储 器中, 并且, 所述一个或多个程序用于被所述一个或多个处理器执行, 所述一 个或多个程序包括:
指令, 用于根据所述视频对应的配置文件,确定待编码图片中的可独立解 码视图;
指令, 用于将所述图片划分成至少两个分块(tile ), 其中覆盖可独立解码 视图的一个或多个 tile对应的区域为可独立解码区域;
指令, 用于生成与所述图片对应的辅助消息, 所述辅助消息中包含所述可 独立解码区域位置标识,所述可独立解码区域位置标识由一个或多个分块标识 ( tile id )组成; 以及
指令, 用于编码所述图片包含的所有 tile, 以致形成编码后的视频码流, 所述编码后的视频码流中包括所述辅助消息。 51、 如权利要求 50所述的编码 器, 其特征在于, 所述辅助消息进一步包括如下信息之一: 可独立解码区域标 识、 解码所述可独立解码区域的裁减(cropping )信息、 解码可独立解码区域 的档次(profile )信息、 以及解码可独立解码区域的级别 (level )信息。
28、 如权利要求 27所述的编码器, 其特征在于, 所述辅助消息进一步包 括如下信息之一: 可独立解码区域标识、 解码所述可独立解码区域的裁减 ( cropping )信息、 解码可独立解码区域的档次( profile )信息、 以及解码可 独立解码区域的级别 (level )信息。
29、 一种解码器, 其特征在于, 包括:
一个或多个处理器;
一个或多个存储器; 一个或多个程序, 其中, 所述一个或多个程序存储于所述一个或多个存储 器中, 并且, 所述一个或多个程序用于被所述一个或多个处理器执行, 所述一 个或多个程序包括:
指令, 用于接收视频码流, 所述视频码流包括待解码的视频和辅助消息, 所述待解码的视频由待解码的图片序列组成;
指令, 用于获取待解码图片;
指令, 用于根据所述辅助消息, 获得所述待解码图片的可独立解码区域的 可独立解码区域位置标识, 所述可独立解码区域位置标识由一个或多个分块 ( tile ) 的分块标识(tile id )组成; 以及
指令,用于根据可独立解码区域位置标识获得所述待解码图片的可独立解 码区域, 解码所述可独立解码区域。
30、 如权利要求 29所述的解码器, 其特征在于, 所述辅助消息进一步包 括如下信息之一: 可独立解码区域标识、 解码所述可独立解码区域的裁减
( cropping )信息、 解码可独立解码区域的档次( profile )信息、 以及解码可 独立解码区域的级别 (level )信息。
31、 一种编码器, 设置于处理视频的源装置中, 用于编码视频, 所示视频 由图片序列组成, 其特征在于, 包括:
一个或多个电路, 用于根据所述视频对应的配置文件,确定待编码图片中 的可独立解码视图; 将所述图片划分成至少两个分块(tile ), 其中覆盖可独立 解码视图的一个或多个 tile对应的区域为可独立解码区域; 生成与所述图片对 应的辅助消息, 所述辅助消息中包含所述可独立解码区域位置标识,所述可独 立解码区域位置标识由一个或多个分块标识(tile id )组成; 以及, 编码所述 图片包含的所有 tile, 以致形成编码后的视频码流, 所述编码后的视频码流中 包括所述辅助消息。
32、 如权利要求 31所述的编码器, 其特征在于, 所述辅助消息进一步包 括如下信息之一: 可独立解码区域标识、 解码所述可独立解码区域的裁减 ( cropping )信息、 解码可独立解码区域的档次( profile )信息、 以及解码可 独立解码区域的级别 (level )信息。
33、 一种解码器, 设置于处理视频的接收装置中, 其特征在于, 包括: 一个或多个电路, 用于接收视频码流, 所述视频码流包括待解码的视频和 辅助消息, 所述待解码的视频由待解码的图片序列组成; 获取待解码图片; 根 据所述辅助消息,获得所述待解码图片的可独立解码区域的可独立解码区域位 置标识, 所述可独立解码区域位置标识由一个或多个分块(tile ) 的分块标识 ( tile id )组成; 根据可独立解码区域位置标识获得所述待解码图片的可独立 解码区域, 解码所述可独立解码区域。
34、 如权利要求 33所述的解码器, 其特征在于, 所述辅助消息进一步包 括如下信息之一: 可独立解码区域标识、 解码所述可独立解码区域的裁减
( cropping )信息、 解码可独立解码区域的档次( profile )信息、 以及解码可 独立解码区域的级别 (level )信息。
35、 一种计算机可读存储介质, 所述计算机可读存储介质存储若干指令, 当所述若干执行被设备执行时, 将触发所述设备执行如下操作:
根据视频对应的配置文件,确定所述视频中待编码图片中的可独立解码视 图;
将所述图片划分成至少两个分块(tile ), 其中覆盖可独立解码视图的一个 或多个 tile对应的区域为可独立解码区域;
生成与所述图片对应的辅助消息,所述辅助消息中包含所述可独立解码区 域位置标识, 所述可独立解码区域位置标识由一个或多个分块标识 (tile id ) 组成; 以及
编码所述图片包含的所有 tile, 以致形成编码后的视频码流, 所述编码后 的视频码流中包括所述辅助消息。
36、 如权利要求 35所述的计算机器可读存储介质, 其特征在于, 所述辅 助消息进一步包括如下信息之一: 可独立解码区域标识、解码所述可独立解码 区域的裁减 (cropping )信息、 解码可独立解码区域的档次(profile )信息、 以及解码可独立解码区域的级别 (level )信息。
37、 一种计算机可读存储介质, 所述计算机可读存储介质存储若干指令, 当所述若干执行被设备执行时, 将触发所述设备执行如下操作:
接收视频码流, 所述视频码流包括待解码的视频和辅助消息, 所述待解码 的视频由待解码的图片序列组成;
获取待解码图片;
根据所述辅助消息,获得所述待解码图片的可独立解码区域的可独立解码 区域位置标识, 所述可独立解码区域位置标识由一个或多个分块(tile ) 的分 块标识( tile id )组成; 以及
根据可独立解码区域位置标识获得所述待解码图片的可独立解码区域,解 码所述可独立解码区域。
38、 如权利要求 37所述的计算机器可读存储介质, 其特征在于, 所述辅 助消息进一步包括如下信息之一: 可独立解码区域标识、解码所述可独立解码 区域的裁减 (cropping )信息、 解码可独立解码区域的档次(profile )信息、 以及解码可独立解码区域的级别 (level )信息。
39、 一种视频编码及解码的系统, 其特征在于, 包括: 一个如权利要求 24或 25或 27或 28或 31或 32所述的编码器, 以及, 一个如权利要求 26或 29或 30或 33或 34所述的解码器。
40、 一种视频码流, 其特征在于, 包括: 待解码的视频和辅助消息, 所述 待解码的视频由待解码的图片序列组成;所述辅助消息包括指示所述图片系列 的可独立解码区域的可独立解码区域位置标识,所述可独立解码区域位置标识 由一个或多个分块(tile ) 的分块标识 (tile id )组成。
PCT/CN2012/082494 2012-09-29 2012-09-29 视频编码及解码方法、装置及系统 WO2014047943A1 (zh)

Priority Applications (11)

Application Number Priority Date Filing Date Title
CN201810107420.8A CN108419076B (zh) 2012-09-29 2012-09-29 视频编码及解码方法、装置及系统
PCT/CN2012/082494 WO2014047943A1 (zh) 2012-09-29 2012-09-29 视频编码及解码方法、装置及系统
CN201810107231.0A CN108429917B (zh) 2012-09-29 2012-09-29 视频编码及解码方法、装置及系统
JP2015533407A JP6074509B2 (ja) 2012-09-29 2012-09-29 映像符号化及び復号化方法、装置及びシステム
CN201280001898.3A CN103907350B (zh) 2012-09-29 2012-09-29 视频编码及解码方法、装置及系统
KR1020157010905A KR101661436B1 (ko) 2012-09-29 2012-09-29 비디오 인코딩 및 디코딩 방법, 장치 및 시스템
EP12885466.8A EP2887663B1 (en) 2012-09-29 2012-09-29 Method, apparatus and system for encoding and decoding video
AU2012391251A AU2012391251B2 (en) 2012-09-29 2012-09-29 Method, apparatus and system for encoding and decoding video
US14/631,658 US11089319B2 (en) 2012-09-29 2015-02-25 Video encoding and decoding method, apparatus and system
US14/631,645 US20150172692A1 (en) 2012-09-29 2015-02-25 Video encoding and decoding method, apparatus and system
US17/375,936 US11533501B2 (en) 2012-09-29 2021-07-14 Video encoding and decoding method, apparatus and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2012/082494 WO2014047943A1 (zh) 2012-09-29 2012-09-29 视频编码及解码方法、装置及系统

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US14/631,658 Continuation US11089319B2 (en) 2012-09-29 2015-02-25 Video encoding and decoding method, apparatus and system
US14/631,645 Continuation US20150172692A1 (en) 2012-09-29 2015-02-25 Video encoding and decoding method, apparatus and system

Publications (1)

Publication Number Publication Date
WO2014047943A1 true WO2014047943A1 (zh) 2014-04-03

Family

ID=50386905

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/082494 WO2014047943A1 (zh) 2012-09-29 2012-09-29 视频编码及解码方法、装置及系统

Country Status (7)

Country Link
US (3) US11089319B2 (zh)
EP (1) EP2887663B1 (zh)
JP (1) JP6074509B2 (zh)
KR (1) KR101661436B1 (zh)
CN (3) CN108429917B (zh)
AU (1) AU2012391251B2 (zh)
WO (1) WO2014047943A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109417642A (zh) * 2016-07-01 2019-03-01 Sk电信有限公司 用于高分辨率影像流的影像比特流生成方法和设备
WO2019137313A1 (zh) * 2018-01-12 2019-07-18 华为技术有限公司 一种媒体信息的处理方法及装置
CN114501070A (zh) * 2022-04-14 2022-05-13 全时云商务服务股份有限公司 视频会议同步额外信息的编解码方法、处理方法和系统

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5893346B2 (ja) * 2011-11-07 2016-03-23 キヤノン株式会社 画像符号化装置、画像符号化方法及びプログラム、画像復号装置、画像復号方法及びプログラム
EP2835969A4 (en) * 2012-04-06 2015-09-09 Sony Corp DECODING APPARATUS, DECODING METHOD, ENCODING APPARATUS, AND ENCODING METHOD
RU2674312C2 (ru) * 2013-07-22 2018-12-06 Сони Корпорейшн Устройство и способ обработки информации
CN104935933B (zh) * 2015-06-05 2019-11-26 广东中星微电子有限公司 一种视频编解码方法
US10034026B2 (en) 2016-04-22 2018-07-24 Akila Subramaniam Device for and method of enabling the processing of a video stream
US20180054613A1 (en) 2016-08-22 2018-02-22 Mediatek Inc. Video encoding method and apparatus with in-loop filtering process not applied to reconstructed blocks located at image content discontinuity edge and associated video decoding method and apparatus
CN115834909B (zh) * 2016-10-04 2023-09-19 有限公司B1影像技术研究所 图像编码/解码方法和计算机可读记录介质
CN109587478B (zh) * 2017-09-29 2023-03-31 华为技术有限公司 一种媒体信息的处理方法及装置
EP3547704A1 (en) * 2018-03-30 2019-10-02 Thomson Licensing Method, apparatus and stream for volumetric video format
WO2019199025A1 (ko) * 2018-04-09 2019-10-17 에스케이텔레콤 주식회사 영상을 부호화/복호화하는 방법 및 그 장치
CN112544084B (zh) * 2018-05-15 2024-03-01 夏普株式会社 图像编码装置、编码流提取装置以及图像解码装置
CN109525842B (zh) * 2018-10-30 2022-08-12 深圳威尔视觉科技有限公司 基于位置的多Tile排列编码方法、装置、设备和解码方法
CN109587490B (zh) * 2018-11-05 2022-05-31 深圳威尔视觉传媒有限公司 一种Tile分片填充方法、装置、设备、存储介质和解码方法
WO2020146662A1 (en) * 2019-01-09 2020-07-16 Futurewei Technologies, Inc. Sub-picture identifier signaling in video coding
GB2585042A (en) * 2019-06-25 2020-12-30 Sony Corp Image data encoding and decoding
US11303935B2 (en) * 2019-07-10 2022-04-12 Qualcomm Incorporated Deriving coding system operational configuration
CN110446070A (zh) * 2019-07-16 2019-11-12 重庆爱奇艺智能科技有限公司 一种视频播放的方法和装置
US11785239B2 (en) * 2021-06-29 2023-10-10 Tencent America LLC Independent coded region output supplementary enhancement information message
CN113660529A (zh) * 2021-07-19 2021-11-16 镕铭微电子(济南)有限公司 基于Tile编码的视频拼接、编码、解码方法及装置
WO2023131937A1 (ko) * 2022-01-07 2023-07-13 엘지전자 주식회사 피쳐 부호화/복호화 방법, 장치, 비트스트림을 저장한 기록 매체 및 비트스트림 전송 방법

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1965321A (zh) * 2003-09-07 2007-05-16 微软公司 视频编解码器中的片层
CN101075462A (zh) * 2006-05-19 2007-11-21 索尼株式会社 记录/再现/编辑装置、方法、程序
CN102461173A (zh) * 2009-06-09 2012-05-16 汤姆森特许公司 解码装置、解码方法以及编辑装置

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1982517A4 (en) * 2006-01-12 2010-06-16 Lg Electronics Inc MULTIVATE VIDEO PROCESSING
US7949054B2 (en) * 2006-06-01 2011-05-24 Microsoft Corporation Flexible data organization for images
JP5326234B2 (ja) * 2007-07-13 2013-10-30 ソニー株式会社 画像送信装置、画像送信方法および画像送信システム
CN101583029B (zh) * 2008-05-13 2011-01-19 联咏科技股份有限公司 熵解码电路与方法、以及使用流水线方式的熵解码方法
US8897359B2 (en) 2008-06-03 2014-11-25 Microsoft Corporation Adaptive quantization for enhancement layer video coding
US9071843B2 (en) * 2009-02-26 2015-06-30 Microsoft Technology Licensing, Llc RDP bitmap hash acceleration using SIMD instructions
WO2011003231A1 (zh) * 2009-07-06 2011-01-13 华为技术有限公司 一种可伸缩视频编码文件的传输方法、接收方法及装置
US20110286533A1 (en) * 2010-02-23 2011-11-24 Fortney Douglas P Integrated recording and video on demand playback system
WO2011126284A2 (en) 2010-04-05 2011-10-13 Samsung Electronics Co., Ltd. Method and apparatus for encoding video by using adaptive prediction filtering, method and apparatus for decoding video by using adaptive prediction filtering
GB2481612A (en) * 2010-06-30 2012-01-04 Skype Ltd Updating image regions in a shared image system
US9596447B2 (en) * 2010-07-21 2017-03-14 Qualcomm Incorporated Providing frame packing type information for video coding
US20120170648A1 (en) * 2011-01-05 2012-07-05 Qualcomm Incorporated Frame splitting in video coding
JP5747559B2 (ja) * 2011-03-01 2015-07-15 富士通株式会社 動画像復号方法、動画像符号化方法、動画像復号装置、及び動画像復号プログラム
US9118928B2 (en) * 2011-03-04 2015-08-25 Ati Technologies Ulc Method and system for providing single view video signal based on a multiview video coding (MVC) signal stream
US9398307B2 (en) * 2011-07-11 2016-07-19 Sharp Kabushiki Kaisha Video decoder for tiles
ES2805313T3 (es) * 2011-08-11 2021-02-11 Sun Patent Trust Procedimiento de codificación de imágenes, procedimiento de descodificación de imágenes, aparato de codificación de imágenes, aparato de descodificación de imágenes y aparato de codificación / descodificación de imágenes
US20130188709A1 (en) * 2012-01-25 2013-07-25 Sachin G. Deshpande Video decoder for tiles with absolute signaling
EP2868092A4 (en) * 2012-07-02 2016-05-04 Nokia Technologies Oy METHOD AND DEVICE FOR VIDEO CODING
TWI669952B (zh) * 2012-09-18 2019-08-21 美商Vid衡器股份有限公司 使用圖塊及圖塊組的感興趣區域視訊編碼的方法及裝置
ITTO20120901A1 (it) * 2012-10-15 2014-04-16 Rai Radiotelevisione Italiana Procedimento di codifica e decodifica di un video digitale e relativi dispositivi di codifica e decodifica

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1965321A (zh) * 2003-09-07 2007-05-16 微软公司 视频编解码器中的片层
CN101075462A (zh) * 2006-05-19 2007-11-21 索尼株式会社 记录/再现/编辑装置、方法、程序
CN102461173A (zh) * 2009-06-09 2012-05-16 汤姆森特许公司 解码装置、解码方法以及编辑装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2887663A4 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109417642A (zh) * 2016-07-01 2019-03-01 Sk电信有限公司 用于高分辨率影像流的影像比特流生成方法和设备
CN109417642B (zh) * 2016-07-01 2021-06-22 Sk电信有限公司 用于高分辨率影像流的影像比特流生成方法和设备
WO2019137313A1 (zh) * 2018-01-12 2019-07-18 华为技术有限公司 一种媒体信息的处理方法及装置
CN110035331A (zh) * 2018-01-12 2019-07-19 华为技术有限公司 一种媒体信息的处理方法及装置
US11172239B2 (en) 2018-01-12 2021-11-09 Huawei Technoloies Co., Ltd. Media information processing method and apparatus
CN114501070A (zh) * 2022-04-14 2022-05-13 全时云商务服务股份有限公司 视频会议同步额外信息的编解码方法、处理方法和系统

Also Published As

Publication number Publication date
CN108419076A (zh) 2018-08-17
US20150172693A1 (en) 2015-06-18
CN108419076B (zh) 2022-04-29
AU2012391251B2 (en) 2016-04-21
US11089319B2 (en) 2021-08-10
CN108429917A (zh) 2018-08-21
EP2887663A1 (en) 2015-06-24
KR101661436B1 (ko) 2016-09-29
JP2015534376A (ja) 2015-11-26
EP2887663B1 (en) 2017-02-22
CN103907350A (zh) 2014-07-02
CN108429917B (zh) 2022-04-29
CN103907350B (zh) 2018-02-23
US20210344942A1 (en) 2021-11-04
JP6074509B2 (ja) 2017-02-01
AU2012391251A1 (en) 2015-05-07
US20150172692A1 (en) 2015-06-18
KR20150063126A (ko) 2015-06-08
EP2887663A4 (en) 2015-08-26
US11533501B2 (en) 2022-12-20

Similar Documents

Publication Publication Date Title
WO2014047943A1 (zh) 视频编码及解码方法、装置及系统
JP6644903B2 (ja) ビデオの符号化・復号装置、方法、およびコンピュータプログラム
US20200252608A1 (en) Sub-partition intra prediction
US10841619B2 (en) Method for decoding a video bitstream
JP5947405B2 (ja) ビデオ符号化方法および装置
US9578326B2 (en) Low-delay video buffering in video coding
JP2015535405A (ja) ビデオコーディングのための方法と装置
KR20210095959A (ko) 비디오 인코더, 비디오 디코더 및 대응하는 방법
US20230060709A1 (en) Video coding supporting subpictures, slices and tiles
JP2023041687A (ja) タイル構成のシグナリングのためのエンコーダ、デコーダ、および対応する方法
JP2023515175A (ja) シグナリングスライスヘッダシンタックス要素を簡略化するためのエンコーダ、デコーダおよび対応する方法
JP6120667B2 (ja) 画像処理装置、撮像装置、画像処理方法、プログラム、及び記録媒体
RU2810966C1 (ru) Способы сигнализирования комбинации передискретизации опорного изображения и пространственной масштабируемости
RU2806281C1 (ru) Способ сигнализирования разделения субизображений в кодированном потоке видео
WO2024091953A1 (en) Systems and methods of video decoding with dynamic noise reconstruction
WO2023056179A1 (en) Histogram of gradient generation
WO2024064447A1 (en) Systems and methods of video decoding with dynamic noise reconstruction
WO2024030692A1 (en) Systems and methods of video decoding with improved buffer storage and bandwidth efficiency

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12885466

Country of ref document: EP

Kind code of ref document: A1

REEP Request for entry into the european phase

Ref document number: 2012885466

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2012885466

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2015533407

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20157010905

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2012391251

Country of ref document: AU

Date of ref document: 20120929

Kind code of ref document: A