WO2018001208A1 - 编解码的方法及设备 - Google Patents

编解码的方法及设备 Download PDF

Info

Publication number
WO2018001208A1
WO2018001208A1 PCT/CN2017/090067 CN2017090067W WO2018001208A1 WO 2018001208 A1 WO2018001208 A1 WO 2018001208A1 CN 2017090067 W CN2017090067 W CN 2017090067W WO 2018001208 A1 WO2018001208 A1 WO 2018001208A1
Authority
WO
WIPO (PCT)
Prior art keywords
spatial
image
panoramic image
layout format
default
Prior art date
Application number
PCT/CN2017/090067
Other languages
English (en)
French (fr)
Inventor
马祥
杨海涛
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP17819205.0A priority Critical patent/EP3468199B1/en
Priority to EP21167189.6A priority patent/EP3934255A1/en
Priority to KR1020197001684A priority patent/KR102243120B1/ko
Publication of WO2018001208A1 publication Critical patent/WO2018001208A1/zh
Priority to US16/234,107 priority patent/US10805606B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the present invention relates to the field of video encoding and decoding and compression, and in particular, to a codec method and apparatus suitable for encoding and decoding a panoramic image.
  • Digital video capabilities can be incorporated into a wide range of devices, including digital television, digital live broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital Cameras, digital recording devices, digital media players, video game devices, video game consoles, cellular or satellite radio phones, video conferencing devices, video streaming devices, and the like.
  • Digital video devices implement video compression techniques such as MPEG-2, MPEG-4, ITU-TH.263, ITU-TH.264/MPEG-4 Part 10 Advanced Video Codec (AVC), ITU-TH.265 High
  • the video compression techniques defined in the standards defined by the Efficiency Video Coding and Decoding (HEVC) standard and those described in the extensions to the standards enable more efficient transmission and reception of digital video information.
  • Video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing these video codec techniques.
  • the concept of a frame refers to a complete image, which can be played by a frame-by-frame image in a certain order and frame rate.
  • the frame rate reaches a certain speed, the interval between the two frames is less than the resolution limit of the human eye, and a short visual pause occurs, so that it appears to be dynamically appearing on the screen.
  • the basis for the compression of video files is the compression encoding of single-frame digital images.
  • the spatial structure is the same or similar in a frame of image. For example, there is a close correlation and similarity between the colors of sampling points in the same object or background.
  • one frame of image has substantially a large correlation with its previous or subsequent frame, and the difference in pixel values of the description information is small, and these are all parts that can be compressed.
  • there is not only spatial redundancy information in the video file but also a large amount of time redundant information, which is caused by the composition of the video.
  • the frame rate of video sampling is generally 25 frames/second to 30 frames/second, and in special cases, 60 frames/second may occur. That is to say, the sampling interval between two adjacent frames is at least 1/30 second to 1/25 second. In such a short period of time, there is basically a large amount of similar information in the sampled image, and there is a great correlation between the images.
  • video compression coding is to use various technical methods to remove redundant information in the video sequence to achieve the effect of reducing storage space and saving transmission bandwidth.
  • video compression processing technologies mainly include intra prediction, inter prediction, transform quantization, entropy coding, and deblocking filtering.
  • intra prediction inter prediction
  • transform quantization entropy coding
  • deblocking filtering deblocking filtering
  • chroma sampling predictive coding
  • transform coding quantization coding
  • Chroma sampling This method makes full use of the visual psychological characteristics of the human eye, and tries to minimize the amount of data described by a single element from the underlying data representation.
  • Most of the television systems use luminance-chrominance-chrominance (YUV) color coding, which is widely adopted by European television systems.
  • the YUV color space includes a luminance signal Y and two color difference signals U and V, and the three components are independent of each other.
  • the YUV color mode is more flexible in representation, and the transmission takes up less bandwidth, which is superior to the traditional red, green and blue (RGB) color model.
  • the YUV 4:2:0 form indicates that the two chrominance components U and V are only half of the luminance Y component in both the horizontal direction and the vertical direction, that is, there are four luminance components Y among the four sampled pixels, and the chrominance component There is only one U and V.
  • the amount of data is further reduced, which is only about 33% of the original.
  • the use of human eye physiological visual characteristics to achieve video compression through this color sampling method is one of the widely used video data compression methods.
  • Predictive coding that is, using the data information of the previously encoded frame to predict the frame currently to be encoded.
  • a predicted value is obtained by prediction, which is not completely equivalent to the actual value, and there is a certain residual value. If the prediction is more suitable, the closer the predicted value is to the actual value, the smaller the residual value, so that the residual value can be encoded to greatly reduce the amount of data, and the residual value plus the predicted value is used when decoding at the decoding end.
  • Restoring and reconstructing the initial image is the basic idea of predictive coding. In the mainstream coding standard, predictive coding is divided into two basic types: intra prediction and inter prediction.
  • Transform coding instead of directly encoding the original spatial domain information, the information sample value is converted from the current domain to another artificial domain (usually called the transform domain) according to some form of transformation function. Compression coding is performed according to the distribution characteristics of the information in the transform domain.
  • the reason for transform coding is that video image data tends to have large data correlation in the spatial domain, resulting in the existence of a large amount of redundant information, and direct encoding requires a large amount of bits.
  • the data correlation is greatly reduced, so that the redundant information of the encoding is reduced, and the amount of data required for encoding is also greatly reduced, so that a higher compression ratio can be obtained, and a better compression effect can be achieved.
  • Typical transform codes include a Kalo (K-L) transform, a Fourier transform, and the like.
  • Integer Discrete Cosine Transform (DCT) is a transform coding method commonly used in many international standards.
  • Quantization coding The above-mentioned transform coding does not compress data itself.
  • the quantization process is a powerful means of compressing data, and it is also the main reason for the loss of data in lossy compression.
  • the process of quantification is the process of forcing a large dynamic input value to be forced into fewer output values. Because the range of quantized input values is large, more bit numbers are needed, and the range of output values after "forced planning" is small, so that only a small number of bits can be represented.
  • Each quantized input is normalized to a quantized output, that is, quantized into an order of magnitude, often referred to as a quantization level (usually specified by the encoder).
  • the encoder control module selects the coding mode adopted by the image block according to the local characteristics of different image blocks in the video frame. Performing frequency domain or spatial domain prediction on the block of intra prediction coding, performing motion compensation prediction on the block of inter prediction coding, and predicting The residual is then transformed and quantized to form a residual coefficient, and finally the final code stream is generated by an entropy encoder.
  • the intra or inter prediction reference signals are obtained by the decoding module at the encoding end.
  • the transformed and quantized residual coefficients are reconstructed by inverse quantization and inverse transform, and then added to the predicted reference signal to obtain a reconstructed image.
  • the loop filtering performs pixel correction on the reconstructed image to improve the encoding quality of the reconstructed image.
  • a variety of coding techniques for different scenes can be used to encode and transmit 360-degree omnidirectional visual information in three-dimensional space.
  • the receiving end decodes the received encoded data and then reproduces the 360-degree omnidirectional visual content through a dedicated display device to provide the user with an immersive visual experience.
  • the application of this technology has led to the emergence and application of a series of virtual reality (VR) products such as Oculus Rift, Gear VR, HTC Vive, etc., and has rapidly become popular, correspondingly, put forward higher requirements for coding technology.
  • VR virtual reality
  • the present invention provides a codec method capable of improving codec efficiency, and the codec method is suitable for encoding a two-dimensional panoramic image, particularly when the two-dimensional panoramic image includes a plurality of sub-images, and the plurality of sub-images The situation in which the spatial positional relationship has a large influence on the overall coding efficiency.
  • an encoding method includes determining a two-dimensional panoramic image applicable spatial layout format to be encoded and respective sub-images of the two-dimensional panoramic image in the applicable spatial layout format a spatial positional relationship; determining whether a spatial positional relationship of each sub-image of the two-dimensional panoramic image in the applicable spatial layout format is the same as a default spatial positional relationship in the applicable spatial layout format; When the spatial positional relationship of each sub-image of the two-dimensional panoramic image in the applicable spatial layout format is the same as the default spatial positional relationship in the applicable spatial layout format, the encoding is used to indicate the applicable space In the layout format, each sub-image of the two-dimensional panoramic image adopts a default spatial position relationship indicating information, and the two-dimensional panoramic image is in the default spatial position relationship according to the applicable spatial layout format. Each sub-image is encoded to generate an encoded code stream.
  • the encoding method introduces a default spatial positional relationship in a spatial format to be applied to the two-dimensional panoramic image for each sub-image of the two-dimensional panoramic image, since the spatial position of each sub-image of the two-dimensional panoramic image is normally The relationship is relatively fixed. Compared with the spatial positional relationship of each sub-image that encodes the two-dimensional panoramic image one by one, this way of adopting the default spatial positional relationship requires only one indication information, and each of the commonly used two-dimensional panoramic images can be used. The spatial positional relationship of the sub-images is clearly described, thereby saving the overhead of the larger codewords and improving the coding efficiency.
  • the step of determining the applicable spatial layout format of the two-dimensional panoramic image to be encoded may be obtained by derivation, and specifically, may be obtained. Determining a width and height of the two-dimensional panoramic image, determining a suitable spatial layout format of the two-dimensional panoramic image according to the aspect ratio determination of the two-dimensional panoramic image and a correspondence table between the preset aspect ratio and the spatial layout format .
  • the applicable spatial layout format of the two-dimensional panoramic image can also be encoded in an index manner, such as in the code stream, but the decoding end can directly determine according to the index.
  • the applicable spatial layout format of the two-dimensional panoramic image can also be encoded in an index manner, such as in the code stream, but the decoding end can directly determine according to the index.
  • the default space The location relationship may be more than one, that is, a plurality of default spatial position relationships may be used for each sub-image of the two-dimensional panoramic image.
  • the encoding method further includes: encoding each sub-photo of the two-dimensional panoramic image.
  • a default spatial positional relationship index of the image, the default spatial location index being used to uniquely indicate a default spatial location relationship among the plurality of different default spatial location relationships.
  • the advantage of this solution is that, for different application scenarios, that is, the content covered by the two-dimensional panoramic image, the optimal default spatial positional relationship of the respective sub-images is different, and therefore, in order to improve the coding efficiency, the situation is
  • the present invention proposes to set a plurality of default spatial position relationships for selection.
  • the coding method includes: determining the Whether the applicable spatial layout format of the two-dimensional panoramic image is the same as the default spatial layout format; correspondingly, determining whether the spatial positional relationship of each sub-image of the two-dimensional panoramic image under the applicable spatial layout format is The steps of the same spatial positional relationship in the applicable spatial layout format are also adaptively changed, and the specific change is according to whether the applicable spatial layout format of the two-dimensional panoramic image is different from the default spatial layout.
  • the format is the same, and it is determined that when the applicable spatial layout format of the two-dimensional panoramic image is the default spatial layout format, the spatial positional relationship of each sub-image of the two-dimensional panoramic image is a default spatial positional relationship.
  • the spatial layout format of the two-dimensional panoramic image and the spatial positional relationship of each sub-image of the two-dimensional panoramic image are combined, so that the coding overhead can be further saved. Meanwhile, considering the flexibility of encoding, that is, when the spatial layout format of the two-dimensional panoramic image is not the default spatial layout format, whether the spatial positional relationship of each sub-image of the two-dimensional panoramic image is allowed to be the default spatial position is still allowed.
  • the process may be: when the applicable spatial layout format of the two-dimensional panoramic image is different from the default spatial layout format, determining each sub-image of the two-dimensional panoramic image in the applicable spatial layout format. Whether the spatial positional relationship is the same as the default spatial positional relationship in the applicable spatial layout format.
  • the spatial positional relationship of each sub-image of the used two-dimensional panoramic image is not exactly the same as the default setting, there may be a part. the same.
  • the order of the default sub-images is: Left->Front->Right->Top->Rear->Bottom, but the actual order of use is Bottom->Rear->Left->Front->Right->Top, only The next four sub-images are arranged in the same order as the default settings.
  • the corresponding encoding method may include: when the spatial positional relationship of each sub-image of the two-dimensional panoramic image in the applicable spatial layout format is different from the default spatial positional relationship in the applicable spatial layout format When the same, it is determined that the spatial positional relationship of each sub-image of the two-dimensional panoramic image in the applicable spatial layout format is the same as the default spatial positional relationship under the applicable spatial layout format, and if so, from the current processing
  • the sub-images begin to encode encodings that use a default spatial positional relationship with a range of sub-images.
  • the spatial layout format of the two-dimensional panoramic image includes a plurality of sub-spaces, such as 12 or 20, and the corresponding sub-images of the two-dimensional panoramic image include a corresponding number of sub-pictures.
  • the encoding method may further include, for example, determining a range of sub-images of the two-dimensional panoramic image having the same spatial positional relationship as the default spatial positional relationship, for example, using the same spatial positional relationship from the fourth to eleventh sub-images.
  • the encoding method is encoded.
  • the spatial positional relationship of each sub-image of the panoramic image includes an arrangement order of the respective sub-images of the panoramic image, or the panoramic view.
  • the rotation angle of each sub-image of the image, or the arrangement order of the sub-images of the panoramic image and the rotation angle is not limited to the coding method and the various extension schemes provided by the first aspect of the present invention.
  • a decoding method includes: receiving an encoded code stream of a two-dimensional panoramic image, determining an applicable spatial layout format of the two-dimensional panoramic image; and analyzing an encoded code of the two-dimensional panoramic image Flowing to determine whether a spatial positional relationship of each sub-image of the two-dimensional panoramic image is a default spatial positional relationship in the applicable spatial layout format; if in the applicable spatial layout format, the two-dimensional panoramic view The spatial positional relationship of each sub-image of the image is a default spatial positional relationship, and the encoded code stream of the two-dimensional panoramic image is decoded according to the applicable spatial layout format and a default spatial positional relationship.
  • the decoding method introduces a default spatial positional relationship in a spatial format to be applied to the two-dimensional panoramic image for each sub-image of the two-dimensional panoramic image, since the spatial position of each sub-image of the two-dimensional panoramic image is normally The relationship is relatively fixed. Compared with the spatial positional relationship of each sub-image that decodes the two-dimensional panoramic image one by one, this method adopts the default spatial positional relationship, and only needs to decode one indication information to obtain each of the two-dimensional panoramic images.
  • the spatial positional relationship of the sub-images can reduce the complexity of decoding, save the decoding time, and reduce the need for buffer space in the decoding process.
  • the step of determining the applicable spatial layout format of the two-dimensional panoramic image to be encoded can be obtained by derivation. And obtaining the width and height of the two-dimensional panoramic image, determining the application of the two-dimensional panoramic image according to the aspect ratio determination of the two-dimensional panoramic image and the correspondence between the preset aspect ratio and the spatial layout format.
  • a spatial layout format; the step of determining the applicable spatial layout format of the two-dimensional panoramic image to be encoded may be obtained by deriving or directly by parsing the code stream, and specifically, the encoded code stream of the two-dimensional panoramic image may be parsed and obtained.
  • the default spatial positional relationship may be more than one, that is, there may be multiple default spatial position relationships for each of the two-dimensional panoramic images.
  • the sub-image is used.
  • the decoding method further includes: if the spatial position relationship of each sub-image of the two-dimensional panoramic image is a default spatial position relationship in the applicable spatial layout format, parsing the two-dimensional image Encoding the code stream of the panoramic image to obtain an index of a default spatial positional relationship, and acquiring, according to the default spatial position relation index, each child of the two-dimensional panoramic image corresponding to the index from the plurality of default spatial position relationships A default spatial positional relationship of the image, and decoding the encoded code stream of the two-dimensional panoramic image according to the applicable spatial layout format and a default spatial positional relationship.
  • the decoding method includes: parsing the An encoded code stream of the two-dimensional panoramic image to determine whether a suitable spatial layout format of the two-dimensional panoramic image is a default spatial layout format; correspondingly, determining whether a spatial positional relationship of each sub-image of the two-dimensional panoramic image in the applicable spatial layout format is a default space in the applicable spatial layout format
  • the step of the positional relationship also needs to be adaptively changed.
  • the change is determined according to whether the applicable spatial layout format of the two-dimensional panoramic image is a default spatial layout format, and when the two-dimensional panoramic image is When the applicable spatial layout format is the default spatial layout format, the spatial positional relationship of each sub-image of the two-dimensional panoramic image is a default spatial positional relationship.
  • the solution may be: when the applicable spatial layout format of the two-dimensional panoramic image is a non-default spatial layout format, determining, in the applicable spatial layout format, each sub-image of the two-dimensional panoramic image Whether the spatial positional relationship is a default spatial positional relationship, and if the spatial positional relationship of each sub-image of the two-dimensional panoramic image is a default spatial positional relationship, the decoding is performed according to the applicable spatial layout format and a default spatial positional relationship.
  • the decoding method provided by the second aspect of the present invention it is considered that, in actual use, if the spatial positional relationship of each sub-image of the used two-dimensional panoramic image is not exactly the same as the default setting, there may be a part. the same.
  • the order of the default sub-images is: Left->Front->Right->Top->Rear->Bottom, but the actual order of use is Bottom->Rear->Left->Front->Right->Top, only The next four sub-images are arranged in the same order as the default settings.
  • the corresponding decoding method may include: when parsing the code stream, determining a spatial positional relationship of each sub-image of the two-dimensional panoramic image in the applicable spatial layout format and not in the applicable spatial layout format a default spatial position relationship, parsing a code stream, determining a spatial positional relationship of each sub-image of the two-dimensional panoramic image in the applicable spatial layout format, and a default spatial positional relationship in the applicable spatial layout format
  • the parts are identical, and if so, the decoding mode that uses the default spatial positional relationship with a certain range of sub-images is decoded from the currently processed sub-image.
  • This processing method is particularly suitable for a spatial layout format of a two-dimensional panoramic image including a plurality of sub-spaces, such as 12 or 20, and the corresponding sub-images of the two-dimensional panoramic image include a corresponding number of sub-images.
  • the decoding method may further include, for example, parsing the code stream to determine a range of sub-images of the two-dimensional panoramic image having the same spatial positional relationship as the default spatial position, such as from the fourth to eleventh sub-images and the default spatial position. The decoding method with the same relationship is decoded.
  • the spatial positional relationship of each sub-image of the panoramic image includes an arrangement order of the respective sub-images of the panoramic image, or the panoramic view.
  • the rotation angle of each sub-image of the image, or the arrangement order of the sub-images of the panoramic image and the rotation angle is the arrangement order of the respective sub-images of the panoramic image, or the panoramic view.
  • an encoding apparatus includes: a spatial layout format and a spatial location relationship determining unit, configured to determine a spatial layout format applicable to the two-dimensional panoramic image to be encoded, and a space of each sub-image of the two-dimensional panoramic image in the applicable spatial layout format.
  • a positional relationship configured to determine whether a spatial positional relationship of each sub-image of the two-dimensional panoramic image in the applicable spatial layout format is different from a default spatial position in the applicable spatial layout format The same relationship; and a coding unit for spatial positional relationship of each sub-image of the two-dimensional panoramic image under the applicable spatial layout format and a default spatial position under the applicable spatial layout format
  • the encoding is used to indicate that, in the applicable spatial layout format, each sub-image of the two-dimensional panoramic image adopts a default spatial positional relationship indicating information, and according to the applicable spatial layout format, a default spatial positional relationship, encoding each sub-image of the two-dimensional panoramic image to generate an encoded code .
  • the encoding apparatus introduces a default spatial positional relationship in a spatial format to be applied to the two-dimensional panoramic image for each sub-image of the two-dimensional panoramic image, since the spatial position of each sub-image of the two-dimensional panoramic image is normally The relationship is relatively fixed. Compared with the spatial positional relationship of each sub-image that encodes the two-dimensional panoramic image one by one, this way of adopting the default spatial positional relationship requires only one indication information, and each of the commonly used two-dimensional panoramic images can be used. The spatial positional relationship of the sub-images is clearly described, thereby saving the overhead of the larger codewords and improving the coding efficiency.
  • an apparatus for implementing the decoding method of the second aspect of the present invention includes: a spatial layout format determining unit, configured to receive an encoded code stream of the two-dimensional panoramic image, and determine the second a suitable spatial layout format of the dimensional panoramic image; a spatial positional relationship determining unit configured to parse the code stream to determine a spatial positional relationship of each sub-image of the two-dimensional panoramic image in the applicable spatial layout format Whether it is a default spatial position relationship; a decoding unit, configured to, if the spatial positional relationship of each sub-image of the two-dimensional panoramic image is a default spatial positional relationship, according to the applicable spatial layout format, The encoded code stream of the two-dimensional panoramic image is decoded by applying a spatial layout format and a default spatial positional relationship.
  • the decoding apparatus introduces a default spatial positional relationship in a spatial format to be applied to the two-dimensional panoramic image for each of the two-dimensional panoramic images, since the spatial position of each sub-image of the two-dimensional panoramic image is normally The relationship is relatively fixed. Compared with the spatial positional relationship of each sub-image that decodes the two-dimensional panoramic image one by one, this method adopts the default spatial positional relationship, and only needs to decode one indication information to obtain each of the two-dimensional panoramic images.
  • the spatial positional relationship of the sub-images can reduce the complexity of decoding, save the decoding time, and reduce the need for buffer space in the decoding process.
  • an apparatus for implementing the encoding method of the first aspect of the invention comprising a nonvolatile storage medium, and a central processing unit, the nonvolatile storage medium storing an executable program,
  • the central processor is coupled to the non-volatile storage medium and executes the executable program to implement an encoding method and extension thereof as provided by the first aspect of the present invention.
  • an apparatus for implementing a decoding method of a second aspect of the present invention comprising a nonvolatile storage medium, and a central processing unit, the nonvolatile storage medium storing an executable program,
  • the central processor is coupled to the non-volatile storage medium and executes the executable program to implement an encoding method and extension thereof as provided by the second aspect of the present invention.
  • a codec system comprising the encoding as provided by the third aspect of the invention Apparatus, and a decoding apparatus as provided by the fourth aspect of the invention, the decoding apparatus for decoding an encoded code stream from the encoding apparatus.
  • FIG. 1 is a block diagram of a VR device system to which an embodiment of the present invention is applied;
  • Figure 2 is a latitude and longitude view of a panoramic image
  • Figure 3 is a two-dimensional panorama of a polyhedral format
  • FIG. 4 is a schematic diagram of a view image acquisition device to which an embodiment of the present invention is applied;
  • FIG. 5 is a schematic diagram of an image encoder to which an embodiment of the present invention is applied.
  • FIG. 6 is a schematic diagram of a VR display device to which an embodiment of the present invention is applied;
  • FIG. 7 is a comparative diagram of different spatial layout formats of a two-dimensional panoramic image
  • FIG. 8 is a schematic diagram showing positional marking of each subspace in a hexahedron spatial layout format
  • FIG. 10 is an illustration of an encoding process of a two-dimensional panoramic image in a hexahedral spatial layout format
  • FIG. 11 is a flow chart showing an encoding method according to an embodiment of the present invention.
  • FIG. 12 is a flow chart showing a decoding method according to an embodiment of the present invention.
  • Figure 13 is a block diagram of an encoding apparatus according to an embodiment of the present invention.
  • Figure 14 is a block diagram of a decoding apparatus according to an embodiment of the present invention.
  • FIG. 1 is a system architecture diagram of a VR device 100 to which an embodiment of the present invention is applied.
  • the VR device 100 has an image acquisition device 110 connected to the VR display device 150 via a transmission device 130.
  • the transmission device 130 may be constituted by a LAN (Local Area Network), a WLAN (Wireless Local Area Network), the Internet, an LTE network, or the like.
  • the image capturing device 110 includes N cameras, and the N cameras are arranged in an area where a panoramic image needs to be generated, for acquiring images of respective corresponding viewpoints of the N cameras, and the images of the respective viewpoints are stitched to form a camera.
  • the spherical panorama can be understood as an image in a spherical format centered on the image capturing device 120.
  • the spherical format image cannot be conveniently represented and stored.
  • indexing so the prior art usually expands the spherical panorama to obtain a two-dimensional planar panorama, and then compresses, processes, stores, transmits, and the like.
  • mapping The operation of expanding a three-dimensional spherical panorama to obtain a two-dimensional planar panorama is called a mapping.
  • mapping methods there are a variety of mapping methods, corresponding to a variety of two-dimensional planar panorama format.
  • the most common panorama format is called the latitude and longitude map. As shown in Figure 2, in the latitude and longitude map, the image near the north and south pole regions is stretched, and there is severe distortion and data redundancy.
  • the spherical panorama content can be mapped to a polyhedron.
  • the polyhedron can be a cube (hexahedron), thereby passing the spherical panorama through
  • the panoramic images are represented by a plurality of faces of the polyhedron.
  • the spherical panorama is represented by six equal-sized square faces of a plurality of cubes, and the images mapped on the six faces of the cube are directly expanded by spatial proximity.
  • a cross-shaped two-dimensional image is obtained, as shown in Fig.
  • the image in this format can also be called a two-dimensional image in a hexahedron format.
  • the cross-shaped image shown in 3(b) can be directly processed, and the rectangular area of the smallest area surrounded by the cross-shaped image can be selected as the processing target, if the rectangular area is After all the contents of the cross-shaped image are accommodated, there is still redundancy, that is, the cross-shaped image does not cover all the spaces of the rectangular area, and the redundant portion in the rectangular area can be filled with default content, for example, All black or all white, or use the pixel values in the cross-shaped image for extended fill.
  • the six faces of the hexahedron may be disassembled, that is, the cross-shaped figure is disassembled and rearranged, that is, according to a given geometric position, for example, 1 ⁇ 6,
  • the combination of 2x3, 3x2, and 6x1 is a rectangular area, as shown in Fig. 3(c), which is referred to herein as a spatial layout format.
  • Figure 3(d) shows the hexahedral spatial layout format of the 3x2 arrangement.
  • the image acquisition device 110 After mapping the three-dimensional spherical panorama to a two-dimensional planar panoramic image using the various mapping methods described above or other mapping methods not listed herein, the image acquisition device 110 performs further image encoding (compression, post-processing) on the panoramic image. Processing) is transmitted to the VR display device 150 via the transmission device 130.
  • some related parameter information that needs to be used in the N cameras 111-1 to 111-N in the above-mentioned image for mapping the three-dimensional spherical panorama into a two-dimensional planar panorama also needs to be transmitted through the transmission device 130.
  • these related parameter information includes, but is not limited to, focal length, image depth, distortion, camera tilt angle, distance between cameras, angle covered by the angle of view, and the like.
  • the VR display device 150 is configured to receive the compressed data transmitted from the image collection device 110, that is, the encoded panoramic image, and decode the compressed data to reconstruct the panoramic image image, and based on the panoramic image data.
  • the display screen of the display device 150 displays the panoramic image.
  • FIG. 4 is a system frame diagram of the image pickup device 110, which includes the camera portion 111 including N cameras 111-1 to 111-N, an image processing portion 113, and a data transfer portion 115.
  • the N cameras 111-1 to 111-N of the camera section 111 are arranged in an area where a panoramic image needs to be generated to acquire a sub image required for splicing a panoramic image with respect to the panoramic image
  • the panoramic image is preferably a spherical panoramic image having image information 360 degrees above and below the three-dimensional space.
  • the camera 111-1 has an imager 111-1a and a signal processor 111-1b.
  • the imaging device 111-1a includes an imaging lens and an image sensing device (not shown).
  • the imaging lens is a fixed focus or zoom light adjustment device composed of one or more sets of coaxial lenses.
  • the image sensing device may be configured by a CCD (Charge Coupled Device) or a CCD (Complementary Metal Oxide Semiconductor).
  • the signal processor 111-1b performs sampling, gain control, analog signal conversion to digital signals, white balance adjustment, and gamma correction on the analog image signal output from the imager 111-1a to generate a digital image.
  • the image processing unit 113 includes an image splicer 1132, an image mapper 1134, and an image encoder 1136.
  • the image splicer 1132 splices N digital images from the N cameras 111-1 to 111-N of the camera unit 111, and the splicing process can be simply understood as the space according to the N cameras. a positional relationship, determining a spatial positional relationship of the N digital images, then performing feature matching on the adjacent digital images, and correcting and aligning the adjacent digital images according to the result of the feature matching, thereby aligning the N digital images Stitching into a three-dimensional panoramic image.
  • the image mapper 1134 is configured to map the three-dimensional panoramic image generated by the image splicer into a two-dimensional panoramic image, and the mapping process is as described above, and the mapped two-dimensional panoramic image may be in a latitude and longitude format or a polyhedron format. Polyhedron formats such as hexahedron format ensure that the image has less distortion.
  • the image encoder 1136 is configured to encode and output the two-dimensional panoramic image generated by the mapping by the image mapper 1134 to the data transmission portion.
  • the image encoder 1136 can operate in accordance with a video compression standard (eg, the High Efficiency Video Codec H.265 standard) and can follow the HEVC Test Model (HM).
  • a video compression standard eg, the High Efficiency Video Codec H.265 standard
  • HM HEVC Test Model
  • image encoder 1136 can operate in accordance with other proprietary or industry standards including ITU-TH.261, ISO/IEC MPEG-1 Visual, ITU-TH.262 or ISO/IEC MPEG-2 Visual, ITU-TH.263, ISO/IEC MPEG-4 Visual, ITU-TH.264 (also known as ISO/IEC MPEG-4 AVC), includes scalable video codec (SVC) and multiview video codec (MVC) extensions.
  • SVC scalable video codec
  • MVC multiview video codec
  • Image encoder 1136 may be implemented using a variety of processing circuits, such as one or more microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), discrete logic, hardware Or any combination thereof. If the technology is implemented partially or wholly in software, the device may store the instructions of the software in a suitable non-transitory computer readable storage medium, and the instructions in the hardware may be executed using one or more processors to perform the techniques of the present invention. . Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) can be considered as one or more processors.
  • the image encoder 1136 encodes the two-dimensional panoramic image, wherein the temporally consecutive multi-frame two-dimensional panoramic image constitutes a panoramic video, and the panoramic video is encoded by the image encoder 1136 to form a panoramic video stream.
  • the code stream contains encoded information of the panoramic video data in the form of a bit stream.
  • the encoded information may include encoded picture data and associated data.
  • Associated data can include sequence parameter sets (SPS), picture parameter sets (PPS), and other syntax structures.
  • SPS sequence parameter sets
  • PPS picture parameter sets
  • An SPS can contain parameters that are applied to zero or more sequences.
  • the PPS can contain parameters that are applied to zero or more pictures.
  • a grammatical structure refers to a collection of zero or more syntax elements arranged in a specified order in a code stream.
  • image encoder 1136 may segment the image to be encoded into a raster of a coding tree block (CTB).
  • CTB coding tree block
  • a CTB may be referred to as a "tree block,” a "maximum coding unit” (LCU), or a "coding tree unit.”
  • LCU maximum coding unit
  • the CTB is not limited to a particular size and may include one or more coding units (CUs).
  • Each CTB can be associated with a block of pixels of equal size within the picture.
  • Each pixel can correspond to one luminance (luminance or luma) sample and two chrominance or chroma samples.
  • each CTB can be associated with one luma sample block and two chroma sample blocks.
  • the CTB of a picture can be divided into one or more stripes.
  • each stripe contains an integer number of CTBs.
  • image encoder 1136 can generate The encoded information of each strip of the picture, that is, the CTB within the stripe.
  • image encoder 1136 can recursively perform quadtree partitioning on the pixel blocks associated with the CTB to partition the block of pixels into decreasing blocks of pixels. The smaller block of pixels can be associated with a CU.
  • FIG. 5 is a schematic block diagram of an image encoder 1136 according to an embodiment of the present invention, including an encoding end prediction module 1361, a transform quantization module 1362, an entropy encoding module 1363, an encoding reconstruction module 1364, and an encoding end filtering module 1365, specifically:
  • the encoder prediction module 1361 is configured to generate prediction data.
  • the encoding side prediction module 1361 may generate one or more prediction units (PUs) each of which is no longer partitioned. Each PU of a CU may be associated with a different block of pixels within a block of pixels of the CU.
  • the encoding end prediction module 1361 can generate a predictive pixel block for each PU of the CU.
  • the encoding side prediction module 1361 may use intra prediction or inter prediction to generate a predictive pixel block of the PU. If the encoding side prediction module 1361 uses intra prediction to generate a predictive pixel block for the PU, the encoding side prediction module 1361 may generate a predictive pixel block of the PU based on the decoded pixels of the picture associated with the PU.
  • the encoding side prediction module 1361 may generate the PU based on the decoded pixels of the one or more pictures different from the picture associated with the PU. Predictive pixel block.
  • the encoding side prediction module 1361 may generate a residual pixel block of the CU based on the predictive pixel block of the PU of the CU.
  • the residual pixel block of the CU may indicate the difference between the sampled value in the predictive pixel block of the PU of the CU and the corresponding sampled value in the initial pixel block of the CU.
  • Transform quantization module 1362 is for processing the predicted residual data.
  • Image encoder 1136 may perform recursive quadtree partitioning on the residual pixel blocks of the CU to partition the residual pixel blocks of the CU into one or more smaller residual pixel blocks associated with the transform units (TUs) of the CU. Because the pixels in the pixel block associated with the TU each correspond to one luma sample and two chroma samples, each TU can be associated with one luma residual sample block and two chroma residual sample blocks.
  • Image encoder 1136 may apply one or more transforms to the residual sample block associated with the TU to generate a coefficient block (ie, a block of coefficients).
  • the transform can be a DCT transform or a variant thereof.
  • the coefficient block is obtained by applying a one-dimensional transform in the horizontal and vertical directions to calculate a two-dimensional transform.
  • Image encoder 1136 may perform a quantization procedure for each of the coefficient blocks. Quantization generally refers to the process by which the coefficients are quantized to reduce the amount of data used to represent the coefficients, thereby providing further compression.
  • Image encoder 1136 may generate a set of syntax elements that represent coefficients in the quantized coefficient block.
  • Image encoder 1136 may apply an entropy encoding operation (eg, a context adaptive binary arithmetic coding (CABAC) operation) to some or all of the above syntax elements by entropy encoding module 1363.
  • CABAC context adaptive binary arithmetic coding
  • entropy encoding module 1363 can binarize the syntax elements to form a binary sequence that includes one or more bits (referred to as "binary").
  • Entropy encoding module 1363 may encode a portion of the binary using regular encoding, and may use bypass encoding to encode other portions of the binary.
  • image encoder 1136 may apply inverse quantization and inverse transform to the transformed coefficient block to reconstruct the residual sample block from the transformed coefficient block.
  • Image encoder 1136 may add the reconstructed residual sample block to a corresponding sample block of one or more predictive sample blocks to produce a reconstructed sample block.
  • image encoder 1136 can reconstruct the block of pixels associated with the TU. The pixel block of each TU of the CU is reconstructed in this way until the entire pixel block reconstruction of the CU is completed.
  • image encoder 1136 After image encoder 1136 reconstructs the block of pixels of the CU, image encoder 1136 performs a deblocking filtering operation through encoding end filtering module 1365 to reduce the blockiness of the block of pixels associated with the CU. After image encoder 1136 performs a deblocking filtering operation, image encoder 1136 may use a sample adaptive offset (SAO) to modify the reconstructed block of pixels of the CTB of the picture. After performing these operations, image encoder 1136 may store the reconstructed blocks of pixels of the CU in a decoded picture buffer for use in generating predictive pixel blocks for other CUs.
  • SAO sample adaptive offset
  • the data transmission unit 115 is responsible for transmitting the encoded data and the code stream to the VR display device via the transmission device 130. 150.
  • the data transmission portion 115 can be a radio interface circuit that is coupled to the controller and is adapted to generate, for example, a wireless communication signal for communicating with a cellular communication network, a wireless communication system, or a wireless local area network, the wireless communication signal being coupled to The antenna on the radio interface circuit is sent to the other device(s).
  • the VR display device 150 is typically a head mounted viewing device, typically a pair of glasses with a built-in illuminated screen for displaying video images.
  • the device is equipped with a position and direction sensing system that can track various movements of the user's head and present the video image content of the corresponding position and direction to the screen. Therefore, as shown in FIG. 6, the VR display device 150 includes at least a data receiving unit 151, a data processing unit 153, and a data display unit 155.
  • the VR terminal device may further include a high-level interactive function module such as a user's gaze tracking system, and present the user's area of interest to the screen.
  • the data receiving unit 151 is configured to receive the data, the code stream transmitted by the image collection device 110 via the transmission device 130, and transmit the data received by the image collection device to the data processing unit 153.
  • the data receiving unit 151 may be configured by an antenna for receiving a transmitted device from another data transmitting device via a cellular communication network, a wireless communication system, or a wireless local area network, and transmitting the same to the radio.
  • the interface circuit is sent to the affiliation processing unit 153 by the radio interface circuit.
  • the data processing section 153 includes at least an image decoder 1532 for decoding a code stream from the data receiving section 151, and a memory 1534, which is included in the form of a bitstream by the image encoder 1136. Encoded information of the encoded video data.
  • the memory 1534 is for storing an image to be displayed after decoding.
  • the image decoder 1532 includes an entropy decoding module 1532a, a decoding reconstruction module 1532b, and a decoding filtering module 1532c;
  • Entropy decoding module 1532a parses the code stream to extract syntax elements from the code stream.
  • the image decoder 1532 may perform regular decoding on the partial binary and may perform the bypass decoding on the binary of the other portion, and the binary in the code stream has a mapping relationship with the syntax element, through parsing The binary gets the syntax element.
  • the decoding reconstruction module 1532b may reconstruct a picture of the video data based on the syntax elements extracted from the code stream.
  • the process of reconstructing video data based on syntax elements is generally reciprocal to the process performed by image encoder 1136 to generate syntax elements.
  • image decoder 1532 can generate a predictive pixel block of a PU of a CU based on syntax elements associated with the CU.
  • image decoder 1532 can inverse quantize the coefficient blocks associated with the TUs of the CU.
  • Image decoder 1532 may perform an inverse transform on the inverse quantized coefficient block to reconstruct a residual pixel block associated with the TU of the CU.
  • Image decoder 1532 may reconstruct a block of pixels of the CU based on the predictive pixel block and the residual pixel block.
  • Decoding filtering module 1532c after decoding reconstruction block 1532b reconstructs the pixel block of the CU, performing a deblocking filtering operation to reduce the blockiness of the pixel block associated with the CU. Additionally, image decoder 1532 can perform the same SAO operations as image encoder 1136 based on one or more SAO syntax elements. After image decoder 1532 performs these operations, image decoder 1532 may store the block of pixels of the CU in a decoded picture buffer, memory 1534. The decoded picture buffer can provide reference pictures for subsequent motion compensation, intra prediction, and presentation by the display device.
  • the display unit 155 is configured to read the decoded image stored in the memory 1534 and display it through a display screen.
  • the display screen may be a liquid crystal display (LCD), a plasma, a plasma display panel, a plasma display panel (TFT), an organic light-emitting diode display (OLED), and a surface.
  • LCD liquid crystal display
  • TFT plasma display panel
  • OLED organic light-emitting diode display
  • SED Surface-conduction electron-emitter display
  • laser video display carbon nanotubes
  • Quantum dot display Quantum dot display
  • Interferometric Modulator display IMOD
  • the encoding end prediction module 1361 of the image capturing device 110 performs predictive encoding on a certain image block to be encoded in the two-dimensional panoramic image generated by the image mapper 1134 mapping
  • pixels of adjacent positions around the image block are The reconstruction is used to generate the predicted value of the image block to be encoded.
  • the closer the prediction information is to the pixel of the image block to be encoded the closer the obtained prediction value is to the pixel value of the image block to be encoded, and the smaller the residual, The corresponding code rate is smaller. Therefore, the correlation between the reconstructed pixel values of the image block to be encoded and the pixel values of the image block to be encoded directly affects the encoding efficiency.
  • inter-frame coding if motion mapping is performed, if the mapping between images on each face of the polyhedron is not ideal, if the boundary is crossed during the motion estimation process (the boundary between adjacent images), the coding efficiency will also be affected. .
  • the six sub-images there are six sub-images constituting the two-dimensional panoramic image, and the six sub-images generally have four spatial layout formats, as shown in FIG. 3(d).
  • the layout format, and for any spatial layout format the six sub-images can be arranged in any order, and the sub-images can also be rotated at an angle.
  • the order of sub-images can be set according to specific situations. For example, it is also a 3x2 layout format.
  • six sub-images of the two-dimensional panoramic image of the hexahedron format generally have 4 (Type_idx: 0 to 3) spatial layout formats.
  • Type_idx 0 to 3 spatial layout formats.
  • the different spatial layout formats need to be divided into subspaces similar to or identical to the sub-image sizes in the two-dimensional panoramic image of the hexahedron format, and the divided subspaces are numbered in a predetermined order, for example, 0 to 5, and the process also It can be understood that the spatial position in a certain spatial layout format is indexed accordingly, and the corresponding values of 0 to 5 are index values of different positions, as shown in FIG. 8 , it should be noted that the position marking method here may also be Other forms of markup, for example, for 1x6, can also be marked from the bottom up, or other markup forms.
  • the six sub-images may be named separately as shown in FIG. For: Left, Front, Right, Rear, Top, Bottom.
  • ⁇ Left->0, Front->1, Right->2, Rear->3, Top->4, Bottom->5 ⁇ can obtain a two-dimensional panorama of a specific arrangement order in a specific spatial layout format.
  • the image can also be referred to as a two-dimensional panoramic image after reordering.
  • the sub-images of the two-dimensional panoramic image of the hexahedron format can be arranged in an arbitrary order to improve the two-dimensional panorama of the hexahedron format as much as possible.
  • the spatial correlation between the individual sub-images of the image therefore, is required to support this free permutation order in the encoding/decoding syntax design, and an exemplary syntax is as shown in Table 1 below:
  • Num_of_layout_face_minus1 its value plus 1 is the number of faces of the polyhedron, here the number of faces of the hexahedron;
  • Layout_face[i] Describes the indication information of the sub-image placed in the sub-space of the i-th position in the current layout format. Note that you only need to identify the index value of the current face in the remaining faces of the FaceArray array, and only need to identify the information of the first 5 faces (the remaining last face is placed in the last space of the last position).
  • Layout_rotation[i] Describes the rotation angle of the sub-image in the subspace at the i-th position.
  • the correspondence between the rotation angle indication information and the actual rotation angle can be as shown in Table 2 below. In actual use, you can also follow other correspondences, as long as the codec side agrees:
  • this grammatical structure can achieve a free arrangement order corresponding to each sub-picture in different spatial formats, as well as a rotation angle.
  • a rotation angle In fact, for a general panoramic image, how to arrange each sub-picture and rotate each sub-picture in a certain spatial layout format due to the relative position determination between the sub-pictures, so as to maintain the space between the sub-pictures as much as possible.
  • the method of correlation is generally also determined. Therefore, under a certain spatial layout format, it is not necessary to always encode the sub-picture arrangement order and the surface rotation angle, thereby resulting in low coding efficiency of each sub-picture arrangement order of the current image coding panoramic image.
  • the present invention provides an encoding method for improving the efficiency of codec.
  • the encoding method as shown in FIG. 11, the encoding method 110 mainly includes
  • S112 determining a spatial layout format applicable to the two-dimensional panoramic image to be encoded and a spatial positional relationship of each sub-image of the two-dimensional panoramic image in the applicable spatial layout format;
  • the above determined process may be implemented by means of rate-distortion optimization or predictive analysis. Specifically, the above-described determination process is still described by taking the two-dimensional panoramic image of the above hexahedron format as an example. It should be noted that the application generally refers to an optimal spatial layout format of a two-dimensional panoramic image and an optimal space of each sub-image of the two-dimensional panoramic image under the optimal spatial layout format. Positional relationship, however, in some special cases, considering the complexity of coding, delay or encoding and decoding of the codec, the selected spatial layout format and the corresponding spatial positional relationship are not necessarily optimal, but most suitable Codec requirements, this is also allowed in the present invention.
  • Step 1 After it has been known that the current video format is a hexahedral format, it is necessary to determine which of the four spatial layout formats, the determination process can be determined using the aspect ratio of the two-dimensional panoramic image, for example:
  • the height/width of the hexahedron is the greatest common divisor of vertical_size and horizontal_size.
  • a spatial layout format of the hexahedron can be obtained, corresponding to one of 0 to 3 in Type_idx in FIG. 3(d).
  • Step 2 After determining the spatial layout format, it is necessary to define the spatial positional relationship of each sub-image of the two-dimensional panoramic image in the hexahedron format;
  • Parse num_of_layout_face_minus1 to get the number of sub-images of the hexahedral format panorama.
  • the sub-image indication information layout_face[i] and layout_rotation[i] are sequentially analyzed, and the parsed indication information and each sub-image (Left, Front, Right) , Rear, Top, Bottom), wherein the codec side agrees on a sub-image indication information and each sub-picture Corresponding method of the image, and determining the specific rotation angle according to the parsed rotation angle indication information, the codec end agrees a corresponding method of the indication information and the rotation angle.
  • S114 determining whether a spatial positional relationship of each sub-image of the two-dimensional panoramic image in the applicable spatial layout format is the same as a default spatial positional relationship in the applicable spatial layout format;
  • the default setting here means that the codec end complies with a convention.
  • the decoder knows that it is the default setting, the decoder does not need to parse the specific syntax element information, but directly By convention, each parameter is set to the parameter corresponding to the default setting.
  • the spatial positional relationship may refer to a spatial arrangement order of each sub-image or a rotation angle of the sub-image, or a spatial arrangement order and a rotation angle.
  • each sub-image of the two-dimensional panoramic image is generally arranged in the order of Left, Front, Right, Top, Rear, Bottom, and respectively according to 0, 0, 0, -90°.
  • the rotation angle of 90°, -90° is rotated, and the correlation between the sub-images can be better maintained. Therefore, the spatial positional relationship of the sub-images in the set space layout format can be set as the default setting.
  • the encoding end such as the image encoder 1136 in the image acquisition acquisition device 110, does not need to arrange the sub-images.
  • the order and the rotation angle of the sub-image are encoded, and the decoding end, such as the image decoder 1532 in the VR display device 150, can directly arrange the order and/or the corresponding parameter sub-images when learning that the currently used parameters are default settings.
  • the sub rotation angle, etc. are set to the parameters corresponding to the default settings.
  • the above is only a 3x2 spatial layout format, and the default settings of the sub-image arrangement order and/or the sub-image rotation angle are described as an example. In practical applications, based on similar concepts, the above default settings can be further extended to the spatial layout format, the order of sub-images, the rotation angle of sub-images, the number of sub-spaces in the polyhedral format, or some of the parameters. parameter.
  • each sub-image of the two-dimensional panoramic image in the applicable spatial layout format is the same as the default spatial positional relationship in the applicable spatial layout format
  • the encoding is used to indicate that In the applicable spatial layout format, each sub-image of the two-dimensional panoramic image adopts a default spatial position relationship indication information, and according to the applicable spatial layout format, the default spatial position relationship is used.
  • Each sub-image of the two-dimensional panoramic image is encoded to generate an encoded code stream.
  • the present invention mainly introduces the concept of the sub-image arrangement order and/or the sub-image rotation angle of the default setting, only when the sub-image actually used
  • the arrangement order and/or the sub-image conversion angle are different from the default settings, the actual sub-image arrangement order and/or sub-image rotation angle and the like are further encoded; the present invention can be effectively reduced for representation by the above method.
  • the number of bits required for the spatial position information of each sub-image of the two-dimensional panoramic image thereby improving coding efficiency.
  • step S114 when the spatial positional relationship of each sub-image of the two-dimensional panoramic image in the applicable spatial layout format is not related to the default spatial position in the applicable spatial layout format, at the same time, a suitable spatial layout format for encoding a two-dimensional panoramic image and a spatial positional relationship of each sub-image of the two-dimensional panoramic image under the applicable spatial layout format.
  • the present invention further provides a decoding method, as shown in FIG. 12, the decoding method mainly includes
  • S122 Receive an encoded code stream of the two-dimensional panoramic image, and determine an applicable spatial layout format of the two-dimensional panoramic image;
  • the decoding method of the present invention by decoding the code stream by using the default spatial layout format and the spatial position relationship, the time required for the decoding process, the required buffer space, and the improved decoding efficiency can be effectively reduced.
  • a two-dimensional panoramic image in a hexahedron format is taken as an example to illustrate several preferred implementations of the present invention.
  • the present invention is not limited to the use of only a hexahedral format, and the present invention may also adopt any other number of Polyhedral format, such as tetrahedron, pentahedron, octahedron, dodecahedron or icosahedron.
  • the coding information generated in the following embodiments may be set at a position such as sequence level header information, image level header information, and Supplemental Enhancement Information (SEI) of the code stream.
  • SEI Supplemental Enhancement Information
  • Embodiment 1 is a diagrammatic representation of Embodiment 1:
  • the spatial layout format of the hexahedral format two-dimensional panoramic image is derived using the aspect ratio, and the spatial positional relationship of each sub-image of the two-dimensional panoramic image, such as the arrangement order and whether the rotation pass rate optimization or pre-analysis is obtained;
  • the spatial positional relationship of each sub-image of the default two-dimensional panoramic image is set.
  • the number of sub-images of the known two-dimensional panoramic image is fixed, so that no additional design syntax elements are required to indicate the number.
  • a syntax element may also be designed to declare the number of sub-images of the two-dimensional panoramic image in a certain format.
  • it may be predetermined at the codec end that, when the specific rotation angle of each sub-image of the two-dimensional panoramic image is not specified, each sub-image of the dimensional panoramic image may be defaulted to not rotate, or the codec ends are the same.
  • Table 3 is a syntax element design according to Embodiment 1 of the present invention.
  • Default_order_flag describes whether to adopt the default spatial positional relationship
  • Layout_face[i] describes the indication information of the sub-image placed in the sub-space of the i-th position in the current layout format. Note that, in the subsequent embodiments, the same syntax elements may be considered to be the same as those described in the present embodiment, and the description will not be repeated for the sake of brevity.
  • the specific implementation process of the first embodiment through the encoder such as the image encoder 1136 in the image acquisition and collection device 110, may be:
  • Step 1 Select the applicable spatial layout format and the spatial positional relationship of the sub-images.
  • the optimal spatial layout format and the corresponding sub-image spatial positional relationship of the corresponding optimal two-dimensional panoramic image are selected, including but not limited to the arrangement order and/or whether to rotate.
  • Step 2 Encode default_order_flag
  • the spatial layout relationship of the specific spatial layout format and its corresponding sub-images can be set to the default setting on the codec side, that is, the default spatial layout format and the spatial positional relationship of the corresponding sub-images.
  • Step 3 Determine whether to encode layout_face[i] according to default_order_flag
  • layout_face[i] is encoded, wherein the specific method of encoding includes, but is not limited to, using entropy coding, variable length coding, fixed length coding or other coding methods, and is applicable to coding of syntax elements in subsequent embodiments.
  • the specific implementation process of the first embodiment through the decoder such as the image decoder 1532 in the VR display device 150, may be:
  • Parse the syntax element default_order_flag If it is 1, it indicates that the spatial layout format of the two-dimensional panoramic image and the corresponding spatial positional relationship adopt the default setting. Therefore, it is not necessary to analyze the spatial layout format of each sub-image in Table 3 and its corresponding spatial positional relationship.
  • a syntax element that is, there is no need to decode related syntax elements; wherein the spatial positional relationship may be an arrangement order used by each sub-image of the two-dimensional panoramic image in a certain spatial layout format, if Default setting, there is no need to parse the syntax elements behind the ordering Layout_face[i], but directly sets the order of each sub-image to the default order, and decodes according to the default order. If it is 0, it means that the spatial layout format of the two-dimensional panoramic image and the corresponding spatial positional relationship do not adopt the default setting, and the decoder needs to further analyze layout_face[i], that is, decode the related syntax elements.
  • the decoder needs to obtain the spatial layout format of the two-dimensional panoramic image before performing the above steps, and the obtaining manner may be that the spatial layout format is obtained according to the aspect ratio of the two-dimensional panoramic image, and the specific obtaining manner may be: obtaining the current The height and width of the 2D panoramic image, the aspect ratio is calculated, and the specific spatial layout format is determined according to the aspect ratio, for example, the aspect ratio is 1:6, corresponding to the 1x6 layout format; the aspect ratio is 3:2, corresponding to 3x2 The layout format, and so on.
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • the spatial layout format of the two-dimensional panoramic image is not derived using the aspect ratio, but by encoding the specific layout format indication information; for multiple different layout formats, there is usually a specific sub-image spatial position.
  • the relationship will be optimal, that is, the most accurate prediction result can be obtained.
  • the specific spatial layout format and the spatial positional relationship of its corresponding sub-image can be set to the default setting on the codec side, and the spatial positional relationship includes It is not limited to the arrangement order of the respective sub-images of the two-dimensional panoramic image.
  • Table 4 is a table of syntax elements corresponding to Embodiment 2 of the present invention.
  • Layout_type_index describes a spatial layout format index of the current two-dimensional panoramic image, according to which a specific spatial layout format can be obtained;
  • the specific implementation process of the second embodiment through the encoder such as the image encoder 1136 in the image acquisition and collection device 110, may be:
  • Step 1 Select the applicable spatial layout format and the spatial positional relationship of the sub-images
  • Step 2 Encoding layout_type_index
  • the indication information corresponding to the applicable spatial layout format is encoded, that is, layout_type_index.
  • Step 3 Encode default_order_flag
  • step 2 of the embodiment is the same as step 2 of the embodiment. Therefore, the specific implementation process can be understood by referring to step 2 of the embodiment.
  • Step 4 Determine whether to encode layout_face[i] according to default_order_flag
  • the spatial positional relationship of the sub-images such as the encoding of the arrangement order layout_face[i] is skipped. Otherwise, the layout_face[i] is encoded.
  • the specific implementation process of the second embodiment through the decoder such as the image decoder 1532 in the VR display device 150, may be:
  • Step 1 Analyze layout_type_index to get the spatial layout format
  • the corresponding spatial layout format can be obtained according to the index value of the layout format by looking up the table.
  • Step 2 Parse the default_order_flag and determine whether to parse the layout_face[i]
  • Parse the syntax element default_order_flag If it is 1, it means that in the determined spatial layout format, the spatial positional relationship of each sub-image of the two-dimensional panoramic image adopts the default setting, and therefore, it is not necessary to analyze the grammar in Table 3 regarding the positional relationship of each sub-image space.
  • Embodiment 3 is a diagrammatic representation of Embodiment 3
  • This embodiment sets the optimal spatial layout format of the two-dimensional panoramic image and the optimal spatial position information of each sub-image of the two-dimensional panoramic image in the optimal spatial layout format to a default setting.
  • Table 5 is a syntax element design corresponding to Embodiment 3 of the present invention.
  • Default_layout_order_flag describes whether the current two-dimensional panoramic image adopts a default spatial layout format and spatial positional relationship
  • the specific implementation process of the third embodiment through the encoder such as the image encoder 1136 in the image acquisition and collection device 110, may be:
  • Step 1 Select the applicable spatial layout format and the spatial positional relationship of the sub-images.
  • Step 2 Encode default_layout_order_flag
  • the default_layout_order_flag is 1, otherwise the default_layout_order_flag is 0. Finally, the default_layout_order_flag is encoded.
  • Step 3 Determine whether to encode layout_type_index and layout_face[i] according to default_layout_order_flag
  • the specific implementation process of the third embodiment through the decoder such as the image decoder 1532 in the VR display device 150, may be:
  • Step 1 Parse default_layout_order_flag
  • Step 2 Determine whether to resolve layout_type_index and layout_face[i] according to default_layout_order_flag
  • the default_layout_order_flag 1
  • the default_layout_order_flag is 0, it indicates that the spatial layout format of the two-dimensional panoramic image and the corresponding spatial positional relationship do not adopt the default setting, and it is necessary to continue to parse the syntax elements layout_type_index and layout_face[i] according to the parsed decoded code stream.
  • Embodiment 4 is a diagrammatic representation of Embodiment 4:
  • the spatial layout of the default two-dimensional panoramic image and the spatial positional relationship of each sub-image of the default two-dimensional panoramic image may be separately set.
  • the design of the specific syntax elements is shown in Table 6.
  • the specific implementation process of the fourth embodiment through the encoder such as the image encoder 1136 in the image acquisition and collection device 110, may be:
  • Step 1 Select the applicable spatial layout format and the corresponding applicable sub-image spatial position relationship
  • Step 2 Encode default_layout_order_flag
  • Step 3 Encoding layout_type_index
  • this step is determined by the second step. If the default_layout_order_flag is 0, the step is executed. Otherwise, this step is skipped.
  • Step 4 Encode default_order_flag
  • Step 5 Determine whether to encode layout_face[i] according to default_order_flag
  • the specific implementation process of the fourth embodiment through the decoder such as the image decoder 1532 in the VR display device 150, may be:
  • Step 1 Parse default_layout_order_flag
  • the default_layout_order_flag is parsed first. If it is 1, the 2D panoramic image actually uses the default spatial layout format, and the spatial positional relationship of each sub-image of the 2D panoramic image also uses the default setting, and skips the subsequent layout_type_index, default_order_flag, layout_face[i] Resolution step. Otherwise, the parsing steps of layout_type_index, default_order_flag, layout_face[i] are performed later.
  • Step 2 Analyze layout_type_index
  • Step 3 Parse the default_order_flag and determine whether to parse the layout_face[i]
  • Parse the syntax element default_order_flag If it is 1, it means that the two-dimensional panoramic image is in the current spatial layout format, and the spatial positional relationship of each sub-image uses the default setting, that is, the spatial positional relationship of each sub-image is the same as the default setting, so there is no need to parse the subsequent sub-images.
  • the syntax elements of the spatial positional relationship such as layout_face[i] directly set the spatial positional relationship of each sub-image to the default spatial positional relationship, and the codec side agrees on the default spatial positional relationship. If it is 0, it means that in the current spatial layout format, the spatial positional relationship of each sub-image is actually different from the default setting.
  • the layout_face[i] needs to be further parsed, and the code stream is decoded according to the result of the specific parsing.
  • Embodiment 5 is a diagrammatic representation of Embodiment 5:
  • the number of faces of the polyhedron when the three-dimensional panoramic image is required to be mapped is described in the encoding process, and the spatial position information of each sub-image of the mapped two-dimensional panoramic image in a specific spatial layout format is defined.
  • layout_rotation[i] specifies the number of faces of the polyhedron and the rotation angle of each sub-image of the two-dimensional panoramic image.
  • a set of sorting order usually corresponds to a set of rotation angles.
  • the arrangement order and rotation angle of each sub-image of the two-dimensional panoramic image can be designed simultaneously.
  • D-0 as an example, that is, based on the implementation of four, the number of faces of the hexahedron is specified, and each sub-image of the two-dimensional panoramic image can be rotated, and the specific rotation method can be used in Table 2 above or other
  • the codec side has a common agreement. Therefore, the syntax element corresponding to the fifth embodiment of the present invention can be designed as shown in Table 8:
  • the default_layout_order_rotation_flag describes a spatial layout format of the two-dimensional panoramic image, and whether the arrangement order and the rotation angle of each sub-image of the two-dimensional panoramic image are the same as the default setting.
  • the default_order_rotation_flag describes whether the arrangement order and the rotation angle of each sub-image of the two-dimensional panoramic image are the same as the default setting.
  • the specific implementation process of the fifth embodiment through the encoder such as the image encoder 1136 in the image acquisition and collection device 110, may be:
  • Step 1 Select the applicable layout format and the applicable arrangement order and rotation angle of the sub-image.
  • Step 2 Encode num_of_layout_face_minus1
  • num_of_layout_face_minus1+1 is the number of hexahedral faces
  • Step 3 Encode default_layout_order_rotation_flag
  • the spatial layout format currently applicable to the two-dimensional panoramic image, and the applicable arrangement order and rotation angle of the plurality of sub-images of the two-dimensional panoramic image are the same as the default setting. If they are the same, the default_layout_order_rotation_flag is set to 1, and if it is different, the default_layout_order_rotation_flag is set to 0, and finally the default_layout_order_rotation_flag is encoded. If the default_layout_order_rotation_flag is 1, the encoding step of the subsequent layout_type_index to layout_rotation[num_of_layout_face_minus1] may be skipped. Otherwise, the encoding step for subsequent it is performed.
  • Step 4 Encoding layout_type_index
  • Step 5 Encode default_order_rotation_flag
  • the default_order_rotation_flag is encoded.
  • the encoding step of the subsequent layout_face[i] ⁇ layout_rotation[num_of_layout_face_minus1] syntax element is skipped, otherwise the encoding of the above syntax element is performed, that is, the following step six:
  • Step 6 Encode the remaining syntax elements layout_face[i], layout_rotation[i], layout_rotation[num_of_layout_face_minus1].
  • the specific implementation process of the fifth embodiment through the decoder such as the image decoder 1532 in the VR display device 150, may be:
  • Step 1 Parse num_of_layout_face_minus1 to get the number of faces
  • Step 2 parse default_layout_order_rotation_flag
  • the decoder skips the remaining pair layout_type_index ⁇ The parsing step of layout_rotation[num_of_layout_face_minus1]. Otherwise, the parsing step for subsequent syntax elements is performed.
  • Step 3 Parse layout_type_index to obtain the current spatial layout format of the 2D panoramic image
  • Step 4 Parse default_order_rotation_flag
  • the decoder will skip the layout_face[i]. , layout_rotation[i], layout_rotation[num_of_layout_face_minus1] parsing steps. Otherwise, the parsing steps for subsequent syntax elements are continued.
  • Step 5 Parse the syntax elements layout_face[i], layout_rotation[i], layout_rotation[num_of_layout_face_minus1].
  • Embodiment 6 is a diagrammatic representation of Embodiment 6
  • the sub-images of the two-dimensional panoramic image used are arranged in an order that is not exactly the same as the default setting, there may be some identical parts.
  • the order of the default sub-images is: Left->Front->Right->Top->Rear->Bottom, but the actual order of use is Bottom->Rear->Left->Front->Right->Top, only
  • the next four sub-images are arranged in the same order as the default settings.
  • starting from the third subspace of the applicable spatial layout format there is no need to determine the order of the sub-images corresponding to the other subspaces, but only need to tell the decoder to start from a certain position.
  • the order of the default sub-images is still used, and the length of the sub-spaces using the same sort order is told to the decoder.
  • the partial_default_order_flag describes whether or not a partial sub-image in each sub-image of the two-dimensional panoramic image is arranged in the same order as the default.
  • the specific implementation process of the sixth embodiment through the encoder such as the image encoder 1136 in the image acquisition and collection device 110, may be:
  • Step 1 Select the applicable spatial layout format and the spatial positional relationship of the sub-images.
  • the optimal spatial layout format and the corresponding sub-image spatial positional relationship of the corresponding optimal two-dimensional panoramic image are selected, including but not limited to the arrangement order and/or whether to rotate.
  • Step 2 Encode default_order_flag
  • the spatial layout relationship of the specific spatial layout format and its corresponding sub-images can be set to the default setting on the codec side, that is, the default spatial layout format and the spatial positional relationship of the corresponding sub-images.
  • Step 3 Determine whether to encode partial_default_order_flag, layout_face[i] according to default_order_flag
  • default_order_flag is 1, the encoding of partial_default_order_flag, layout_face[i] is skipped. If it is 0, it indicates that the spatial positional relationship of each sub-image of the two-dimensional panoramic image, such as the arrangement order and/or the rotation angle, is not exactly the same as the default setting. However, it is possible that the partial order is the same as the default setting, so the spatial positional relationship of the currently remaining sub-images is compared with the default spatial positional relationship. The partial_default_order_flag is 1 if the partial spatial positional relationship is satisfied and the partial spatial positional relationship in the default setting is the same. Otherwise 0. Last pair The partial_default_order_flag is encoded.
  • the codec side needs to jointly agree on a method for determining that the partial order is the same as the default setting. For example, for the present embodiment, if the current spatial layout format is 1x6, or 6x1, and the Left position is at the position of 0 or 1 or 2, Space, and the order in which there are only three consecutive faces is the same as the order of three consecutive faces behind Left in the default setting, the current face is considered to be in the same order as the default settings.
  • Step 3 Determine the encoding method of layout_face[i] according to partial_default_order_flag
  • the specific implementation process of the sixth embodiment through the decoder such as the image decoder 1532 in the VR display device 150, may be:
  • Step 1 Parse the default_order_flag and determine whether to parse the partial_default_order_flag, layout_face[i]
  • Parse the syntax element default_order_flag If it is 1, it indicates that the spatial layout format of the two-dimensional panoramic image and the corresponding spatial positional relationship adopt the default setting, so there is no need to parse the partial_default_order_flag, and the spatial layout format of each sub-image in Table 3 and its corresponding
  • the syntax element of the spatial positional relationship that is, the related syntax element is not required to be decoded; wherein the spatial positional relationship may be an arrangement order used by each sub-image of the two-dimensional panoramic image in a certain spatial layout format, if With the default setting described above, it is not necessary to parse the syntax element layout_face[i] related to the following order, but the sub-image ordering order is directly set to the default order, and decoding is performed according to the default order.
  • the decoder needs to obtain the spatial layout format of the two-dimensional panoramic image before performing the above steps, and the obtaining manner may be that the spatial layout format is obtained according to the aspect ratio of the two-dimensional panoramic image, and the specific obtaining manner may be: obtaining the current The height and width of the 2D panoramic image, the aspect ratio is calculated, and the specific spatial layout format is determined according to the aspect ratio, for example, the aspect ratio is 1:6, corresponding to the 1x6 layout format; the aspect ratio is 3:2, corresponding to 3x2 The layout format, and so on.
  • Step 3 Parse the partial_default_order_flag and determine the resolution method of layout_face[i] according to it.
  • the analysis of the indication information if the current position is 2, skip 2, because the last subspace location does not need to resolve the specific surface indication information, otherwise three.
  • the spatial layout format is deduced by using the aspect ratio by default, and the spatial layout format type used for direct coding can also be directly encoded in the actual use; in this embodiment, the coding of the hexahedron number and each surface are not considered.
  • the angle of rotation, in actual use, can also consider the impact of these two factors.
  • This embodiment only considers two spatial layout formats (layout_6x1, layout_1x6). In actual use, it can also be extended to other spatial layout formats in a similar manner.
  • the codec side agrees that the start of the same sequence segment is Left, and the actual use may start from a certain face in the middle, or directly encode the indication information of the start face.
  • the same sequence segment length is 4, and may be other numbers in actual use, or directly encode the length of the same sequence segment.
  • Embodiment 7 is a diagrammatic representation of Embodiment 7:
  • the spatial positional relationship of the default sub-images may be arranged in various order.
  • the order of the sub-images selected by the encoding end is one of them, in addition to the need to indicate that the currently used sub-image sorting order is the default setting, it is also necessary to tell the decoding end which default sub-image sorting order is used.
  • Table 10 shows the syntax element design in this case.
  • Default_order_index describes the index of the spatial positional relationship of the default sub-image.
  • the specific implementation process of the seventh embodiment through the encoder such as the image encoder 1136 in the image acquisition and collection device 110, may be:
  • Step 1 Select the applicable spatial layout format and the spatial positional relationship of the sub-images.
  • the optimal spatial layout format and the corresponding sub-image spatial positional relationship of the corresponding optimal two-dimensional panoramic image are selected, including but not limited to the arrangement order and/or whether to rotate.
  • Step 2 Encode default_order_flag
  • the width and height of the image can be determined, and usually the spatial positional relationship of a particular sub-image will be optimal, that is, the best
  • the quasi-predicted result, such a specific spatial layout format and the spatial positional relationship of its corresponding sub-image can be set to the default setting on the codec side, that is, the default spatial layout format and the spatial positional relationship of its corresponding sub-image.
  • Step 3 Determine the encoding of the default_order_index, layout_face[i] according to the default_order_flag
  • the codec needs to jointly determine the corresponding manner of the indication information and the sequence of the plurality of sub-images, and skip the arrangement order of the sub-images layout_face[i The code of ]. Otherwise, the layout_face[i] is encoded.
  • the specific implementation process of the seventh embodiment through the decoder such as the image decoder 1532 in the VR display device 150, may be:
  • Step 1 Parse the default_order_flag and determine whether to parse the default_order_index, layout_face[i]
  • Parse the syntax element default_order_flag If it is 1, it indicates that the spatial layout format of the two-dimensional panoramic image and the corresponding spatial position relationship adopt the default setting, and further analysis of default_order_index is needed to obtain which default sub-image is used in the current spatial layout format. Spatial positional relationship, and there is no need to parse the syntax element layout_face[i] related to the arrangement order; wherein the spatial positional relationship may be that each sub-image of the two-dimensional panoramic image is used in a certain spatial layout format If the default setting is adopted, it is not necessary to parse the syntax element layout_face[i] related to the following order, but the sub-image ordering order is directly set to the default order, and decoding is performed according to the default order. If it is 0, it means that the spatial layout format of the two-dimensional panoramic image and the corresponding spatial positional relationship do not adopt the default setting, and the decoder needs to further analyze layout_face[i], that is, decode the related syntax elements.
  • the decoder needs to obtain the spatial layout format of the two-dimensional panoramic image before performing the above steps, and the obtaining manner may be that the spatial layout format is obtained according to the aspect ratio of the two-dimensional panoramic image, and the specific obtaining manner may be: obtaining the current The height and width of the 2D panoramic image, the aspect ratio is calculated, and the specific spatial layout format is determined according to the aspect ratio, for example, the aspect ratio is 1:6, corresponding to the 1x6 layout format; the aspect ratio is 3:2, corresponding to 3x2 The layout format, and so on.
  • default_order_flag default sub-image arrangement order
  • default_order_index default sub-image arrangement order
  • the sub-images are arranged in order (the codec side agrees on a corresponding way), as shown in Table 11.
  • the spatial layout format is deduced by the aspect ratio by default, and the spatial layout format type used can be directly encoded in actual use.
  • the coding of the number of hexahedral faces and the angle of rotation of each face are not considered, and in actual use, the influence of these two factors may also be considered.
  • only a plurality of default sub-image arrangement orders are specified for each spatial layout format. In actual use, it may also be used to specify a plurality of default spatial arrangement formats and sub-image arrangement orders (without using width and height).
  • the coding space layout instruction information is required, or for specifying a plurality of default spatial arrangement formats and sub-image arrangement orders, and also for each spatial layout format.
  • a variety of default sub-image sort orders can also be expanded and used in conjunction with the rotation angle.
  • the encoding method 110 may be implemented by a hardware device or a hardware device having a corresponding function, such as an encoding device.
  • a method is used for An encoding device 1300 implementing all of the above possible encoding methods of the present invention, the encoding device comprising:
  • a spatial layout format and spatial position relationship determining unit 1301, configured to determine a two-dimensional panoramic image applicable spatial layout format to be encoded and a spatial position of each sub-image of the two-dimensional panoramic image in the applicable spatial layout format relationship;
  • a determining unit 1303, configured to determine whether a spatial positional relationship of each sub-image of the two-dimensional panoramic image in the applicable spatial layout format is the same as a default spatial positional relationship in the applicable spatial layout format ;as well as
  • a coding unit 1305, configured to: when a spatial positional relationship of each sub-image of the two-dimensional panoramic image in the applicable spatial layout format is the same as a default spatial positional relationship in the applicable spatial layout format Encoding to indicate that, in the applicable spatial layout format, each sub-image of the two-dimensional panoramic image adopts a default spatial positional relationship indicating information, and according to the applicable spatial layout format, the default The spatial positional relationship is obtained by encoding each sub-image of the two-dimensional panoramic image to generate an encoded code stream.
  • the present invention can effectively reduce the space for representing each sub-image of a two-dimensional panoramic image by the above apparatus.
  • the number of bits required for location information thereby improving coding efficiency.
  • the various units of the encoding device 1300 can be a functional unit implemented by general purpose or dedicated hardware.
  • the decoding method 120 may be implemented by a hardware device or a hardware device having a corresponding function, such as a decoding device.
  • a spatial layout format determining unit 1401 configured to receive an encoded code stream of the two-dimensional panoramic image, and determine an applicable spatial layout format of the two-dimensional panoramic image
  • a spatial position relationship determining unit 1403, configured to parse the code stream to determine whether a spatial positional relationship of each sub-image of the two-dimensional panoramic image is a default spatial position relationship in the applicable spatial layout format;
  • a decoding unit 1405, configured to: if the spatial positional relationship of each sub-image of the two-dimensional panoramic image is a default spatial position relationship in the applicable spatial layout format, according to the applicable spatial layout format and a default The spatial positional relationship decodes the encoded code stream of the two-dimensional panoramic image.
  • the various units of the decoding device 1400 can be a functional unit implemented by general purpose or dedicated hardware.
  • the decoding apparatus of the present invention can effectively reduce the time required for the decoding process, and the required buffer space, and improve the decoding efficiency.
  • the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted as one or more instructions or code via a computer-readable medium and executed by a hardware-based processing unit.
  • the computer readable medium can comprise a computer readable storage medium (which corresponds to a tangible medium such as a data storage medium) or a communication medium comprising, for example, any medium that facilitates transfer of the computer program from one place to another in accordance with a communication protocol. .
  • computer readable media generally may correspond to (1) a non-transitory tangible computer readable storage medium, or (2) a communication medium such as a signal or carrier wave.
  • Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for use in carrying out the techniques described herein.
  • the computer program product can comprise a computer readable medium.
  • certain computer-readable storage media may comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, disk storage or other magnetic storage device, flash memory, or may be used to store instructions or data structures. Any other medium in the form of the desired program code and accessible by the computer. Also, any connection is properly termed a computer-readable medium. For example, if you use coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technology (eg, infrared, radio, and microwave) to send commands from a website, server, or other remote source, coaxial cable , fiber optic cable, twisted pair, DSL, or wireless technologies (eg, infrared, radio, and microwave) are included in the definition of the media.
  • coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technology eg, infrared, radio, and microwave
  • a magnetic disk and an optical disk include a compact disk (CD), a laser disk, an optical disk, a digital video disk (DVD), a flexible disk, and a Blu-ray disk, wherein the disk usually reproduces data magnetically, and the disk passes the laser Optically copy data. Combinations of the above should also be included within the scope of computer readable media.
  • DSPs digital signal processors
  • ASIC application specific integrated circuit
  • FPGA field programmable logic array
  • processors may refer to any of the foregoing structures or any other structure suitable for implementing the techniques described herein.
  • the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec.
  • the techniques can be fully implemented in one or more circuits or logic elements.
  • the techniques of the present invention can be broadly implemented by a variety of devices or devices, including a wireless handset, an integrated circuit (IC), or a collection of ICs (eg, a chipset).
  • IC integrated circuit
  • Various components, modules or units are described in this disclosure to emphasize functional aspects of the apparatus configured to perform the disclosed techniques, but are not necessarily required to be implemented by different hardware units. Rather, as described above, various units may be combined in a codec hardware unit or combined with suitable software and/or by a collection of interoperable hardware units (including one or more processors as described above). Or firmware to provide.
  • system and “network” are used interchangeably herein. It should be understood that the term “and/or” herein is merely an association relationship describing an associated object, indicating that there may be three relationships, for example, A and/or B, which may indicate that A exists separately, and A and B exist simultaneously. There are three cases of B alone. In addition, the character "/" in this article generally indicates that the contextual object is an "or" relationship.
  • B corresponding to A means that B is associated with A, and B can be determined according to A.
  • determining B from A does not mean that B is only determined based on A, and that B can also be determined based on A and/or other information.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device implementations described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not carried out.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本发明提供一种应用于视频处理领域的解码方法,其包括接收二维全景图像的编码码流,确定所述二维全景图像的适用空间布局格式;解析所述二维全景图像的编码码流,以确定在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系是否为默认的空间位置关系;若在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系为默认的空间位置关系,则根据所述适用空间布局格式及默认的空间位置关系解码所述二维全景图像的编码码流。

Description

编解码的方法及设备 技术领域
本发明涉及视频编解码及压缩领域,尤其涉及一种适用于全景图像编解码的编解码方法及装置。
背景技术
数字视频能力可并入到大范围的装置中,包含数字电视、数字直播系统、无线广播系统、个人数字助理(PDA)、膝上型或桌上型计算机、平板计算机、电子书阅读器、数码相机、数字记录装置、数字媒体播放器、视频游戏装置、视频游戏控制台、蜂窝式或卫星无线电电话、视频会议装置、视频流装置等等。数字视频装置实施视频压缩技术,例如由MPEG-2、MPEG-4、ITU-TH.263、ITU-TH.264/MPEG-4第10部分高级视频编解码(AVC)、ITU-TH.265高效率视频编解码(HEVC)标准定义的标准和所述标准的扩展部分中所描述的那些视频压缩技术,从而更高效地发射及接收数字视频信息。视频装置可通过实施这些视频编解码技术来更高效地发射、接收、编码、解码和/或存储数字视频信息。
在视频编解码领域中,帧的概念是指一副完整的图像,由一帧一帧的图像按照一定的次序和帧速率组成视频格式后即可播放。当帧速率达到一定的速度后,两帧之间的间隔时间小于人眼的分辨极限,会出现短暂的视觉停留,方才能看似动态的出现在屏幕上。视频文件能够实现压缩的基础是单帧数字图像的压缩编码,数字化后的图像中存在很多的重复表示信息,称之为冗余信息。一帧图像中往往存有许多空间结构相同或相似的地方,例如同一物体或背景中的采样点颜色之间大都存在着紧密关联性和相似性。在多帧图像组中,一帧图像和其前一帧或后一帧基本上都有巨大的相关性,描述信息的像素值差别很小,这些都是可以被压缩的部分。同样的道理,视频文件中不但存在着空间冗余信息,而且包含大量的时间冗余信息,这是由视频的组成结构导致的。例如,视频采样的帧速率一般为25帧/秒至30帧/秒,特殊情况中会出现60帧/秒的可能。也就是说,相邻两帧间的采样时间间隔至少为1/30秒到1/25秒。在这么短的时间内,采样得到的图像画面中基本上都存在大量的相似信息,画面之间存在巨大关联性。但是在原始的数字视频录入系统中是各自独立的记录,没有考虑和利用到这些连贯相似特性,这就造成了相当巨大的重复多余的数据量。另外,已通过研究表明,从人眼的视觉敏感度这一心理特性的角度出发,视频信息中也是存在可以用来压缩的部分,即视觉冗余。所谓视觉冗余,是指利用人眼对亮度变化比较敏感,而对色度的变化相对不敏感的胜利特性来适当的压缩视频比特流。在高亮度的区域,人眼视觉对亮度变化的敏感度呈现下降趋势,转而对物体的边缘处较为敏感,内部区域相对不敏感;对整体结构较为敏感,对内部细节变换相对不敏感。因为视频图像信息的最终服务对象是我们人类群体,所以可以充分利用人眼的这些特性对原有的视频图像信息进行压缩处理,达到更佳的压缩效果。除了上面提到的空间冗余、时间冗余和视觉冗余外,视频图像信息中还会存在信息熵冗余、结构冗余、知识冗余、重要性冗 余等等这一系列的冗余信息。视频压缩编码的目的就是使用各种技术方法将视频序列中的冗余信息去除掉,以达到减小存储空间和节省传输带宽的效果。
就目前的技术发展现状而言,视频压缩处理技术主要包括帧内预测、帧间预测、变换量化、熵编码以及消块滤波处理等。在国际通用范围内,存在的视频压缩编码标准中主流的压缩编码方式主要有四种:色度抽样、预测编码、变换编码和量化编码。
色度抽样:此方式充分利用了人眼的视觉心理特点,从底层的数据表示中就开始设法最大限度的缩减单个元素描述的数据量。在电视系统中多数采用的是亮度-色度-色度(YUV)颜色编码,它是欧洲电视系统广泛采用的标准。YUV颜色空间中包括一个亮度信号Y和两个色差信号U和V,三个分量彼此独立。YUV颜色模式彼此分开的表示方式更加灵活,传输占用带宽少,比传统的红绿蓝(RGB)色彩模型更具优势。例如,YUV 4:2:0形式表示两色度分量U和V在水平方向和垂直方向上都只有亮度Y分量的一半,即4个采样像素点中存在4个亮度分量Y,而色度分量U和V则只有一个。这样表示时,数据量进一步缩小,仅为原始的33%左右。利用人眼生理视觉特性,通过这种色度抽样的方式实现视频压缩的目的,是目前广泛采用的视频数据压缩方式之一。
预测编码:即利用之前已编码帧的数据信息来预测当前将要编码的帧。通过预测得到一个预测值,它不完全等同与实际值,之间存在着一定的残差值。如果预测越适合,则预测值就会越接近实际值,残差值就越小,这样对残差值进行编码就能大大减小数据量,在解码端解码时运用残差值加上预测值还原重构出初始图像,这就是预测编码的基本思想方法。在主流编码标准中预测编码分为帧内预测和帧间预测两种基本类型。
变换编码:是不直接对原本的空间域信息进行编码,而是将信息采样值按照某种形式的变换函数,从当前域转换到另外一种人为定义域中(通常称为变换域),再根据信息在变换域的分布特性进行压缩编码。变换编码的原因是:视频图像数据往往在空间域的数据相关性大,导致大量冗余信息的存在,直接编码需要很大的比特量。而在变换域中数据相关性则大大减少,使得编码的冗余信息减少,编码所需的数据量也随之大大减少,这样就能够得到较高的压缩比,而且可以实现较好的压缩效果。典型的变换编码有卡洛(K-L)变换、傅立叶变换等。整数离散余弦变换(DCT)是许多国际标准中都普遍采用的变换编码方式。
量化编码:上述提到的变换编码其实本身并不压缩数据,量化过程才是压缩数据的有力手段,也是有损压缩中数据“损失”的主要原因。量化的过程就是将动态范围较大的输入值强行规划成较少的输出值的过程。因为量化输入值范围较大,需要较多的比特数表示,而“强行规划”后的输出值范围较小,从而只需要少量的比特数即可表示。每个量化输入被归一化到一个量化输出,即量化到某个数量级中,这些数量级通常被称之为量化级(通常由编码器规定)。
在基于混合编码架构的编码算法中,上述压缩编码方式被混合使用,编码器控制模块根据视频帧中不同图像块的局部特性,选择该图像块所采用的编码模式。对帧内预测编码的块进行频域或空域预测,对帧间预测编码的块进行运动补偿预测,预测的 残差再通过变换和量化处理形成残差系数,最后通过熵编码器生成最终的码流。为避免预测误差的累积,帧内或帧间预测的参考信号是通过编码端的解码模块得到。变换和量化后的残差系数经过反量化和反变换重建残差信号,再与预测的参考信号相加得到重建的图像。环路滤波会对重建后的图像进行像素修正,提高重建图像的编码质量。
在上述的基础编码的技术和编码框架下,衍生出了多种针对不同场景的编码技术,比如全景编码(Panorama encoding)技术,该编码技术可将三维空间360度全向视觉信息进行编码后传输给接收端,接收端将接收到的编码数据解码后再经过专用的显示设备重现360度的全向视觉内容,以向用户提供身临其境的视觉感受。该技术的应用促使,如,Oculus Rift,Gear VR,HTC Vive等一系列虚拟现实(virtual reality,VR)产品的产生与应用,并迅速流行,相应,对编码技术提出了更高的要求。
发明内容
本发明提供一种可以提高编解码效率的编解码方法,该编解码方法适用于编码对二维全景图像进行编码,尤其适用于当二维全景图像包括多个子图像,且所述多个子图像的空间位置关系对整个编码效率有较大影响的情况。
根据本发明的第一方面,一种编码方法,该方法包括,确定待编码的二维全景图像适用空间布局格式以及在所述适用的空间布局格式下所述二维全景图像的各个子图像的空间位置关系;判断在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系是否与在所述适用空间布局格式下的默认的空间位置关系相同;当在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系与在所述适用空间布局格式下的默认的空间位置关系相同时,编码用于指示在所述适用的空间布局格式下,所述二维全景图像的各个子图像采用默认的空间位置关系的指示信息,并根据所述适用的空间布局格式以所述的默认的空间位置关系,对所述二维全景图像的各个子图像进行编码生成编码码流。
该编码方法中采引入将针对二维全景图像的各个子图像在所述二维全景图像适用的空间格式下的默认的空间位置关系,由于通常情况下二维全景图像的各个子图像的空间位置关系比较固定,相较于逐个编码二维全景图像的各个子图像的空间位置关系,这种采用默认的空间位置关系的方式,仅需要一个指示信息便可以将常用所述二维全景图像的各个子图像的空间位置关系描述清楚,从而能够节省较大编码码字的开销,提升编码的效率。
在本发明第一方面提供的编码方法的基础上,考虑到进一步节省编码码字的开销,步骤确定待编码的二维全景图像适用空间布局格式可以通过推导的方式获取,具体的可以是,获取所述二维全景图像的宽和高,根据所述二维全景图像的宽高比确定以及预设的宽高比与空间布局格式的对应关系表确定所述二维全景图像的适用空间布局格式。当然,考虑到编码的复杂度,延时,以及网络的情况,所述二维全景图像的适用空间布局格式也可以通过索引的方式编码如码流中,而是解码端能够直接根据该索引确定所述二维全景图像的适用空间布局格式。
在本发明第一方面提供的编码方法的基础上,考虑到方案的灵活性,默认的空间 位置关系可以不止一个,即,可以有多个默认的空间位置关系供所述二维全景图像的各个子图像来使用,对应的,该编码方法还包括:编码所述二维全景图像的各个子图像的默认空间位置关系索引,所述默认的空间位置索引用于唯一指示所述多个不同的默认的空间位置关系中的一个默认的空间位置关系。该方案的优点在于,针对不同的应用场景,即,所述二维全景图像所涵盖的内容,所述各个子图像的最佳默认空间位置关系不同,因此,为了提高编码的效率,针对该情况,本发明提出设置多种默认的空间位置关系供选择。
在本发明第一方面体提供的编码方法的基础上,考虑到编码方案的简洁性,也可以考虑引入默认的二维全景图像的空间布局格式,对应的,所述编码方法包括:判断所述二维全景图像的适用空间布局格式是否与默认的空间布局格式相同;对应的,所述判断在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系是否与在所述适用空间布局格式下的默认的空间位置关系相同的步骤也要进行适应性的变化,具体的所述变化为,根据所述二维全景图像的适用空间布局格式是否与默认的空间布局格式相同,确定,当所述二维全景图像的适用空间布局格式为默认的空间布局格式时,所述二维全景图像的各个子图像的空间位置关系为默认的空间位置关系。该方法中将二维全景图像的空间布局格式和二维全景图像的各个子图像的空间位置关系组合考虑,从而可以进一步的节省编码的开销。同时,考虑到编码的灵活性,即,在所述二维全景图像的空间布局格式并非默认空间布局格式时,仍然允许对二维全景图像的各个子图像的空间位置关系进行是否为默认空间位置关系的判断,从而增加编码的灵活性,同时也是提高编码效率的一种方式。具体的,该过程可以是当所述二维全景图像的适用空间布局格式与默认的空间布局格式不同时,判断在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系是否与在所述适用空间布局格式下的默认的空间位置关系相同。
在本发明第一方面提供的编码方法的基础上,考虑到在实际使用过程中,如果使用的二维全景图像的各个子图像的空间位置关系与默认设置并不完全相同,但是有可能存在部分相同。例如默认子图像的排列顺序为:Left->Front->Right->Top->Rear->Bottom,但是实际使用的次序为Bottom->Rear->Left->Front->Right->Top,只有后面的四个子图像排列顺序与默认设置相同。对于这种情况,其实从适用空间布局格式的第三个子空间开始,没有必要为后面的其他子空间对应的子图像进行的排列顺序的确定,而只需要告诉解码端,从某个位置开始,使用的仍然是默认的子图像的排列顺序,并且告知解码端使用相同的排列顺序的子空间的长度即可。对应的所述编码方法可以包括:当在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系与在所述适用空间布局格式下的默认的空间位置关系不相同时,判断所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系与在所述适用空间布局格式下的默认的空间位置关系部分相同,若是,从当前处理的子图像开始对与一定范围的子图像采用默认空间位置关系的编码方式进行编码。这种处理方式特别适用于二维全景图像的空间布局格式包括多个子空间,比如12个或者20个,而对应的所述二维全景图像的子图像包括对应数量的子图 像的情况下。当然,所述编码方法中还可以包括比如,确定与默认的空间位置关系相同的二维全景图像的子图像的范围,比如从第4个至第11个子图像采用与默认的空间位置关系相同的编码方式进行编码。
在以上本发明第一方面提供的编码方法以及各种扩展的方案中,所述的全景图像的各个子图像的空间位置关系包括,所述全景图像的各个子图像的排列顺序,或者所述全景图像的各个子图像的旋转角度,或者所述全景图像的各个子图像的排列顺序以及旋转角度。
根据本发明的第二方面,一种解码方法,该方法包括,接收二维全景图像的编码码流,确定所述二维全景图像的适用空间布局格式;解析所述二维全景图像的编码码流,以确定在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系是否为默认的空间位置关系;若在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系为默认的空间位置关系,则根据所述适用空间布局格式及默认的空间位置关系解码所述二维全景图像的编码码流。
该解码方法中采引入将针对二维全景图像的各个子图像在所述二维全景图像适用的空间格式下的默认的空间位置关系,由于通常情况下二维全景图像的各个子图像的空间位置关系比较固定,相较于逐个解码二维全景图像的各个子图像的空间位置关系,这种采用默认的空间位置关系的方式,仅需要解码一个指示信息便可以获得所述二维全景图像的各个子图像的空间位置关系,从而能降低的解码的复杂度,节省了解码的时间,同时降低了对解码过程中对缓存空间的需求。
在本发明第二方面提供的解码方法的基础上,考虑到编码的灵活性和编码码字的开销,步骤确定待编码的二维全景图像适用空间布局格式可以通过推导的方式获取,具体的可以是,获取所述二维全景图像的宽和高,根据所述二维全景图像的宽高比确定以及预设的宽高比与空间布局格式的对应关系表确定所述二维全景图像的适用空间布局格式;步骤确定待编码的二维全景图像适用空间布局格式可以通过推导的方式获取也可以通过解析码流直接获取,具体的可以是,解析所述二维全景图像的编码码流,获取所述二维全景图像的适用的空间布局格式索引,根据所述空间布局格式索引确定所述二维全景图像的适用空间布局格式。
在本发明第二方面提供的解码方法的基础上,考虑到方案的灵活性,默认的空间位置关系可以不止一个,即,可以有多个默认的空间位置关系供所述二维全景图像的各个子图像来使用,对应的,该解码方法还包括:若在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系为默认的空间位置关系,解析所述二维全景图像的编码码流以获得默认的空间位置关系的索引,根据所述默认空间位置关系索引从所述多个默认空间位置关系中获取与所述索引对应的所述二维全景图像的各个子图像的默认的空间位置关系,并根据所述适用空间布局格式及默认的空间位置关系解码所述二维全景图像的编码码流。
在本发明第二方面提供的解码方法的基础上,考虑到编码方案的简洁性,也可以考虑引入默认的二维全景图像的空间布局格式,对应的,所述的解码方法包括:解析所述二维全景图像的编码码流,以确定所述二维全景图像的适用空间布局格式是否为 默认的空间布局格式;对应的,所述判断在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系是否为在所述适用空间布局格式下的默认的空间位置关系的步骤也要进行适应性的变化,具体的,所述变化为,根据所述二维全景图像的适用空间布局格式是否为默认的空间布局格式,确定,当所述二维全景图像的适用空间布局格式为默认的空间布局格式时,所述二维全景图像的各个子图像的空间位置关系为默认的空间位置关系。同时,考虑到解码端的复杂度,即,在所述二维全景图像的空间布局格式并非默认空间布局格式时,也不需要逐个解码二维全景图像的各个子图像的空间位置关系,而是通过对二维全景图像的各个子图像的空间位置关系进行是否为默认空间位置关系的判断,来确定是否采用默认的位置关系来解码所述各个子图像,从而降低解码的复杂度,以及解码的硬件开销,具体的,该方案可以是,当所述二维全景图像的适用空间布局格式非默认的空间布局格式时,确定在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系是否为默认的空间位置关系,若所述二维全景图像的各个子图像的空间位置关系为默认的空间位置关系则根据所述适用空间布局格式及默认的空间位置关系解码所述二维全景图像的编码码流。
在本发明第二方面提供的解码方法的基础上,考虑到在实际使用过程中,如果使用的二维全景图像的各个子图像的空间位置关系与默认设置并不完全相同,但是有可能存在部分相同。例如默认子图像的排列顺序为:Left->Front->Right->Top->Rear->Bottom,但是实际使用的次序为Bottom->Rear->Left->Front->Right->Top,只有后面的四个子图像排列顺序与默认设置相同。对于这种情况,其实从适用空间布局格式的第三个子空间开始,没有必要为后面的其他子空间对应的子图像进行的排列顺序的确定,而只需要告诉解码端,从某个位置开始,使用的仍然是默认的子图像的排列顺序,并且告知解码端使用相同的排列顺序的子空间的长度即可。对应的所述的解码方法可以包括:当解析码流,判断在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系与非所述适用空间布局格式下的默认的空间位置关系时,解析码流,判断所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系与在所述适用空间布局格式下的默认的空间位置关系部分相同,若是,从当前处理的子图像开始对与一定范围的子图像采用默认空间位置关系的解码方式进行解码。这种处理方式特别适用于二维全景图像的空间布局格式包括多个子空间,比如12个或者20个,而对应的所述二维全景图像的子图像包括对应数量的子图像的情况下。当然,所述解码方法中还可以包括比如,解析码流确定与默认的空间位置关系相同的二维全景图像的子图像的范围,比如从第4个至第11个子图像采用与默认的空间位置关系相同的解码方式进行解码。
在以上本发明第二方面提供的解码方法以及各种扩展的方案中,所述的全景图像的各个子图像的空间位置关系包括,所述全景图像的各个子图像的排列顺序,或者所述全景图像的各个子图像的旋转角度,或者所述全景图像的各个子图像的排列顺序以及旋转角度。
根据本发明的第三方面,一种实现本发明第一方面的编码方法的装置,编码装置 包括:空间布局格式及空间位置关系确定单元,其用于,确定待编码的二维全景图像适用空间布局格式以及在所述适用的空间布局格式下所述二维全景图像的各个子图像的空间位置关系;判断单元,其用于,判断在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系是否与在所述适用空间布局格式下的默认的空间位置关系相同;以及编码单元,其用于,当在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系与在所述适用空间布局格式下的默认的空间位置关系相同时,编码用于指示在所述适用的空间布局格式下,所述二维全景图像的各个子图像采用默认的空间位置关系的指示信息,并根据所述适用的空间布局格式以所述的默认的空间位置关系,对所述二维全景图像的各个子图像进行编码生成编码码流。
该编码装置中采引入将针对二维全景图像的各个子图像在所述二维全景图像适用的空间格式下的默认的空间位置关系,由于通常情况下二维全景图像的各个子图像的空间位置关系比较固定,相较于逐个编码二维全景图像的各个子图像的空间位置关系,这种采用默认的空间位置关系的方式,仅需要一个指示信息便可以将常用所述二维全景图像的各个子图像的空间位置关系描述清楚,从而能够节省较大编码码字的开销,提升编码的效率。
根据本发明的第四方面,一种实现本发明第二方面的解码方法的装置,解码装置包括:空间布局格式确定单元,其用于,接收二维全景图像的编码码流,确定所述二维全景图像的适用空间布局格式;空间位置关系确定单元,其用于,解析所述码流,以确定在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系是否为默认的空间位置关系;解码单元,其用于,若在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系为默认的空间位置关系,则根据所述适用空间布局格式及默认的空间位置关系解码所述二维全景图像的编码码流。
该解码装置中采引入将针对二维全景图像的各个子图像在所述二维全景图像适用的空间格式下的默认的空间位置关系,由于通常情况下二维全景图像的各个子图像的空间位置关系比较固定,相较于逐个解码二维全景图像的各个子图像的空间位置关系,这种采用默认的空间位置关系的方式,仅需要解码一个指示信息便可以获得所述二维全景图像的各个子图像的空间位置关系,从而能降低的解码的复杂度,节省了解码的时间,同时降低了对解码过程中对缓存空间的需求。
根据本发明的第五方面,一种实现本发明第一方面的编码方法的装置,其包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现如本发明第一方面所提供的编码方法及其扩展内容。
根据本发明的第六方面,一种实现本发明第二方面的解码方法的装置,其包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现如本发明第二方面所提供的编码方法及其扩展内容。
根据本发明的第七方面,一种编解码系统,其包括如本发明第三方面提供的编码 装置,以及如本发明第四方面提供的解码装置,所述解码装置用于解码来自于所述编码装置的编码码流。
附图说明
为了更清楚地说明本发明实施方式的技术方案,下面将对实施方式或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施方式,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是应用本发明实施方式的VR设备系统框架图;
图2是全景图像的经纬图;
图3是多面体格式的二维全景图;
图4是应用本发明实施方式的视图像采集装置的示意图;
图5是应用本发明实施方式的图像编码器的示意性示意图;
图6是应用本发明实施方式的VR显示装置的示意性示意图;
图7是二维全景图像的不同空间布局格式下的对比示意图;
图8是六面体空间布局格式下的各个子空间的位置标记示意图;
图9是二维全景图像的各个子图像的命名示意图;
图10是二维全景图像在六面体空间布局格式下的编码过程示例;
图11是本发明实施方式的编码方法的流程框图;
图12是本发明实施方式的解码方法的流程框图。
图13是本发明实施方式的编码装置的框图;
图14是本发明实施方式的解码装置的框图;
具体实施方式
下面将应用于虚拟现实(Virtual Reality,VR)场景中的全景编码技术为例,结合本附图,对本发明实施方式中的技术方案进行清楚、完整地描述,显然,所描述的实施方式是本发明一部分实施方式,而不是全部的实施方式。基于本发明中的实施方式,本领域普通技术人员在没有付出创造性劳动的前提下所获得的所有其他实施方式,都属于本发明保护的范围。
图1是应用本发明实施方式的VR设备100的系统架构图。所述的VR设备100具有图像采集装置110经过传输装置130与VR显示装置150相连接。传输装置130可以是由LAN(局域网)、WLAN(无线局域网)、因特网、LTE网络等构成。
图像采集装置110包N个相机,所述N个相机被布置在需要生成全景图像的区域内,用于获取所述N个相机各自对应视点的图像,所述各个视点的图像经过拼接后构成一个三维空间上下左右360度的全方位视觉信息的球面全景图,该球面全景图可以理解为是大致以图像采集装置120为中心的球面格式的图像,然而,球面格式的图像无法方便的表示、存储和索引,因此现有技术通常将球面全景图展开得到二维平面全景图,再对其进行压缩、处理、存储、传输等操作。将三维球面全景图展开得到二维平面全景图的操作称为映射。 目前存在多种映射方法,对应的得到多种二维平面全景图格式。最常见的全景图格式称为经纬图,如图2所示,经纬图中,邻近南北两极区域图像经过拉伸获得,存在严重的失真和数据冗余。
为克服经纬图中存在的缺陷,如图3(a)所示,可通过将球面全景图内容映射到一个多面体上,典型的,该多面体可以是立方体(六面体),从而将球面全景图通过多个全景图像通过所述多面体的多个面来表示。以立方体为例,所述的球面全景图通过多个立方体的6个相等大小的正方形的面来表示,将所述映射在所述立方体的六个面上的图像直接按空间邻近关系展开后可得十字形二维图像,如图3(b)所示,这种格式的图像也可以称之为六面体格式的二维图像。在压缩等图像处理中,可直接处理3(b)所示的十字形的图像,也可选择该将该十字形的图像包围在内的最小面积的矩形区域作为处理对象,若该矩形区域在容置完所述十字形图像的所有内容之后仍有冗余,即所述十字形图像并不能恰好覆盖矩形区域的所有空间,那么该矩形区域内的冗余部分可填充为默认内容,例如,全黑或全白,或者使用十字形图像中的像素值进行扩展填充。为减小六面体格式的二维图像面积,可以将其中所述六面体的六个面拆解开,即,将所述十字形图形拆开后重新排列,即,按照给定的几何位置例如1x6,2x3,3x2,6x1的组合为矩形区域,如图3(c)所示,这里称其为空间布局格式。图3(d)为3x2排列的六面体空间布局格式。
在使用上述各种映射方法或其它未在本文中列举的映射方法将三维球面全景图映射为二维平面全景图像后,所述图像采集装置110对该全景图像进行进一步的图像编码(压缩、后处理)并经由传输装置130传输给VR显示装置150。需要注意的是,在上述的图像进行三维球面全景图映射为二维平面全景图中需要使用到所述的N个相机111-1到111-N的一些相关参数信息也需要通过传输装置130传输到VR显示装置150,这些相关参数信息包括但不限于焦距、图像深度,失真,相机的倾角,相机之间的距离,视角所覆盖的角度等信息。
VR显示装置150用于接收从图像采集装置110所发送的经过压缩数据,即编码后全景图像后,并对所述压缩数据进行解码以重建全景图像的全景图图像数据,并基于全景图像数据在显示装置150的显示屏显示所述的全景图像。
图4为图像采集装置110的系统框架图,该图像采集装置110包括有所述相机部111,其包括N个相机111-1到111-N,图像处理部113,以及数据传输部115。
所述相机部111的N个相机111-1到111-N被布置在需要生成全景图像的区域内,以获取用于拼接全景图像所需的子图像,该子图像相对于所述全景图像而言,该全景图像优选地为一个在三维空间上下左右360度均有图像信息的球面全景图。所述相机111-1具有成像器111-1a以及信号处理器111-1b。成像装置111-1a包括有成像镜头以及图像感测装置(未显示)。所述成像镜头为一组或者多组同轴透镜构成的定焦或者变焦的光线调节装置。所述图像感测装置可以是CCD(电荷耦合器件)或者CCD(互补金属氧化物半导体)等构成。所述信号处理器111-1b对从成像器111-1a中输出的模拟图像信号执行采样、增益控制、模拟信号向数字信号转换,白平衡调节、伽马矫正从而生成数字图像。
所述图像处理部113包括图像拼接器1132,图像映射器1134,图像编码器1136。
所述的图像拼接器1132将来自于相机部111的N个相机111-1到111-N的N个数字图像进行拼接,该拼接的过程可以简单的理解为,根据所述N个相机的空间位置关系,确定所述N个数字图像的空间位置关系,接着将邻近的数字图像进行特征匹配,并根据特征匹配的结果将邻近的数字图像进行校正和坐标对齐,从而将所述N个数字图像拼接成为一幅三维全景图像。
所述图像映射器1134用于将图像拼接器所生成的三维全景图像映射成为一个二维的全景图像,其映射过程如前所述,经过映射后的二维全景图像可以经纬图格式或者多面体格式,而多面体格式如六面体格式能够保证图像具有较小的失真。
所述图像编码器1136用于对经过图像映射器1134映射生成的二维全景图像进行编码并输出至所述数据传输部。所述图像编码器1136可根据视频压缩标准(例如,高效率视频编解码H.265标准)而操作,且可遵照HEVC测试模型(HM)。H.265标准的文本描述ITU-TH.265(V3)(04/2015)于2015年4月29号发布,可从http://handle.itu.int/11.1002/1000/12455下载,所述文件的全部内容以引用的方式并入本文中。或者,图像编码器1136可根据其它专属或行业标准而操作,所述标准包含ITU-TH.261、ISO/IECMPEG-1Visual、ITU-TH.262或ISO/IECMPEG-2Visual、ITU-TH.263、ISO/IECMPEG-4Visual,ITU-TH.264(还称为ISO/IECMPEG-4AVC),包含可分级视频编解码(SVC)及多视图视频编解码(MVC)扩展。应理解,本发明的技术不限于任何特定编解码标准或技术。
图像编码器1136可采用多种处理电路来实现,例如一个或多个微处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)、离散逻辑、硬件或其任何组合。如果技术部分地或者全部以软件实施,则装置可将软件的指令存储于合适的非瞬时计算机可读存储媒体中,且可使用一个或多个处理器执行硬件中的指令以执行本发明的技术。可将前述各者中的任一者(包含硬件、软件、硬件与软件的组合等)视为一个或多个处理器。
所述图像编码器1136编码所述二维全景图像,其中在时间上连续的多帧二维全景图像便构成了全景视频,全景视频经过图像编码器1136编码后就形成全景视频码流。所述码流以比特流的形式包含了全景视频数据的编码信息。所述编码信息可以包含编码图片数据及相关联数据。相关联数据可包含序列参数集(SPS)、图片参数集(PPS)及其它语法结构。SPS可含有应用于零个或多个序列的参数。PPS可含有应用于零个或多个图片的参数。语法结构是指码流中以指定次序排列的零个或多个语法元素的集合。
在编码过程中,图像编码器1136可将待编码图像分割成编码树型块(CTB)的栅格。在一些例子中,CTB可被称作“树型块”、“最大编码单元”(LCU)或“编码树型单元”。CTB不限于特定大小且可包含一个或多个编码单元(CU)。每一个CTB可以与图片内的具有相等大小的像素块相关联。每一像素可对应一个亮度(luminance或luma)采样及两个色度(chrominance或chroma)采样。因此,每一个CTB可与一个亮度采样块及两个色度采样块相关联。图片的CTB可分成一个或多个条带。在一些实例中,每一个条带包含整数个数的CTB。作为编码图片的一部分,图像编码器1136可产生所 述图片的每一条带的编码信息,即编码所述条带内的CTB。为了编码CTB,图像编码器1136可对与CTB相关联的像素块递归地执行四叉树分割,以将像素块分割成递减的像素块。所述较小的像素块可以和CU相关联。
图5是根据本发明实施方式的图像编码器1136的示意性框图,包括编码端预测模块1361、变换量化模块1362、熵编码模块1363、编码重建模块1364以及编码端滤波模块1365,具体的:
编码端预测模块1361用于产生预测数据。编码端预测模块1361可产生每一不再分割CU的一个或多个预测单元(PU)。CU的每一个PU可与CU的像素块内的不同像素块相关联。编码端预测模块1361可针对CU的每一个PU产生预测性像素块。编码端预测模块1361可使用帧内预测或帧间预测来产生PU的预测性像素块。如果编码端预测模块1361使用帧内预测来产生PU的预测性像素块,则编码端预测模块1361可基于与PU相关联的图片的解码后的像素来产生PU的预测性像素块。如果编码端预测模块1361使用帧间预测来产生PU的预测性像素块,则编码端预测模块1361可基于不同于与PU相关联的图片的一个或多个图片的解码后的像素来产生PU的预测性像素块。编码端预测模块1361可基于CU的PU的预测性像素块来产生CU的残余像素块。CU的残余像素块可指示CU的PU的预测性像素块中的采样值与CU的初始像素块中的对应采样值之间的差。
变换量化模块1362用于对经过预测的残差数据进行处理。图像编码器1136可对CU的残余像素块执行递归四叉树分割以将CU的残余像素块分割成与CU的变换单元(TU)相关联的一个或多个较小残余像素块。因为与TU相关联的像素块中的像素各自对应一个亮度采样及两个色度采样,所以每一个TU可与一个亮度的残余采样块及两个色度的残余采样块相关联。图像编码器1136可将一个或多个变换应用于与TU相关联的残余采样块以产生系数块(即,系数的块)。变换可以是DCT变换或者它的变体。采用DCT的变换矩阵,通过在水平和竖直方向应用一维变换计算二维变换,获得所述系数块。图像编码器1136可对系数块中的每一个系数执行量化程序。量化一般指系数经量化以减少用以表示系数的数据量,从而提供进一步压缩的过程。
图像编码器1136可产生表示量化后系数块中的系数的语法元素的集合。图像编码器1136通过熵编码模块1363可将熵编码操作(例如,上下文自适应二进制算术译码(CABAC)操作)应用于上述语法元素中的部分或者全部。为将CABAC编码应用于语法元素,熵编码模块1363可将语法元素二进制化以形成包括一个或多个位(称作“二进位”)的二进制序列。熵编码模块1363可使用规则(regular)编码来编码二进位中的一部分,且可使用旁通(bypass)编码来编码二进位中的其它部分。
除熵编码系数块的语法元素外,图像编码器1136通过编码重建模块1364,可将逆量化及逆变换应用于变换后的系数块,以从变换后的系数块重建残余采样块。图像编码器1136可将重建后的残余采样块加到一个或多个预测性采样块的对应采样块,以产生重建后的采样块。通过重建每一色彩分量的采样块,图像编码器1136可重建与TU相关联的像素块。以此方式重建CU的每一TU的像素块,直到CU的整个像素块重建完成。
在图像编码器1136重建构CU的像素块之后,图像编码器1136通过编码端滤波模块1365,执行消块滤波操作以减少与CU相关联的像素块的块效应。在图像编码器1136执行消块滤波操作之后,图像编码器1136可使用采样自适应偏移(SAO)来修改图片的CTB的重建后的像素块。在执行这些操作之后,图像编码器1136可将CU的重建后的像素块存储于解码图片缓冲器中以用于产生其它CU的预测性像素块。
当,图像编码器1136将图像映射器1134映射生成的二维全景图像进行编码后,则由所述的数据传输部115负责将编码后的数据,码流,经传输装置130传输至VR显示装置150。该数据传输部115可以是无线电接口电路,该无线电接口电路连接到控制器并且适合于生成例如用于与蜂窝通信网络、无线通信系统或者无线局域网通信的无线通信信号,该无线通信信号通过连接到无线电接口电路上的天线被发送至向其它(多个)装置。
所述VR显示装置150通常为头戴式观看设备,通常为一副眼镜,其中内置发光屏幕用于显示视频图像。设备中置有位置与方向感知系统,能够追踪用户头部的各种运动,并将相应位置与方向的视频图像内容呈现到屏幕上。因此,如图6所示,该VR显示装置150至少包括数据接收部151,数据处理部153,以及数据显示部155。当然,VR终端设备还可以包括用户视线追踪系统等高级交互功能模块,并将用户感兴趣区域呈现到屏幕上。
所述数据接收部151用于接收图像采集装置110经传输装置130所传输数据,码流,并将其所接收到的数据传输给数据处理部153。所述数据接收部151可以由天线和无线电接口电路构成,该天线用于接收来自于其他数据发送装置通过蜂窝通信网络、无线通信系统或者无线局域网通信的所发送的装置,并将其传输给无线电接口电路,由该无线电接口电路将其发送给素具处理部153。
所述的数据处理部153至少包括图像解码器1532和存储器1534,解码器1532用于解码所来自于数据接收部151的码流,所述码流以比特流的形式包含了由图像编码器1136编码的视频数据的编码信息。存储器1534用于存储解码后待显示的图像。
所述图像解码器1532包括熵解码模块1532a,解码重建模块1532b,解码滤波模块1532c;
熵解码模块1532a解析所述码流以从所述码流提取语法元素。当图像解码器1532执行CABAC解码时,图像解码器1532可对部分二进位执行规则解码且可对其它部分的二进位执行旁通解码,码流中的二进位与语法元素具有映射关系,通过解析二进位获得语法元素。
解码重建模块1532b,可基于从码流提取的语法元素来重建视频数据的图片。基于语法元素来重建视频数据的过程大体上与由图像编码器1136执行以产生语法元素的过程互逆。举例来说,图像解码器1532可基于与CU相关联的语法元素来产生CU的PU的预测性像素块。另外,图像解码器1532可逆量化与CU的TU相关联的系数块。图像解码器1532可对逆量化后的系数块执行逆变换以重建与CU的TU相关联的残余像素块。图像解码器1532可基于预测性像素块及残余像素块来重建CU的像素块。
解码滤波模块1532c;在解码重建模块1532b重建CU的像素块之后,执行消块滤波操作以减少与CU相关联的像素块的块效应。另外,基于一个或多个SAO语法元素,图像解码器1532可执行与图像编码器1136相同的SAO操作。在图像解码器1532执行这些操作之后,图像解码器1532可将CU的像素块存储于解码图片缓冲器,即存储器1534中。解码图片缓冲器可提供用于后续运动补偿、帧内预测及显示装置呈现的参考图片。
所述显示部155用于读取存储于所述存储器1534内的解码后的图像,并将其通过显示屏显示。所述显示屏可以是液晶显示器(Liquid crystal display,LCD),等离子,等离子显示面板,等离子显示面板(Thin-film transistordisplay,TFT),有机发光半导体显示器(Organic light-emitting diodedisplay,简称OLED),表面传导电子发射显示器(Surface-conduction electron-emitter display,简称SED),激光显示器(Laser video display),碳纳米管显示器(Carbon nanotubes),量子点显示器(Quantum dot display),以及干涉调制器显示器(Interferometric modulator display,IMOD)。
在上述图像采集装置110编码以及VR显示装置150解码的过程中通过预测对图像进行编码或解码的过程十分重要,该过程是整个编解码过程中的核心技术之一。对于视频信号来说,一幅图像内邻近像素之间有着较强的空间相关性,相邻的图像之间有很强的时间相关性,通过预测技术可以讲这种空间相关性和时间相关性去除,从而大幅提高编解码效率。
举例来说,当图像采集装置110的编码端预测模块1361对经过图像映射器1134映射生成的二维全景图像中的某个待编码图像块进行预测编码时,该图像块周围邻近位置的像素的重建会被用来产生所述待编码图像块的预测值,显然预测信息与待编码图像块像素越接近,所得到的预测值更接近于待编码图像块的像素值,进而残差越小,相应的编码码率就越小。所以,待编码图像块的周围重建像素值与待编码图像块像素值的相关性直接影响了编码效率。而对于帧间编码,由于进行运动估计时,如果映射在多面体的各个面上的图像之间衔接不够理想,运动估计过程中如果跨越边界(相邻图像之间的边界)也将会影响编码效率。
以前述的六面体格式的二维全景图像为例,构成所述二维全景图像的子图像共有六个,这六个子图像一般有四种空间布局格式,如图3(d)中所示的空间布局格式,而对于任何一种空间布局格式,六个子图像可以按照任何次序进行排列,并且子图像还可以按照一定角度进行旋转。但是为了尽可能的保持子图像之间的空间相关性,以提高预测编码的效率,子图像排列顺序可以根据具体的情况进行设定。例如,同样都是3x2布局格式,对于图7(a)与图7(b)来说,显然图7(b)之间各个面之间衔接的更为自然,面之间的空间相关性保持较好。而对于图7(a),由于面相接处的像素跳变较大,无法保持较好的空间相关性,进而会影响编码效率。因此,为了提高编码效率,二维全景图像中的子图像的布局格式以及子图像的排布次序信息就格外重要,而反映子图像的布局格式以及子图像的排布次序的信息将在编码过程中以特定的方式编码并传输至解码端。
继续以前述的六面体格式的二维全景图像为例,六面体格式的二维全景图像中的六个子图像一般有4(Type_idx:0~3)种空间布局格式。为了表示某种空间布局格式下所述的 六面体格式的二维全景图像中的子图像排列顺序。首先,需要将不同空间布局格式划分成与六面体格式的二维全景图像中的子图像尺寸相似或相同的子空间,将划分得到子空间按照预定的顺序进行编号,例如0~5,该过程也可以理解为对某种空间布局格式下的空间位置进行相应的索引,对应0~5即为各个不同位置的索引值,如下图8所示,需要注意的是,这里的位置标记方式也可以是其他的标记形式,例如,对于1x6,也可以从下往上标记,或者其他标记形式。对应,为了区分并标识六面体格式的三维全景图像的各个子图像,将六面体格式的三维全景图像展开成十字形六面体格式的二维全景图像之后,可按照图9所示方式将六个子图像分别命名为:Left,Front,Right,Rear,Top,Bottom。
上述的六个子图像的标识可以用以数组FaceArray={Left,Front,Right,Rear,Top,Bottom}来定义,将FaceArray中的元素以某种排列顺序依次放置到某种空间布局格式的各个子空间内,例如{Left->0,Front->1,Right->2,Rear->3,Top->4,Bottom->5}便可以得到特定空间布局格式下特定排列顺序的二维全景图像,也可以称之为重新排序之后的二维全景图像。
以上述的定义为前提,为提高六面体格式的二维全景图像的编码效率,六面体格式的二维全景图像的各个子图像可以按照任意的次序进行排列,以尽可能的提高六面体格式的二维全景图像的各个子图像之间的空间相关性,因此,在编/解码语法设计上需要支持这种自由的排列顺序,而一种示例性的语法涉及如下表1中所示:
表1
Figure PCTCN2017090067-appb-000001
num_of_layout_face_minus1:其值加上1为多面体的面的数量,此处为六面体的面的数量;
layout_face[i]:描述了在当前布局格式下,第i个位置的子空间中放置的子图像的指示信息。注意这里仅需要标识当前面在FaceArray数组剩余面中的index值,并且仅需要标识出前5个面的指示信息即可(剩余最后一个面放置到最后一个位置的子空间中)。
layout_rotation[i]:描述了第i个位置的子空间中子图像的旋转角度。旋转角度指示信息与实际旋转角度之间的对应关系可如下表2所示。实际使用时,也可以按照其他的对应关系,只要编解码端共同约定即可:
表2
layout_rotation[i] rotation degree(逆时针)
0 0
1 –90
2 +90
3 +180
具体根据下表1中的语法进行编码的示例如图10所示。其中,face array中元素个数不断减少,但是剩余各个子图像的相对前后顺序不变。
然而,这种语法结构虽然能够实现对应于不同空间格式下各个子画面的自由的排列顺序,以及旋转角度。事实上,对于一般的全景图像而言,由于各个子画面之间的相对位置确定,在某种空间布局格式下,如何排列各个子画面和旋转各个子画面,以尽可能保持子画面之间空间相关性的方法一般也是确定的。因此,在一定的空间布局格式下,并没有必要总是对子画面排列顺序和面旋转角度进行编码,从而导致,使用当前技术编码全景图像各个子画面排列顺序的编码效率较低。
有鉴于此,本发明提出了一种编码方法,用于提高编解码的效率,该编码方法,如图11所示,该编码方法110主要包括,
S112确定待编码的二维全景图像适用空间布局格式以及在所述适用的空间布局格式下所述二维全景图像的各个子图像的空间位置关系;
上述确定的过程可以通过率失真优化或者预测分析等方式来实现,具体的,仍然以上述的六面体格式的二维全景图像为例来说明书上述确定过程。需要说明的是,所述的适用通常情况下指的是二维全景图像的最佳空间布局格式以及在所述最佳的空间布局格式下所述二维全景图像的各个子图像的最佳空间位置关系,但是,在有些特殊的情况下考虑到编码的复杂度,延时或者编解码其的编解码能力,所选择的空间布局格式以及相应的空间位置关系并不一定最佳,但是最适合编解码要求,这种情况在本发明中也是允许的。
步骤1:在已经获知当前视频格式为六面体格式之后,需要确定是四种空间布局格式中的哪一种,该确定过程可以使用二维全景图像的高宽比确定,例如:
vertical_size:horizontal_size=1:6→1x6布局格式
vertical_size:horizontal_size=2:3→2x3布局格式
vertical_size:horizontal_size=3:2→3x2布局格式
vertical_size:horizontal_size=6:1→6x1布局格式
六面体面的高度/宽度为vertical_size和horizontal_size的最大公约数。
至此,可以得到六面体的空间布局格式,对应图3(d)中Type_idx中的0~3中的一种。
步骤2:确定空间布局格式之后去,需要明确六面体格式的二维全景图像的各个子图像的空间位置关系;
解析num_of_layout_face_minus1,得到六面体格式全景图的子图像的个数。按照空间布局格式中各子空间的空间位置关系,如排列顺序,依次解析子图像的指示信息layout_face[i]和layout_rotation[i],将解析出来的指示信息与各个子图像(Left,Front,Right,Rear,Top,Bottom)进行对应,其中,编解码端共同约定一种子图像的指示信息与各子图 像的对应方法,并且依据解析出来的旋转角度指示信息,确定具体的旋转角度,编解码端共同约定一种指示信息与旋转角度的对应方法。
S114判断在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系是否与在所述适用空间布局格式下的默认的空间位置关系相同;
这里的默认设置,指的是编解码端共同遵守一种约定,对于某些参数的设置,如果解码端已知其为默认设置,则解码端不需要解析具体的语法元素信息,而是直接按照约定,将各个参数设置为默认设置对应的参数。空间位置关系可以指各个子图像的空间排列顺序或者子图像的旋转角度,或者空间排列顺序和旋转角度。以图8中的3x2布局格式为例,由于二维全景图像的各个子图像一般按照Left,Front,Right,Top,Rear,Bottom的次序排列,并且分别按照0,0,0,-90°,90°,-90°的旋转角度进行旋转,各个子图像之间的相关性可以得到较好的保持。因此,可将这种设置空间布局格式下的子图像的空间位置关系设为默认设置。在具体编码时,如果实际使用的子图像的排列顺序和/或对应的旋转角度信息与默认设置一样,则编码端,如图像获取采集装置110中的图像编码器1136,无需对子图像的排列顺序和子图像的旋转角度进行编码,解码端,如VR显示装置150中的图像解码器1532,在获知当前使用的参数是默认设置时,可以直接将相应的参数子图像的排列顺序和/或及子旋转角度等设置为默认设置对应的参数。
需要注意的是,以上仅为3x2的空间布局格式布下,子图像排列顺序和/或子图像旋转角度的默认设置为例进行说明。实际应用中,基于相似的概念,上述的默认设置也可以进一步推广到空间布局格式,子图像的排列顺序,子图像的旋转角度,多面体格式的子空间的个数等参数中的部分参数或全部参数。此外,在上文描述中,为了容易理解,给出了常用的四种空间布局格式,对应图8中的Type_idx=0~3、而空间布局格式根据不同的应用场景以及技术的发展,也可以有不同的变化和扩展,或者限制为仅包括所列举出的空间布局格式中的部分,如3x2的空间布局格式布,在此不作限定。
S116当在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系与在所述适用空间布局格式下的默认的空间位置关系相同时,编码用于指示在所述适用的空间布局格式下,所述二维全景图像的各个子图像采用默认的空间位置关系的指示信息,并根据所述适用的空间布局格式以所述的默认的空间位置关系,对所述二维全景图像的各个子图像进行编码生成编码码流。
考虑到实际情况,为了减少头信息的编码传输比特,去掉语法元素设计冗余,本发明主要引入了默认设置的子图像排列顺序和/或子图像旋转角度的概念,只有当实际使用的子图像排列顺序和/或子图像转换角度与默认设置不同时,才会进一步编码实际所采用的子图像排列顺序和/或子图像旋转角度等信息;本发明通过上述的方法可以有效的减少用于表示二维全景图像的各个子图像的空间位置信息所需要的比特数,从而提高编码效率。当然,与步骤S114相对应的,当在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系与在所述适用空间布局格式下的默认的空间位置关系不同时,编码二维全景图像的适用空间布局格式以及在所述适用的空间布局格式下所述二维全景图像的各个子图像的空间位置关系。
基于以上编码方法,本发明还提供一种解码方法,如图12所示,该解码方法主要包括,
S122接收二维全景图像的编码码流,确定所述二维全景图像的适用空间布局格式;
S122解析所述码流,以确定在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系是否为默认的空间位置关系;
S124若在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系为默认的空间位置关系,则根据所述适用空间布局格式及默认的空间位置关系解码所述二维全景图像的编码码流。
本发明的解码方法中通过采用默认的空间布局格式以及空间位置关系来解码码流,可以有效的减少解码过程所需要的时间,和所需要的缓存空间,提高的解码的效率。
以下以六面体格式的二维全景图像为例子来说明本发明的几种较佳的实现方式,但是如前所述,本发明不限制仅采用六面体格式来实现,本发明也可以采用其他任意数量的多面体格式,如四面体,五面体,八面体,十二面体或者二十面体等。此外需要说明的是以下实施方式中所生成的编码信息可以设置在编码码流的序列级头信息、图像级头信息、补充增强信息(Supplemental Enhancement Information,SEI)等位置。
实施方式一:
该实施方式中,六面体格式二维全景图像的空间布局格式使用高宽比推导,二维全景图像的各个子图像的空间位置关系,如排列顺序引以及是否旋转通过率失真优化或者预分析获得;同时在编码端,如图像获取采集装置110中的图像编码器1136,中设置默认二维全景图像的各个子图像的空间位置关系。在置默认二维全景图像的各个子图像的空间位置关系中,对于已知格式的二维全景图像来说,其子图像的个数已经固定,所以不需要额外设计语法元素表明其数量。当然,具体设计时,为了适用于各种不同的应用场景,也可以设计语法元素对某种格式下的二维全景图像的子图像的数量进行声明。此外,可以在编解码端预定,当不指明二维全景图像的各个子图像的具体旋转角度的情况下,可以默认所述维全景图像的各个子图像均不旋转,或者编解码端都按照相同的方式进行对所述各个子图像进行旋转,如此,可以在码流中省略关于各个子图像旋转信息的变量。表3为本发明实施方式一的语法元素设计。
表3
Figure PCTCN2017090067-appb-000002
default_order_flag描述了是否采用默认的空间位置关系;
layout_face[i]描述了在当前布局格式下,第i个位置的子空间中放置的子图像的指示信息。注意,在后续实施方式中,相同的语法元素在未注明的情况下可以认为和本实施方式中所描述的内容相同,为了简洁起见,将不再重复描述。
实施方式一通过编码器,如图像获取采集装置110中的图像编码器1136,的具体实现过程可以是:
步骤一:选择出适用的空间布局格式以及子图像的空间位置关系
通过率失真优化或者预分析等操作,选择出最佳的空间布局格式及其对应的最优二维全景图像的各个子图像空间位置关系,包括但不限于排列顺序和/或是否旋转。
步骤二:编码default_order_flag
对于某种特定的空间布局格式下,由于空间布局格式已经确定,图像的宽度以及高度已经可以确定,通常会有某种特定的子图像的空间位置关系会是最佳的,即能够得到最精准的预测结果,这种,特定的空间布局格式以及其对应的子图像的空间位置关系可以在编解码端设置为默认设置,即默认的空间布局格式以及其对应的子图像的空间位置关系。该步骤中,首先需要判断在当前的空间布局格式下,当前适用的子图像的空间位置关系是否与默认设置中的相同的空间布局格式下子图像的空间位置关系相同。如果完全相同,则default_order_flag设置为1。如果不完全相同,则default_order_flag设置为0。最后编码default_order_flag。
步骤三:依据default_order_flag决定是否编码layout_face[i]
如果default_order_flag为1,则跳过子图像的空间位置关系,如排列顺序layout_face[i]的编码。否则,对layout_face[i]进行编码,其中,编码的具体方法包括但不限于使用熵编码,变长编码,定长编码或其他编码方法,且适用于后续实施方式中对语法元素的编码。
实施方式一通过解码器,如VR显示装置150中的图像解码器1532,的具体实现过程可以是:
解析default_order_flag,据其判断是否解析layout_face[i]
解析语法元素default_order_flag。如果为1,则说明二维全景图像的空间布局格式以及对应的空间位置关系采用的是默认设置,因此,无需解析表格3中的有关于各个子图像的空间布局格式以及其对应的空间位置关系的语法元素,即无需解码相关的语法元素;其中,所述的空间位置关系可以是所述二维全景图像的各个子图像在某种的空间布局格式下所使用的排列顺序,若采用所述的默认设置,则无需解析后面有关排列序的语法元素 layout_face[i],而是直接将各个子图像排列顺序设为默认顺序,并根据该默认顺序进行解码。如果为0,则说明二维全景图像的空间布局格式以及对应的空间位置关系并未采用默认设置,则解码器需要进一步解析layout_face[i],即解码相关的语法元素。
当然,若所述的二维全景图像的空间布局格式,并未设置为默认设置,而是采用将各种不同的空间布局格式下二维全景图像各个子图像的空间位置关系设置为默认设置的情况下,解码器在执行上述步骤之前需要获得所述二维全景图像的空间布局格式,获得方式可以是依据二维全景图像高宽比得到其空间布局格式,具体的获取方式可以是,获取当前二维全景图像的高度和宽度,计算高宽比,根据高宽比确定具体的空间布局格式,例如:高宽比为1:6,对应1x6布局格式;高宽比为3:2,对应3x2布局格式,其他情况以此类推。
实施方式二:
在该实施方式下,二维全景图像的空间布局格式不使用高宽比推导,而是通过编码具体布局格式指示信息;针对多重不同的布局格式,通常会有某种特定的子图像的空间位置关系会是最佳的,即能够得到最精准的预测结果,这种特定的空间布局格式以及其对应的子图像的空间位置关系可以在编解码端设置为默认设置,所述空间位置关系包括但不限于二维全景图像的各个子图像的排列顺序。表4为本发明实施方式二对应的语法元素表格。
表4
Figure PCTCN2017090067-appb-000003
layout_type_index描述当前二维全景图像的空间布局格式索引,根据该索引可以获取到具体的空间布局格式;
实施方式二通过编码器,如图像获取采集装置110中的图像编码器1136,的具体实现过程可以是:
步骤一:选择出适用的空间布局格式以及子图像的空间位置关系;
步骤二:编码layout_type_index
该步骤中,编码适用的空间布局格式对应的指示信息,即layout_type_index。
步骤三:编码default_order_flag
该步骤与实施方式一种的步骤二相同,因此可参照实施方式一种的步骤二了解具体实现过程。
步骤四:依据default_order_flag决定是否编码layout_face[i]
如果default_order_flag为1,则跳过子图像的空间位置关系,如排列顺序layout_face[i]的编码。否则,对layout_face[i]进行编码。
实施方式二通过解码器,如VR显示装置150中的图像解码器1532,的具体实现过程可以是:
步骤一:解析layout_type_index,得到空间布局格式
解析语法元素layout_type_index,根据解析所得数据,即,空间布局格式的索引值,得到应到具体的空间布局格式,具体对应方式为编解码端共同约定,并将所对应的空间布局格式设置为当前实际使用格式。具体,可以通过查表的方式根据布局格式的索引值得到其对应的空间布局格式。
步骤二:解析default_order_flag,据其判断是否解析layout_face[i]
解析语法元素default_order_flag。如果为1,则说明在确定的空间布局格式下,二维全景图像的各个子图像的空间位置关系采用的是默认设置,因此,无需解析表格3中的有关于各个子图像空间位置关系的语法元素,即无需解码相关的语法元素;其中,所述的空间位置关系可以是所述二维全景图像的各个子图像在确定的空间布局格式下所使用的排列顺序,若采用所述的默认设置,则无需解析后面有关排列序的语法元素layout_face[i],而是直接将各个子图像排列顺序设为默认顺序,并根据该默认顺序解码码流。如果为0,则说明二维全景图像的各个子图像在确定的空间布局格式下的空间位置关系并未采用默认设置,则解码器需要进一步解析layout_face[i],即解码相关的语法元素。
实施方式三:
该实施方式将二维全景图像的最佳空间布局格式以及该最佳空间布局格式下的该二维全景图形的各个子图像的最佳空间位置信息设置为默认设置。表5为本发明实施方式三对应的语法元素设计。
表5
Figure PCTCN2017090067-appb-000004
default_layout_order_flag描述当前二维全景图像是否采用默认的空间布局格式及空间位置关系;
实施方式三通过编码器,如图像获取采集装置110中的图像编码器1136,的具体实现过程可以是:
步骤一:选择出适用的空间布局格式以及子图像的空间位置关系
步骤二:编码default_layout_order_flag
判断当前适用的空间布局格式和子图像的空间位置关系,如排列顺序,是否与默认设置相同,如果相同,则default_layout_order_flag为1,否则default_layout_order_flag为0。最后对default_layout_order_flag编码。
步骤三:依据default_layout_order_flag决定是否编码layout_type_index以及layout_face[i]
如果default_layout_order_flag设置为1,则跳过后面layout_type_index以及layout_face[i]的编码。否则,对后面的layout_type_index以及layout_face[i]进行编码。
实施方式三通过解码器,如VR显示装置150中的图像解码器1532,的具体实现过程可以是:
步骤一:解析default_layout_order_flag
步骤二:依据default_layout_order_flag决定是否解析layout_type_index和layout_face[i]
如果default_layout_order_flag为1,则说明二维全景图像的空间布局格式以及对应的空间位置关系采用的是默认设置,无需再解析layout_type_index和layout_face[i]。如果default_layout_order_flag为0,则说明二维全景图像的空间布局格式以及对应的空间位置关系并未采用默认设置,需要继续解析语法元素layout_type_index和layout_face[i],根据解析所得解码码流。
实施方式四:
为了提高编码的灵活性,可以将默认的二维全景图像的空间布局格式和默认的二维全景图像的各个子图像的空间位置关系单独设置,具体语法元素的设计如表6所示,
表6
Figure PCTCN2017090067-appb-000005
实施方式四通过编码器,如图像获取采集装置110中的图像编码器1136,的具体实现过程可以是:
步骤一:选择出适用的空间布局格式以及对应的适用的子图像空间位置关系
步骤二:编码default_layout_order_flag
判断当前适用的空间布局格式及子图像空间位置关系是否与默认设置相同,如果相同则default_layout_order_flag为1,否则default_layout_order_flag为0,最后对default_layout_order_flag进行编码。需要注意的是,如果default_layout_order_flag为1,则跳过后续layout_type_index,default_order_flag,和layout_face[i]语法元素的编码步骤。否则,需要执行后续layout_type_index,default_order_flag,和/或layout_face[i]语法元素的编码步骤
步骤三:编码layout_type_index
该步骤的执行与否决定于步骤二,若default_layout_order_flag为0,则执行本步骤,否则跳过本步骤。
步骤四:编码default_order_flag
判断二维全景图像在当前的空间布局格式下,其适用的空间位置关系,如排布次序是否与默认的空间位置关系相同。如果是,则default_order_flag为1。否则default_order_flag为0。最后对default_order_flag进行编码。
步骤五:依据default_order_flag决定是否编码layout_face[i]
如果default_order_flag设置为1,则跳过对子图像排列顺序layout_face[i]的编码。否则,对layout_face[i]进行编码。
实施方式四通过解码器,如VR显示装置150中的图像解码器1532,的具体实现过程可以是:
步骤一:解析default_layout_order_flag
首先解析default_layout_order_flag。如果为1,则二维全景图像实际使用的是默认的空间布局格式,且二维全景图像的各个子图像的空间位置关系也使用默认设置,并跳过后面对layout_type_index,default_order_flag,layout_face[i]的解析步骤。否则执行后面对layout_type_index,default_order_flag,layout_face[i]的解析步骤。
步骤二:解析layout_type_index
步骤三:解析default_order_flag,据其判断是否解析layout_face[i]
解析语法元素default_order_flag。如果为1,则说明二维全景图像在当前的空间布局格式下,其各个子图像的空间位置关系使用默认设置,即各个子图像的空间位置关系与默认设置相同,因此无需解析后面各个子图像的空间位置关系的语法元素,如layout_face[i],而是直接将各个子图像的空间位置关系设为默认空间位置关系,编解码端共同约定该默认的空间位置关系。如果为0,则说明在当前空间布局格式下,各个子图像的空间位置关系实际不同于默认设置,需要进一步解析layout_face[i],并根据具体解析的结果对码流进行解码。
实施方式五:
当需要对三维全景图像映射时的多面体的面的数量在编码过程中进行说明,并且将经过映射得到的二维全景图像的各个子图像在特定的空间布局格式下的空间位置信息定义 为包含排列顺序以及旋转角度的情况下,则需要增加语法元素num_of_layout_face_minus1,layout_rotation[i]对多面体的面的数量以及二维全景图像的各个子图像的旋转角度进行指明。对于上述实施方式一~实施方式四(分别记做A,B,C,D),视num_of_layout_face_minus1,layout_rotation[i]的是否需要设计语法元素的情况,可以各分为4种情况。由于对于A/B/C/D,均有四种情况,所以共有4x4=16种情况,可记为M-m,其中M=A,B,C,D;m=0,1,2,3,其中,多面体的面的数量以及二维全景图像的各个子图像的旋转角度的语法元素的组合情况可以如表7中示。
表7
m Signal_face_number With_rotation
0 1 1
1 1 0
2 0 1
3 0 0
在实践中,一套排列顺序通常对应一套旋转角度,一套默认设置中,二维全景图像的各个子图像的排列顺序和旋转角度可以同时设计。我们以D-0为例,即在实施四的基础上,对六面体的面的数量进行了指明,并且二维全景图像的各个子图像可以使用旋转,具体旋转方式可以使用上文中表2或者其他编解码端共同约定的方式,由此,本发明实施方式五对应的语法元素设计可以为表8所示:
表8
Figure PCTCN2017090067-appb-000006
default_layout_order_rotation_flag描述二维全景图像的空间布局格式,和所述二维全景图像的各个子图像的排列顺序和旋转角度是否与默认设置相同。
default_order_rotation_flag描述二维全景图像的各个子图像的排列顺序和旋转角度是否与默认设置相同。
实施方式五通过编码器,如图像获取采集装置110中的图像编码器1136,的具体实现过程可以是:
步骤一:选择出适用布局格式以及子图像的适用的排列顺序以及旋转角度
步骤二:编码num_of_layout_face_minus1
其中,num_of_layout_face_minus1+1为六面体面的数量;
步骤三:编码default_layout_order_rotation_flag
判断二维全景图像当前适用的空间布局格式,以及二维全景图像的多个子图像的适用的排列顺序和旋转角度是否与默认设置相同。如果相同,则将default_layout_order_rotation_flag设置为1,如果不同则将default_layout_order_rotation_flag设置为0,最后对default_layout_order_rotation_flag进行编码。如果default_layout_order_rotation_flag为1,则可以跳过后续layout_type_index~layout_rotation[num_of_layout_face_minus1]的编码步骤。否则执行对后续对其的编码步骤。
步骤四:编码layout_type_index
步骤五:编码default_order_rotation_flag
判断二维全景图像在当前空间布局格式下,其各个子图像所适用的排列顺序和旋转角度是否与默认设置相同,如果相同则default_order_rotation_flag为1。否则default_order_rotation_flag为0。最后对default_order_rotation_flag进行编码。这里如果 default_order_rotation_flag为1,则跳过后续layout_face[i]~layout_rotation[num_of_layout_face_minus1]语法元素的编码步骤,否则执行对上述的语法元素进行的编码,即下述的步骤六:
步骤六:编码剩余语法元素layout_face[i],layout_rotation[i],layout_rotation[num_of_layout_face_minus1]。
实施方式五通过解码器,如VR显示装置150中的图像解码器1532,的具体实现过程可以是:
步骤一:解析num_of_layout_face_minus1,得到面的数量;
将num_of_layout_face_minus1加1为具体六面体面的数量。
步骤二:解析default_layout_order_rotation_flag;
如果为default_layout_order_rotation_flag为1,则说明二维全景图像的空间布局格式,以及该二维全景图像所使用的排列顺序和旋转角度与默认设置相同,这种情况下,解码器将跳过剩余对layout_type_index~layout_rotation[num_of_layout_face_minus1]的解析步骤。否则执行对后续语法元素的解析步骤。
步骤三:解析layout_type_index,获得二维全景图像的当前空间布局格式
步骤四:解析default_order_rotation_flag
如果default_order_rotation_flag为1,则说明二维全景图像在当前空间布局格式下,其子图像实际使用的排列顺序以及旋转角度与默认设置相同,在这种情况下,解码器将跳过对layout_face[i],layout_rotation[i],layout_rotation[num_of_layout_face_minus1]的解析步骤。否则继续执行对后续语法元素的解析步骤。
步骤五:解析语法元素layout_face[i],layout_rotation[i],layout_rotation[num_of_layout_face_minus1]。
实施方式六:
在实际应用中,如果使用的二维全景图像的各个子图像的排列顺序与默认设置并不完全相同,但是有可能存在部分相同。例如默认子图像的排列顺序为:Left->Front->Right->Top->Rear->Bottom,但是实际使用的次序为Bottom->Rear->Left->Front->Right->Top,只有后面的四个子图像排列顺序与默认设置相同。对于这种情况,其实从适用空间布局格式的第三个子空间开始,没有必要为后面的其他子空间对应的子图像进行的排列顺序的确定,而只需要告诉解码端,从某个位置开始,使用的仍然是默认的子图像的排列顺序,并且告知解码端使用相同的排列顺序的子空间的长度即可。
表9中给出一个具体语法设计的示例,为了描述方便,本示例中,仅考虑空间布局格式为1x6,或者6x1,并且仅考虑存在且只存在连续4个子图像的排列顺序与默认设置相同的情况,并且相同顺序段是从Left开始的情况:
表9
Figure PCTCN2017090067-appb-000007
partial_default_order_flag描述二维全景图像的各个子图像中是否有部分的子图像的排列顺序与默认的排列顺序相同。
实施方式六通过编码器,如图像获取采集装置110中的图像编码器1136,的具体实现过程可以是:
步骤一:选择出适用的空间布局格式以及子图像的空间位置关系
通过率失真优化或者预分析等操作,选择出最佳的空间布局格式及其对应的最优二维全景图像的各个子图像空间位置关系,包括但不限于排列顺序和/或是否旋转。
步骤二:编码default_order_flag
对于某种特定的空间布局格式下,由于空间布局格式已经确定,图像的宽度以及高度已经可以确定,通常会有某种特定的子图像的空间位置关系会是最佳的,即能够得到最精准的预测结果,这种,特定的空间布局格式以及其对应的子图像的空间位置关系可以在编解码端设置为默认设置,即默认的空间布局格式以及其对应的子图像的空间位置关系。该步骤中,首先需要判断在当前的空间布局格式下,当前适用的子图像的空间位置关系是否与默认设置中的相同的空间布局格式下子图像的空间位置关系相同。如果完全相同,则default_order_flag设置为1。如果不完全相同,则default_order_flag设置为0。最后编码default_order_flag。
步骤三:依据default_order_flag决定是否编码partial_default_order_flag,layout_face[i]
如果default_order_flag为1,则跳过对partial_default_order_flag,layout_face[i]的编码。如果为0,则说明当二维全景图像的各个子图像的空间位置关系,如排列顺序和/或旋转角度与默认设置并不完全相同。但是有可能部分次序与默认设置相同,因此将当前剩余的各个子图像的空间位置关系默认空间位置关系进行比对。如果满足部分空间位置关系与默认设置中的部分空间位置关系相同,则partial_default_order_flag为1。否则为0。最后对 partial_default_order_flag进行编码。其中,编解码端需共同约定一种认定部分次序与默认设置相同的方法,例如对于本实施方式,如果当前空间布局格式为1x6,或者6x1,并且Left位置位于0或者1或者2的位置的子空间,并且其后面存在且只存在连续三个面的排列顺序与默认设置中Left后面连续三个面的次序相同,则认为当前面排列顺序与默认设置部分相同。
步骤三:依据partial_default_order_flag决定layout_face[i]的编码方法
如果partial_default_order_flag为0,则依次对layout_face[i]进行编码,没有任何的跳过过程。如果partial_default_order_flag为1,若当前的子空间位置i<=2,并且当前的子空间位置编码的面为Left时,则跳过后面连续2或者3个的子空间位置的子图像的指示信息的编码,如果当前位置为2,则跳过2个,因为最后一个的子空间位置本来就不需要编码具体的面指示信息,否则为3个。
实施方式六通过解码器,如VR显示装置150中的图像解码器1532,的具体实现过程可以是:
步骤一:解析default_order_flag,据其判断是否解析partial_default_order_flag,layout_face[i]
解析语法元素default_order_flag。如果为1,则说明二维全景图像的空间布局格式以及对应的空间位置关系采用的是默认设置,因此,无需解析partial_default_order_flag,以及表格3中的有关于各个子图像的空间布局格式以及其对应的空间位置关系的语法元素,即无需解码相关的语法元素;其中,所述的空间位置关系可以是所述二维全景图像的各个子图像在某种的空间布局格式下所使用的排列顺序,若采用所述的默认设置,则无需解析后面有关排列序的语法元素layout_face[i],而是直接将各个子图像排列顺序设为默认顺序,并根据该默认顺序进行解码。如果为0,则说明二维全景图像的空间布局格式以及对应的空间位置关系并未采用默认设置,则解码器需要进一步解析partial_default_order_flag,layout_face[i],并根据具体解析的结果对子图像的排列顺序进行设置。
当然,若所述的二维全景图像的空间布局格式,并未设置为默认设置,而是采用将各种不同的空间布局格式下二维全景图像各个子图像的空间位置关系设置为默认设置的情况下,解码器在执行上述步骤之前需要获得所述二维全景图像的空间布局格式,获得方式可以是依据二维全景图像高宽比得到其空间布局格式,具体的获取方式可以是,获取当前二维全景图像的高度和宽度,计算高宽比,根据高宽比确定具体的空间布局格式,例如:高宽比为1:6,对应1x6布局格式;高宽比为3:2,对应3x2布局格式,其他情况以此类推。
步骤三:解析partial_default_order_flag,并据其决定layout_face[i]的解析方法,
解析语法元素partial_default_order_flag,如果为0,则依次解析后面的layout_face[i]。如果为1,并且当前空间布局格式为1x6,或者6x1,并且当前子空间位置i<=2,并且当前解析的子图像为Left,则跳过后面连续2或者3个子空间位置对应的子图像的指示信息的解析,如果当前位置为2,则跳过2个,因为最后一个的子空间位置本来就不需要解析具体的面指示信息,否则为3个。
需要注意的是:本实施方式中空间布局格式默认使用高宽比进行推导,实际使用时也可以直接编码使用的空间布局格式类型;本实施方式中并没有考虑六面体面个数的编码以及各个面的旋转角度,实际使用时,也可以考虑这两者因素的影响。本实施方式仅考虑了两种空间布局格式(layout_6x1,layout_1x6),实际使用时,也可以按照类似方法扩展到其他的空间布局格式。本实施方式中编解码端约定相同次序段的起始为Left,实际使用时也可以从中间的某个面开始,或者直接编码起始面的指示信息。本实施方式中相同次序段长度为4,实际使用时也可以是其他个数,或者直接编码相同次序段的长度。本实施方式中判断部分匹配的起始子空间位置i<=2,实际使用是,也可以设置为其他情况。并且后面跳过的的子空间位置个数也可以进行修改,此处也受到相同次序段长度的影响。
实施方式七:
实际使用时,为了能增加灵活性,默认的子图像的空间位置关系如排列顺序可能有多种,例如对于6x1空间布局格式,可以同时设置Left->Front->Right->Top->Rear->Bottom,以及Rear->Bottom->Left->Front->Right->Top为默认子图像的排列顺序。如果编码端选择的子图像排列顺序为其中的一种,此时,除了需要指明当前使用的子图像排列顺序是默认设置外,还需要告诉解码端具体使用的是哪一个默认子图像排列顺序。表10给出了在这种情况下的语法元素设计。
表10
Figure PCTCN2017090067-appb-000008
default_order_index描述默认的子图像的空间位置关系的索引。
实施方式七通过编码器,如图像获取采集装置110中的图像编码器1136,的具体实现过程可以是:
步骤一:选择出适用的空间布局格式以及子图像的空间位置关系
通过率失真优化或者预分析等操作,选择出最佳的空间布局格式及其对应的最优二维全景图像的各个子图像空间位置关系,包括但不限于排列顺序和/或是否旋转。
步骤二:编码default_order_flag
对于某种特定的空间布局格式下,由于空间布局格式已经确定,图像的宽度以及高度已经可以确定,通常会有某种特定的子图像的空间位置关系会是最佳的,即能够得到最精 准的预测结果,这种,特定的空间布局格式以及其对应的子图像的空间位置关系可以在编解码端设置为默认设置,即默认的空间布局格式以及其对应的子图像的空间位置关系。该步骤中,首先需要判断在当前的空间布局格式下,当前适用的子图像的空间位置关系是否与默认设置中的相同的空间布局格式下子图像的空间位置关系相同。如果完全相同,则default_order_flag设置为1。如果不完全相同,则default_order_flag设置为0。最后编码default_order_flag。
步骤三:依据default_order_flag决定编码default_order_index,layout_face[i]的编码
如果default_order_flag为1,则需要进一步编码当前使用默认子图像的排列顺序的指示信息,编解码端需共同认定指示信息与多个子图像排列顺序的对应方式,并且跳过子图像的排列顺序layout_face[i]的编码。否则,对layout_face[i]进行编码。
实施方式七通过解码器,如VR显示装置150中的图像解码器1532,的具体实现过程可以是:
步骤一:解析default_order_flag,据其判断是否解析default_order_index,layout_face[i]
解析语法元素default_order_flag。如果为1,则说明二维全景图像的空间布局格式以及对应的空间位置关系采用的是默认设置,则需要进一步解析default_order_index,得到在当前空间布局格式下,具体使用的是哪一个默认的子图像的空间位置关系,并且无需解析后面有关排列顺序的语法元素layout_face[i];其中,所述的空间位置关系可以是所述二维全景图像的各个子图像在某种的空间布局格式下所使用的排列顺序,若采用所述的默认设置,则无需解析后面有关排列序的语法元素layout_face[i],而是直接将各个子图像排列顺序设为默认顺序,并根据该默认顺序进行解码。如果为0,则说明二维全景图像的空间布局格式以及对应的空间位置关系并未采用默认设置,则解码器需要进一步解析layout_face[i],即解码相关的语法元素。
当然,若所述的二维全景图像的空间布局格式,并未设置为默认设置,而是采用将各种不同的空间布局格式下二维全景图像各个子图像的空间位置关系设置为默认设置的情况下,解码器在执行上述步骤之前需要获得所述二维全景图像的空间布局格式,获得方式可以是依据二维全景图像高宽比得到其空间布局格式,具体的获取方式可以是,获取当前二维全景图像的高度和宽度,计算高宽比,根据高宽比确定具体的空间布局格式,例如:高宽比为1:6,对应1x6布局格式;高宽比为3:2,对应3x2布局格式,其他情况以此类推。
需要注意的是,本实施例中使用了两个语法元素分别指明是否使用默认子图像排列顺序(default_order_flag),以及使用的是哪一个默认的子图像排列顺序(default_order_index)。实际上也可以直接使用default_order_index来分析判断,例如当default_order_index为0时,认为其没有使用默认的子图像排列顺序,当default_order_index不为0时,在根据其具体的值判断使用的是哪一种默认的子图像排列顺序(编解码端共同约定一种对应方式),如表11所示。
表11
Figure PCTCN2017090067-appb-000009
需要注意的是,本实施方式中空间布局格式默认使用高宽比进行推导,实际使用时也可以直接编码使用的空间布局格式类型。本实施方式中并没有考虑六面体面个数的编码以及各个面的旋转角度,实际使用时,也可以考虑这两者因素的影响。本实施方式中仅为每一种空间布局格式规定了多种默认的子图像排列顺序,实际使用时,也可以用于规定多种默认的空间排布格式和子图像排列顺序(在不使用宽高比推导空间布局格式,而是需要编码空间排布格式指示信息的情况下),或者用于规定多种默认的空间排布格式和子图像排列顺序的同时,也为每一种空间布局格式额外规定多种默认的子图像排列顺序。并且,上述情况也可以结合旋转角度进行扩展使用。
以上本发明以上的各个语法元素的名称,以及语法元素的编码方法以及语法元素中的判断逻辑过程可以根据不同的需求进行设计和调整,并不局限于上述示例中所给出的形态。
以上所述的各种实施例均可以通过硬件的方式实现,例如所述编码方法110可以通过硬件形态的或者具备相应功能的硬件装置,例如编码装置来实现,请参阅图13,一种用于实现本发明上述的所有可能的编码方法的一种编码装置1300,该编码装置包括:
空间布局格式及空间位置关系确定单元1301,其用于,确定待编码的二维全景图像适用空间布局格式以及在所述适用的空间布局格式下所述二维全景图像的各个子图像的空间位置关系;
判断单元1303,其用于,判断在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系是否与在所述适用空间布局格式下的默认的空间位置关系相同;以及
编码单元1305,其用于,当在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系与在所述适用空间布局格式下的默认的空间位置关系相同时,编码用于指示在所述适用的空间布局格式下,所述二维全景图像的各个子图像采用默认的空间位置关系的指示信息,并根据所述适用的空间布局格式以所述的默认的空间位置关系,对所述二维全景图像的各个子图像进行编码生成编码码流。
本发明通过上述的装置可以有效的减少用于表示二维全景图像的各个子图像的空间 位置信息所需要的比特数,从而提高编码效率。该编码装置1300的各个单元可以是一种功能单元,由通用的或者专用的硬件来实现。
相似地,所述解码方法120可以通过硬件形态的或者具备相应功能的硬件装置,例如解码装置来实现,请参阅图14,一种用于实现本发明上述的所有可能的解码方法的一种解码装置1400,该解码装置包括:
空间布局格式确定单元1401,其用于,接收二维全景图像的编码码流,确定所述二维全景图像的适用空间布局格式;
空间位置关系确定单元1403,其用于,解析所述码流,以确定在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系是否为默认的空间位置关系;
解码单元1405,其用于,若在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系为默认的空间位置关系,则根据所述适用空间布局格式及默认的空间位置关系解码所述二维全景图像的编码码流。
该解码装置1400的各个单元可以是一种功能单元,由通用的或者专用的硬件来实现。
本发明的解码装置中通过采用默认的空间布局格式以及空间位置关系来解码码流,可以有效的减少解码过程所需要的时间,和所需要的缓存空间,提高的解码的效率。
在一个或多个实例中,所描述的功能可以硬件、软件、固件或其任何组合来实施。如果以软件实施,则功能可作为一个或多个指令或代码而存储于计算机可读媒体上或经由计算机可读媒体而发送,且通过基于硬件的处理单元执行。计算机可读媒体可包含计算机可读存储媒体(其对应于例如数据存储媒体等有形媒体)或通信媒体,通信媒体包含(例如)根据通信协议促进计算机程序从一处传送到另一处的任何媒体。以此方式,计算机可读媒体大体上可对应于(1)非瞬时的有形计算机可读存储媒体,或(2)例如信号或载波等通信媒体。数据存储媒体可为可由一个或多个计算机或一个或多个处理器存取以检索指令、代码及/或数据结构以用于实施本发明中所描述的技术的任何可用媒体。计算机程序产品可包含计算机可读媒体。
通过实例而非限制,某些计算机可读存储媒体可包括RAM、ROM、EEPROM、CD-ROM或其它光盘存储器、磁盘存储器或其它磁性存储装置、快闪存储器,或可用以存储呈指令或数据结构的形式的所要程序代码且可由计算机存取的任何其它媒体。而且,任何连接可适当地称为计算机可读媒体。举例来说,如果使用同轴电缆、光缆、双绞线、数字用户线(DSL)或无线技术(例如,红外线、无线电及微波)而从网站、服务器或其它远程源发送指令,则同轴电缆、光缆、双绞线、DSL或无线技术(例如,红外线、无线电及微波)包含于媒体的定义中。然而,应理解,计算机可读存储媒体及数据存储媒体不包含连接、载波、信号或其它瞬时媒体,而是有关非瞬时有形存储媒体。如本文中所使用,磁盘及光盘包含压缩光盘(CD)、激光光盘、光学光盘、数字影音光盘(DVD)、软性磁盘及蓝光光盘,其中磁盘通常以磁性方式复制数据,而光盘通过激光以光学方式复制数据。以上各物的组合还应包含于计算机可读媒体的范围内。
可由例如一个或多个数字信号处理器(DSP)、通用微处理器、专用集成电路 (ASIC)、现场可编程逻辑阵列(FPGA)或其它等效集成或离散逻辑电路等一个或多个处理器来执行指令。因此,如本文中所使用的术语“处理器”可指代前述结构或适于实施本文中所描述的技术的任何其它结构中的任一者。另外,在一些方面中,可将本文中所描述的功能性提供于经配置以用于编码及解码的专用硬件及/或软件模块内,或并入于组合式编解码器中。而且,所述技术可完全实施于一个或多个电路或逻辑元件中。
本发明的技术可以广泛地由多种装置或设备来实施,所述装置或设备包含无线手持机、集成电路(IC)或IC集合(例如,芯片组)。在本发明中描述各种组件、模块或单元以强调经配置以执行所揭示技术的装置的功能方面,但未必要求通过不同硬件单元来实现。确切地说,如上文所描述,各种单元可组合于编解码器硬件单元中,或通过交互操作性硬件单元(包含如上文所描述的一个或多个处理器)的集合结合合适软件及/或固件来提供。
应理解,说明书通篇中提到的“一个实施方式”或“一实施方式”意味着与实施方式有关的特定特征、结构或特性包括在本发明的至少一个实施方式中。因此,在整个说明书各处出现的“在一个实施方式中”或“在一实施方式中”未必一定指相同的实施方式。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施方式中。
在本发明的各种实施方式中,应理解,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施方式的实施过程构成任何限定。
另外,本文中术语“系统”和“网络”在本文中常可互换使用。应理解,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
在本申请所提供的实施方式中,应理解,“与A相应的B”表示B与A相关联,根据A可以确定B。但还应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其它信息确定B。
本领域普通技术人员可以意识到,结合本文中所公开的实施方式描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施方式中的对应过程,在此不再赘述。
在本申请所提供的几个实施方式中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施方式仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不 执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施方式方案的目的。
另外,在本发明各个实施方式中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。

Claims (18)

  1. 一种解码方法,其特征在于,包括:
    接收二维全景图像的编码码流,确定所述二维全景图像的适用空间布局格式;
    解析所述二维全景图像的编码码流,以确定在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系是否为默认的空间位置关系;
    若在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系为默认的空间位置关系,则根据所述适用空间布局格式及默认的空间位置关系解码所述二维全景图像的编码码流。
  2. 如权利要求1所述的解码方法,其特征在于:所述解析所述二维全景图像的编码码流,以确定在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系是否为默认的空间位置关系,包括:
    解析所述二维全景图像的编码码流,获取默认空间位置关系指示标识,当所述默认空间位置关系指示标识为第一取值时,确定在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系为默认的空间位置关系。
  3. 如权利要求1或2所述的解码方法,其特征在于:所述确定所述二维全景图像的适用空间布局格式包括:
    解析二维全景图像的编码码流,获取所述二维全景图像的宽和高,根据所述二维全景图像的宽高比确定以及预设的宽高比与空间布局格式的对应关系表确定所述二维全景图像的适用空间布局格式;或
    解析所述二维全景图像的编码码流,获取所述二维全景图像的适用的空间布局格式索引,根据所述空间布局格式索引确定所述二维全景图像的适用空间布局格式。
  4. 如权利要求1至3任意一项所述的解码方法,其特征在于:所述默认的空间位置关系包括多种不同的默认的空间位置关系,对应,所述若在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系为默认的空间位置关系,则根据所述适用空间布局格式及默认的空间位置关系解码所述二维全景图像的编码码流包括:
    若在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系为默认的空间位置关系,解析所述二维全景图像的编码码流以获得默认的空间位置关系的索引,根据所述默认空间位置关系索引从所述多个默认空间位置关系中获取与所述索引对应的所述二维全景图像的各个子图像的默认的空间位置关系,并根据所述适用空间布局格式及默认的空间位置关系解码所述二维全景图像的编码码流。
  5. 如权利要求1、2或4所述的解码方法,其特征在于:所述确定所述二维全景图像的适用空间布局格式,包括:解析所述二维全景图像的编码码流,以确定所述二维全景图像的适用空间布局格式是否为默认的空间布局格式;
    对应,所述确定在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系是否为默认的空间位置关系,包括:
    根据所述二维全景图像的适用空间布局格式是否为默认的空间布局格式,确定,当所述二维全景图像的适用空间布局格式为默认的空间布局格式时,所述二维全景图像的各个子图像的空间位置关系为默认的空间位置关系。
  6. 如权利要求5所述的解码方法,其特征在于:所述解析所述二维全景图像的编码码流,以确定所述二维全景图像的适用空间布局格式是否为默认的空间布局格式,包括:解析所述二维全景图像的编码码流,获取默认空间布局格式指示标识,当所述默认空间布局格式指示标识为第一取值时,确定在所述二维全景图像的适用空间布局格式为默认的空间布局格式。
  7. 如权利要求1、2、4或5所述的解码方法,其特征在于:所述确定所述二维全景图像的适用空间布局格式,包括:解析所述二维全景图像的编码码流,以确定所述二维全景图像的适用空间布局格式是否为默认的空间布局格式;
    对应,所述确定在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系是否为默认的空间位置关系,包括:
    当所述二维全景图像的适用空间布局格式非默认的空间布局格式时,确定在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系是否为默认的空间位置关系。
  8. 如权利要求1至7任意一项所述的解码方法,其特征在于:所述的全景图像的各个子图像的空间位置关系包括,所述全景图像的各个子图像的排列顺序,或者所述全景图像的各个子图像的旋转角度,或者所述全景图像的各个子图像的排列顺序以及旋转角度。
  9. 一种编码方法,其特征在于,包括:
    确定待编码的二维全景图像适用空间布局格式以及在所述适用的空间布局格式下所述二维全景图像的各个子图像的空间位置关系;
    判断在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系是否与在所述适用空间布局格式下的默认的空间位置关系相同;
    当在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系与在所述适用空间布局格式下的默认的空间位置关系相同时,编码用于指示在所述适用的空间布局格式下,所述二维全景图像的各个子图像采用默认的空间位置关系的指示信息,并根据所述适用的空间布局格式以所述的默认的空间位置关系,对所述二维全景图像的各个子图像进行编码生成编码码流。
  10. 根据权利要求9所述的编码方法,其特征在于,包括:当在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系与在所述适用空间布局格式下的默认的空间位置关系相同时,所述编码用于指示在所述适用的空间布局格式下,所述二维全景图像的各个子图像采用默认的空间位置关系的指示信息,包括:当在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系与在所述适用空间布局格式下的默认的空间位置关系相同时,将默认空间位置关系指示标识设置为第一取值,并编码设置为第一取值的所述默认空间位置关系指示标识。
  11. 根据权利要求9或10所述的编码方法,其特征在于,包括:所述确定待编码的二维全景图像适用空间布局格式包括:
    获取所述二维全景图像的宽和高,根据所述二维全景图像的宽高比确定以及预设的宽高比与空间布局格式的对应关系表确定所述二维全景图像的适用空间布局格式。
  12. 根据权利要求11所述的编码方法,其特征在于,所述方法还包括:获取所述二维全景图像的适用空间布局格式获取空间布局格式索引,编码所述二维全景图像的适用 空间布局格式索引。
  13. 根据权利要求9至12任意一项所述的编码方法,其特征在于,所述默认的空间位置关系包括多种不同的默认的空间位置关系,对应,所述编码方法还包括:编码所述二维全景图像的各个子图像的默认空间位置关系索引,所述默认的空间位置索引用于唯一指示所述多个不同的默认的空间位置关系中的一个默认的空间位置关系。
  14. 根据权利要求9至13任意一项所述的编码方法,其特征在于,所述方法还包括:判断所述二维全景图像的适用空间布局格式是否与默认的空间布局格式相同;
    对应,所述判断在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系是否与在所述适用空间布局格式下的默认的空间位置关系相同,包括;
    根据所述二维全景图像的适用空间布局格式是否与默认的空间布局格式相同,确定,当所述二维全景图像的适用空间布局格式为默认的空间布局格式时,所述二维全景图像的各个子图像的空间位置关系为默认的空间位置关系。
  15. 根据权利要求9至13任意一项所述的编码方法,其特征在于,所述方法还包括:判断所述二维全景图像的适用空间布局格式是否与默认的空间布局格式相同;
    对应,所述判断在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系是否与在所述适用空间布局格式下的默认的空间位置关系相同,包括;
    当所述二维全景图像的适用空间布局格式与默认的空间布局格式不同时,判断在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系是否与在所述适用空间布局格式下的默认的空间位置关系相同。
  16. 如权利要求9至15任意一项所述的解码方法,其特征在于:所述的全景图像的各个子图像的空间位置关系包括,所述全景图像的各个子图像的排列顺序,或者所述全景图像的各个子图像的旋转角度,或者所述全景图像的各个子图像的排列顺序以及旋转角度。
  17. 一种解码装置,其特征在于,包括:
    空间布局格式确定单元,其用于,接收二维全景图像的编码码流,确定所述二维全景图像的适用空间布局格式;
    空间位置关系确定单元,其用于,解析所述码流,以确定在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系是否为默认的空间位置关系;
    解码单元,其用于,若在所述适用空间布局格式下,所述二维全景图像的各个子图像的空间位置关系为默认的空间位置关系,则根据所述适用空间布局格式及默认的空间位置关系解码所述二维全景图像的编码码流。
  18. 一种编码装置,其特征在于,空间布局格式及空间位置关系确定单元,其用于,确定待编码的二维全景图像适用空间布局格式以及在所述适用的空间布局格式下所述二维全景图像的各个子图像的空间位置关系;
    判断单元,其用于,判断在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系是否与在所述适用空间布局格式下的默认的空间位置关系相同;以及,
    编码单元,其用于,当在所述适用的空间布局格式下的所述二维全景图像的各个子图像的空间位置关系与在所述适用空间布局格式下的默认的空间位置关系相同时,编码用于指示在所述适用的空间布局格式下,所述二维全景图像的各个子图像采用默认的空间位置关系的指示信息,并根据所述适用的空间布局格式以所述的默认的空间位置关系,对所述二维全景图像的各个子图像进行编码生成编码码流。
PCT/CN2017/090067 2016-06-27 2017-06-26 编解码的方法及设备 WO2018001208A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP17819205.0A EP3468199B1 (en) 2016-06-27 2017-06-26 Encoding and decoding method and device
EP21167189.6A EP3934255A1 (en) 2016-06-27 2017-06-26 Encoding method and device and decoding method and device
KR1020197001684A KR102243120B1 (ko) 2016-06-27 2017-06-26 인코딩 방법 및 장치 및 디코딩 방법 및 장치
US16/234,107 US10805606B2 (en) 2016-06-27 2018-12-27 Encoding method and device and decoding method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610482290.7A CN107547907B (zh) 2016-06-27 2016-06-27 编解码的方法及设备
CN201610482290.7 2016-06-27

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/234,107 Continuation US10805606B2 (en) 2016-06-27 2018-12-27 Encoding method and device and decoding method and device

Publications (1)

Publication Number Publication Date
WO2018001208A1 true WO2018001208A1 (zh) 2018-01-04

Family

ID=60786598

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/090067 WO2018001208A1 (zh) 2016-06-27 2017-06-26 编解码的方法及设备

Country Status (5)

Country Link
US (1) US10805606B2 (zh)
EP (2) EP3468199B1 (zh)
KR (1) KR102243120B1 (zh)
CN (1) CN107547907B (zh)
WO (1) WO2018001208A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111801947A (zh) * 2018-03-02 2020-10-20 华为技术有限公司 选择性环路滤波下的图像编码的装置及方法

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11006135B2 (en) * 2016-08-05 2021-05-11 Sony Corporation Image processing apparatus and image processing method
US10863198B2 (en) * 2017-01-03 2020-12-08 Lg Electronics Inc. Intra-prediction method and device in image coding system for 360-degree video
US10652578B2 (en) * 2018-02-05 2020-05-12 Apple Inc. Processing of multi-directional images in spatially-ordered video coding applications
WO2020009341A1 (ko) * 2018-07-06 2020-01-09 엘지전자 주식회사 동적 뷰포인트의 좌표계에 대한 메타데이터를 송수신하는 방법 및 장치
CN109819234A (zh) * 2019-02-01 2019-05-28 广州卓远虚拟现实科技有限公司 一种基于h.265的虚拟现实视频传输与播放方法、系统
CN110266316B (zh) * 2019-05-08 2023-02-21 创新先进技术有限公司 一种数据压缩、解压方法、装置和设备
US11785214B2 (en) * 2019-11-14 2023-10-10 Mediatek Singapore Pte. Ltd. Specifying video picture information
DE102021132275A1 (de) 2021-12-08 2023-06-15 immerVR GmbH Vorrichtung und Verfahren

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101002474A (zh) * 2004-08-13 2007-07-18 庆熙大学校产学协力团 用于二十面体全景图像的编码和解码的方法和设备
CN102209241A (zh) * 2011-05-25 2011-10-05 杭州华三通信技术有限公司 一种基于多子图的视频编解码方法及其装置
US20150341552A1 (en) * 2014-05-21 2015-11-26 Here Global B.V. Developing a Panoramic Image

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003141562A (ja) * 2001-10-29 2003-05-16 Sony Corp 非平面画像の画像処理装置及び画像処理方法、記憶媒体、並びにコンピュータ・プログラム
US7308131B2 (en) * 2002-12-03 2007-12-11 Ntt Docomo, Inc. Representation and coding of panoramic and omnidirectional images
US20130009980A1 (en) * 2011-07-07 2013-01-10 Ati Technologies Ulc Viewing-focus oriented image processing
US10319071B2 (en) * 2016-03-23 2019-06-11 Qualcomm Incorporated Truncated square pyramid geometry and frame packing structure for representing virtual reality video content
KR102014240B1 (ko) * 2016-09-08 2019-08-27 가온미디어 주식회사 공간적 구조 정보를 이용한 동기화된 다시점 영상의 선택적 복호화 방법, 부호화 방법 및 그 장치
WO2018064967A1 (en) * 2016-10-07 2018-04-12 Mediatek Inc. Video encoding method and apparatus with syntax element signaling of employed projection layout and associated video decoding method and apparatus

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101002474A (zh) * 2004-08-13 2007-07-18 庆熙大学校产学协力团 用于二十面体全景图像的编码和解码的方法和设备
CN102209241A (zh) * 2011-05-25 2011-10-05 杭州华三通信技术有限公司 一种基于多子图的视频编解码方法及其装置
US20150341552A1 (en) * 2014-05-21 2015-11-26 Here Global B.V. Developing a Panoramic Image

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MAO, YUMING: "The Technique and System for the Generation of Camouflage Scenario", CHINA MASTER'S THESES FULL-TEXT DATABASE, 15 February 2014 (2014-02-15), XP009511377, ISSN: 1674-0246 *
See also references of EP3468199A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111801947A (zh) * 2018-03-02 2020-10-20 华为技术有限公司 选择性环路滤波下的图像编码的装置及方法
US11343504B2 (en) 2018-03-02 2022-05-24 Huawei Technologies Co., Ltd. Apparatus and method for picture coding with selective loop-filtering

Also Published As

Publication number Publication date
US20190158825A1 (en) 2019-05-23
EP3934255A1 (en) 2022-01-05
CN107547907A (zh) 2018-01-05
US10805606B2 (en) 2020-10-13
KR102243120B1 (ko) 2021-04-21
KR20190020083A (ko) 2019-02-27
EP3468199B1 (en) 2021-04-28
EP3468199A1 (en) 2019-04-10
EP3468199A4 (en) 2019-06-19
CN107547907B (zh) 2020-02-21

Similar Documents

Publication Publication Date Title
WO2018001208A1 (zh) 编解码的方法及设备
WO2017071480A1 (zh) 参考帧编解码的方法与装置
TWI606718B (zh) 規定視覺動態範圍編碼操作及參數
WO2018001207A1 (zh) 编解码的方法及装置
WO2019134557A1 (zh) 视频图像的处理方法及装置
US9538239B2 (en) Decoder and method for decoding encoded input data containing a plurality of blocks or packets
WO2017129023A1 (zh) 解码方法、编码方法、解码设备和编码设备
US11431985B2 (en) Method and device for signaling information on chroma format
US20220295071A1 (en) Video encoding method, video decoding method, and corresponding apparatus
US11368696B2 (en) Picture encoder, picture decoder and corresponding methods of encoding and decoding
US20220094947A1 (en) Method for constructing mpm list, method for obtaining intra prediction mode of chroma block, and apparatus
CN115836527A (zh) 编码器、解码器及用于自适应环路滤波的对应方法
WO2022166462A1 (zh) 编码、解码方法和相关设备
CN111510643B (zh) 全景图和特写图的拼接系统及方法
CN113316938A (zh) 使用去块滤波的图像编译方法和装置
US20230421780A1 (en) Video coding method on basis of transformation, and device therefor
WO2023051156A1 (zh) 视频图像的处理方法及装置
CN111713106A (zh) 用信号发送360度视频信息
US20220046246A1 (en) Transform-based image coding method and device therefor
WO2020114393A1 (zh) 变换方法、反变换方法以及视频编码器和视频解码器
US11917210B2 (en) Method and device for processing general constraint information in image/video coding system
WO2023000182A1 (zh) 图像编解码及处理方法、装置及设备
RU2787713C2 (ru) Способ и устройство предсказания блока цветности
WO2024061660A1 (en) Dynamic structures for volumetric data coding
CN117616758A (zh) 图像编码/解码方法、用于发送比特流的方法以及存储比特流的记录介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17819205

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 20197001684

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2017819205

Country of ref document: EP

Effective date: 20190103