WO2019127100A1 - 视频编码的方法、装置和计算机系统 - Google Patents

视频编码的方法、装置和计算机系统 Download PDF

Info

Publication number
WO2019127100A1
WO2019127100A1 PCT/CN2017/119000 CN2017119000W WO2019127100A1 WO 2019127100 A1 WO2019127100 A1 WO 2019127100A1 CN 2017119000 W CN2017119000 W CN 2017119000W WO 2019127100 A1 WO2019127100 A1 WO 2019127100A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion vector
vector value
image
frame number
pole
Prior art date
Application number
PCT/CN2017/119000
Other languages
English (en)
French (fr)
Inventor
周焰
郑萧桢
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2017/119000 priority Critical patent/WO2019127100A1/zh
Priority to CN201780018384.1A priority patent/CN108886616A/zh
Publication of WO2019127100A1 publication Critical patent/WO2019127100A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding

Definitions

  • the present invention relates to the field of information technology and, more particularly, to a method, apparatus and computer system for video coding.
  • 360-degree panoramic video usually refers to a video with a horizontal viewing angle of 360 degrees (-180° to 180°) and a vertical viewing angle of 180 degrees (-90° to 90°), usually in the form of a three-dimensional spherical surface.
  • 360-degree panoramic video usually generates two-dimensional planar video in a certain geometric relationship, and then through digital image processing and encoding and decoding operations.
  • the upper and lower polar portions of the panoramic video may be Obvious stretching, the splicing of the left and right sides of the latitude and longitude diagram will also have obvious discontinuities.
  • the human eye's attention area is mainly in the vicinity of the equator of the panoramic video (the vertical viewing angle is near 0 degree)
  • the coding is more careful;
  • the allocated bits are relatively small, and the quality of the video coding is relatively poor.
  • this coding method consumes a large amount of extra bits in the smooth motion region, while for the intensely moving bipolar portion, the coding quality is caused by fewer allocated bits. Poor, resulting in lower coding efficiency.
  • a coding method of 360-degree panoramic video is needed to improve the coding efficiency of 360-degree panoramic video.
  • Embodiments of the present invention provide a video coding method, apparatus, and computer system, which can improve coding efficiency of 360-degree panoramic video.
  • a first aspect provides a video encoding method, including: acquiring motion vector information of a first preset frame number image in a 360-degree panoramic video by an image signal processor; and determining a minimum motion vector value according to the motion vector information And a pole motion vector value, wherein the minimum motion vector value is a motion vector value of a region in the first preset frame number image that has the smallest motion vector value, and the pole motion vector value is the first preset frame a motion vector value corresponding to the two-pole region of the 360-degree panoramic video in the number image; determining a second pre-determination in the 360-degree panoramic video according to the minimum motion vector value, the pole motion vector value, and the motion vector value threshold
  • the rotation angle of the frame number image is set; according to the rotation angle, the second preset frame number image is rotated, and then encoded.
  • a second aspect provides an apparatus for video encoding, comprising: an image signal processor, configured to acquire motion vector information of a first preset frame number image in a 360-degree panoramic video; and a processing unit, configured to use, according to the motion vector information, Determining a minimum motion vector value and a pole motion vector value, wherein the minimum motion vector value is a motion vector value of a region in the first preset frame number image in which a motion vector value is the smallest, the pole motion vector value being the a motion vector value corresponding to a two-pole region of the 360-degree panoramic video in the first preset frame number image; and determining the 360 according to the minimum motion vector value, the pole motion vector value, and the motion vector value threshold a rotation angle of the second preset frame number image in the panoramic video; a rotation unit, configured to rotate the second preset frame number image according to the rotation angle; and a coding unit configured to perform the rotated image coding.
  • a computer system comprising: a memory for storing computer executable instructions; a processor for accessing the memory and executing the computer executable instructions to perform the method of the first aspect above The operation in .
  • a computer storage medium having stored therein program code, the program code being operative to indicate a method of performing the first aspect described above.
  • the motion vector information is obtained by the image signal processor, and the minimum motion vector value and the pole motion vector value are determined according to the motion vector information, and the rotation angle of the 360-degree panoramic video is determined according to the determination, and the rotation angle of the 360-degree panoramic video can be determined.
  • the sub-encoding enables the rotation encoding of the 360-degree panoramic video, which can improve the compression efficiency and the encoding quality of the 360-degree panoramic video, thereby improving the encoding efficiency of the 360-degree panoramic video.
  • FIG. 1 is an architectural diagram of a technical solution to which an embodiment of the present invention is applied.
  • FIG. 2 is a processing architecture diagram of an encoder according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of data to be encoded according to an embodiment of the present invention.
  • 4a is a diagram of generating a two-dimensional planar video by 360 degree panoramic video mapping according to an embodiment of the present invention.
  • Figure 4b is a diagram of video rotation in accordance with an embodiment of the present invention.
  • FIG. 5 is a schematic flowchart of a method for video encoding according to an embodiment of the present invention.
  • FIG. 6 is a flow chart of a method of video encoding in accordance with an embodiment of the present invention.
  • FIG. 7 is a schematic block diagram of an apparatus for video encoding according to an embodiment of the present invention.
  • Figure 8 is a schematic block diagram of a computer system in accordance with an embodiment of the present invention.
  • the size of the sequence numbers of the processes does not imply a sequence of executions, and the order of execution of the processes should be determined by its function and internal logic, and should not be construed as an embodiment of the present invention.
  • the implementation process constitutes any limitation.
  • FIG. 1 is an architectural diagram of a technical solution to which an embodiment of the present invention is applied.
  • system 100 can receive data 102 to be encoded, encode data to be encoded 102, generate encoded data 108, and can further encode video 108.
  • system 100 can receive 360 degree panoramic video data, and rotationally encode 360 degree panoramic video data to produce rotationally encoded data.
  • components in system 100 may be implemented by one or more processors, which may be processors in a computing device or processors in a mobile device (e.g., a drone).
  • the processor may be any type of processor, which is not limited in this embodiment of the present invention.
  • the processor may include an Image Signal Processor (ISP), an encoder, and the like.
  • ISP Image Signal Processor
  • One or more memories may also be included in system 100.
  • the memory can be used to store instructions and data, such as computer-executable instructions to implement the technical solution of the embodiments of the present invention, data to be encoded 102, encoded data 108, and the like.
  • the memory may be any kind of memory, which is not limited in this embodiment of the present invention.
  • the data to be encoded 102 may include text, images, graphic objects, animation sequences, audio, video, or any other data that needs to be encoded.
  • the data to be encoded 102 may include sensory data from sensors, which may be vision sensors (eg, cameras, infrared sensors), microphones, near field sensors (eg, ultrasonic sensors, radar), position sensors, Temperature sensor, touch sensor, etc.
  • the data to be encoded 102 may include information from a user, such as biometric information, which may include facial features, fingerprint scans, retinal scans, voice recordings, DNA sampling, and the like.
  • Encoding is necessary for efficient and/or secure transmission or storage of data.
  • the encoding of the encoded data 102 may include data compression, encryption, error correction encoding, format conversion, and the like.
  • compression of multimedia data such as video or audio, can reduce the number of bits transmitted in the network.
  • Sensitive data such as financial information and personally identifiable information, can be encrypted to protect confidentiality and/or privacy prior to transmission and storage. In order to reduce the bandwidth occupied by video storage and transmission, it is necessary to encode and compress the video data.
  • Any suitable coding technique can be used to encode the data 102 to be encoded.
  • the type of encoding depends on the data being encoded and the specific coding requirements.
  • the encoder can implement one or more different codecs.
  • Each codec may include code, instructions or a computer program that implements different encoding algorithms. Based on various factors, including the type and/or source of the data to be encoded 102, the receiving entity of the encoded data, the available computing resources, the network environment, the business environment, the rules and standards, etc., a suitable encoding algorithm can be selected for encoding. Data to be encoded 102.
  • the encoder can be configured to encode a series of video frames. Encoding the data in each frame can take a series of steps.
  • the encoding step can include processing steps such as prediction, transform, quantization, entropy encoding, and the like.
  • the prediction includes two types of intra prediction and inter prediction, and the purpose is to use the prediction block information to remove redundant information of the current image block to be encoded.
  • Intra prediction uses the information of the current frame image to obtain prediction block data.
  • Inter prediction uses the information of the reference frame to obtain prediction block data, the process comprising dividing the image block to be encoded into a plurality of sub image blocks; and then, for each sub image block, searching the reference image for the image that best matches the current sub image block.
  • the block is used as a prediction block; thereafter, the sub-image block is subtracted from the corresponding pixel value of the prediction block to obtain a residual, and the obtained residuals of the respective sub-image blocks are combined to obtain a residual of the image block.
  • Transforming the residual block of the image using the transformation matrix can remove the correlation of the residual of the image block, that is, remove the redundant information of the image block, so as to improve the coding efficiency
  • the transformation of the data block in the image block usually adopts a two-dimensional transformation. That is, the residual information of the data block is multiplied by an NxM transform matrix and its transposed matrix at the encoding end, and the transform coefficients are obtained after multiplication.
  • the transform coefficients are quantized to obtain quantized coefficients, and finally the quantized coefficients are entropy encoded, and finally the entropy-encoded bit stream and the encoded coding mode information, such as intra prediction mode and motion vector information, are performed.
  • the decoder Store or send to the decoder.
  • the entropy-encoded bit stream is first obtained and entropy decoded to obtain a corresponding residual, according to the predicted image block corresponding to the information image block such as the motion vector or the intra prediction obtained by decoding, according to the predicted image block and the image block.
  • the residual is obtained by the value of each pixel in the current sub-image block.
  • FIG. 2 is a block diagram showing the processing architecture of an encoder according to an embodiment of the present invention.
  • the prediction process may include intra prediction and inter prediction.
  • a residual corresponding to the data unit for example, a pixel point
  • the pixel obtained by reconstructing the reference pixel point can be obtained from the stored context, according to the reference pixel point.
  • the pixel obtained after the reconstruction and the pixel of the pixel point obtain the pixel residual corresponding to the pixel point.
  • the pixel residual is subjected to entropy coding by transforming and quantization.
  • the control of the quantization rate can be achieved by controlling the quantization parameter.
  • the quantized pixel residual corresponding to a certain pixel point may also be subjected to inverse quantization inverse transform processing, and then reconstructed to obtain a pixel reconstructed by the pixel, and the reconstructed pixel of the pixel is stored, so that When the pixel is used as the reference pixel, the pixel reconstructed by the pixel acquires the pixel residual corresponding to the other pixel.
  • the quantization parameter may include a quantization step size indicating a quantization step size or a value related to the quantization step size, for example, a quantization parameter (QP) in an H.264, H.265, or similar encoder, or a quantization matrix. Or its reference matrix, etc.
  • QP quantization parameter
  • FIG. 3 shows a schematic diagram of data to be encoded in an embodiment of the present invention.
  • the data 302 to be encoded may include a plurality of frames 304.
  • multiple frames 304 may represent consecutive image frames in a video stream.
  • Each frame 304 can include one or more strips 306.
  • Each strip 306 can include one or more macroblocks 308.
  • Each macroblock 308 can include one or more blocks 310.
  • Each block 310 can include one or more pixels 312.
  • Each pixel 312 can include one or more data sets corresponding to one or more data portions, such as a luminance data portion and a chrominance data portion.
  • the data unit can be a frame, a stripe, a macroblock, a block, a pixel, or a group of any of the above.
  • the size of the data unit can vary.
  • one frame 304 may include 100 stripes 306, each strip 306 may include 10 macroblocks 308, each macroblock 308 may include 4 (eg, 2x2) blocks 310, each block 310 may include 64 (eg, 8x8) pixels 312.
  • the technical solution of the embodiment of the present invention can be used to encode a 360-degree panoramic video, and can be applied to various products involving 360-degree panoramic video, for example, a panoramic camera, a virtual reality product, a head mounted device (Head Mount Device, HMD) ), augmented reality products, video encoders, video decoders, etc.
  • a panoramic camera a virtual reality product
  • a head mounted device Head Mount Device, HMD
  • augmented reality products video encoders, video decoders, etc.
  • 360-degree panoramic video usually generates two-dimensional planar video in a certain geometric relationship, and then through digital image processing and encoding and decoding operations.
  • the common format of a two-dimensional plan that is mapped to a specific geometric relationship via a 360-degree panorama is Equirectangular.
  • the latitude and longitude diagram shows a complete ball facing azimuth ⁇ and pitch angle
  • a two-dimensional plan obtained by sampling is shown in Figure 4a.
  • the commonly mapped two-dimensional plan format is also hexahedral, octahedral, and icosahedral.
  • Other mapping mechanisms can be used to map a spherical body into a two-dimensional plan, and the two-dimensional plan after mapping.
  • Composing a two-dimensional planar video which can be encoded and compressed using general video coding and decoding standards such as HEVC/H.265, H.264/AVC, AVS1-P2, AVS2-P2, VP8, VP9, etc. .
  • the two-dimensional planar video is obtained by spherical video mapping, and can also be obtained by partial spherical video mapping.
  • the spherical video or partial spherical video is typically captured by multiple cameras.
  • the processing of the 360-degree panoramic video is the processing of its two-dimensional planar video unless otherwise stated.
  • the embodiment of the present invention provides a rotation coding scheme applied to 360-degree panoramic video coding, and the 360-degree panoramic video to be encoded is first passed through a certain Rotate the angles and then encode the rotated video.
  • the rotation of a 360-degree panoramic video is the rotation of a normal geometric sphere. As shown in FIG. 4b, assuming that the rotation angle of the 360-degree panoramic video is ( ⁇ , ⁇ , ⁇ ), each frame of the 360-degree panoramic video is rotated by ⁇ , ⁇ , ⁇ around the x-axis, the y-axis, and the z-axis, respectively. The angle can be obtained from the rotated video.
  • FIG. 5 shows a schematic flowchart of a method 500 for video encoding according to an embodiment of the present invention.
  • the method 500 can be performed by the system 100 shown in FIG.
  • the motion vector information is acquired by the image signal processor, so that the secondary encoding process can be avoided.
  • the video is preprocessed by the image signal processor to obtain motion vector information for each pixel or each region of each frame of the image in the video.
  • Motion vector information for different image blocks in the image may be calculated based on motion vector information for each pixel or each region of the image.
  • the motion vector information of the first preset frame number image in the 360-degree panoramic video is acquired from the image signal processor.
  • the motion vector information of the first preset frame number image is a local motion vector or a global motion vector of the first preset frame number image in the image signal processor.
  • the pole motion vector value is a motion vector value corresponding to a two-pole region of the 360-degree panoramic video in the first preset frame number image.
  • the motion vector values of the different regions may be calculated based on the motion vector information of the first preset frame number image acquired from the image signal processor.
  • the area here may be an image block, that is, the image may be divided into image blocks, and the motion vector value of each image block is calculated, but the embodiment of the present invention does not limit this.
  • the image is first tiled, and each image block obtained by the division is recorded as a block, and each image is calculated from motion vector information of each pixel or each region (less than Block). The motion vector value of the block.
  • the image block includes a plurality of coding tree units (CTUs), and the motion vector values of the image blocks are average values of motion vector values of the plurality of CTUs.
  • the motion vector value of each CTU may be the sum of the absolute value of the horizontal component of the motion vector and the absolute value of the vertical component of all the pixel points of each CTU.
  • the block can be 512x512 in size, and the CTU can be 64x64 in size, that is, one block is 8x8 CTUs.
  • the CTU is used as a motion vector (Motion Vector, MV) calculation unit.
  • MV Motion Vector
  • the image is divided into one block. Where there are less than 8x8 CTUs at the image boundary, the number of actual remaining CTUs can be a block.
  • the value of MVctu can be the sum of the absolute value of the horizontal component (MVx) of the MV and the absolute value of the vertical component (MVy) of all the pixels in a 64x64 size pixel block, namely:
  • the MV of each CTU in the image can be calculated. Then calculate the MV of each block according to the MV of each CTU in the image, which is recorded as MVblock.
  • the value of the MV of each block may be the average of the MVs of all CTUs in the block.
  • the minimum motion vector value and the pole motion vector value may be determined accordingly.
  • the minimum motion vector value is a motion vector value minMVblock of the image block with the smallest motion vector value in the frame image.
  • the minimum motion vector value is a minimum of an average value of motion vector values of image blocks corresponding to the same position in the first preset frame number image.
  • the value is avgMinMVblock.
  • the minimum motion vector value is the minimum value among the average values of the motion vector values of the image blocks corresponding to the same position in the 5 frame image.
  • the spherical video it is actually only the upper half of the spherical video. For example, for each block in the latitude and longitude map, first find the block in the sphere that is symmetric about the sphere in the sphere, and use the average of the two MVblocks as the MV of the two blocks, so that only the search for the latitude and longitude is needed in the search calculation process. In the upper part of the figure, you can save half the amount of calculation.
  • the pole motion vector value is a motion vector value corresponding to the two-pole region of the 360-degree panoramic video in the first preset frame number image.
  • an average of the motion vector values of a predetermined number of blocks of the two-pole region may be taken as the pole motion vector value.
  • the top two CTU regions in the latitude and longitude map are selected to calculate the pole motion vector values. First, the MVctu of the top two CTUs and the MVctu of the central symmetric CTU are taken as the actual MVctu values, and then the MVctu values of all CTUs in the top two rows of the latitude and longitude map are averaged as the pole motion vector values, which are recorded as polarMVctu.
  • the average value of the pole motion vector values obtained in the above manner for the multi-frame image can be used as the pole motion vector value.
  • the pole motion vector value avgPolarMVctu of the 5 frames is the average value of the polarMVctu of the 5 frames.
  • the rotation angle of the second preset frame number image is determined by using the minimum motion vector value and the pole motion vector value obtained according to the first preset frame number image.
  • the first preset frame number image may be an image of a preset number of frames after the first random pointcut of the 360 panoramic video.
  • the second preset frame number image may be an image of a preset number of frames between the first random pointcut point and the second random pointcut point of the 360 panoramic video, where the second random pointcut is The next random pointcut of the first random pointcut.
  • the first preset number of frames is less than or equal to the second preset number of frames. That is to say, the rotation angle of the image with more frames can be determined from the image with fewer frames. For example, the rotation angle of all images between the random point-in point and the next random point-cut point may be determined according to several frames of images after the random point-in point. Rotation determination for each random pointcut can avoid the phenomenon of subsequent image rotation errors in the video and achieve more accurate rotation coding.
  • the motion vector value threshold may be determined according to a motion search range of the 360-degree panoramic video.
  • the motion search range may be determined by the image signal processor.
  • the motion vector value threshold MVlow can be:
  • the motion search range may be ( ⁇ 16, ⁇ 8), and therefore, the motion vector value threshold MVlow calculated according to the above formula is 1536.
  • Rotational judgment is made using a motion vector value threshold designed according to the motion search range, and encoding of various motion search conditions can be applied.
  • the rotation angle of the second preset frame number image may be determined.
  • the rotation angle is determined to be 0. This situation indicates that the pole region motion vector value is relatively small, so no rotation is required.
  • the rotation angle is a first angle. This case indicates that the pole region motion vector value is large, while other regions have regions where the motion vector value is relatively small, so rotation is required to rotate the region with the smallest motion vector value to the pole region.
  • the specific rotation manner may be that the coordinates on the corresponding spherical surface are first calculated by the coordinates of the center point of the region with the smallest motion vector value on the latitude and longitude map, and then the rotation angle is calculated by the rotation formula (ie, the first The angle, the calculation formula will be given below, and finally can be rotated according to the rotation of the geometric spherical surface shown in Fig. 4b.
  • the first angle is an angle determined according to a position of a region in which the motion vector value is the smallest in the first preset frame number image.
  • the first angle may be determined according to a location of a region in which the motion vector value is the smallest in the first preset frame number image.
  • the area where the motion vector value is the smallest, such as the sitting point of the center point of the block, is marked as (m, n), and the sampling point (m, n) on the warp and weft map in the warp and latitude mapping mode and the point on the spherical surface can be obtained.
  • the conversion relationship between them is calculated
  • the coordinates of the specific conversion formula are as follows:
  • the rotation angle is rotated by the first angle ( ⁇ , ⁇ , 0) in accordance with the rotation mode shown in Fig. 4b, and the region where the motion vector value is the smallest can be rotated to the pole region.
  • the pole motion vector value is not less than the motion vector value threshold, the minimum motion vector value is not less than the motion vector value threshold, and the pole motion vector value is greater than a predetermined multiple of the minimum motion
  • the vector value determines that the angle of rotation is the first angle. This case indicates that although the minimum motion vector value is also large, the pole motion vector value is much larger than the minimum motion vector value, so it is also necessary to rotate the first angle to rotate the region where the motion vector value is the smallest to the pole region.
  • the predetermined multiple may be 8, but the embodiment of the present invention is not limited thereto, that is, as long as the pole motion vector value can be indicated to be much larger than the minimum motion vector value.
  • the motion vector value determines that the rotation angle is zero. This situation indicates that both the minimum motion vector value and the pole motion vector value are large, and the pole motion vector value is not much larger than the minimum motion vector value, so no rotation is required.
  • the rotation angle is 0, no rotation is performed; and if the rotation angle is the first angle, the image is rotated by the first angle. After the rotation operation, the image is encoded.
  • the rotation and encoding operations for the 360 degree panoramic video may all be performed by an encoder or separately by different processing units.
  • the rotation angle may be calculated before the original video is sent to the video encoder, and then the rotation angle and the original video are sent to the video encoder for encoding.
  • the video encoder rotates the video and then encodes it.
  • the angle of the video rotation is written into the code stream, and the decoding end obtains the rotation angle from the code stream and then rotates the decoded video back.
  • the video rotation angle information may be written into a sequence header, a picture header, a slice header, a video parameter set (VPS), a sequence parameter set (SPS). ), picture parameter set (PPS), Supplemental Enhancement Information (SEI), or extension data.
  • the rotation angle may be calculated before the original video is sent to the video encoder, and then the original video is rotated by a tool that can rotate the video, and the rotated video is rotated. It is then sent to the video encoder for encoding. Since the 360-degree panoramic video is viewed by a head-mounted device or a virtual reality product, the video does not affect the information that can be viewed after being rotated, so the decoded video can be rotated back.
  • FIG. 6 is a flowchart of a method for video encoding according to an embodiment of the present invention. It should be understood that FIG. 6 is only an example and should not be construed as limiting the embodiments of the present invention.
  • the pole MV and the minimum MV are respectively obtained, and the rotation angle is calculated according to the position corresponding to the minimum MV. Determining whether rotation is required by comparing the pole MV and the minimum MV with the threshold value, wherein the pole MV and the minimum MV obtained according to the image of the first preset frame number can be used to determine whether the second preset frame number image needs to be Rotate. If it is necessary to rotate, the corresponding image is rotated by the above rotation angle, and then encoded; if no rotation is required, the encoding is directly performed.
  • the motion vector information is obtained by the image signal processor, and the minimum motion vector value and the pole motion vector value are determined according to the motion vector information, and the rotation angle of the 360-degree panoramic video is determined according to the determination, and the rotation angle of the 360-degree panoramic video can be determined.
  • the sub-encoding enables the rotation encoding of the 360-degree panoramic video, which can improve the compression efficiency and the encoding quality of the 360-degree panoramic video, thereby improving the encoding efficiency of the 360-degree panoramic video.
  • FIG. 7 shows a schematic block diagram of an apparatus 700 for video encoding in accordance with an embodiment of the present invention.
  • the apparatus 700 can perform the method of video encoding of the embodiments of the present invention described above.
  • the apparatus 700 can include:
  • the image signal processor 710 is configured to acquire motion vector information of the first preset frame number image in the 360-degree panoramic video.
  • the processing unit 720 is configured to determine, according to the motion vector information, a minimum motion vector value and a pole motion vector value, where the minimum motion vector value is an area of the first preset frame number image that has the smallest motion vector value.
  • a motion vector value, the pole motion vector value is a motion vector value corresponding to a two-pole region of the 360-degree panoramic video in the first preset frame number image; and, according to the minimum motion vector value, the pole motion a vector value, and a motion vector value threshold, determining a rotation angle of the second preset frame number image in the 360-degree panoramic video;
  • a rotation unit 730 configured to rotate the second preset frame number image according to the rotation angle
  • the encoding unit 740 is configured to encode the rotated image.
  • processing unit 720 is further configured to:
  • the motion vector value threshold is determined according to a motion search range of the 360-degree panoramic video.
  • the image signal processor is further configured to:
  • the motion search range is determined.
  • the motion vector information of the first preset frame number image is a local motion vector or a global motion vector of the first preset frame number image in the image signal processor.
  • the first preset frame number image is an image of a preset number of frames after the first random pointcut of the 360 panoramic video.
  • the second preset frame number image is an image of a preset number of frames between the first random cut point and the second random cut point of the 360 panoramic video, where the second random The pointcut is the next random pointcut of the first random pointcut.
  • the first preset number of frames is less than or equal to the second preset number of frames.
  • processing unit 720 is specifically configured to:
  • the pole motion vector value is not less than the motion vector value threshold, the minimum motion vector value is not less than the motion vector value threshold, and the pole motion vector value is greater than a predetermined multiple of the minimum motion vector value, Determining that the angle of rotation is a first angle; or,
  • the pole motion vector value is not less than the motion vector value threshold, the minimum motion vector value is not less than the motion vector value threshold, and the pole motion vector value is not greater than a predetermined multiple of the minimum motion vector value, Then determining that the rotation angle is 0;
  • the first angle is an angle determined according to a position of a region in which the motion vector value is the smallest in the first preset frame number image.
  • the predetermined multiple is 8.
  • processing unit 720 is further configured to:
  • the minimum motion vector value is a minimum value of an average value of motion vector values of image blocks corresponding to the same position in the first preset frame number image, where the image block includes multiple pixel points .
  • the image block includes a plurality of coding tree unit CTUs, and a motion vector value of the image block is an average value of motion vector values of the plurality of CTUs, where a motion vector value of each CTU is The sum of the absolute value of the horizontal component of the motion vector of all the pixels of each CTU and the absolute value of the vertical component.
  • the above-described rotation unit 730 and encoding unit 740 may both be implemented by an encoder or separately.
  • the rotation unit 730 is implemented by a tool that can rotate a video
  • the encoding unit 740 is implemented by an encoder.
  • the device for video coding may be a chip, which may be specifically implemented by a circuit, but the specific implementation manner of the embodiment of the present invention is not limited.
  • FIG. 8 shows a schematic block diagram of a computer system 800 in accordance with an embodiment of the present invention.
  • the computer system 800 can include a processor 810 and a memory 820.
  • computer system 800 may also include components that are generally included in other computer systems, such as input and output devices, communication interfaces, and the like, which are not limited by the embodiments of the present invention.
  • Memory 820 is for storing computer executable instructions.
  • the memory 820 may be various kinds of memories, for example, may include a high speed random access memory (RAM), and may also include a non-volatile memory, such as at least one disk memory, which is implemented by the present invention. This example is not limited to this.
  • RAM high speed random access memory
  • non-volatile memory such as at least one disk memory
  • the processor 810 is configured to access the memory 820 and execute the computer executable instructions to perform the operations in the method of video encoding of the embodiments of the present invention described above.
  • the processor 810 can include a microprocessor, a field-programmable gate array (FPGA), a central processing unit (CPU), a graphics processing unit (GPU), etc., and is implemented by the present invention. This example is not limited to this.
  • the apparatus and computer system for video encoding of an embodiment of the present invention may correspond to an execution subject of a method of video encoding according to an embodiment of the present invention, and the above-described and other operations and/or functions of each module in the video encoding apparatus and the computer system respectively.
  • Embodiments of the present invention also provide an electronic device, which may include the above-described video coding device or computer system of various embodiments of the present invention.
  • the embodiment of the invention further provides a computer storage medium, wherein the computer storage medium stores program code, and the program code can be used to indicate a method for performing the video encoding of the embodiment of the invention.
  • the term "and/or” is merely an association relationship describing an associated object, indicating that there may be three relationships.
  • a and/or B may indicate that A exists separately, and A and B exist simultaneously, and B cases exist alone.
  • the character "/" in this article generally indicates that the contextual object is an "or" relationship.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, or an electrical, mechanical or other form of connection.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the embodiments of the present invention.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • the technical solution of the present invention contributes in essence or to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium.
  • a number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Studio Devices (AREA)

Abstract

公开了一种视频编码的方法、装置和计算机系统。该方法包括:通过图像信号处理器获取360度全景视频中第一预设帧数图像的运动矢量信息;根据所述运动矢量信息,确定最小运动矢量值和极点运动矢量值,其中,所述最小运动矢量值为所述第一预设帧数图像中运动矢量值最小的区域的运动矢量值,所述极点运动矢量值为所述第一预设帧数图像中对应所述360度全景视频的两极区域的运动矢量值;根据所述最小运动矢量值、所述极点运动矢量值,以及运动矢量值门限,确定所述360度全景视频中第二预设帧数图像的旋转角度;根据所述旋转角度,对所述第二预设帧数图像进行旋转,然后再进行编码。本发明实施例的技术方案,能够提高360度全景视频的编码效率。

Description

视频编码的方法、装置和计算机系统
版权申明
本专利文件披露的内容包含受版权保护的材料。该版权为版权所有人所有。版权所有人不反对任何人复制专利与商标局的官方记录和档案中所存在的该专利文件或者该专利披露。
技术领域
本发明涉及信息技术领域,并且更具体地,涉及一种视频编码的方法、装置和计算机系统。
背景技术
近年来随着视频拍摄场景的扩展,360度全景视频应用逐渐普及。360度全景视频通常是指水平视角有360度(-180°~180°)、垂直视角有180度(-90°~90°)的视频,通常以三维球面形式呈现。360度全景视频通常会以某种几何关系映射生成二维平面视频,然后再经由数字图像处理及编解码操作。
在360度全景视频映射成二维平面进行编码时,比如说映射成经纬图格式,全景视频的上下两极部分(垂直视角为-90度和90度附近区域,也可以称为极点区域)会有明显的拉伸,经纬图左右两边的拼接处也会有明显的不连续。考虑到人眼关注区域主要在全景视频的赤道附近区域(垂直视角为0度附近区域),在对经纬图进行编码时,往往会对于经纬图中心区域分配更多的比特,编码的更加仔细;对于经纬图的上下两极以及左右两侧就分配的比特相对要少一些,视频编码的质量相对要差一些。但是对于某些两极部分运动剧烈,中心区域运动平缓的360度全景视频,这种编码方式消耗大量多余的比特在运动平缓区域,而对于运动剧烈的两极部分则由于分配的比特较少导致编码质量较差,从而导致编码效率较低。
因此,需要一种360度全景视频的编码方法,以提高360度全景视频的编码效率。
发明内容
本发明实施例提供了一种视频编码的方法、装置和计算机系统,能够提高360度全景视频的编码效率。
第一方面,提供了一种视频编码的方法,包括:通过图像信号处理器获取360度全景视频中第一预设帧数图像的运动矢量信息;根据所述运动矢量信息,确定最小运动矢量值和极点运动矢量值,其中,所述最小运动矢量值为所述第一预设帧数图像中运动矢量值最小的区域的运动矢量值,所述极点运动矢量值为所述第一预设帧数图像中对应所述360度全景视频的两极区域的运动矢量值;根据所述最小运动矢量值、所述极点运动矢量值,以及运动矢量值门限,确定所述360度全景视频中第二预设帧数图像的旋转角度;根据所述旋转角度,对所述第二预设帧数图像进行旋转,然后再进行编码。
第二方面,提供了视频编码的装置,包括:图像信号处理器,用于获取360度全景视频中第一预设帧数图像的运动矢量信息;处理单元,用于根据所述运动矢量信息,确定最小运动矢量值和极点运动矢量值,其中,所述最小运动矢量值为所述第一预设帧数图像中运动矢量值最小的区域的运动矢量值,所述极点运动矢量值为所述第一预设帧数图像中对应所述360度全景视频的两极区域的运动矢量值;以及,根据所述最小运动矢量值、所述极点运动矢量值,以及运动矢量值门限,确定所述360度全景视频中第二预设帧数图像的旋转角度;旋转单元,用于根据所述旋转角度,对所述第二预设帧数图像进行旋转;编码单元,用于对旋转后的图像进行编码。
第三方面,提供了一种计算机系统,包括:存储器,用于存储计算机可执行指令;处理器,用于访问所述存储器,并执行所述计算机可执行指令,以进行上述第一方面的方法中的操作。
第四方面,提供了一种计算机存储介质,该计算机存储介质中存储有程序代码,该程序代码可以用于指示执行上述第一方面的方法。
本发明实施例的技术方案,通过图像信号处理器获取运动矢量信息,根据运动矢量信息,确定最小运动矢量值和极点运动矢量值,并据此确定360度全景视频的旋转角度,能够不需要二次编码而实现360度全景视频的旋转编码,可以提高360度全景视频的压缩效率和编码质量,从而能够提高360度全景视频的编码效率。
附图说明
图1是应用本发明实施例的技术方案的架构图。
图2是本发明实施例的编码器的处理架构图。
图3是本发明实施例的待编码数据的示意图。
图4a是本发明实施例的360度全景视频映射生成二维平面视频的示图。
图4b是本发明实施例的视频旋转的示图。
图5是本发明实施例的视频编码的方法的示意性流程图。
图6是本发明实施例的视频编码的方法的流程图。
图7是本发明实施例的视频编码的装置的示意性框图。
图8是本发明实施例的计算机系统的示意性框图。
具体实施方式
下面将结合附图,对本发明实施例中的技术方案进行描述。
应理解,本文中的具体的例子只是为了帮助本领域技术人员更好地理解本发明实施例,而非限制本发明实施例的范围。
还应理解,本发明实施例中的公式只是一种示例,而非限制本发明实施例的范围,各公式可以进行变形,这些变形也应属于本发明保护的范围。
还应理解,在本发明的各种实施例中,各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本发明实施例的实施过程构成任何限定。
还应理解,本说明书中描述的各种实施方式,既可以单独实施,也可以组合实施,本发明实施例对此并不限定。
除非另有说明,本发明实施例所使用的所有技术和科学术语与本发明的技术领域的技术人员通常理解的含义相同。本申请中所使用的术语只是为了描述具体的实施例的目的,不是旨在限制本申请的范围。本申请所使用的术语“和/或”包括一个或多个相关的所列项的任意的和所有的组合。
图1是应用本发明实施例的技术方案的架构图。
如图1所示,系统100可以接收待编码数据102,对待编码数据102进行编码,产生编码数据108,并可以进一步视频编码108。例如,系统100可以接收360度全景视频数据,对360度全景视频数据进行旋转编码以产生旋转编码后的数据。在一些实施例中,系统100中的部件可以由一个或多个 处理器实现,该处理器可以是计算设备中的处理器,也可以是移动设备(例如无人机)中的处理器。该处理器可以为任意种类的处理器,本发明实施例对此不做限定。在一些可能的设计中,该处理器可以包括图像信号处理器(Image Signal Processor,ISP)、编码器等。系统100中还可以包括一个或多个存储器。该存储器可用于存储指令和数据,例如,实现本发明实施例的技术方案的计算机可执行指令,待编码数据102、编码数据108等。该存储器可以为任意种类的存储器,本发明实施例对此也不做限定。
待编码数据102可以包括文本,图像,图形对象,动画序列,音频,视频,或者任何需要编码的其他数据。在一些情况下,待编码数据102可以包括来自传感器的传感数据,该传感器可以为视觉传感器(例如,相机、红外传感器),麦克风,近场传感器(例如,超声波传感器、雷达),位置传感器,温度传感器,触摸传感器等。在一些情况下,待编码数据102可以包括来自用户的信息,例如,生物信息,该生物信息可以包括面部特征,指纹扫描,视网膜扫描,嗓音记录,DNA采样等。
编码对于高效和/或安全的传输或存储数据是必需的。对待编码数据102的编码可以包括数据压缩,加密,纠错编码,格式转换等。例如,对多媒体数据(例如视频或音频)压缩可以减少在网络中传输的比特数量。敏感数据,例如金融信息和个人标识信息,在传输和存储前可以加密以保护机密和/或隐私。为了减少视频存储和传输所占用的带宽,需要对视频数据进行编码压缩处理。
任何合适的编码技术都可以用于编码待编码数据102。编码类型依赖于被编码的数据和具体的编码需求。
在一些实施例中,编码器可以实现一种或多种不同的编解码器。每种编解码器可以包括实现不同编码算法的代码,指令或计算机程序。基于各种因素,包括待编码数据102的类型和/或来源,编码数据的接收实体,可用的计算资源,网络环境,商业环境,规则和标准等,可以选择一种合适的编码算法编码给定的待编码数据102。
例如,编码器可以被配置为编码一系列视频帧。编码每个帧中的数据可以采用一系列步骤。在一些实施例中,编码步骤可以包括预测、变换、量化、熵编码等处理步骤。
预测包括帧内预测和帧间预测两种类型,其目的在于利用预测块信息 去除当前待编码图像块的冗余信息。帧内预测利用本帧图像的信息获得预测块数据。帧间预测利用参考帧的信息获得预测块数据,其过程包括将待编码图像块划分成若干个子图像块;然后,针对每个子图像块,在参考图像中搜索与当前子图像块最匹配的图像块作为预测块;其后,将该子图像块与预测块的相应像素值相减得到残差,并将得到的各子图像块对应的残差组合在一起,得到图像块的残差。
使用变换矩阵对图像的残差块进行变换可以去除图像块的残差的相关性,即去除图像块的冗余信息,以便提高编码效率,图像块中的数据块的变换通常采用二维变换,即在编码端将数据块的残差信息分别与一个NxM的变换矩阵及其转置矩阵相乘,相乘之后得到的是变换系数。变换系数经量化可得到量化后的系数,最后将量化后的系数进行熵编码,最后将熵编码得到的比特流及进行编码后的编码模式信息,如帧内预测模式、运动矢量信息等,进行存储或发送到解码端。在图像的解码端,首先获得熵编码比特流后进行熵解码,得到相应的残差,根据解码得到的运动矢量或帧内预测等信息图像块对应的预测图像块,根据预测图像块与图像块的残差得到当前子图像块中各像素点的值。
图2示出了本发明实施例的编码器的处理架构图。如图2所示,预测处理可以包括帧内预测和帧间预测。通过预测处理,可以得到数据单元(例如像素点)对应的残差,其中,在对某一像素点进行预测时,可以从存储的上下文中获取参考像素点重建后得到的像素,按照参考像素点重建后得到的像素与该像素点的像素,得到该像素点对应的像素残差。像素残差通过变换、量化后再进行熵编码。在量化处理时,可以通过对量化参数的控制,实现对码率的控制。对某一像素点对应的量化处理后的像素残差还可以进行反量化反变换处理,再进行重建处理,得到该像素点重建后的像素,并将该像素点重建后的像素进行存储,以便于在该像素点作为参考像素点时,利用该像素点重建后的像素获取其他像素点对应的像素残差。
量化参数可以包括量化步长,表示量化步长或者与量化步长相关的值,例如,H.264、H.265或者类似的编码器中的量化参数(Quantization Parameter,QP),或者,量化矩阵或其参考矩阵等。
图3示出了本发明实施例的待编码数据的示意图。
如图3所示,待编码数据302可以包括多个帧304。例如,多个帧304 可以表示视频流中的连续的图像帧。每个帧304可以包括一个或多个条带306。每个条带306可以包括一个或多个宏块308。每个宏块308可以包括一个或多个块310。每个块310可以包括一个或多个像素312。每个像素312可以包括一个或多个数据集,对应于一个或多个数据部分,例如,亮度数据部分和色度数据部分。数据单元可以为帧,条带,宏块,块,像素或以上任一种的组。在不同的实施例中,数据单元的大小可以变化。作为举例,一个帧304可以包括100个条带306,每个条带306可以包括10个宏块308,每个宏块308可以包括4个(例如,2x2)块310,每个块310可以包括64个(例如,8x8)像素312。
本发明实施例的技术方案可以用于对360度全景视频进行编码,可以应用于各种涉及360度全景视频的产品,例如,全景相机、虚拟现实产品、头戴式设备(Head Mount Device,HMD)、增强现实产品、视频编码器、视频解码器等。
360度全景视频通常会以某种几何关系映射生成二维平面视频,然后再经由数字图像处理及编解码操作。经360度全景图以特定几何关系映射成的二维平面图常见的格式有经纬图(Equirectangular)。经纬图表示的是一个完整的球面对方位角θ和俯仰角
Figure PCTCN2017119000-appb-000001
进行采样得到的二维平面图,如图4a所示。
除了经纬图,常见的映射成的二维平面图格式还有六面体、八面体、二十面体格式,还可以使用其它的映射机制将一个球面体映射成一个二维平面图,经映射之后的二维平面图组成二维平面视频,所述二维平面视频可以使用通用的视频编解码标准,如HEVC/H.265,H.264/AVC,AVS1-P2,AVS2-P2,VP8,VP9等,进行编码压缩。所述二维平面视频通过球面视频映射得到,也可经由部分球面视频映射得到。所述球面视频或部分球面视频通常由多个摄像机拍摄得到。
在本发明实施例中,除非另有说明,对360度全景视频的处理为对其二维平面视频的处理。
为了提高360度全景视频的压缩效率和编码质量,提高编码效率,本发明实施例提供了一种应用于360度全景视频编码中的旋转编码方案,对于待编码的360度全景视频,先经过某个角度的旋转,然后再将旋转后的视频编码。
360度全景视频的旋转就是普通几何球面的旋转。如图4b所示,假设360度全景视频的旋转角度是(α,β,γ),则对该360度全景视频的每帧图像分别绕x轴、y轴、z轴旋转α、β、γ角度即可得到旋转后的视频。
图5示出了本发明实施例的视频编码的方法500的示意性流程图。该方法500可以由图1所示的系统100执行。
510,通过图像信号处理器获取360度全景视频中第一预设帧数图像的运动矢量信息。
在本发明实施例中,通过图像信号处理器获取运动矢量信息,这样可以避免二次编码过程。
视频经过图像信号处理器的预处理可以得到视频中每帧图像的每个像素点或每一个区域的运动矢量信息。基于图像的每个像素点或每一个区域的运动矢量信息可以计算图像中不同图像块的运动矢量信息。
在本发明实施例中,从图像信号处理器获取360度全景视频中第一预设帧数图像的运动矢量信息。可选地,所述第一预设帧数图像的运动矢量信息为所述图像信号处理器中所述第一预设帧数图像的局部运动矢量或全局运动矢量。
520,根据所述运动矢量信息,确定最小运动矢量值和极点运动矢量值,其中,所述最小运动矢量值为所述第一预设帧数图像中运动矢量值最小的区域的运动矢量值,所述极点运动矢量值为所述第一预设帧数图像中对应所述360度全景视频的两极区域的运动矢量值。
根据从图像信号处理器获取的第一预设帧数图像的运动矢量信息可以计算不同区域的运动矢量值。可选地,此处的区域可以为图像块(Block),即可以将图像划分为图像块,计算每个图像块的运动矢量值,但本发明实施例对此并不限定。
具体而言,对于一帧图像,先将图像划块,划分得到的每个图像块记为一个Block,由每个像素点或每一个区域(小于Block)的运动矢量信息计算得到图像中每个Block的运动矢量值。
可选地,所述图像块包括多个编码树形单元(Coding Tree Unit,CTU),所述图像块的运动矢量值为所述多个CTU的运动矢量值的平均值。每个CTU的运动矢量值可以为所述每个CTU的所有像素点的运动矢量的水平分量的绝对值与垂直分量的绝对值之和。
例如,Block可以为512x512的大小,CTU可以为64x64大小,即一个Block为8x8个CTU。将CTU作为运动矢量(Motion Vector,MV)计算单元。按照8x8个CTU为一个Block对图像进行划分,图像边界处不足8x8个CTU的地方则可以按照实际剩余CTU个数为一个Block。将一个CTU的MV记作MVctu,则MVctu的值可以为一个64x64大小的像素块内所有像素点的MV的水平分量(MVx)的绝对值与垂直分量(MVy)的绝对值之和,即:
Figure PCTCN2017119000-appb-000002
根据上式可以计算出图像中每个CTU的MV。然后根据图像中每个CTU的MV计算每个Block的MV,记作MVblock。可选地,每个Block的MV的值可以是该Block中所有CTU的MV的平均值。
得到第一预设帧数图像中每一帧图像中每个Block的运动矢量值后,可以据此确定最小运动矢量值和极点运动矢量值。
可选地,在所述第一预设帧数为1时,即只取一帧图像时,所述最小运动矢量值为该帧图像中运动矢量值最小的图像块的运动矢量值minMVblock。
可选地,在所述第一预设帧数大于1时,所述最小运动矢量值为所述第一预设帧数图像中对应相同位置的图像块的运动矢量值的平均值中的最小值avgMinMVblock。例如,以第一预设帧数取5为例,最小运动矢量值为5帧图像中对应相同位置的图像块的运动矢量值的平均值中的最小值。
考虑到球面视频有对称性,实际上只要处理球面视频的上半部分。例如,对于经纬图中的每个Block,先找到该Block在球体中关于球心对称的Block,用两者MVblock的平均值作为这两个Block的MV,这样在搜索计算过程中只需要搜索经纬图的上半部分,可以节省一半的计算量。
极点运动矢量值为所述第一预设帧数图像中对应所述360度全景视频的两极区域的运动矢量值。可选地,对于一帧图像,可以取两极区域的预定数量的Block的运动矢量值的平均值作为极点运动矢量值。例如,选取经纬图中最上面两行CTU区域来计算极点运动矢量值。先将最上面两行CTU的MVctu与其关于中心对称的CTU的MVctu求均值作为实际的MVctu值,然后对经纬图最上面两行所有CTU的MVctu值求平均值作为极点运动矢量 值,记为polarMVctu。对于多帧图像,可以将多帧图像的按上述方式得到的极点运动矢量值的平均值作为极点运动矢量值。例如,以第一预设帧数取5为例,5帧图像的极点运动矢量值avgPolarMVctu为这5帧图像的polarMVctu的平均值。
530,根据所述最小运动矢量值、所述极点运动矢量值,以及运动矢量值门限,确定所述360度全景视频中第二预设帧数图像的旋转角度。
将所述最小运动矢量值和所述极点运动矢量值与运动矢量值门限比较,判断是否需要对所述360度全景视频中第二预设帧数图像进行旋转。
应理解,在本发明实施例中,若旋转角度为0,则表示不进行旋转;若旋转角度不为0,则表示进行旋转。
在本发明实施例中,利用根据所述第一预设帧数图像得到的所述最小运动矢量值、所述极点运动矢量值确定所述第二预设帧数图像的旋转角度。可选地,所述第一预设帧数图像可以为所述360全景视频的第一随机切入点后的预设帧数的图像。所述第二预设帧数图像可以为所述360全景视频的所述第一随机切入点到第二随机切入点之间的预设帧数的图像,其中,所述第二随机切入点为所述第一随机切入点的下一个随机切入点。
可选地,所述第一预设帧数小于或等于所述第二预设帧数。也就是说,可以根据较少帧数图像确定较多帧数图像的旋转角度。例如,可以根据随机切入点后的几帧图像确定该随机切入点到下一个随机切入点之间的所有图像的旋转角度。针对每个随机切入点进行旋转判断,能够避免视频中后面的图像旋转错误的现象,做到更加准确的旋转编码。
可选地,可以根据所述360度全景视频的运动搜索范围确定所述运动矢量值门限。可选地,可以通过所述图像信号处理器确定所述运动搜索范围。
具体而言,假设从图像信号处理器获取的运动搜索范围大小是(±H,±V),也就是水平方向的搜索范围是[-H,H],垂直方向的搜索范围是[-V,V],则运动矢量值门限MVlow可以为:
Figure PCTCN2017119000-appb-000003
例如,在具体的实施方式中,运动搜索范围可以为(±16,±8),因此,按照上述公式计算得到的运动矢量值门限MVlow为1536。
采用根据运动搜索范围设计的运动矢量值门限来做旋转判断,能够适 用各种不同运动搜索条件的编码。
将上述的最小运动矢量值、极点运动矢量值,与运动矢量值门限比较,可以确定所述第二预设帧数图像的旋转角度。
可选地,若所述极点运动矢量值小于所述运动矢量值门限,则确定所述旋转角度为0。这种情况表明极点区域运动矢量值比较小,因此不需要做旋转。
可选地,若所述极点运动矢量值不小于所述运动矢量值门限,且所述最小运动矢量值小于所述运动矢量值门限,则确定对所述旋转角度为第一角度。这种情况表明极点区域运动矢量值较大,而其他区域存在运动矢量值比较小的区域,因此需要进行旋转,将运动矢量值最小的区域旋转到极点区域。可选地,具体旋转的方式可以是先通过该运动矢量值最小的区域的中心点在经纬图上的坐标计算出对应球面上的坐标,再由旋转公式计算出旋转角度(即所述第一角度,其计算公式下面将给出),最后可按照图4b所示的几何球面的旋转方式进行旋转。
所述第一角度为根据所述第一预设帧数图像中运动矢量值最小的区域的位置确定的角度。
具体而言,可以根据所述第一预设帧数图像中运动矢量值最小的区域的位置,确定所述第一角度。例如,将运动矢量值最小的区域,如Block的中心点的坐标记为(m,n),可以通过经纬图映射方式中经纬图上的采样点(m,n)与球面上的点
Figure PCTCN2017119000-appb-000004
之间的转换关系计算得到
Figure PCTCN2017119000-appb-000005
的坐标,具体的转换公式如下:
Figure PCTCN2017119000-appb-000006
Figure PCTCN2017119000-appb-000007
θ=(0.5-v)*π
其中W、H是经纬图的宽和高,然后按照下面的公式计算出旋转角度,即所述第一角度(α,β,0):
α=90-θ,
Figure PCTCN2017119000-appb-000008
在需要旋转的情况下,旋转角度采用所述第一角度(α,β,0),按照图4b所示的旋转方式进行旋转,可以将运动矢量值最小的区域旋转到极点区域。
可选地,若所述极点运动矢量值不小于所述运动矢量值门限,所述最小运动矢量值不小于所述运动矢量值门限,且所述极点运动矢量值大于预定 倍数的所述最小运动矢量值,则确定对所述旋转角度为第一角度。这种情况表明虽然最小运动矢量值也较大,但极点运动矢量值远大于最小运动矢量值,因此也需要旋转所述第一角度,将运动矢量值最小的区域旋转到极点区域。
可选地,所述预定倍数可以为8,但本发明实施例对此并不限定,也就是说,只要能够表明极点运动矢量值远大于最小运动矢量值即可。
可选地,若所述极点运动矢量值不小于所述运动矢量值门限,所述最小运动矢量值不小于所述运动矢量值门限,且所述极点运动矢量值不大于预定倍数的所述最小运动矢量值,则确定所述旋转角度为0。这种情况表明最小运动矢量值和极点运动矢量值都较大,极点运动矢量值没有远大于最小运动矢量值,因此不需要做旋转。
540,根据所述旋转角度,对所述第二预设帧数图像进行旋转,然后再进行编码。
具体地,若所述旋转角度为0,则不进行旋转;若所述旋转角度为所述第一角度,则将图像旋转所述第一角度。在旋转操作后,再对图像进行编码。
可选地,对360度全景视频的旋转和编码操作可以都由编码器执行,也可以分别由不同的处理单元执行。
可选地,在一种实施方式中,可以是在原始视频送入视频编码器编码之前先算好旋转角度,然后将旋转角度和原始视频送入视频编码器进行编码。由视频编码器对视频做旋转后再编码,在编码过程中会将视频旋转的角度写入码流中,解码端从码流中获取旋转角度之后再将解码出来的视频旋转回来。所述视频旋转角度信息可以写入序列头(sequence header)、图像头(picture header)、条带头(slice header)、视频参数集(video parameter set,VPS)、序列参数集(sequence parameter set,SPS)、图像参数集(picture parameter set,PPS)、附加增强信息(Supplemental Enhancement Information,SEI)、或扩展数据(extension data)中。
可选地,在另一种实施方式中,可以是在原始视频送入视频编码器编码之前先计算出旋转角度,然后由可以对视频进行旋转的工具对原始视频做旋转,将旋转后的视频再送入视频编码器进行编码。由于360度全景视频是用头戴式设备或虚拟现实产品等观看,视频经过旋转后并不会影响所能观看到的信息,所以可以不用再将解码后的视频旋转回来。
图6为本发明实施例的视频编码的方法的流程图。应理解,图6仅是示例,不应理解为对本发明实施例的限定。
如图6所示,在对360度全景视频编码前,先判断是否需要旋转。通过计算图像中CTU的MV,分别得到极点MV和最小MV,并根据最小MV对应的位置计算旋转角度。通过将极点MV和最小MV,与门限值进行比较,判断是否需要旋转,其中,可以利用根据第一预设帧数图像得到的极点MV和最小MV,确定第二预设帧数图像是否需要旋转。若需要旋转,则将相应图像旋转上述旋转角度,然后再进行编码;若不需要旋转,则直接进行编码。
本发明实施例的技术方案,通过图像信号处理器获取运动矢量信息,根据运动矢量信息,确定最小运动矢量值和极点运动矢量值,并据此确定360度全景视频的旋转角度,能够不需要二次编码而实现360度全景视频的旋转编码,可以提高360度全景视频的压缩效率和编码质量,从而能够提高360度全景视频的编码效率。
上文中详细描述了本发明实施例的视频编码的方法,下面将描述本发明实施例的视频编码的装置和计算机系统。
图7示出了本发明实施例的视频编码的装置700的示意性框图。该装置700可以执行上述本发明实施例的视频编码的方法。
如图7所示,该装置700可以包括:
图像信号处理器710,用于获取360度全景视频中第一预设帧数图像的运动矢量信息;
处理单元720,用于根据所述运动矢量信息,确定最小运动矢量值和极点运动矢量值,其中,所述最小运动矢量值为所述第一预设帧数图像中运动矢量值最小的区域的运动矢量值,所述极点运动矢量值为所述第一预设帧数图像中对应所述360度全景视频的两极区域的运动矢量值;以及,根据所述最小运动矢量值、所述极点运动矢量值,以及运动矢量值门限,确定所述360度全景视频中第二预设帧数图像的旋转角度;
旋转单元730,用于根据所述旋转角度,对所述第二预设帧数图像进行旋转;
编码单元740,用于对旋转后的图像进行编码。
可选地,所述处理单元720还用于:
根据所述360度全景视频的运动搜索范围确定所述运动矢量值门限。
可选地,所述图像信号处理器还用于:
确定所述运动搜索范围。
可选地,所述第一预设帧数图像的运动矢量信息为所述图像信号处理器中所述第一预设帧数图像的局部运动矢量或全局运动矢量。
可选地,所述第一预设帧数图像为所述360全景视频的第一随机切入点后的预设帧数的图像。
可选地,所述第二预设帧数图像为所述360全景视频的所述第一随机切入点到第二随机切入点之间的预设帧数的图像,其中,所述第二随机切入点为所述第一随机切入点的下一个随机切入点。
可选地,所述第一预设帧数小于或等于所述第二预设帧数。
可选地,所述处理单元720具体用于:
若所述极点运动矢量值小于所述运动矢量值门限,则确定所述旋转角度为0;或者,
若所述极点运动矢量值不小于所述运动矢量值门限,且所述最小运动矢量值小于所述运动矢量值门限,则确定对所述旋转角度为第一角度;或者,
若所述极点运动矢量值不小于所述运动矢量值门限,所述最小运动矢量值不小于所述运动矢量值门限,且所述极点运动矢量值大于预定倍数的所述最小运动矢量值,则确定对所述旋转角度为第一角度;或者,
若所述极点运动矢量值不小于所述运动矢量值门限,所述最小运动矢量值不小于所述运动矢量值门限,且所述极点运动矢量值不大于预定倍数的所述最小运动矢量值,则确定所述旋转角度为0;
其中,所述第一角度为根据所述第一预设帧数图像中运动矢量值最小的区域的位置确定的角度。
可选地,所述预定倍数为8。
可选地,所述处理单元720还用于:
根据所述第一预设帧数图像中运动矢量值最小的区域的位置,确定所述第一角度。
可选地,所述最小运动矢量值为所述第一预设帧数图像中对应相同位置的图像块的运动矢量值的平均值中的最小值,其中,所述图像块包括多个像素点。
可选地,所述图像块包括多个编码树形单元CTU,所述图像块的运动 矢量值为所述多个CTU的运动矢量值的平均值,其中,每个CTU的运动矢量值为所述每个CTU的所有像素点的运动矢量的水平分量的绝对值与垂直分量的绝对值之和。
可选地,上述旋转单元730和编码单元740可以都由编码器实现,也可以分开实现,例如,旋转单元730由可以对视频进行旋转的工具实现,编码单元740由编码器实现。
应理解,上述本发明实施例的视频编码的装置可以是芯片,其具体可以由电路实现,但本发明实施例对具体的实现形式不做限定。
图8示出了本发明实施例的计算机系统800的示意性框图。
如图8所示,该计算机系统800可以包括处理器810和存储器820。
应理解,该计算机系统800还可以包括其他计算机系统中通常所包括的部件,例如,输入输出设备、通信接口等,本发明实施例对此并不限定。
存储器820用于存储计算机可执行指令。
存储器820可以是各种种类的存储器,例如可以包括高速随机存取存储器(Random Access Memory,RAM),还可以包括非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器,本发明实施例对此并不限定。
处理器810用于访问该存储器820,并执行该计算机可执行指令,以进行上述本发明实施例的视频编码的方法中的操作。
处理器810可以包括微处理器,现场可编程门阵列(Field-Programmable Gate Array,FPGA),中央处理器(Central Processing unit,CPU),图形处理器(Graphics Processing Unit,GPU)等,本发明实施例对此并不限定。
本发明实施例的视频编码的装置和计算机系统可对应于本发明实施例的视频编码的方法的执行主体,并且视频编码的装置和计算机系统中的各个模块的上述和其它操作和/或功能分别为了实现前述各个方法的相应流程,为了简洁,在此不再赘述。
本发明实施例还提供了一种电子设备,该电子设备可以包括上述本发明各种实施例的视频编码的装置或者计算机系统。
本发明实施例还提供了一种计算机存储介质,该计算机存储介质中存储有程序代码,该程序代码可以用于指示执行上述本发明实施例的视频编码的方法。
应理解,在本发明实施例中,术语“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系。例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、装置或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本发明实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解, 本发明的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以权利要求的保护范围为准。

Claims (25)

  1. 一种视频编码的方法,其特征在于,包括:
    通过图像信号处理器获取360度全景视频中第一预设帧数图像的运动矢量信息;
    根据所述运动矢量信息,确定最小运动矢量值和极点运动矢量值,其中,所述最小运动矢量值为所述第一预设帧数图像中运动矢量值最小的区域的运动矢量值,所述极点运动矢量值为所述第一预设帧数图像中对应所述360度全景视频的两极区域的运动矢量值;
    根据所述最小运动矢量值、所述极点运动矢量值,以及运动矢量值门限,确定所述360度全景视频中第二预设帧数图像的旋转角度;
    根据所述旋转角度,对所述第二预设帧数图像进行旋转,然后再进行编码。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    根据所述360度全景视频的运动搜索范围确定所述运动矢量值门限。
  3. 根据权利要求2所述的方法,其特征在于,所述方法还包括:
    通过所述图像信号处理器确定所述运动搜索范围。
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述第一预设帧数图像的运动矢量信息为所述图像信号处理器中所述第一预设帧数图像的局部运动矢量或全局运动矢量。
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述第一预设帧数图像为所述360全景视频的第一随机切入点后的预设帧数的图像。
  6. 根据权利要求5所述的方法,其特征在于,所述第二预设帧数图像为所述360全景视频的所述第一随机切入点到第二随机切入点之间的预设帧数的图像,其中,所述第二随机切入点为所述第一随机切入点的下一个随机切入点。
  7. 根据权利要求1至6中任一项所述的方法,其特征在于,所述第一预设帧数小于或等于所述第二预设帧数。
  8. 根据权利要求1至7中任一项所述的方法,其特征在于,所述根据所述最小运动矢量值、所述极点运动矢量值,以及运动矢量值门限,确定所述360度全景视频中第二预设帧数图像的旋转角度,包括:
    若所述极点运动矢量值小于所述运动矢量值门限,则确定所述旋转角度 为0;或者,
    若所述极点运动矢量值不小于所述运动矢量值门限,且所述最小运动矢量值小于所述运动矢量值门限,则确定对所述旋转角度为第一角度;或者,
    若所述极点运动矢量值不小于所述运动矢量值门限,所述最小运动矢量值不小于所述运动矢量值门限,且所述极点运动矢量值大于预定倍数的所述最小运动矢量值,则确定对所述旋转角度为第一角度;或者,
    若所述极点运动矢量值不小于所述运动矢量值门限,所述最小运动矢量值不小于所述运动矢量值门限,且所述极点运动矢量值不大于预定倍数的所述最小运动矢量值,则确定所述旋转角度为0;
    其中,所述第一角度为根据所述第一预设帧数图像中运动矢量值最小的区域的位置确定的角度。
  9. 根据权利要求8所述的方法,其特征在于,所述预定倍数为8。
  10. 根据权利要求8或9所述的方法,其特征在于,所述方法还包括:
    根据所述第一预设帧数图像中运动矢量值最小的区域的位置,确定所述第一角度。
  11. 根据权利要求1至10中任一项所述的方法,其特征在于,所述最小运动矢量值为所述第一预设帧数图像中对应相同位置的图像块的运动矢量值的平均值中的最小值,其中,所述图像块包括多个像素点。
  12. 根据权利要求11所述的方法,其特征在于,所述图像块包括多个编码树形单元CTU,所述图像块的运动矢量值为所述多个CTU的运动矢量值的平均值,其中,每个CTU的运动矢量值为所述每个CTU的所有像素点的运动矢量的水平分量的绝对值与垂直分量的绝对值之和。
  13. 一种视频编码的装置,其特征在于,包括:
    图像信号处理器,用于获取360度全景视频中第一预设帧数图像的运动矢量信息;
    处理单元,用于根据所述运动矢量信息,确定最小运动矢量值和极点运动矢量值,其中,所述最小运动矢量值为所述第一预设帧数图像中运动矢量值最小的区域的运动矢量值,所述极点运动矢量值为所述第一预设帧数图像中对应所述360度全景视频的两极区域的运动矢量值;以及,根据所述最小运动矢量值、所述极点运动矢量值,以及运动矢量值门限,确定所述360度全景视频中第二预设帧数图像的旋转角度;
    旋转单元,用于根据所述旋转角度,对所述第二预设帧数图像进行旋转;
    编码单元,用于对旋转后的图像进行编码。
  14. 根据权利要求13所述的装置,其特征在于,所述处理单元还用于:
    根据所述360度全景视频的运动搜索范围确定所述运动矢量值门限。
  15. 根据权利要求14所述的装置,其特征在于,所述图像信号处理器还用于:
    确定所述运动搜索范围。
  16. 根据权利要求13至15中任一项所述的装置,其特征在于,所述第一预设帧数图像的运动矢量信息为所述图像信号处理器中所述第一预设帧数图像的局部运动矢量或全局运动矢量。
  17. 根据权利要求13至16中任一项所述的装置,其特征在于,所述第一预设帧数图像为所述360全景视频的第一随机切入点后的预设帧数的图像。
  18. 根据权利要求17所述的装置,其特征在于,所述第二预设帧数图像为所述360全景视频的所述第一随机切入点到第二随机切入点之间的预设帧数的图像,其中,所述第二随机切入点为所述第一随机切入点的下一个随机切入点。
  19. 根据权利要求13至18中任一项所述的装置,其特征在于,所述第一预设帧数小于或等于所述第二预设帧数。
  20. 根据权利要求13至19中任一项所述的装置,其特征在于,所述处理单元具体用于:
    若所述极点运动矢量值小于所述运动矢量值门限,则确定所述旋转角度为0;或者,
    若所述极点运动矢量值不小于所述运动矢量值门限,且所述最小运动矢量值小于所述运动矢量值门限,则确定对所述旋转角度为第一角度;或者,
    若所述极点运动矢量值不小于所述运动矢量值门限,所述最小运动矢量值不小于所述运动矢量值门限,且所述极点运动矢量值大于预定倍数的所述最小运动矢量值,则确定对所述旋转角度为第一角度;或者,
    若所述极点运动矢量值不小于所述运动矢量值门限,所述最小运动矢量值不小于所述运动矢量值门限,且所述极点运动矢量值不大于预定倍数的所述最小运动矢量值,则确定所述旋转角度为0;
    其中,所述第一角度为根据所述第一预设帧数图像中运动矢量值最小的区域的位置确定的角度。
  21. 根据权利要求20所述的装置,其特征在于,所述预定倍数为8。
  22. 根据权利要求20或21所述的装置,其特征在于,所述处理单元还用于:
    根据所述第一预设帧数图像中运动矢量值最小的区域的位置,确定所述第一角度。
  23. 根据权利要求13至22中任一项所述的装置,其特征在于,所述最小运动矢量值为所述第一预设帧数图像中对应相同位置的图像块的运动矢量值的平均值中的最小值,其中,所述图像块包括多个像素点。
  24. 根据权利要求23所述的装置,其特征在于,所述图像块包括多个编码树形单元CTU,所述图像块的运动矢量值为所述多个CTU的运动矢量值的平均值,其中,每个CTU的运动矢量值为所述每个CTU的所有像素点的运动矢量的水平分量的绝对值与垂直分量的绝对值之和。
  25. 一种计算机系统,其特征在于,包括:
    存储器,用于存储计算机可执行指令;
    处理器,用于访问所述存储器,并执行所述计算机可执行指令,以进行根据权利要求1至12中任一项所述的方法中的操作。
PCT/CN2017/119000 2017-12-27 2017-12-27 视频编码的方法、装置和计算机系统 WO2019127100A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2017/119000 WO2019127100A1 (zh) 2017-12-27 2017-12-27 视频编码的方法、装置和计算机系统
CN201780018384.1A CN108886616A (zh) 2017-12-27 2017-12-27 视频编码的方法、装置和计算机系统

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/119000 WO2019127100A1 (zh) 2017-12-27 2017-12-27 视频编码的方法、装置和计算机系统

Publications (1)

Publication Number Publication Date
WO2019127100A1 true WO2019127100A1 (zh) 2019-07-04

Family

ID=64325696

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/119000 WO2019127100A1 (zh) 2017-12-27 2017-12-27 视频编码的方法、装置和计算机系统

Country Status (2)

Country Link
CN (1) CN108886616A (zh)
WO (1) WO2019127100A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111294648A (zh) * 2020-02-20 2020-06-16 成都纵横自动化技术股份有限公司 一种无人机空地视频传输方法
CN112367486B (zh) * 2020-10-30 2023-03-28 维沃移动通信有限公司 视频处理方法及装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102549622A (zh) * 2009-09-29 2012-07-04 北京大学 用于处理体图像数据的方法
CN104063843A (zh) * 2014-06-18 2014-09-24 长春理工大学 一种基于中心投影的集成立体成像元素图像生成的方法
WO2017027884A1 (en) * 2015-08-13 2017-02-16 Legend3D, Inc. System and method for removing camera rotation from a panoramic video

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10636121B2 (en) * 2016-01-12 2020-04-28 Shanghaitech University Calibration method and apparatus for panoramic stereo video system
CN107135397B (zh) * 2017-04-28 2018-07-06 中国科学技术大学 一种全景视频编码方法和装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102549622A (zh) * 2009-09-29 2012-07-04 北京大学 用于处理体图像数据的方法
CN104063843A (zh) * 2014-06-18 2014-09-24 长春理工大学 一种基于中心投影的集成立体成像元素图像生成的方法
WO2017027884A1 (en) * 2015-08-13 2017-02-16 Legend3D, Inc. System and method for removing camera rotation from a panoramic video

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JILL BOYCE ET AL.: "Spherical rotation orientation SEI for HEVC and AVC co- ding of 360 video", (JOINT COLLABORATIVE TEAM ON VIDEO CODING OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), 4 January 2017 (2017-01-04) - 20 January 2017 (2017-01-20), Geneva, XP030118131 *

Also Published As

Publication number Publication date
CN108886616A (zh) 2018-11-23

Similar Documents

Publication Publication Date Title
US11218729B2 (en) Coding multiview video
US10600233B2 (en) Parameterizing 3D scenes for volumetric viewing
KR102273199B1 (ko) 곡선 뷰 비디오 인코딩/디코딩에서 효율성 향상을 위한 시스템 및 방법
WO2019034807A1 (en) SEQUENTIAL CODING AND DECODING OF VOLUMETRIC VIDEO
TW201916685A (zh) 用於處理360°vr幀序列的方法及裝置
US11138460B2 (en) Image processing method and apparatus
WO2019157717A1 (zh) 运动补偿的方法、装置和计算机系统
WO2018206551A1 (en) Coding spherical video data
EP3434021B1 (en) Method, apparatus and stream of formatting an immersive video for legacy and immersive rendering devices
US20200045342A1 (en) Methods, devices and stream to encode global rotation motion compensated images
US20200145695A1 (en) Apparatus and method for decoding a panoramic video
WO2019127100A1 (zh) 视频编码的方法、装置和计算机系统
US20210150665A1 (en) Image processing method and device
US11196977B2 (en) Unified coding of 3D objects and scenes
WO2023280266A1 (zh) 鱼眼图像压缩、鱼眼视频流压缩以及全景视频生成方法
WO2019157718A1 (zh) 运动补偿的方法、装置和计算机系统
US20230379495A1 (en) A method and apparatus for encoding mpi-based volumetric video
Groth et al. Wavelet-Based Fast Decoding of 360 Videos
Jiang et al. Semi-Regular Geometric Kernel Encoding & Reconstruction for Video Compression
JP2022551064A (ja) 容積ビデオを符号化、送信、及び復号化するための方法及び装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17936784

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17936784

Country of ref document: EP

Kind code of ref document: A1