WO2021057684A1 - 视频解码方法和装置、视频编码方法和装置、存储介质及电子装置 - Google Patents

视频解码方法和装置、视频编码方法和装置、存储介质及电子装置 Download PDF

Info

Publication number
WO2021057684A1
WO2021057684A1 PCT/CN2020/116642 CN2020116642W WO2021057684A1 WO 2021057684 A1 WO2021057684 A1 WO 2021057684A1 CN 2020116642 W CN2020116642 W CN 2020116642W WO 2021057684 A1 WO2021057684 A1 WO 2021057684A1
Authority
WO
WIPO (PCT)
Prior art keywords
area
resolution
region
identification value
decoding
Prior art date
Application number
PCT/CN2020/116642
Other languages
English (en)
French (fr)
Inventor
高欣玮
李蔚然
谷沉沉
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2021057684A1 publication Critical patent/WO2021057684A1/zh
Priority to US17/469,729 priority Critical patent/US11968379B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/174Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments

Definitions

  • This application relates to the computer field, and in particular, to a video decoding method and device, a video encoding method and device, a storage medium, and an electronic device.
  • the transmission bandwidth is relatively small (for example, less than that shown in Figure 1).
  • the peak signal-to-noise ratio PSNR1 corresponding to different blocks in a frame of the video when high resolution is used for encoding is lower than that for different blocks in a frame of the video.
  • the peak signal-to-noise ratio PSNR2 corresponding to the resolution encoding, that is to say, the peak signal-to-noise ratio PSNR1 when the high-resolution encoding is used when the transmission bandwidth is small is relatively small, and the distortion is relatively large.
  • the corresponding peak signal-to-noise ratio PSNR3 is lower than the corresponding peak signal-to-noise ratio when different blocks in a frame are encoded with high resolution.
  • PSNR4 that is, when the transmission bandwidth is large, the peak signal-to-noise ratio PSNR3 when encoding with low resolution is relatively small, and the distortion is relatively large.
  • intersection point D as shown in FIG. 1 will move, thereby increasing which resolution is selected for the frames in the video in the prior art. Difficulty in the choice of coding.
  • a video decoding method including:
  • the multiple resolutions include at least two different resolutions
  • the resolution corresponding to each area is used to decode each area of the plurality of areas.
  • a video coding method including:
  • a syntax element is added to the encoded data corresponding to each area according to the resolution corresponding to each area, where the syntax element is used to indicate the resolution adopted for encoding each area.
  • a video decoding device includes:
  • the first acquisition module is configured to acquire a to-be-decoded video frame, wherein the to-be-decoded video frame is divided into multiple regions;
  • the second acquiring module is used to acquire the syntax element carried in the data to be decoded corresponding to each of the multiple regions, wherein the syntax element is used to indicate the resolution used to decode each of the regions, and the decoding
  • the multiple resolutions adopted by the multiple regions include at least two different resolutions
  • the decoding module is configured to decode each of the multiple regions by using the resolution corresponding to each region.
  • the first acquisition module is used for one of the following:
  • the multiple regions are multiple video blocks obtained by dividing the to-be-decoded video frame based on a predetermined video codec standard
  • the multiple regions are regions obtained by dividing the to-be-decoded video frame in response to the received input region dividing instruction;
  • the multiple areas are multiple tile areas.
  • a video encoding device includes:
  • the third acquisition module is configured to acquire a video frame to be encoded, wherein the video frame to be encoded is divided into multiple regions;
  • the encoding module is configured to encode each of the multiple regions with a resolution corresponding to a plurality of resolutions to obtain encoded data corresponding to each of the regions, wherein the multiple resolutions include at least Two different resolutions;
  • the adding module is configured to add a syntax element to the encoded data corresponding to each area according to the resolution corresponding to each area, wherein the syntax element is used to indicate the resolution adopted for encoding each area.
  • the fifth determining unit includes:
  • the first determining subunit is configured to determine that the identification value corresponding to each area is the first when the resolution corresponding to each area is the same as the resolution corresponding to the previous area of each area. Identification value;
  • the second determining subunit is configured to determine that the identification value corresponding to each area is second when the resolution corresponding to each area is different from the resolution corresponding to the previous area of each area. Identification value.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the computer-readable instructions When executed by one or more processors, the one or more processors perform the following steps:
  • the multiple resolutions include at least two different resolutions
  • the resolution corresponding to each area is used to decode each area of the plurality of areas.
  • An electronic device comprising a memory and one or more processors, and computer readable instructions are stored in the memory.
  • the one or more processors execute the following steps:
  • the multiple resolutions include at least two different resolutions
  • the resolution corresponding to each area is used to decode each area of the plurality of areas.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the computer-readable instructions When executed by one or more processors, the one or more processors perform the following steps:
  • a syntax element is added to the encoded data corresponding to each area according to the resolution corresponding to each area, where the syntax element is used to indicate the resolution adopted for encoding each area.
  • An electronic device comprising a memory and one or more processors, and computer readable instructions are stored in the memory.
  • the one or more processors execute the following steps:
  • a syntax element is added to the encoded data corresponding to each area according to the resolution corresponding to each area, where the syntax element is used to indicate the resolution adopted for encoding each area.
  • Figure 1 is a schematic diagram of the peak signal-to-noise ratio of the encoding and decoding methods in the related art
  • Fig. 2 is a schematic diagram of an optional video decoding method according to an embodiment of the present application.
  • Fig. 3 is a schematic diagram of an application environment of an optional video decoding method according to an embodiment of the present application
  • Fig. 4 is a schematic diagram of an optional video decoding method according to an optional implementation manner of the present application.
  • Fig. 5 is a schematic diagram of another optional video decoding method according to an optional implementation manner of the present application.
  • FIG. 6 is a schematic diagram of yet another optional video decoding method according to an optional implementation manner of the present application.
  • FIG. 7 is a schematic diagram of still another optional video decoding method according to an optional implementation manner of the present application.
  • Fig. 8 is a schematic diagram of an optional video decoding method according to an embodiment of the present application.
  • Fig. 9 is a schematic diagram of an application environment of an optional video decoding method according to an embodiment of the present application.
  • FIG. 10 is a schematic diagram of an optional video encoding method according to an optional implementation manner of the present application.
  • Fig. 11 is a schematic diagram of an optional video decoding device according to an embodiment of the present application.
  • Fig. 12 is a schematic diagram of an optional video encoding device according to an embodiment of the present application.
  • FIG. 13 is a schematic diagram of an application scenario of an optional video encoding and decoding method according to an embodiment of the present application.
  • FIG. 14 is a schematic diagram of an application scenario of another optional video encoding and decoding method according to an embodiment of the present application.
  • Fig. 15 is a schematic diagram of an optional electronic device according to an embodiment of the present application.
  • a video decoding method As shown in FIG. 2, the method includes:
  • the above-mentioned video decoding method may be applied to the hardware environment formed by the server 302 and the client 304 as shown in FIG. 3.
  • the server 302 obtains the to-be-decoded video frame, where the to-be-decoded video frame is divided into multiple regions; obtains the syntax element carried in the to-be-decoded data corresponding to each of the multiple regions, where the syntax element It is used to indicate the resolution used to decode each area.
  • the multiple resolutions used to decode multiple areas include at least two different resolutions; the resolution corresponding to each area is used for each of the multiple areas.
  • the area is decoded.
  • the server 302 sends the decoded video to the client 304 for playing.
  • the above-mentioned video decoding method may but is not limited to be applied to a scene of audio and video processing.
  • client A and client B make a video call
  • client A and client B separately capture video images, encode the captured video images, and send the encoded video to the other party.
  • the video is decoded, and the decoded video is played.
  • the above-mentioned video decoding method can also be applied but not limited to scenes such as video file playback and video live broadcast.
  • the aforementioned client can be, but is not limited to, various types of applications, such as online education applications, instant messaging applications, community space applications, game applications, shopping applications, browser applications, financial applications, multimedia applications, live broadcast applications, etc.
  • applications such as online education applications, instant messaging applications, community space applications, game applications, shopping applications, browser applications, financial applications, multimedia applications, live broadcast applications, etc.
  • it can be applied but not limited to the scenario where audio and video are processed in the above instant messaging application, or it can also be applied but not limited to the scenario where the audio and video are processed in the above multimedia application, so as to avoid video processing.
  • the peak signal-to-noise ratio for encoding and decoding fluctuates greatly.
  • this embodiment does not make any limitation on this.
  • different regions in the to-be-decoded video frame are obtained by encoding with different resolutions.
  • the video frame to be decoded is divided into 4 areas, namely area 1, area 2, area 3 and area 4.
  • area 1 is coded with resolution 1
  • area 2 and area 3 are coded with resolution 2.
  • area 4 is coded with resolution 3.
  • the encoding information is indicated by the syntax elements carried in the data to be decoded.
  • the decoder obtains the different resolutions used in different regions by obtaining the syntax elements, and then decodes them using the resolutions corresponding to each region.
  • the multiple regions included in the video frame to be decoded are decoded using at least two different resolutions.
  • the syntax element used to indicate the resolution used to decode each region may be a piece of data located at a fixed position of the video frame to be decoded, where different data values represent Different resolutions.
  • the syntax element can be obtained by searching the position on the video frame to be decoded, so as to determine the different resolutions of each region.
  • a video frame to be decoded is obtained, where the video frame to be decoded is divided into multiple regions, including: region 1, region 2, region 3, and region 4.
  • the syntax elements carried in the data to be decoded corresponding to each of the multiple areas where the syntax element corresponding to area 1 is used to indicate that the resolution adopted by the decoding area 1 is resolution 1, and the syntax element corresponding to area 2 is used for Indicates that the resolution used in decoding area 2 is resolution 2, the syntax element corresponding to area 3 is used to indicate that the resolution used in decoding area 3 is resolution 2, and the syntax element corresponding to area 4 is used to indicate the resolution used in decoding area 4.
  • the resolution of is resolution 3, resolution 1 is used to decode area 1, resolution 2 is used to decode area 2 and area 3, and resolution 3 is used to decode area 4.
  • obtaining the syntax element carried in the to-be-decoded data corresponding to each of the multiple regions includes:
  • the identification value corresponding to each region is acquired, where the identification value corresponding to each region is used to indicate the resolution used for decoding each region.
  • identification values can be directly used in the syntax element to indicate different resolutions.
  • resolution 1 is represented by 00
  • resolution 2 is represented by 01
  • resolution 3 is represented by 10
  • resolution 4 is represented by 11. It should be noted that the way that the identification value represents the resolution is not only that, but various identification value representation ways that can distinguish the resolution can be used to indicate different resolutions adopted in different regions.
  • a video frame to be decoded is obtained, where the video frame to be decoded is divided into multiple regions, including: region 1, region 2, region 3, and region 4.
  • the identification value corresponding to each area in the multiple areas where the identification value corresponding to area 1 is 00, it can be determined that the resolution used in decoding area 1 is resolution 1, and the identification value corresponding to area 2 is 01, and the decoding can be determined
  • the resolution used in area 2 is resolution 2, and the identification value corresponding to area 3 is 01. It can be determined that the resolution used in decoding area 3 is resolution 2, and the identification value corresponding to area 4 is 10, and decoding area 4 can be determined.
  • the resolution used is resolution 3, resolution 1 is used to decode area 1, resolution 2 is used to decode area 2 and area 3, and resolution 3 is used to decode area 4.
  • obtaining the syntax element carried in the to-be-decoded data corresponding to each of the multiple regions includes:
  • the identification value corresponding to each area is used to indicate the resolution used to decode each area and the resolution used to decode the previous area of each area.
  • the relationship between the adopted resolutions; the resolution adopted for decoding each region is determined according to the identification value and the resolution adopted for decoding the previous region of each region.
  • the identification value corresponding to the current area may be used to indicate the relationship between the resolution of the current area and the resolution of the previous area of the current area. Then the resolution of the current area is determined according to the relationship and the resolution of the previous area.
  • determining the resolution used for decoding each region according to the identification value and the resolution used for decoding the previous region of each region further includes:
  • the resolution used for decoding each region is the resolution corresponding to the previous region of each region; In the case where the identification value corresponding to each region is the second identification value, it is determined that the resolution used for decoding each region is a resolution different from the resolution corresponding to the previous region of each region.
  • the relationship between resolutions may include, but is not limited to: the same resolution or different resolutions.
  • the first identification value is used to indicate that the resolution is the same, and the second identification value is used to indicate that the resolution is different.
  • the identification value can be used to directly indicate the resolution adopted by the region, or the identification value can also be used to indicate its difference from the previous one.
  • 0 is used to indicate the same resolution
  • 1 is used to indicate different resolutions.
  • the resolution used for decoding includes two resolutions, resolution A and resolution B, then the video frame to be decoded is obtained, where the video frame to be decoded is divided into multiple regions, including: region 1, region 2, Area 3 and area 4, obtain the identification value corresponding to each of the multiple areas, where the identification value corresponding to area 1 is 0, and it can be obtained that the resolution used in the last area of the previous frame is resolution A, It can be determined that the resolution used in decoding area 1 is resolution A, the corresponding identification value of area 2 is 1, it can be determined that the resolution used in decoding area 2 is resolution B, and the corresponding identification value of area 3 is 0, which can be determined The resolution used in the decoding area 3 is resolution B, and the identification value corresponding to the area 4 is 0. It can be determined that the resolution used in the decoding area 4 is resolution B.
  • the division of the to-be-decoded video frame into multiple regions includes one of the following:
  • the multiple regions are multiple video blocks obtained by dividing the to-be-decoded video frame based on a predetermined video codec standard
  • the multiple regions are regions obtained by dividing the to-be-decoded video frame in response to the acquired input region dividing instruction;
  • the multiple areas are multiple Tile (tile) areas.
  • a plurality of regions can be divided but not limited to using various division methods, for example: the division method of video blocks in the standard protocol, binary tree, triple tree, quad tree, etc., each video block
  • the smaller video window during the video call is divided into one area, as area 1, the larger one can be indicated by the input area division instruction.
  • the video window or the part other than the smaller video window is divided into one area, which is referred to as area 2.
  • the area division method may also be another division standard, for example, different tile areas are divided by the division method of the tile area.
  • a video encoding method As shown in FIG. 8, the method includes:
  • S806 Add a syntax element to the encoded data corresponding to each area according to the resolution corresponding to each area, where the syntax element is used to indicate the resolution adopted for encoding each area.
  • the foregoing video encoding method may be applied to the hardware environment constituted by the server 902, the server 302, the client 904, and the client 304 as shown in FIG. 9.
  • the server 902 obtains the to-be-encoded video frame collected by the client 904, where the to-be-encoded video frame is divided into multiple regions; Encoding data for each region to obtain the encoded data corresponding to each region, wherein the multiple resolutions include at least two different resolutions; add syntax elements to the encoded data corresponding to each region according to the resolution corresponding to each region, where , The syntax element is used to indicate the resolution used to encode each region.
  • the server 902 sends the encoded video to the server 302 for decoding.
  • the server 302 sends the decoded video to the client 304 for playing.
  • the above-mentioned video encoding method may be applied to, but not limited to, scenes of audio and video processing.
  • client A and client B make a video call
  • client A and client B separately capture video images, encode the captured video images, and send the encoded video to the other party.
  • the video is decoded, and the decoded video is played.
  • the above-mentioned video encoding method can also be applied to scenes such as video file playback, video live broadcast, etc., but is not limited to.
  • the aforementioned client can be, but is not limited to, various types of applications, such as online education applications, instant messaging applications, community space applications, game applications, shopping applications, browser applications, financial applications, multimedia applications, live broadcast applications, etc.
  • applications such as online education applications, instant messaging applications, community space applications, game applications, shopping applications, browser applications, financial applications, multimedia applications, live broadcast applications, etc.
  • it can be applied but not limited to the scenario where audio and video are processed in the above instant messaging application, or it can also be applied but not limited to the scenario where the audio and video are processed in the above multimedia application, so as to avoid video processing.
  • the peak signal-to-noise ratio for encoding and decoding fluctuates greatly.
  • this embodiment does not make any limitation on this.
  • different regions in the to-be-encoded video frame are encoded with different resolutions.
  • the video frame to be encoded is divided into 4 areas, namely area 1, area 2, area 3, and area 4.
  • area 1 is encoded with resolution 1
  • area 1 is added to represent resolution 1.
  • Syntax elements, area 2 and area 3 are coded with resolution 2, add syntax elements for area 2 and area 3 to represent resolution 2
  • area 4 is coded with resolution 3, and area 4 is added for representation Syntax element for resolution 3.
  • At least two different resolutions are used to encode multiple regions included in the video frame to be encoded.
  • the syntax element used to indicate the resolution used to encode each region may be a piece of data located at a fixed position of the video frame to be decoded, where different data values represent Different resolutions.
  • a syntax element representing the resolution corresponding to the region can be added at this position.
  • a video frame to be encoded is obtained, where the video frame to be encoded is divided into multiple regions, including: region 1, region 2, region 3, and region 4.
  • Resolution 1 encodes region 1
  • resolution 2 is used to encode regions 2 and 3
  • resolution 3 is used to encode region 4.
  • Add syntax element 1 for the coded data corresponding to area 1 to represent the resolution 1 add syntax element 2 for the coded data corresponding to area 2 and area 3 to represent the resolution 2, and add the coded data corresponding to area 4 ⁇ denotes the syntax element 3 of resolution 3.
  • adding a syntax element to the encoded data corresponding to each region according to the resolution corresponding to each region includes:
  • the identification value corresponding to each region is added as the syntax element to the coded data corresponding to each region.
  • identification values may be directly used in the syntax element to indicate different resolutions.
  • resolution 1 is represented by 00
  • resolution 2 is represented by 01
  • resolution 3 is represented by 10
  • resolution 4 is represented by 11. It should be noted that the way that the identification value represents the resolution is not only that, but various identification value representation ways that can distinguish the resolution can be used to indicate different resolutions adopted in different regions.
  • adding a syntax element to the encoded data corresponding to each region according to the resolution corresponding to each region includes:
  • the identification value corresponding to each region is determined according to the relationship between the resolution corresponding to each region and the resolution corresponding to the previous region of each region; and the identification value corresponding to each region It is added to the coded data corresponding to each region as the syntax element.
  • the identification value corresponding to the current area may be used to indicate the relationship between the resolution of the current area and the resolution of the previous area of the current area.
  • the identification value corresponding to each area is determined according to the relationship between the resolution of the current area and the resolution of the previous area.
  • determining the identification value corresponding to each area according to the relationship between the resolution corresponding to each area and the resolution corresponding to the previous area of each area includes:
  • the identification value corresponding to each area is the first identification value; In a case where the resolution corresponding to the area is different from the resolution corresponding to the previous area of each area, it is determined that the identification value corresponding to each area is the second identification value.
  • the relationship between resolutions may include, but is not limited to: the same resolution or different resolutions.
  • the first identification value is used to indicate that the resolution is the same, and the second identification value is used to indicate that the resolution is different.
  • the identification value can be used to directly indicate the resolution adopted by the region, or the identification value can also be used to indicate its difference from the previous one.
  • the device includes:
  • the first obtaining module 112 is configured to obtain a to-be-decoded video frame, where the to-be-decoded video frame is divided into multiple regions;
  • the second obtaining module 114 is configured to obtain a syntax element carried in the data to be decoded corresponding to each of the multiple regions, where the syntax element is used to indicate the resolution used to decode each of the regions,
  • the multiple resolutions used for decoding the multiple regions include at least two different resolutions;
  • the decoding module 116 is configured to use the resolution corresponding to each area to decode each area of the plurality of areas.
  • the second obtaining module 114 includes:
  • the first acquiring unit includes acquiring an identification value corresponding to each region, wherein the identification value corresponding to each region is used to indicate the resolution adopted for decoding each region.
  • the second obtaining module 114 includes:
  • the second acquiring unit is configured to acquire the identification value corresponding to each area, wherein the identification value corresponding to each area is used to indicate the resolution used for decoding each area and the decoding of each area. The relationship between the resolutions used in the previous area of each area;
  • the first determining unit is configured to determine the resolution used for decoding each area according to the identification value and the resolution used for decoding the previous area of each area.
  • the second obtaining module 114 includes:
  • the second determining unit is configured to determine that the resolution used for decoding each area corresponds to the previous area of each area when it is determined that the identification value corresponding to each area is the first identification value.
  • the third determining unit is configured to determine that the resolution used for decoding each area is the same as that of the previous area of each area when it is determined that the identification value corresponding to each area is the second identification value. Corresponding to different resolutions.
  • the first obtaining module 114 is used for one of the following:
  • the multiple regions are multiple video blocks obtained by dividing the to-be-decoded video frame based on a predetermined video codec standard
  • the multiple regions are regions obtained by dividing the to-be-decoded video frame in response to the acquired input region dividing instruction;
  • the multiple areas are multiple tile areas.
  • the device includes:
  • the third acquiring module 122 is configured to acquire a video frame to be encoded, wherein the video frame to be encoded is divided into multiple regions;
  • the encoding module 124 is configured to encode each of the multiple regions with a resolution corresponding to the multiple resolutions to obtain encoded data corresponding to each of the regions, wherein the multiple resolutions include At least two different resolutions;
  • the adding module 126 is configured to add a syntax element to the encoded data corresponding to each area according to the resolution corresponding to each area, wherein the syntax element is used to indicate the resolution adopted for encoding each area .
  • the adding module 126 includes:
  • the fourth determining unit is configured to determine the identification value corresponding to the multiple identification values of the resolution corresponding to each region, wherein the different resolutions among the multiple resolutions are in the multiple identification values Corresponding to different identification values in;
  • the first adding unit is configured to add the identification value corresponding to each region as the syntax element to the coded data corresponding to each region.
  • the adding module 126 includes:
  • a fifth determining unit configured to determine the identification value corresponding to each area according to the relationship between the resolution corresponding to each area and the resolution corresponding to the previous area of each area;
  • the second adding unit is configured to add the identification value corresponding to each area as the syntax element to the coded data corresponding to each area.
  • the fifth determining unit includes:
  • the first determining subunit is configured to determine that the identification value corresponding to each area is the first when the resolution corresponding to each area is the same as the resolution corresponding to the previous area of each area. Identification value
  • the second determining subunit is configured to determine that the identification value corresponding to each area is second when the resolution corresponding to each area is different from the resolution corresponding to the previous area of each area. Identification value.
  • the application environment of the embodiment of the present application may, but is not limited to, refer to the application environment in the foregoing embodiment, which will not be repeated in this embodiment.
  • the embodiment of the present application provides an optional specific application example for implementing the above-mentioned real-time communication connection method.
  • the foregoing video encoding and decoding method may be, but not limited to, applied to the scene of encoding and decoding a video as shown in FIG. 13.
  • the area in the t-th frame is divided into different Tile areas, as shown in FIG. 13, Tile1 area, Tile2 area, Tile3 area, and Tile4 area.
  • the division method in FIG. 13 is only an example, and the embodiment of the present application does not limit the number and shape of regions obtained by dividing a frame.
  • the resolution corresponding to the smallest cost is used as the resolution used on the tile area.
  • the resolution 1, resolution 2, and resolution 3 are used as the resolution used on the tile area.
  • the cost corresponding to the resolution 2 is the smallest, and the resolution 2 is used for the Tile1 area. Encode the blocks in.
  • the resolution 2 is used to encode the blocks in the Tile1 area
  • the resolution 1 is used to encode the blocks in the Tile2 area
  • the resolution 1 is used to encode the blocks in the Tile3 area
  • the resolution 3 is used to encode the blocks in the Tile3 area.
  • the blocks in the Tile4 area are encoded.
  • the corresponding flag bit may be used in the encoding process to indicate the corresponding resolution, for example, For a block that has been coded using high resolution for encoding, set the corresponding flag bit to 0, and for a block that uses low resolution for encoding, set the corresponding flag bit to 1.
  • this setting method is only an example, and other flag bit setting methods can also be used. For example, for a block that uses high resolution for encoding, set the corresponding flag bit to 1, for a block that uses low resolution for encoding. , Set the corresponding flag bit to 0.
  • the corresponding flag bit can be used in the encoding process to indicate the corresponding resolution, for example, If the resolution of the current block is the same as that of the previous block when encoding, the flag bit corresponding to the current block is set to 0; if the resolution of the current block is different from that of the previous block when encoding, the current block The corresponding flag bit is set to 1.
  • this setting method is just an example, and other flag bit setting methods can also be used.
  • the flag bit corresponding to the current block is set to 1; If the resolution used in encoding of the current block is different from that of the previous block, the flag bit corresponding to the current block is set to 0.
  • the number of bits to be transmitted is different.
  • an electronic device for implementing the foregoing video decoding.
  • the electronic device includes: one or more (only one is shown in the figure) processor 1502, a memory 1504, a sensor 1506, an encoder 1508, and a transmission device 1510.
  • Computer readable instructions are stored in the memory, and the processor is configured to execute the steps in any of the foregoing method embodiments through the computer readable instructions.
  • the above-mentioned electronic device may be located in at least one network device among a plurality of network devices in a computer network.
  • the foregoing processor may be configured to execute the following steps through computer-readable instructions:
  • the multiple resolutions include at least two different resolutions
  • the resolution corresponding to each area is used to decode each area of the plurality of areas.
  • the structure shown in FIG. 15 is only for illustration, and the electronic device may also be a smart phone (such as an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, and a mobile Internet device (Mobile Internet Devices, MID), PAD and other terminal devices.
  • FIG. 15 does not limit the structure of the above-mentioned electronic device.
  • the electronic device may also include more or fewer components (such as a network interface, a display device, etc.) than shown in FIG. 15, or have a configuration different from that shown in FIG. 15.
  • the memory 1504 can be used to store computer-readable instructions and modules, such as computer-readable instructions/modules corresponding to the video decoding method and device in the embodiments of the present application.
  • the processor 1502 runs the computer-readable instructions stored in the memory 1504. And modules to perform various functional applications and data processing, that is, to achieve the above-mentioned control method of the target component.
  • the memory 1504 may include a high-speed random access memory, and may also include a non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory.
  • the memory 1504 may further include a memory remotely provided with respect to the processor 1502, and these remote memories may be connected to the terminal through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
  • the aforementioned transmission device 1510 is used to receive or send data via a network.
  • the foregoing specific examples of the network may include wired networks and wireless networks.
  • the transmission device 1510 includes a network adapter (Network Interface Controller, NIC), which can be connected to other network devices and routers via a network cable so as to communicate with the Internet or a local area network.
  • the transmission device 1510 is a radio frequency (RF) module, which is used to communicate with the Internet in a wireless manner.
  • RF radio frequency
  • the memory 1504 is used to store application programs.
  • the embodiment of the present application also provides a storage medium in which computer-readable instructions are stored, wherein the computer-readable instructions are configured to execute the steps in any one of the foregoing method embodiments when running.
  • the aforementioned storage medium may be configured to store computer-readable instructions for executing the following steps:
  • the multiple resolutions include at least two different resolutions
  • the resolution corresponding to each area is used to decode each area of the plurality of areas.
  • the storage medium is further configured to store computer-readable instructions for executing the steps included in the method in the foregoing embodiment, which will not be repeated in this embodiment.
  • the read instruction may be stored in a computer-readable storage medium, which may include: flash disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, etc.
  • the integrated unit in the foregoing embodiment is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in the foregoing computer-readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium. It includes several computer-readable instructions to enable one or more computer devices (which may be personal computers, servers, or network devices, etc.) to perform all or part of the steps of the methods described in the various embodiments of the present application.
  • the disclosed client can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division.
  • there may be other division methods for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, units or modules, and may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

一种视频解码方法,包括:采用获取待解码视频帧,其中,待解码视频帧被划分为多个区域;获取多个区域中每个区域对应的待解码数据中携带的语法元素,其中,语法元素用于指示解码每个区域所采用的分辨率,解码多个区域所采用的多个分辨率包括至少两个不同的分辨率;采用每个区域所对应的分辨率对多个区域中的每个区域进行解码。

Description

视频解码方法和装置、视频编码方法和装置、存储介质及电子装置
本申请要求于2019年09月27日提交中国专利局,申请号为201910927094.X,申请名称为“视频解码方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机领域,具体而言,涉及一种视频解码方法和装置、视频编码方法和装置、存储介质及电子装置。
背景技术
在现有的视频编码过程中,如图1所示,如果对于视频中的一帧中的不同块都采用了高分辨率进行编码,则在传输的带宽比较小(例如,小于图1中所示的带宽阈值Th)的情况下,对于视频中的一帧中的不同块采用高分辨率进行编码时所对应的峰值信噪比PSNR1要低于对于视频中的一帧中的不同块采用低分辨率进行编码时所对应的峰值信噪比PSNR2,也就是说,在传输带宽较小时采用高分辨率进行编码时的峰值信噪比PSNR1相对较小,失真相对较大。
同理,如果对于视频中的一帧中的不同块都采用了低分辨率进行编码,则在传输的带宽比较大(例如,大于图1中所示的带宽阈值Th)的情况下,对于视频中的一帧中的不同块采用低分辨率进行编码时所对应的峰值信噪比PSNR3要低于对于视频中的一帧中的不同块采用高分辨率进行编码时所对应的峰值信噪比PSNR4,也就是说,在传输带宽较大时采用低分辨率进行编码时的峰值信噪比PSNR3相对较小,失真相对较大。
此外,对于不同类型的视频或者同一视频中的不同帧或者同一帧中的不同块,如图1所示的交点D会移动,从而增加了现有技术中选择哪种分辨率对视频中的帧进行编码的选择难度。
针对上述的问题,目前尚未提出有效的解决方案。
发明内容
一种视频解码方法,包括:
获取待解码视频帧,其中,所述待解码视频帧被划分为多个区域;
获取所述多个区域中每个区域对应的待解码数据中携带的语法元素,其中,所述语法元素用于指示解码所述每个区域所采用的分辨率,解码所述多个区域所采用的多个分辨率包括至少两个不同的分辨率;及
采用每个区域所对应的分辨率对所述多个区域中的每个区域进行解码。
一种视频编码方法,包括:
获取待编码视频帧,其中,所述待编码视频帧被划分为多个区域;
采用多个分辨率中对应的分辨率对所述多个区域中的每个区域进行编码,得到所述每个区域对应的编码数据,其中,所述多个分辨率包括至少两个不同的分辨率;及
根据所述每个区域对应的分辨率为所述每个区域对应的编码数据添加语法元素,其中,所述语法元素用于指示编码所述每个区域所采用的分辨率。
一种视频解码装置,包括:
第一获取模块,用于获取待解码视频帧,其中,所述待解码视频帧被划分为多个区域;
第二获取模块,用于获取所述多个区域中每个区域对应的待解码数据中携带的语法元素,其中,所述语法元素用于指示解码所述每个区域所采用的分辨率,解码所述多个区域所采用的多个分辨率包括至少两个不同的分辨率;及
解码模块,用于采用每个区域所对应的分辨率对所述多个区域中的每个区域进行解码。
可选地,所述第一获取模块用于以下之一:
所述多个区域是基于预先确定的视频编解码标准对所述待解码视频帧进 行划分得到的多个视频块;
所述多个区域是响应于获取到的输入的区域划分指令对所述待解码视频帧进行划分得到的区域;及
所述多个区域是多个分片(Tile)区域。
一种视频编码装置,包括:
第三获取模块,用于获取待编码视频帧,其中,所述待编码视频帧被划分为多个区域;
编码模块,用于采用多个分辨率中对应的分辨率对所述多个区域中的每个区域进行编码,得到所述每个区域对应的编码数据,其中,所述多个分辨率包括至少两个不同的分辨率;及
添加模块,用于根据所述每个区域对应的分辨率为所述每个区域对应的编码数据添加语法元素,其中,所述语法元素用于指示编码所述每个区域所采用的分辨率。
可选地,所述第五确定单元,包括:
第一确定子单元,用于在所述每个区域对应的分辨率与所述每个区域的前一个区域对应的分辨率相同的情况下,确定所述每个区域对应的标识值为第一标识值;及
第二确定子单元,用于在所述每个区域对应的分辨率与所述每个区域的前一个区域对应的分辨率不同的情况下,确定所述每个区域对应的标识值为第二标识值。
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:
获取待解码视频帧,其中,所述待解码视频帧被划分为多个区域;
获取所述多个区域中每个区域对应的待解码数据中携带的语法元素,其中,所述语法元素用于指示解码所述每个区域所采用的分辨率,解码所述多 个区域所采用的多个分辨率包括至少两个不同的分辨率;及
采用每个区域所对应的分辨率对所述多个区域中的每个区域进行解码。
一种电子装置,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行以下步骤:
获取待解码视频帧,其中,所述待解码视频帧被划分为多个区域;
获取所述多个区域中每个区域对应的待解码数据中携带的语法元素,其中,所述语法元素用于指示解码所述每个区域所采用的分辨率,解码所述多个区域所采用的多个分辨率包括至少两个不同的分辨率;及
采用每个区域所对应的分辨率对所述多个区域中的每个区域进行解码。
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:
获取待编码视频帧,其中,所述待编码视频帧被划分为多个区域;
采用多个分辨率中对应的分辨率对所述多个区域中的每个区域进行编码,得到所述每个区域对应的编码数据,其中,所述多个分辨率包括至少两个不同的分辨率;及
根据所述每个区域对应的分辨率为所述每个区域对应的编码数据添加语法元素,其中,所述语法元素用于指示编码所述每个区域所采用的分辨率。
一种电子装置,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行以下步骤:
获取待编码视频帧,其中,所述待编码视频帧被划分为多个区域;
采用多个分辨率中对应的分辨率对所述多个区域中的每个区域进行编码,得到所述每个区域对应的编码数据,其中,所述多个分辨率包括至少两个不同的分辨率;及
根据所述每个区域对应的分辨率为所述每个区域对应的编码数据添加语法元素,其中,所述语法元素用于指示编码所述每个区域所采用的分辨率。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是相关技术中编解码方式的峰值信噪比的示意图;
图2是根据本申请实施例的一种可选的视频解码方法的示意图;
图3是根据本申请实施例的一种可选的视频解码方法的应用环境示意图;
图4是根据本申请可选的实施方式的一种可选的视频解码方法的示意图;
图5是根据本申请可选的实施方式的另一种可选的视频解码方法的示意图;
图6是根据本申请可选的实施方式的又一种可选的视频解码方法的示意图;
图7是根据本申请可选的实施方式的再一种可选的视频解码方法的示意图;
图8是根据本申请实施例的一种可选的视频解码方法的示意图;
图9是根据本申请实施例的一种可选的视频解码方法的应用环境示意图;
图10是根据本申请可选的实施方式的一种可选的视频编码方法的示意图;
图11是根据本申请实施例的一种可选的视频解码装置的示意图;
图12是根据本申请实施例的一种可选的视频编码装置的示意图;
图13是根据本申请实施例的一种可选的视频编解码方法的应用场景示意图;
图14是根据本申请实施例的另一种可选的视频编解码方法的应用场景示意图;
图15是根据本申请实施例的一种可选的电子装置的示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
根据本申请实施例的一个方面,提供了一种视频解码方法,如图2所示,该方法包括:
S202,获取待解码视频帧,其中,所述待解码视频帧被划分为多个区域;
S204,获取所述多个区域中每个区域对应的待解码数据中携带的语法元素,其中,所述语法元素用于指示解码所述每个区域所采用的分辨率,解码所述多个区域所采用的多个分辨率包括至少两个不同的分辨率;
S206,采用每个区域所对应的分辨率对所述多个区域中的每个区域进行解码。
可选地,在本实施例中,上述视频解码方法可以应用于如图3所示的服务器302和客户端304所构成的硬件环境中。如图3所示,服务器302获取待解码视频帧,其中,待解码视频帧被划分为多个区域;获取多个区域中每个区域对应的待解码数据中携带的语法元素,其中,语法元素用于指示解码每个区域所采用的分辨率,解码多个区域所采用的多个分辨率包括至少两个不同的分辨率;采用每个区域所对应的分辨率对多个区域中的每个区域进行解码。服务器302将解码后得到的视频发送给客户端304进行播放。
可选地,在本实施例中,上述视频解码方法可以但不限于应用于音视频处理的场景中。比如:客户端A与客户端B进行视频通话,客户端A侧和客户端B侧分别采集视频画面,对采集到的视频画面进行编码,将编码后的视频发送给对方,在对方对接收到的视频进行解码,并播放解码后的视频。
可选地,在本实施例中,上述视频解码方法还可以但不限于应用于视频文件的播放、视频直播等场景中。
其中,上述客户端可以但不限于为各种类型的应用,例如,在线教育应用、即时通讯应用、社区空间应用、游戏应用、购物应用、浏览器应用、金融应用、多媒体应用、直播应用等。具体的,可以但不限于应用于在上述即时通讯应用中对音视频进行处理的场景中,或还可以但不限于应用于在上述多媒体应用中对音视频进行处理的场景中,以避免对视频进行编解码的峰值信噪比波动较大。上述仅是一种示例,本实施例中对此不做任何限定。
可选地,在本实施例中,待解码视频帧中的不同区域是采用不同分辨率编码得到的。比如:待解码视频帧被划分为4个区域,分别是区域1、区域2、区域3和区域4,其中,区域1是采用分辨率1编码的,区域2和区域3是采用分辨率2编码的,区域4是采用分辨率3编码的。这些编码信息通过待解码数据中携带的语法元素来指示,解码端通过获取语法元素获取到不同区域采用的不同分辨率,从而采用各个区域对应的分辨率对其进行解码。
可选地,在本实施例中,待解码视频帧所包括的多个区域是至少使用两个不同的分辨率进行解码的。
可选地,在本实施例中,用于指示解码每个区域所采用的分辨率的语法元素可以是位于待解码视频帧的固定位置上的一段数据,在该位置上不同的数据值代表了不同的分辨率。可以通过在待解码视频帧上查找该位置来获取语法元素,从而确定各个区域不同的分辨率。
在一个可选的实施方式中,如图4所示,获取到待解码视频帧,其中,待解码视频帧被划分为多个区域,包括:区域1、区域2、区域3和区域4,获取多个区域中每个区域对应的待解码数据中携带的语法元素,其中,区域1对应的语法元素用于指示解码区域1所采用的分辨率为分辨率1,区域2对应的语法元素用于指示解码区域2所采用的分辨率为分辨率2,区域3对应的语法元素用于指示解码区域3所采用的分辨率为分辨率2,区域4对应的语法元素用于指示解码区域4所采用的分辨率为分辨率3,采用分辨率1对区域1进行解码,采用分辨率2对区域2和区域3进行解码,采用分辨率3对区域4进行解码。
可见,通过上述步骤,对于视频中的一帧中的不同块自适应采用对应的分辨率进行编码,这样无论是在传输的带宽比较小的情况下,还是在传输的带宽比较大的情况下,对应的峰值信噪比都相对较大,失真相对较小,从而保证了峰值信噪比能够在一个较小的范围内变化,并且峰值信噪比都相对较大,从而实现了避免对视频进行编解码的峰值信噪比波动较大的技术效果,进而解决了相关技术中采用相同分辨率对视频进行编解码导致峰值信噪比波动较大的技术问题。
作为一种可选的方案,获取所述多个区域中每个区域对应的待解码数据中携带的语法元素,包括:
获取所述每个区域所对应的标识值,其中,所述每个区域所对应的标识值用于指示解码所述每个区域所采用的分辨率。
可选地,在本实施例中,在语法元素中可以直接用不同的标识值来表示 不同的分辨率。比如:分辨率1用00表示,分辨率2用01表示,分辨率3用10表示,分辨率4用11表示。需要说明的是,标识值表示分辨率的方式不仅于此,可以采用各种能够区分分辨率的标识值表示方式来指示不同区域采用的不同分辨率。
在一个可选的实施方式中,如图5所示,获取到待解码视频帧,其中,待解码视频帧被划分为多个区域,包括:区域1、区域2、区域3和区域4,获取多个区域中每个区域对应的标识值,其中,区域1对应的标识值为00,可以确定解码区域1所采用的分辨率为分辨率1,区域2对应的标识值为01,可以确定解码区域2所采用的分辨率为分辨率2,区域3对应的标识值为01,可以确定解码区域3所采用的分辨率为分辨率2,区域4对应的标识值为10,可以确定解码区域4所采用的分辨率为分辨率3,采用分辨率1对区域1进行解码,采用分辨率2对区域2和区域3进行解码,采用分辨率3对区域4进行解码。
作为一种可选的方案,获取所述多个区域中每个区域对应的待解码数据中携带的语法元素,包括:
获取所述每个区域所对应的标识值,其中,所述每个区域所对应的标识值用于指示解码所述每个区域所采用的分辨率与解码所述每个区域的前一个区域所采用的分辨率之间的关系;根据所述标识值和解码所述每个区域的前一个区域所采用的分辨率确定解码所述每个区域所采用的分辨率。
可选地,在本实施例中,当前区域对应的标识值可以是用来指示当前区域的分辨率与当前区域的前一个区域的分辨率之间的关系的。再根据该关系和前一个区域的分辨率确定当前区域的分辨率。
作为一种可选的方案,根据所述标识值和解码所述每个区域的前一个区域所采用的分辨率确定解码所述每个区域所采用的分辨率,还包括:
在确定所述每个区域对应的标识值为第一标识值的情况下,确定解码所述每个区域所采用的分辨率为所述每个区域的前一个区域对应的分辨率;在确定所述每个区域对应的标识值为第二标识值的情况下,确定解码所述每个 区域所采用的分辨率为与所述每个区域的前一个区域对应的分辨率不同的分辨率。
可选地,在本实施例中,分辨率之间的关系可以但不限于包括:分辨率相同或者分辨率不同。使用第一标识值来表示分辨率相同,使用第二标识值来表示分辨率不同。例如:使用1来标识分辨率相同,使用0来表示分辨率不同,或者,使用0来表示分辨率相同,使用1来表示分辨率不同。
可选地,在本实施例中,对于每一个帧中的第一个区域来说,可以使用标识值来直接表示该区域所采用的分辨率,或者,也可以使用标识值来表示其与前一帧中最后一个区域的分辨率之间的关系。
在上述可选的实施方式中,如图6所示,使用0来表示分辨率相同,使用1来表示分辨率不同。用于解码的分辨率包括两种分辨率,分辨率A和分辨率B,那么,获取到待解码视频帧,其中,待解码视频帧被划分为多个区域,包括:区域1、区域2、区域3和区域4,获取多个区域中每个区域对应的标识值,其中,区域1对应的标识值为0,并且可以获取到前一帧的最后一个区域采用的分辨率是分辨率A,可以确定解码区域1所采用的分辨率为分辨率A,区域2对应的标识值为1,可以确定解码区域2所采用的分辨率为分辨率B,区域3对应的标识值为0,可以确定解码区域3所采用的分辨率为分辨率B,区域4对应的标识值为0,可以确定解码区域4所采用的分辨率为分辨率B。
作为一种可选的方案,所述待解码视频帧被划分为多个区域包括以下之一:
所述多个区域是基于预先确定的视频编解码标准对所述待解码视频帧进行划分得到的多个视频块;
所述多个区域是响应于获取到的输入的区域划分指令对所述待解码视频帧进行划分得到的区域;
所述多个区域是多个Tile(分片)区域。
可选地,在本实施例中,多个区域可以但不限于采用各种划分方式,例 如:采用标准协议中视频块的划分方式,二叉树、三叉树、四叉树等等,每一个视频块为一个区域,或者,可以通过输入的区域划分指令指示区域的划分方式,比如:如图7所示,将视频通话过程中的较小的视频窗口划分为一个区域,作为区域1,较大的视频窗口或者说除较小的视频窗口之外的部分划分为一个区域,作为区域2。区域的划分方式还可以是其他的划分标准,比如:采用分片Tile区域的划分方式划分出的不同的Tile区域。
根据本申请实施例的另一个方面,提供了一种视频编码方法,如图8所示,该方法包括:
S802,获取待编码视频帧,其中,所述待编码视频帧被划分为多个区域;
S804,采用多个分辨率中对应的分辨率对所述多个区域中的每个区域进行编码,得到所述每个区域对应的编码数据,其中,所述多个分辨率包括至少两个不同的分辨率;
S806,根据所述每个区域对应的分辨率为所述每个区域对应的编码数据添加语法元素,其中,所述语法元素用于指示编码所述每个区域所采用的分辨率。
可选地,在本实施例中,上述视频编码方法可以应用于如图9所示的服务器902、服务器302、客户端904和客户端304所构成的硬件环境中。如图9所示,服务器902获取到客户端904采集的待编码视频帧,其中,待编码视频帧被划分为多个区域;采用多个分辨率中对应的分辨率对多个区域中的每个区域进行编码,得到每个区域对应的编码数据,其中,多个分辨率包括至少两个不同的分辨率;根据每个区域对应的分辨率为每个区域对应的编码数据添加语法元素,其中,语法元素用于指示编码每个区域所采用的分辨率。服务器902将编码后得到的视频发送给服务器302进行解码。服务器302将解码后的视频发送给客户端304进行播放。
可选地,在本实施例中,上述视频编码方法可以但不限于应用于音视频处理的场景中。比如:客户端A与客户端B进行视频通话,客户端A侧和客户端B侧分别采集视频画面,对采集到的视频画面进行编码,将编码后的视 频发送给对方,在对方对接收到的视频进行解码,并播放解码后的视频。
可选地,在本实施例中,上述视频编码方法还可以但不限于应用于视频文件的播放、视频直播等场景中。
其中,上述客户端可以但不限于为各种类型的应用,例如,在线教育应用、即时通讯应用、社区空间应用、游戏应用、购物应用、浏览器应用、金融应用、多媒体应用、直播应用等。具体的,可以但不限于应用于在上述即时通讯应用中对音视频进行处理的场景中,或还可以但不限于应用于在上述多媒体应用中对音视频进行处理的场景中,以避免对视频进行编解码的峰值信噪比波动较大。上述仅是一种示例,本实施例中对此不做任何限定。
可选地,在本实施例中,待编码视频帧中的不同区域采用不同分辨率进行编码。比如:待编码视频帧被划分为4个区域,分别是区域1、区域2、区域3和区域4,其中,区域1是采用分辨率1编码的,为区域1添加用于表示分辨率1的语法元素,区域2和区域3是采用分辨率2编码的,为区域2和区域3添加用于表示分辨率2的语法元素,区域4是采用分辨率3编码的,为区域4添加用于表示分辨率3的语法元素。
可选地,在本实施例中,至少使用两个不同的分辨率对待编码视频帧所包括的多个区域进行编码。
可选地,在本实施例中,用于指示编码每个区域所采用的分辨率的语法元素可以是位于待解码视频帧的固定位置上的一段数据,在该位置上不同的数据值代表了不同的分辨率。可以将代表区域对应的分辨率的语法元素添加在该位置上。
在一个可选的实施方式中,如图10所示,获取到待编码视频帧,其中,待编码视频帧被划分为多个区域,包括:区域1、区域2、区域3和区域4,采用分辨率1对区域1进行编码,采用分辨率2对区域2和区域3进行编码,采用分辨率3对区域4进行编码。为区域1对应的编码数据添加用于表示分辨率1的语法元素1,为区域2和区域3对应的编码数据添加用于表示分辨率2的语法元素2,为区域4对应的编码数据添加用于表示分辨率3的语法 元素3。
可见,通过上述步骤,对于视频中的一帧中的不同块自适应采用对应的分辨率进行编码,这样无论是在传输的带宽比较小的情况下,还是在传输的带宽比较大的情况下,对应的峰值信噪比都相对较大,失真相对较小,从而保证了峰值信噪比能够在一个较小的范围内变化,并且峰值信噪比都相对较大,从而实现了避免对视频进行编解码的峰值信噪比波动较大的技术效果,进而解决了相关技术中采用相同分辨率对视频进行编解码导致峰值信噪比波动较大的技术问题。
作为一种可选的方案,根据所述每个区域对应的分辨率为所述每个区域对应的编码数据添加语法元素,包括:
确定所述每个区域所对应的分辨率在多个标识值中所对应的标识值,其中,所述多个分辨率中不同的分辨率在所述多个标识值中对应不同的标识值;将所述每个区域所对应的标识值作为所述语法元素添加到所述每个区域对应的编码数据中。
可选地,在本实施例中,在语法元素中可以直接用不同的标识值来表示不同的分辨率。比如:分辨率1用00表示,分辨率2用01表示,分辨率3用10表示,分辨率4用11表示。需要说明的是,标识值表示分辨率的方式不仅于此,可以采用各种能够区分分辨率的标识值表示方式来指示不同区域采用的不同分辨率。
作为一种可选的方案,根据所述每个区域对应的分辨率为所述每个区域对应的编码数据添加语法元素,包括:
根据所述每个区域对应的分辨率与所述每个区域的前一个区域对应的分辨率之间的关系确定所述每个区域对应的标识值;将所述每个区域所对应的标识值作为所述语法元素添加到所述每个区域对应的编码数据中。
可选地,在本实施例中,当前区域对应的标识值可以是用来指示当前区域的分辨率与当前区域的前一个区域的分辨率之间的关系的。根据当前区域的分辨率与前一个区域的分辨率之间的关系确定出每个区域对应的标识值。
作为一种可选的方案,根据所述每个区域对应的分辨率与所述每个区域的前一个区域对应的分辨率之间的关系确定所述每个区域对应的标识值,包括:
在所述每个区域对应的分辨率与所述每个区域的前一个区域对应的分辨率相同的情况下,确定所述每个区域对应的标识值为第一标识值;在所述每个区域对应的分辨率与所述每个区域的前一个区域对应的分辨率不同的情况下,确定所述每个区域对应的标识值为第二标识值。
可选地,在本实施例中,分辨率之间的关系可以但不限于包括:分辨率相同或者分辨率不同。使用第一标识值来表示分辨率相同,使用第二标识值来表示分辨率不同。例如:使用1来标识分辨率相同,使用0来表示分辨率不同,或者,使用0来表示分辨率相同,使用1来表示分辨率不同。
可选地,在本实施例中,对于每一个帧中的第一个区域来说,可以使用标识值来直接表示该区域所采用的分辨率,或者,也可以使用标识值来表示其与前一帧中最后一个区域的分辨率之间的关系。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。
根据本申请实施例的另一个方面,还提供了一种用于实施上述视频解码方法的视频解码装置,如图11所示,该装置包括:
第一获取模块112,用于获取待解码视频帧,其中,所述待解码视频帧被划分为多个区域;
第二获取模块114,用于获取所述多个区域中每个区域对应的待解码数据中携带的语法元素,其中,所述语法元素用于指示解码所述每个区域所采用的分辨率,解码所述多个区域所采用的多个分辨率包括至少两个不同的分辨率;
解码模块116,用于采用每个区域所对应的分辨率对所述多个区域中的每个区域进行解码。
可选地,所述第二获取模块114,包括:
第一获取单元,包括获取所述每个区域所对应的标识值,其中,所述每个区域所对应的标识值用于指示解码所述每个区域所采用的分辨率。
可选地,所述第二获取模块114,包括:
第二获取单元,用于获取所述每个区域所对应的标识值,其中,所述每个区域所对应的标识值用于指示解码所述每个区域所采用的分辨率与解码所述每个区域的前一个区域所采用的分辨率之间的关系;
第一确定单元,用于根据所述标识值和解码所述每个区域的前一个区域所采用的分辨率确定解码所述每个区域所采用的分辨率。
可选地,所述第二获取模块114,包括:
第二确定单元,用于在确定所述每个区域对应的标识值为第一标识值的情况下,确定解码所述每个区域所采用的分辨率为所述每个区域的前一个区域对应的分辨率;
第三确定单元,用于在确定所述每个区域对应的标识值为第二标识值的情况下,确定解码所述每个区域所采用的分辨率为与所述每个区域的前一个区域对应的分辨率不同的分辨率。
可选地,所述第一获取模块114用于以下之一:
所述多个区域是基于预先确定的视频编解码标准对所述待解码视频帧进行划分得到的多个视频块;
所述多个区域是响应于获取到的输入的区域划分指令对所述待解码视频帧进行划分得到的区域;
所述多个区域是多个Tile区域。
根据本申请实施例的另一个方面,还提供了一种用于实施上述视频编码 方法的视频编码装置,如图12所示,该装置包括:
第三获取模块122,用于获取待编码视频帧,其中,所述待编码视频帧被划分为多个区域;
编码模块124,用于采用多个分辨率中对应的分辨率对所述多个区域中的每个区域进行编码,得到所述每个区域对应的编码数据,其中,所述多个分辨率包括至少两个不同的分辨率;
添加模块126,用于根据所述每个区域对应的分辨率为所述每个区域对应的编码数据添加语法元素,其中,所述语法元素用于指示编码所述每个区域所采用的分辨率。
可选地,所述添加模块126,包括:
第四确定单元,用于确定所述每个区域所对应的分辨率在多个标识值中所对应的标识值,其中,所述多个分辨率中不同的分辨率在所述多个标识值中对应不同的标识值;
第一添加单元,用于将所述每个区域所对应的标识值作为所述语法元素添加到所述每个区域对应的编码数据中。
可选地,所述添加模块126,包括:
第五确定单元,用于根据所述每个区域对应的分辨率与所述每个区域的前一个区域对应的分辨率之间的关系确定所述每个区域对应的标识值;
第二添加单元,用于将所述每个区域所对应的标识值作为所述语法元素添加到所述每个区域对应的编码数据中。
可选地,所述第五确定单元,包括:
第一确定子单元,用于在所述每个区域对应的分辨率与所述每个区域的前一个区域对应的分辨率相同的情况下,确定所述每个区域对应的标识值为第一标识值;
第二确定子单元,用于在所述每个区域对应的分辨率与所述每个区域的 前一个区域对应的分辨率不同的情况下,确定所述每个区域对应的标识值为第二标识值。
本申请实施例的应用环境可以但不限于参照上述实施例中的应用环境,本实施例中对此不再赘述。本申请实施例提供了用于实施上述实时通信的连接方法的一种可选的具体应用示例。
作为一种可选的实施例,上述视频编解码方法可以但不限于应用于如图13所示的对视频进行编解码处理的场景中。在本场景中,对于视频中待编码的第t帧,将第t帧中的区域划分为不同的Tile区域,如图13所示,Tile1区域,Tile2区域,Tile3区域,Tile4区域。图13中的划分方式只是一种示例,本申请实施例对一帧划分得到的区域的数量和形状不做限定。
然后,在不同Tile区域采用不同分辨率分别计算代价(Rate Dictation Cost),使用代价最小对应的分辨率作为Tile区域上使用的分辨率。例如,对于Tile1区域,分别使用预先确定的分辨率集合中的分辨率1、分辨率2、分辨率3计算对应的代价,其中,分辨率2对应的代价最小,则使用分辨率2对Tile1区域中的块进行编码。
通过上述方式,确定出使用分辨率2对Tile1区域中的块进行编码,使用分辨率1对Tile2区域中的块进行编码,使用分辨率1对Tile3区域中的块进行编码,使用分辨率3对Tile4区域中的块进行编码。
在一个可选的实施例中,为了使得解码端知道一个视频帧中的不同块在编码时所使用的分辨率,可以在编码的过程中使用对应的标志位来表示对应的分辨率,例如,对于已编码的一个帧中使用高分辨率进行编码的块,将对应的标志位设置为0,对于使用低分辨率进行编码的块,将对应的标志位设置为1。当然,这种设置方式只是一种示例,还可以采用其他的标志位设置方式,例如,使用高分辨率进行编码的块,将对应的标志位设置为1,对于使用低分辨率进行编码的块,将对应的标志位设置为0。
作为另一个可选的实施例,为了使得解码端知道一个视频帧中的不同块在编码时所使用的分辨率,可以在编码的过程中使用对应的标志位来表示对 应的分辨率,例如,若当前块相对于前一个块在编码时使用的分辨率一致,则将当前块对应的标志位设置为0;若当前块相对于前一个块在编码时使用的分辨率不同,则将当前块对应的标志位设置为1。当然,这种设置方式只是一种示例,还可以采用其他的标志位设置方式,例如,若当前块相对于前一个块在编码时使用的分辨率一致,则将当前块对应的标志位设置为1;若当前块相对于前一个块在编码时使用的分辨率不同,则将当前块对应的标志位设置为0。
对于上述不同实施例中的标志位设置方式,在经过熵编码之后,得到待传输的位数是不一样的。
如图14所示,在本申请的视频编码过程中,对于视频中的一帧中的不同块自适应采用对应的分辨率进行编码,这样无论是在传输的带宽比较小(例如,小于图14中所示的带宽阈值Th)的情况下,还是在传输的带宽比较大(例如,大于图14中所示的带宽阈值Th)的情况下,对应的峰值信噪比都相对较大,失真相对较小。
此外,由于对于视频中的一帧中的不同块自适应采用对应的分辨率进行编码,从而不需要在对视频中的帧进行编码时根据不同类型的视频或同一视频的不同帧或同一帧中不同的块所对应的交点(如,图1中的交点)来选择对应的分辨率,降低了编码复杂度。
根据本申请实施例的又一个方面,还提供了一种用于实施上述视频解码的电子装置,如图15所示,该电子装置包括:一个或多个(图中仅示出一个)处理器1502、存储器1504、传感器1506、编码器1508以及传输装置1510,该存储器中存储有计算机可读指令,该处理器被设置为通过计算机可读指令执行上述任一项方法实施例中的步骤。
可选地,在本实施例中,上述电子装置可以位于计算机网络的多个网络设备中的至少一个网络设备。
可选地,在本实施例中,上述处理器可以被设置为通过计算机可读指令执行以下步骤:
获取待解码视频帧,其中,所述待解码视频帧被划分为多个区域;
获取所述多个区域中每个区域对应的待解码数据中携带的语法元素,其中,所述语法元素用于指示解码所述每个区域所采用的分辨率,解码所述多个区域所采用的多个分辨率包括至少两个不同的分辨率;
采用每个区域所对应的分辨率对所述多个区域中的每个区域进行解码。
可选地,本领域普通技术人员可以理解,图15所示的结构仅为示意,电子装置也可以是智能手机(如Android手机、iOS手机等)、平板电脑、掌上电脑以及移动互联网设备(Mobile Internet Devices,MID)、PAD等终端设备。图15其并不对上述电子装置的结构造成限定。例如,电子装置还可包括比图15中所示更多或者更少的组件(如网络接口、显示装置等),或者具有与图15所示不同的配置。
其中,存储器1504可用于存储计算机可读指令以及模块,如本申请实施例中的视频解码方法和装置对应的计算机可读指令/模块,处理器1502通过运行存储在存储器1504内的计算机可读指令以及模块,从而执行各种功能应用以及数据处理,即实现上述的目标组件的控制方法。存储器1504可包括高速随机存储器,还可以包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器1504可进一步包括相对于处理器1502远程设置的存储器,这些远程存储器可以通过网络连接至终端。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
上述的传输装置1510用于经由一个网络接收或者发送数据。上述的网络具体实例可包括有线网络及无线网络。在一个实例中,传输装置1510包括一个网络适配器(Network Interface Controller,NIC),其可通过网线与其他网络设备与路由器相连从而可与互联网或局域网进行通讯。在一个实例中,传输装置1510为射频(Radio Frequency,RF)模块,其用于通过无线方式与互联网进行通讯。
其中,具体地,存储器1504用于存储应用程序。
本申请的实施例还提供了一种存储介质,该存储介质中存储有计算机可读指令,其中,该计算机可读指令被设置为运行时执行上述任一项方法实施 例中的步骤。
可选地,在本实施例中,上述存储介质可以被设置为存储用于执行以下步骤的计算机可读指令:
获取待解码视频帧,其中,所述待解码视频帧被划分为多个区域;
获取所述多个区域中每个区域对应的待解码数据中携带的语法元素,其中,所述语法元素用于指示解码所述每个区域所采用的分辨率,解码所述多个区域所采用的多个分辨率包括至少两个不同的分辨率;
采用每个区域所对应的分辨率对所述多个区域中的每个区域进行解码。
可选地,存储介质还被设置为存储用于执行上述实施例中的方法中所包括的步骤的计算机可读指令,本实施例中对此不再赘述。
可选地,在本实施例中,本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过计算机可读指令来指令终端设备相关的硬件来完成,该计算机可读指令可以存储于一计算机可读存储介质中,存储介质可以包括:闪存盘、只读存储器(Read-Only Memory,ROM)、随机存取器(Random Access Memory,RAM)、磁盘或光盘等。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
上述实施例中的集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在上述计算机可读取的存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在存储介质中,包括若干计算机可读指令用以使得一台或多台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。
在本申请的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的客户端,可通 过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
以上所述仅是本申请的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。

Claims (20)

  1. 一种视频解码方法,由电子装置执行,所述方法包括:
    获取待解码视频帧,其中,所述待解码视频帧被划分为多个区域;
    获取所述多个区域中每个区域对应的待解码数据中携带的语法元素,其中,所述语法元素用于指示解码所述每个区域所采用的分辨率,解码所述多个区域所采用的多个分辨率包括至少两个不同的分辨率;及
    采用每个区域所对应的分辨率对所述多个区域中的每个区域进行解码。
  2. 根据权利要求1所述的方法,其特征在于,所述获取所述多个区域中每个区域对应的待解码数据中携带的语法元素包括:
    获取所述每个区域所对应的标识值,其中,所述每个区域所对应的标识值用于指示解码所述每个区域所采用的分辨率。
  3. 根据权利要求1所述的方法,其特征在于,所述获取所述多个区域中每个区域对应的待解码数据中携带的语法元素包括:
    获取所述每个区域所对应的标识值,其中,所述每个区域所对应的标识值用于指示解码所述每个区域所采用的分辨率与解码所述每个区域的前一个区域所采用的分辨率之间的关系;及
    根据所述标识值和解码所述每个区域的前一个区域所采用的分辨率确定解码所述每个区域所采用的分辨率。
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述标识值和解码所述每个区域的前一个区域所采用的分辨率确定解码所述每个区域所采用的分辨率包括:
    在确定所述每个区域对应的标识值为第一标识值的情况下,确定解码所述每个区域所采用的分辨率为所述每个区域的前一个区域对应的分辨率;及
    在确定所述每个区域对应的标识值为第二标识值的情况下,确定解码所述每个区域所采用的分辨率为与所述每个区域的前一个区域对应的分辨率不同的分辨率。
  5. 根据权利要求1所述的方法,其特征在于,所述待解码视频帧被划分 为多个区域的方式包括以下之一:
    所述多个区域是基于预先确定的视频编解码标准对所述待解码视频帧进行划分得到的多个视频块;
    所述多个区域是响应于获取到的输入的区域划分指令对所述待解码视频帧进行划分得到的区域;及
    所述多个区域是多个分片区域。
  6. 一种视频编码方法,由电子装置执行,所述方法包括:
    获取待编码视频帧,其中,所述待编码视频帧被划分为多个区域;
    采用多个分辨率中对应的分辨率对所述多个区域中的每个区域进行编码,得到所述每个区域对应的编码数据,其中,所述多个分辨率包括至少两个不同的分辨率;及
    根据所述每个区域对应的分辨率为所述每个区域对应的编码数据添加语法元素,其中,所述语法元素用于指示编码所述每个区域所采用的分辨率。
  7. 根据权利要求6所述的方法,其特征在于,所述根据所述每个区域对应的分辨率为所述每个区域对应的编码数据添加语法元素包括:
    确定所述每个区域所对应的分辨率在多个标识值中所对应的标识值,其中,所述多个分辨率中不同的分辨率在所述多个标识值中对应不同的标识值;及
    将所述每个区域所对应的标识值作为所述语法元素添加到所述每个区域对应的编码数据中。
  8. 根据权利要求6所述的方法,其特征在于,所述根据所述每个区域对应的分辨率为所述每个区域对应的编码数据添加语法元素包括:
    根据所述每个区域对应的分辨率与所述每个区域的前一个区域对应的分辨率之间的关系确定所述每个区域对应的标识值;及
    将所述每个区域所对应的标识值作为所述语法元素添加到所述每个区域对应的编码数据中。
  9. 根据权利要求8所述的方法,其特征在于,所述根据所述每个区域对 应的分辨率与所述每个区域的前一个区域对应的分辨率之间的关系确定所述每个区域对应的标识值包括:
    在所述每个区域对应的分辨率与所述每个区域的前一个区域对应的分辨率相同的情况下,确定所述每个区域对应的标识值为第一标识值;及
    在所述每个区域对应的分辨率与所述每个区域的前一个区域对应的分辨率不同的情况下,确定所述每个区域对应的标识值为第二标识值。
  10. 一种视频解码装置,包括:
    第一获取模块,用于获取待解码视频帧,其中,所述待解码视频帧被划分为多个区域;
    第二获取模块,用于获取所述多个区域中每个区域对应的待解码数据中携带的语法元素,其中,所述语法元素用于指示解码所述每个区域所采用的分辨率,解码所述多个区域所采用的多个分辨率包括至少两个不同的分辨率;及
    解码模块,用于采用每个区域所对应的分辨率对所述多个区域中的每个区域进行解码。
  11. 根据权利要求10所述的装置,其特征在于,所述第二获取模块,包括:
    第一获取单元,用于获取所述每个区域所对应的标识值,其中,所述每个区域所对应的标识值用于指示解码所述每个区域所采用的分辨率。
  12. 根据权利要求10所述的装置,其特征在于,所述第二获取模块,包括:
    第二获取单元,用于获取所述每个区域所对应的标识值,其中,所述每个区域所对应的标识值用于指示解码所述每个区域所采用的分辨率与解码所述每个区域的前一个区域所采用的分辨率之间的关系;及
    第一确定单元,用于根据所述标识值和解码所述每个区域的前一个区域所采用的分辨率确定解码所述每个区域所采用的分辨率。
  13. 根据权利要求12所述的装置,其特征在于,所述第二获取模块,包括:
    第二确定单元,用于在确定所述每个区域对应的标识值为第一标识值的情况下,确定解码所述每个区域所采用的分辨率为所述每个区域的前一个区域对应的分辨率;及
    第三确定单元,用于在确定所述每个区域对应的标识值为第二标识值的情况下,确定解码所述每个区域所采用的分辨率为与所述每个区域的前一个区域对应的分辨率不同的分辨率。
  14. 根据权利要求10所述的装置,其特征在于,所述第一获取模块用于以下之一:
    所述多个区域是基于预先确定的视频编解码标准对所述待解码视频帧进行划分得到的多个视频块;
    所述多个区域是响应于获取到的输入的区域划分指令对所述待解码视频帧进行划分得到的区域;及
    所述多个区域是多个分片区域。
  15. 一种视频编码装置,包括:
    第三获取模块,用于获取待编码视频帧,其中,所述待编码视频帧被划分为多个区域;
    编码模块,用于采用多个分辨率中对应的分辨率对所述多个区域中的每个区域进行编码,得到所述每个区域对应的编码数据,其中,所述多个分辨率包括至少两个不同的分辨率;及
    添加模块,用于根据所述每个区域对应的分辨率为所述每个区域对应的编码数据添加语法元素,其中,所述语法元素用于指示编码所述每个区域所采用的分辨率。
  16. 根据权利要求15所述的装置,其特征在于,所述添加模块,包括:
    第四确定单元,用于确定所述每个区域所对应的分辨率在多个标识值中所对应的标识值,其中,所述多个分辨率中不同的分辨率在所述多个标识值中对应不同的标识值;及
    第一添加单元,用于将所述每个区域所对应的标识值作为所述语法元素 添加到所述每个区域对应的编码数据中。
  17. 根据权利要求15所述的装置,其特征在于,所述添加模块,包括:
    第五确定单元,用于根据所述每个区域对应的分辨率与所述每个区域的前一个区域对应的分辨率之间的关系确定所述每个区域对应的标识值;及
    第二添加单元,用于将所述每个区域所对应的标识值作为所述语法元素添加到所述每个区域对应的编码数据中。
  18. 根据权利要求17所述的装置,其特征在于,所述第五确定单元,包括:
    第一确定子单元,用于在所述每个区域对应的分辨率与所述每个区域的前一个区域对应的分辨率相同的情况下,确定所述每个区域对应的标识值为第一标识值;及
    第二确定子单元,用于在所述每个区域对应的分辨率与所述每个区域的前一个区域对应的分辨率不同的情况下,确定所述每个区域对应的标识值为第二标识值。
  19. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行如权利要求1至9中任一项所述的方法。
  20. 一种电子装置,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行如权利要求1至9中任一项所述的方法。
PCT/CN2020/116642 2019-09-27 2020-09-22 视频解码方法和装置、视频编码方法和装置、存储介质及电子装置 WO2021057684A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/469,729 US11968379B2 (en) 2019-09-27 2021-09-08 Video decoding method and apparatus, video encoding method and apparatus, storage medium, and electronic device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910927094.XA CN110650357B (zh) 2019-09-27 2019-09-27 视频解码方法及装置
CN201910927094.X 2019-09-27

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/469,729 Continuation US11968379B2 (en) 2019-09-27 2021-09-08 Video decoding method and apparatus, video encoding method and apparatus, storage medium, and electronic device

Publications (1)

Publication Number Publication Date
WO2021057684A1 true WO2021057684A1 (zh) 2021-04-01

Family

ID=68993008

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/116642 WO2021057684A1 (zh) 2019-09-27 2020-09-22 视频解码方法和装置、视频编码方法和装置、存储介质及电子装置

Country Status (3)

Country Link
US (1) US11968379B2 (zh)
CN (1) CN110650357B (zh)
WO (1) WO2021057684A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114827669A (zh) * 2022-03-31 2022-07-29 杭州网易智企科技有限公司 一种视频数据的传输方法、装置、介质及设备

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110650357B (zh) * 2019-09-27 2023-02-10 腾讯科技(深圳)有限公司 视频解码方法及装置
US11863786B2 (en) * 2021-05-21 2024-01-02 Varjo Technologies Oy Method of transporting a framebuffer

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105827866A (zh) * 2016-05-18 2016-08-03 努比亚技术有限公司 一种移动终端及控制方法
CN109194923A (zh) * 2018-10-18 2019-01-11 眸芯科技(上海)有限公司 基于局部非均匀分辨率的视频图像处理设备、系统及方法
CN109525842A (zh) * 2018-10-30 2019-03-26 深圳威尔视觉传媒有限公司 基于位置的多Tile排列编码方法、装置、设备和解码方法
WO2019059721A1 (ko) * 2017-09-21 2019-03-28 에스케이텔레콤 주식회사 해상도 향상 기법을 이용한 영상의 부호화 및 복호화
CN110121885A (zh) * 2016-12-29 2019-08-13 索尼互动娱乐股份有限公司 用于利用注视跟踪的vr、低等待时间无线hmd视频流传输的有凹视频链接
CN110650357A (zh) * 2019-09-27 2020-01-03 腾讯科技(深圳)有限公司 视频解码方法及装置

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101282479B (zh) * 2008-05-06 2011-01-19 武汉大学 基于感兴趣区域的空域分辨率可调整编解码方法
CN102883157B (zh) * 2011-07-12 2015-09-09 浙江大学 视频编码方法和视频编码器
US9774881B2 (en) * 2014-01-08 2017-09-26 Microsoft Technology Licensing, Llc Representing motion vectors in an encoded bitstream
US10200713B2 (en) * 2015-05-11 2019-02-05 Qualcomm Incorporated Search region determination for inter coding within a particular picture of video data
US10225546B2 (en) * 2016-02-26 2019-03-05 Qualcomm Incorporated Independent multi-resolution coding
CN107155107B (zh) * 2017-03-21 2018-08-03 腾讯科技(深圳)有限公司 视频编码方法和装置、视频解码方法和装置
CN112292855B (zh) * 2018-04-09 2024-06-04 Sk电信有限公司 用于对图像进行编码/解码的方法和装置
CN115665412A (zh) * 2018-12-17 2023-01-31 华为技术有限公司 视频译码中用于光栅扫描分块组和矩形分块组的分块组分配
WO2020140063A1 (en) * 2018-12-27 2020-07-02 Futurewei Technologies, Inc. Flexible tile signaling in video coding
WO2020141904A1 (ko) * 2019-01-02 2020-07-09 주식회사 엑스리스 영상 신호 부호화/복호화 방법 및 이를 위한 장치
CN114450956A (zh) * 2019-08-06 2022-05-06 Op方案有限责任公司 自适应分辨率管理中的帧缓冲
CN110677721B (zh) * 2019-09-27 2022-09-13 腾讯科技(深圳)有限公司 视频编解码方法和装置及存储介质
WO2021112652A1 (ko) * 2019-12-05 2021-06-10 한국전자통신연구원 영역 차등적 영상 부호화/복호화를 위한 방법, 장치 및 기록 매체

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105827866A (zh) * 2016-05-18 2016-08-03 努比亚技术有限公司 一种移动终端及控制方法
CN110121885A (zh) * 2016-12-29 2019-08-13 索尼互动娱乐股份有限公司 用于利用注视跟踪的vr、低等待时间无线hmd视频流传输的有凹视频链接
WO2019059721A1 (ko) * 2017-09-21 2019-03-28 에스케이텔레콤 주식회사 해상도 향상 기법을 이용한 영상의 부호화 및 복호화
CN109194923A (zh) * 2018-10-18 2019-01-11 眸芯科技(上海)有限公司 基于局部非均匀分辨率的视频图像处理设备、系统及方法
CN109525842A (zh) * 2018-10-30 2019-03-26 深圳威尔视觉传媒有限公司 基于位置的多Tile排列编码方法、装置、设备和解码方法
CN110650357A (zh) * 2019-09-27 2020-01-03 腾讯科技(深圳)有限公司 视频解码方法及装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114827669A (zh) * 2022-03-31 2022-07-29 杭州网易智企科技有限公司 一种视频数据的传输方法、装置、介质及设备
CN114827669B (zh) * 2022-03-31 2023-08-18 杭州网易智企科技有限公司 一种视频数据的传输方法、装置、介质及设备

Also Published As

Publication number Publication date
US20210409738A1 (en) 2021-12-30
CN110650357A (zh) 2020-01-03
CN110650357B (zh) 2023-02-10
US11968379B2 (en) 2024-04-23

Similar Documents

Publication Publication Date Title
WO2021057684A1 (zh) 视频解码方法和装置、视频编码方法和装置、存储介质及电子装置
WO2021057689A1 (zh) 视频解码方法及装置、视频编码方法及装置、存储介质和电子装置
WO2021057477A1 (zh) 视频编解码方法和相关装置
US11438645B2 (en) Media information processing method, related device, and computer storage medium
CN104702976A (zh) 一种视频播放方法及设备
JP2019530306A (ja) 符号化方法および装置、ならびに復号方法および装置
WO2023142716A1 (zh) 编码方法、实时通信方法、装置、设备及存储介质
WO2021057705A1 (zh) 视频编解码方法和相关装置
CN110572673A (zh) 视频编解码方法和装置、存储介质及电子装置
CN106254907B (zh) 一种直播视频合成方法及装置
CN106385627B (zh) 视频编码方法和装置
CN109525852B (zh) 直播视频流处理方法、装置、系统及计算机可读存储介质
WO2021057686A1 (zh) 视频解码方法和装置、视频编码方法和装置、存储介质及电子装置
WO2021057479A1 (zh) 视频编解码方法和相关装置
WO2021057676A1 (zh) 视频编解码方法、装置、电子设备及可读存储介质
CN110545431B (zh) 视频解码方法及装置、视频编码方法及装置
CN110572677A (zh) 视频编解码方法和装置、存储介质及电子装置
WO2021057464A1 (zh) 视频处理方法和装置、存储介质和电子装置
WO2021057478A1 (zh) 视频编解码方法和相关装置
WO2021057480A1 (zh) 视频编解码方法和相关装置
CN110855619B (zh) 播放音视频数据的处理方法、装置、存储介质及终端设备
CN114466224A (zh) 视频数据的编解码方法和装置、存储介质及电子设备
JP2014135728A (ja) 映像伝送システム及び映像伝送方法
CN112351277B (zh) 一种视频的编码方法和装置,视频的解码方法和装置
WO2023051705A1 (zh) 视频通讯方法及装置、电子设备、计算机可读介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20869398

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20869398

Country of ref document: EP

Kind code of ref document: A1