CN110545431A - video decoding method and device and video encoding method and device - Google Patents

video decoding method and device and video encoding method and device Download PDF

Info

Publication number
CN110545431A
CN110545431A CN201910927099.2A CN201910927099A CN110545431A CN 110545431 A CN110545431 A CN 110545431A CN 201910927099 A CN201910927099 A CN 201910927099A CN 110545431 A CN110545431 A CN 110545431A
Authority
CN
China
Prior art keywords
resolution
region
video frame
identification value
syntax element
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910927099.2A
Other languages
Chinese (zh)
Other versions
CN110545431B (en
Inventor
高欣玮
李蔚然
谷沉沉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201910927099.2A priority Critical patent/CN110545431B/en
Publication of CN110545431A publication Critical patent/CN110545431A/en
Application granted granted Critical
Publication of CN110545431B publication Critical patent/CN110545431B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/174Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a slice, e.g. a line of blocks or a group of blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/53Multi-resolution motion estimation; Hierarchical motion estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

the invention discloses a video decoding method and device and a video encoding method and device. Wherein, the method comprises the following steps: acquiring a video frame to be decoded, wherein the video frame to be decoded is divided into a plurality of areas; acquiring a first syntax element carried in data to be decoded corresponding to each of a plurality of regions, wherein the first syntax element is used for indicating a relationship between a first resolution and a second resolution, and a plurality of resolutions adopted for encoding the plurality of regions include at least two different resolutions; determining a first resolution corresponding to each region according to the first syntax element and the second resolution; each region of the plurality of regions is decoded using the first resolution corresponding to each region. The invention solves the technical problem of larger fluctuation of the peak signal-to-noise ratio caused by the adoption of the same resolution ratio to encode and decode the video in the correlation technique.

Description

Video decoding method and device and video encoding method and device
Technical Field
the present invention relates to the field of communications, and in particular, to a video decoding method and apparatus, and a video encoding method and apparatus.
Background
In the conventional video encoding process, as shown in fig. 1, if high resolution is used for encoding different blocks in a frame of a video, when the transmission bandwidth ratio is small (e.g., smaller than the bandwidth threshold Th shown in fig. 1), the peak signal-to-noise ratio PSNR1 corresponding to high resolution encoding of different blocks in a frame of a video is lower than the peak signal-to-noise ratio PSNR2 corresponding to low resolution encoding of different blocks in a frame of a video, that is, the peak signal-to-noise ratio PSNR1 corresponding to high resolution encoding of different blocks in a frame of a video is relatively small and distortion is relatively large when the transmission bandwidth is small.
similarly, if the different blocks in a frame of the video are encoded with low resolution, when the transmission bandwidth ratio is large (for example, larger than the bandwidth threshold Th shown in fig. 1), the peak signal-to-noise ratio PSNR3 corresponding to the low resolution encoding of the different blocks in a frame of the video is lower than the peak signal-to-noise ratio PSNR4 corresponding to the high resolution encoding of the different blocks in a frame of the video, that is, the peak signal-to-noise ratio PSNR3 corresponding to the low resolution encoding of the large transmission bandwidth is relatively small and the distortion is relatively large.
In addition, for different types of videos or different frames in the same video or different blocks in the same frame, the intersection point D shown in fig. 1 may move, thereby increasing the difficulty in selecting which resolution to encode the frame in the video in the prior art.
in view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides a video decoding method and device and a video encoding method and device, which at least solve the technical problem that the fluctuation of a peak signal-to-noise ratio is large because the video is encoded and decoded by adopting the same resolution in the related technology.
According to an aspect of an embodiment of the present invention, there is provided a video decoding method including:
acquiring a video frame to be decoded, wherein the video frame to be decoded is divided into a plurality of areas;
acquiring a first syntax element carried in to-be-decoded data corresponding to each of the plurality of regions, wherein the first syntax element is used for indicating a relationship between a first resolution and a second resolution, the first resolution is a resolution adopted for encoding the each region, the second resolution is a resolution of a reference region corresponding to the each region in a reference video frame for encoding the to-be-decoded video frame, and a plurality of resolutions adopted for encoding the plurality of regions include at least two different resolutions;
Determining a first resolution corresponding to each region according to the first syntax element and the second resolution;
and decoding each region in the plurality of regions with the first resolution corresponding to each region.
According to another aspect of the embodiments of the present invention, there is also provided a video encoding method, including:
acquiring a video frame to be encoded, wherein the video frame to be encoded is divided into a plurality of areas;
Encoding each of the plurality of regions by using a corresponding resolution of a plurality of resolutions to obtain encoded data corresponding to each region, wherein the plurality of resolutions include at least two different resolutions;
adding a first syntax element to the coded data corresponding to each region according to the relationship between the first resolution corresponding to each region and the second resolution of the reference region corresponding to each region in the reference video frame of the video frame to be coded, wherein the first syntax element is used for indicating the resolution adopted for coding each region.
according to another aspect of the embodiments of the present invention, there is also provided a video decoding apparatus, including:
the device comprises a first acquisition module, a second acquisition module and a first decoding module, wherein the first acquisition module is used for acquiring a video frame to be decoded, and the video frame to be decoded is divided into a plurality of areas;
A second obtaining module, configured to obtain a first syntax element carried in data to be decoded corresponding to each of the multiple regions, where the first syntax element is used to indicate a relationship between a first resolution and a second resolution, the first resolution is a resolution used for encoding each region, the second resolution is a resolution used for encoding a reference region corresponding to each region in a reference video frame of the video frame to be decoded, and a plurality of resolutions used for encoding the multiple regions include at least two different resolutions;
A first determining module, configured to determine, according to the first syntax element and the second resolution, a first resolution corresponding to each of the regions;
a first processing module, configured to decode each of the multiple regions with a first resolution corresponding to the each region.
Optionally, the first determining module includes:
a first determining unit, configured to determine a relationship between the second resolution and the first resolution corresponding to each region according to a relationship between the identification value corresponding to each region and the identification value corresponding to the reference region;
A second determining unit, configured to determine the first resolution corresponding to each region according to the relationship between the second resolution and the first resolution corresponding to each region, and the second resolution corresponding to the reference region.
Optionally, the first determining module includes:
A third determining unit, configured to determine the second resolution as the first resolution corresponding to each region if it is determined that the identification value corresponding to each region is the first identification value, where the first identification value is used to indicate that the first resolution corresponding to each region is the same as the second resolution;
a fourth determining unit, configured to determine, as the first resolution corresponding to each region, a resolution different from the second resolution if it is determined that the identification value corresponding to each region is the second identification value, where the second identification value is used to indicate that the first resolution corresponding to each region is different from the second resolution.
Optionally, the first determining module includes:
A first obtaining unit, configured to obtain an identification value corresponding to each region and a third identification value corresponding to the second resolution;
A fifth determining unit, configured to perform an operation on the identification value corresponding to each region and the third identification value, and determine an operation result as a fourth identification value corresponding to the first resolution of each region;
A second obtaining unit, configured to obtain, as the first resolution corresponding to each region, a resolution corresponding to the fourth identification value in multiple resolutions.
optionally, the apparatus further comprises:
a third obtaining module, configured to obtain, after obtaining the video frame to be decoded, a second syntax element corresponding to each region, where the second syntax element is used to indicate a position of the reference video frame;
A second determination module to determine the reference video frame indicated by the second syntax element.
Optionally, the dividing of the video frame to be encoded into a plurality of regions comprises one of:
The plurality of regions are a plurality of video blocks obtained by dividing the video frame to be coded based on a predetermined video coding and decoding standard;
the plurality of regions are obtained by dividing the video frame to be encoded in response to the obtained input region division instruction;
the plurality of regions are a plurality of Tile regions.
According to another aspect of the embodiments of the present invention, there is also provided a video encoding apparatus, including:
The fourth acquisition module is used for acquiring a video frame to be encoded, wherein the video frame to be encoded is divided into a plurality of areas;
A second processing module, configured to encode each of the multiple regions with a corresponding resolution of multiple resolutions to obtain encoded data corresponding to each region, where the multiple resolutions include at least two different resolutions;
A first adding module, configured to add a first syntax element to the encoded data corresponding to each region according to a relationship between the first resolution corresponding to each region and a second resolution of a reference region corresponding to each region in a reference video frame of the video frame to be encoded, where the first syntax element is used to indicate a resolution at which each region is encoded.
Optionally, the first adding module includes:
A sixth determining unit, configured to determine an identification value corresponding to each region according to a relationship between the first resolution corresponding to each region and the second resolution of the reference region;
and an adding unit, configured to add the identification value corresponding to each region as the first syntax element to the encoded data corresponding to each region.
optionally, the sixth determining unit includes:
a first determining subunit, configured to determine, when the first resolution corresponding to each region is the same as the second resolution corresponding to the reference region, that the identification value corresponding to each region is a first identification value;
a second determining subunit, configured to determine, when the first resolution corresponding to each region is different from the second resolution corresponding to the reference region, that the identification value corresponding to each region is a second identification value.
optionally, the sixth determining unit includes:
a third determining subunit, configured to determine a fourth identification value corresponding to the first resolution corresponding to each region in a plurality of identification values, and a third identification value corresponding to the second resolution corresponding to the reference region in the plurality of identification values, where different resolutions in the plurality of resolutions correspond to different identification values in the plurality of identification values;
And the fourth determining subunit is configured to perform an operation on the third identification value and the fourth identification value, and determine an identification value corresponding to each region from an operation result.
optionally, the apparatus further comprises:
a second adding module, configured to add a second syntax element to the encoded data corresponding to each region after the encoded data corresponding to each region is obtained by encoding each region in the plurality of regions with a corresponding resolution in the plurality of resolutions, where the second syntax element is used to indicate a position of the reference video frame.
According to another aspect of the embodiments of the present invention, there is also provided a storage medium, characterized in that the storage medium stores therein a computer program, wherein the computer program is configured to execute the method described in any one of the above when executed.
According to another aspect of the embodiments of the present invention, there is also provided an electronic apparatus, including a memory and a processor, wherein the memory stores therein a computer program, and the processor is configured to execute the method described in any one of the above through the computer program.
in the embodiment of the invention, the video frame to be decoded is obtained, wherein the video frame to be decoded is divided into a plurality of areas; acquiring a first syntax element carried in data to be decoded corresponding to each of a plurality of regions, wherein the first syntax element is used for indicating a relationship between a first resolution and a second resolution, the first resolution is a resolution adopted for encoding each region, the second resolution is a resolution of a reference region corresponding to each region in a reference video frame for encoding the video frame to be decoded, and a plurality of resolutions adopted for encoding the plurality of regions include at least two different resolutions; determining a first resolution corresponding to each region according to the first syntax element and the second resolution; the method adopts the first resolution corresponding to each region to decode each region in the multiple regions, and adaptively encodes different blocks in a frame of the video by adopting the corresponding resolution, so that the corresponding peak signal-to-noise ratio is relatively large and distortion is relatively small no matter the transmission bandwidth is relatively small or the transmission bandwidth is relatively large, thereby ensuring that the peak signal-to-noise ratio can change in a small range and the peak signal-to-noise ratio is relatively large, realizing the technical effect of avoiding the large fluctuation of the peak signal-to-noise ratio of video encoding and decoding, and further solving the technical problem of large fluctuation of the peak signal-to-noise ratio caused by encoding and decoding the video by adopting the same resolution in the related technology.
drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a diagram illustrating a peak SNR of a related art codec;
FIG. 2 is a schematic diagram of an alternative video decoding method according to an embodiment of the present invention;
Fig. 3 is a schematic diagram of an application environment of an alternative video decoding method according to an embodiment of the present invention;
FIG. 4 is a first schematic diagram of an alternative video decoding method according to an alternative embodiment of the present invention;
FIG. 5 is a second schematic diagram of an alternative video decoding method according to an alternative embodiment of the present invention;
FIG. 6 is a schematic diagram of an alternative video decoding method according to an embodiment of the present invention;
fig. 7 is a schematic diagram of an application environment of an alternative video decoding method according to an embodiment of the present invention;
fig. 8 is a schematic diagram of an alternative video encoding method according to an alternative embodiment of the present invention;
fig. 9 is a schematic diagram of an alternative video decoding apparatus according to an embodiment of the present invention;
FIG. 10 is a schematic diagram of an alternative video encoding apparatus according to an embodiment of the present invention;
fig. 11 is a schematic view of an application scenario of an alternative video encoding and decoding method according to an embodiment of the present invention;
fig. 12 is a schematic diagram illustrating an application scenario of an alternative video encoding and decoding method according to an embodiment of the present invention; and
FIG. 13 is a schematic diagram of an alternative electronic device according to an embodiment of the invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
According to an aspect of an embodiment of the present invention, there is provided a video decoding method, as shown in fig. 2, the method including:
s202, acquiring a video frame to be decoded, wherein the video frame to be decoded is divided into a plurality of areas;
s204, obtaining a first syntax element carried in to-be-decoded data corresponding to each of the plurality of regions, where the first syntax element is used to indicate a relationship between a first resolution and a second resolution, the first resolution is a resolution used for encoding each region, the second resolution is a resolution of a reference region corresponding to each region in a reference video frame for encoding the to-be-decoded video frame, and a plurality of resolutions used for encoding the plurality of regions include at least two different resolutions;
s206, determining a first resolution corresponding to each region according to the first syntax element and the second resolution;
S208, decoding each region in the plurality of regions by adopting the first resolution corresponding to each region.
alternatively, in this embodiment, the video decoding method may be applied to a hardware environment formed by the server 302 and the client 304 shown in fig. 3. As shown in fig. 3, the server 302 acquires a video frame to be decoded, wherein the video frame to be decoded is divided into a plurality of regions; acquiring a first syntax element carried in data to be decoded corresponding to each of a plurality of regions, wherein the first syntax element is used for indicating a relationship between a first resolution and a second resolution, the first resolution is a resolution adopted for encoding each region, the second resolution is a resolution of a reference region corresponding to each region in a reference video frame for encoding the video frame to be decoded, and a plurality of resolutions adopted for encoding the plurality of regions include at least two different resolutions; determining a first resolution corresponding to each region according to the first syntax element and the second resolution; each region of the plurality of regions is decoded using the first resolution corresponding to each region. The server 302 sends the decoded video to the client 304 for playing.
optionally, in this embodiment, the video decoding method may be applied to, but not limited to, a scene of audio-video processing. Such as: the client A and the client B carry out video conversation, the client A side and the client B side respectively collect video pictures, the collected video pictures are coded, the coded video is sent to the opposite side, the received video is decoded at the opposite side, and the decoded video is played.
Optionally, in this embodiment, the video decoding method may also be applied to, but not limited to, scenes such as playing of video files, live video, and the like.
the client may be, but not limited to, various types of applications, such as an online education application, an instant messaging application, a community space application, a game application, a shopping application, a browser application, a financial application, a multimedia application, a live application, and the like. Specifically, the method can be applied to, but not limited to, a scene in which the audio and video are processed in the instant messaging application, or can also be applied to, but not limited to, a scene in which the audio and video are processed in the multimedia application, so as to avoid that the peak signal-to-noise ratio for encoding and decoding the video fluctuates greatly. The above is only an example, and this is not limited in this embodiment.
optionally, in this embodiment, different regions in the video frame to be decoded are obtained by encoding with different resolutions. Such as: a video frame to be decoded is divided into 4 regions, region 1, region 2, region 3, and region 4, where region 1 is encoded with resolution 1, region 2 and region 3 are encoded with resolution 2, and region 4 is encoded with resolution 3. The coded information is indicated by a syntax element carried in the data to be decoded, and the decoding end acquires different resolutions adopted by different regions by acquiring the syntax element, so that the decoding end decodes the data by adopting the resolutions corresponding to the regions.
optionally, in this embodiment, the plurality of regions included in the video frame to be encoded are encoded using at least two different resolutions.
alternatively, in this embodiment, the syntax element indicating the resolution at which each region is encoded may be a piece of data located at a fixed position of the video frame to be decoded, at which different data values represent different resolutions. The syntax element may be retrieved by looking up the position on the video frame to be decoded, thereby determining the different resolution of the respective regions.
Optionally, in this embodiment, the first syntax element is configured to indicate a relationship between a first resolution at which each region is encoded and a second resolution at which a reference region corresponding to each region in a reference video frame of the video frame to be decoded is encoded. That is, the resolution of each region in the current video frame is represented by the relationship between the current video frame and the corresponding video region in the reference video frame.
in an optional embodiment, as shown in fig. 4, acquiring a video frame to be decoded, where the video frame to be decoded is divided into a plurality of regions, includes: region 1, region 2, region 3, and region 4, a reference video frame of a video frame to be decoded is divided into a plurality of regions, including: a region 1 ', a region 2', a region 3 ', and a region 4', wherein the region 1 corresponds to the region 1 ', the region 2 corresponds to the region 2', the region 3 corresponds to the region 3 ', and the region 4 corresponds to the region 4', a first syntax element carried in data to be decoded corresponding to each of a plurality of regions in a video frame to be decoded is obtained, wherein the first syntax element corresponding to the region 1 is used to indicate that the resolution adopted by the coding region 1 is the same as the resolution adopted by the coding region 1 ', the first syntax element corresponding to the region 2 is used to indicate that the resolution adopted by the coding region 2 is the same as the resolution adopted by the coding region 2', the first syntax element corresponding to the region 3 is used to indicate that the resolution adopted by the coding region 3 is the same as the resolution adopted by the coding region 3 ', and the first syntax element corresponding to the region 4 is used to indicate that the resolution adopted by the coding region 4 is the same as the resolution adopted by the coding region 4', region 1 is decoded with the resolution of region 1 ', region 2 is decoded with the resolution of region 2', region 3 is decoded with the resolution of region 3 ', and region 4 is decoded with the resolution of region 4'.
Therefore, through the steps, different blocks in one frame of the video are adaptively coded by adopting the corresponding resolution, so that the corresponding peak signal-to-noise ratio is relatively large and the distortion is relatively small no matter under the condition that the transmission bandwidth is relatively small or under the condition that the transmission bandwidth is relatively large, the peak signal-to-noise ratio can be changed in a relatively small range, and the peak signal-to-noise ratio is relatively large, thereby realizing the technical effect of avoiding the large fluctuation of the peak signal-to-noise ratio when the video is coded and decoded by adopting the same resolution, and further solving the technical problem of large fluctuation of the peak signal-to-noise ratio caused by the fact that the video is coded and decoded by adopting the same resolution in the related technology.
as an optional scheme, determining, according to the first syntax element and the second resolution, a first resolution corresponding to each region includes:
S1, determining the relationship between the second resolution and the first resolution corresponding to each region according to the relationship between the identification value corresponding to each region and the identification value corresponding to the reference region;
And S2, determining the first resolution corresponding to each region according to the relationship between the second resolution and the first resolution corresponding to each region and the second resolution.
Optionally, in this embodiment, the first syntax element may be, but is not limited to, used to represent a relationship between the identification value corresponding to each region and the identification value corresponding to the reference region. For example, the identification value corresponding to the region is used to indicate the level of the resolution adopted by the region, and the difference between the resolution levels may be determined as the first syntax element. Then retrieving the first syntax element may determine a difference between the resolution level of the current region and the resolution level of the reference region, and then determine the resolution of the current region based on the determined resolution of the reference region.
as an optional scheme, determining, according to the first syntax element and the second resolution, a first resolution corresponding to each region includes:
S1, in a case that the identification value corresponding to each region is determined to be a first identification value, determining the second resolution as the first resolution corresponding to each region, wherein the first identification value is used to indicate that the first resolution corresponding to each region is the same as the second resolution;
S2, in case that the identification value corresponding to each region is determined to be a second identification value, determining a resolution different from the second resolution as the first resolution corresponding to each region, wherein the second identification value is used to indicate that the first resolution corresponding to each region is different from the second resolution.
alternatively, in the present embodiment, the first syntax element may be, but is not limited to, used to indicate whether the resolution of the current region is the same as the resolution of the reference region. The first syntax element indicates that the first flag value and the second flag value are the same, and the first syntax element indicates that the first syntax element and the second syntax element are different. Such as: the first syntax element 1 indicates that they are the same, and the first syntax element 0 indicates that they are different. Alternatively, a first syntax element of 0 indicates that the two are the same, and a first syntax element of 1 indicates that the two are different.
As an optional scheme, determining, according to the first syntax element and the second resolution, a first resolution corresponding to each region includes:
S1, acquiring an identification value corresponding to each region and a third identification value corresponding to the second resolution;
s2, calculating the identification value corresponding to each region and the third identification value, and determining the calculation result as a fourth identification value corresponding to the resolution of each region;
s3, acquiring a resolution corresponding to the fourth identification value in the plurality of resolutions as the first resolution corresponding to each region.
Optionally, in this embodiment, the first syntax element may be an identification value corresponding to each region, and the identification value corresponding to the resolution of the reference region is obtained in a previous decoding process. The fourth identification value corresponding to the resolution of each region may be determined by calculation between the identification values. Different identification values may correspond to different resolutions among the plurality of resolutions. And determining the resolution corresponding to the fourth identification value from the plurality of resolutions as the resolution corresponding to each area.
As an optional scheme, after obtaining the video frame to be decoded, the method further includes:
S1, obtaining a second syntax element corresponding to each region, wherein the second syntax element is used for indicating the position of the reference video frame;
S2, determining the reference video frame indicated by the second syntax element.
alternatively, in this embodiment, a second syntax element may be employed to indicate the position of the reference video frame, and the reference video frame is determined and acquired according to the indication of the second syntax element. Such as: the second syntax element may represent a distance between the current video frame to the reference video frame.
as an alternative, the dividing of the video frame to be decoded into a plurality of regions includes one of:
s1, the areas are a plurality of video blocks obtained by dividing the video frame to be decoded based on a predetermined video coding and decoding standard;
s2, the multiple regions are obtained by dividing the video frame to be decoded in response to the obtained input region dividing instruction;
S3, the plurality of regions being a plurality of Tile regions.
optionally, in this embodiment, the plurality of regions may be divided in various ways, such as: the method adopts the division mode of video blocks in the standard protocol, such as binary tree, ternary tree, quaternary tree and the like, each video block is a region, or the division mode of the region can be indicated through an input region division instruction, for example: as shown in fig. 5, the smaller video window during the video call is divided into a region as region 1, and the larger video window or the portion other than the smaller video window is divided into a region as region 2. The division mode of the region may also be other division standards, such as: and adopting a partition mode of the partitioned Tile areas to partition different Tile areas.
According to another aspect of the embodiments of the present invention, there is provided a video encoding method, as shown in fig. 6, the method including:
S602, acquiring a video frame to be encoded, wherein the video frame to be encoded is divided into a plurality of areas;
S604, encoding each of the plurality of regions by using a corresponding resolution of a plurality of resolutions, to obtain encoded data corresponding to each region, where the plurality of resolutions include at least two different resolutions;
s606, adding a first syntax element to the encoded data corresponding to each region according to a relationship between the first resolution corresponding to each region and the second resolution of the reference region corresponding to each region in the reference video frame of the video frame to be encoded, where the first syntax element is used to indicate a resolution used for encoding each region.
Alternatively, in this embodiment, the video encoding method may be applied to a hardware environment formed by the server 702, the server 302, the client 704, and the client 304 shown in fig. 7. As shown in fig. 7, a server 702 obtains a video frame to be encoded collected by a client 704, where the video frame to be encoded is divided into a plurality of regions; encoding each of the plurality of regions by using a corresponding resolution of the plurality of resolutions to obtain encoded data corresponding to each region, wherein the plurality of resolutions include at least two different resolutions; and adding a first syntax element to the coded data corresponding to each region according to the relation between the first resolution corresponding to each region and the second resolution of the reference region corresponding to each region in the reference video frame of the video frame to be coded, wherein the first syntax element is used for indicating the first resolution adopted for coding each region. The server 702 sends the encoded video to the server 302 for decoding. The server 302 sends the decoded video to the client 304 for playing.
optionally, in this embodiment, the video encoding method may be applied to, but not limited to, a scene of audio-video processing. Such as: the client A and the client B carry out video conversation, the client A side and the client B side respectively collect video pictures, the collected video pictures are coded, the coded video is sent to the opposite side, the received video is decoded at the opposite side, and the decoded video is played.
Optionally, in this embodiment, the video encoding method may also be applied to, but not limited to, scenes such as playing of video files, live video broadcasts, and the like.
the client may be, but not limited to, various types of applications, such as an online education application, an instant messaging application, a community space application, a game application, a shopping application, a browser application, a financial application, a multimedia application, a live application, and the like. Specifically, the method can be applied to, but not limited to, a scene in which the audio and video are processed in the instant messaging application, or can also be applied to, but not limited to, a scene in which the audio and video are processed in the multimedia application, so as to avoid that the peak signal-to-noise ratio for encoding and decoding the video fluctuates greatly. The above is only an example, and this is not limited in this embodiment.
Optionally, in this embodiment, different regions in the video frame to be encoded are encoded with different resolutions. Such as: a video frame to be encoded is divided into 4 regions, which are region 1, region 2, region 3, and region 4, respectively, where region 1 is encoded with resolution 1, syntax elements for indicating the relationship between resolution 1 and the resolution of reference region 1 are added to region 1, regions 2 and 3 are encoded with resolution 2, syntax elements for indicating the relationship between resolution 2 and the resolutions of reference regions 2 and 3 are added to regions 2 and 3, respectively, region 4 is encoded with resolution 3, and syntax elements for indicating the relationship between resolution 3 and the resolution of reference region 4 are added to region 4.
optionally, in this embodiment, a plurality of regions included in the video frame to be encoded are encoded using at least two different resolutions.
Alternatively, in this embodiment, the syntax element indicating the resolution at which each region is encoded may be a piece of data located at a fixed position of the video frame to be decoded, at which different data values represent different resolutions. A syntax element representing the resolution to which the region corresponds may be added at this position.
therefore, through the steps, different blocks in one frame of the video are adaptively coded by adopting the corresponding resolution, so that the corresponding peak signal-to-noise ratio is relatively large and the distortion is relatively small no matter under the condition that the transmission bandwidth is relatively small or under the condition that the transmission bandwidth is relatively large, the peak signal-to-noise ratio can be changed in a relatively small range, and the peak signal-to-noise ratio is relatively large, thereby realizing the technical effect of avoiding the large fluctuation of the peak signal-to-noise ratio when the video is coded and decoded by adopting the same resolution, and further solving the technical problem of large fluctuation of the peak signal-to-noise ratio caused by the fact that the video is coded and decoded by adopting the same resolution in the related technology.
as an optional scheme, adding a first syntax element to the encoded data corresponding to each region according to a relationship between the first resolution corresponding to each region and the second resolution of the reference region corresponding to each region in the reference video frame of the video frame to be encoded, includes:
S1, determining the identification value corresponding to each region according to the relation between the first resolution corresponding to each region and the second resolution of the reference region;
S2, adding the identification value corresponding to each region as the first syntax element to the encoded data corresponding to each region.
alternatively, in this embodiment, the relationship between the resolutions may be represented by using a flag value, and the flag value may represent whether the resolutions are the same, a difference between levels corresponding to the resolutions, and the like.
optionally, in this embodiment, an identification value corresponding to each region may be added to the position of the first syntax element.
as an optional solution, determining the identification value corresponding to each region according to the relationship between the first resolution corresponding to each region and the second resolution of the reference region includes:
S1, determining the identification value corresponding to each region as the first identification value when the first resolution corresponding to each region is the same as the second resolution corresponding to the reference region;
S2, determining the identification value corresponding to each region as a second identification value when the first resolution corresponding to each region is different from the second resolution corresponding to the reference region.
alternatively, in this embodiment, if the resolution of each region is the same as the resolution of the reference region, it may be represented by a first identification value, and if the resolution of each region is different from the resolution of the reference region, it may be represented by a second identification value. For example: 1 means that they are the same, and 0 means that they are different. Alternatively, 1 means that they are different, and 0 means that they are the same.
as an optional solution, determining the identification value corresponding to each region according to the relationship between the first resolution corresponding to each region and the second resolution of the reference region includes:
s1, determining a fourth identification value corresponding to the first resolution corresponding to each region in a plurality of identification values, and a third identification value corresponding to the second resolution corresponding to the reference region in the plurality of identification values, wherein different resolutions in the plurality of resolutions correspond to different identification values in the plurality of identification values;
and S2, calculating the third identification value and the fourth identification value, and determining the identification value corresponding to each area according to the calculation result.
optionally, in this embodiment, different resolutions of the multiple resolutions correspond to different identification values among the multiple identification values, that is, the resolution of each region and the resolution of the reference region respectively correspond to respective identification values among the multiple identification values, for example, the resolution of each region corresponds to a fourth identification value, and the resolution of the reference region corresponds to a third identification value. The difference between the two can be determined as the identification value corresponding to each region, i.e. the first syntax element.
As an optional scheme, after each of the plurality of regions is encoded by using a corresponding resolution of a plurality of resolutions, and encoded data corresponding to each region is obtained, the method further includes:
and S1, adding a second syntax element to the encoded data corresponding to each region, wherein the second syntax element is used for indicating the position of the reference video frame.
alternatively, in this embodiment, a second syntax element may be used to indicate the position of the reference video frame, and the decoding end may determine and obtain the reference video frame according to the indication of the second syntax element. Such as: the second syntax element may represent a distance between the current video frame to the reference video frame.
In an alternative embodiment, as shown in fig. 8, a current frame t is divided into 4 regions, where the resolution corresponding to each region is resolution 1, resolution 2, and resolution 2, respectively, a reference frame t-k is divided into 4 reference regions, the resolution corresponding to each reference region is resolution 1, resolution 2, and resolution 2, respectively, and then the obtained first syntax elements corresponding to the respective regions are 0, 1, 0, and 0, where 0 represents the same resolution, and 1 represents a different resolution. And the second syntax element may be determined to be k. The first syntax element and the second syntax element are added to the encoded data.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
According to another aspect of the embodiments of the present invention, there is also provided a video decoding apparatus for implementing the above-described video decoding method, as shown in fig. 9, the apparatus including:
1) A first obtaining module 92, configured to obtain a video frame to be decoded, where the video frame to be decoded is divided into a plurality of regions;
2) a second obtaining module 94, configured to obtain a first syntax element carried in the to-be-decoded data corresponding to each of the multiple regions, where the first syntax element is used to indicate a relationship between a first resolution and a second resolution, where the first resolution is a resolution used for encoding the each region, the second resolution is a resolution used for encoding a reference region corresponding to the each region in a reference video frame of the to-be-decoded video frame, and the multiple resolutions used for encoding the multiple regions include at least two different resolutions;
3) A first determining module 96, configured to determine, according to the first syntax element and the second resolution, a first resolution corresponding to each of the regions;
4) a decoding module 98, configured to decode each of the plurality of regions with the first resolution corresponding to each of the plurality of regions.
Optionally, the first determining module includes:
a first determining unit, configured to determine a relationship between the second resolution and the first resolution corresponding to each region according to a relationship between the identification value corresponding to each region and the identification value corresponding to the reference region;
A second determining unit, configured to determine the first resolution corresponding to each region according to the relationship between the second resolution and the first resolution corresponding to each region, and the second resolution.
Optionally, the first determining module includes:
a third determining unit, configured to determine the second resolution as the first resolution corresponding to each region if it is determined that the identification value corresponding to each region is the first identification value, where the first identification value is used to indicate that the first resolution corresponding to each region is the same as the second resolution;
A fourth determining unit, configured to determine, as the first resolution corresponding to each region, a resolution different from the second resolution if it is determined that the identification value corresponding to each region is the second identification value, where the second identification value is used to indicate that the first resolution corresponding to each region is different from the second resolution.
Optionally, the first determining module includes:
A first obtaining unit, configured to obtain an identification value corresponding to each of the regions and a third identification value corresponding to a resolution of the reference region;
a fifth determining unit, configured to determine a sum of the identification value corresponding to each region and the third identification value as a fourth identification value corresponding to the first resolution of each region;
a second obtaining unit, configured to obtain, as the first resolution corresponding to each region, a resolution corresponding to the fourth identification value in multiple resolutions.
Optionally, the apparatus further comprises:
a third obtaining module, configured to obtain, after obtaining the video frame to be decoded, a second syntax element corresponding to each region, where the second syntax element is used to indicate a position of the reference video frame;
a second determination module to determine the reference video frame indicated by the second syntax element.
optionally, the dividing of the video frame to be encoded into a plurality of regions comprises one of:
The plurality of regions are a plurality of video blocks obtained by dividing the video frame to be coded based on a predetermined video coding and decoding standard;
the plurality of regions are obtained by dividing the video frame to be encoded in response to the obtained input region division instruction;
The plurality of regions are a plurality of Tile regions.
according to another aspect of the embodiments of the present invention, there is also provided a video encoding apparatus for implementing the above-described video encoding method, as shown in fig. 10, the apparatus including:
1) A fourth obtaining module 102, configured to obtain a video frame to be encoded, where the video frame to be encoded is divided into a plurality of regions;
2) An encoding module 104, configured to encode each of the multiple regions with a corresponding resolution of multiple resolutions to obtain encoded data corresponding to each region, where the multiple resolutions include at least two different resolutions;
3) A first adding module 106, configured to add a first syntax element to the encoded data corresponding to each region according to a relationship between the first resolution corresponding to each region and the second resolution of the reference region corresponding to each region in the reference video frame of the video frame to be encoded, where the first syntax element is used to indicate a resolution at which each region is encoded.
optionally, the first adding module includes:
A sixth determining unit, configured to determine an identification value corresponding to each region according to a relationship between the first resolution corresponding to each region and the second resolution of the reference region;
and an adding unit, configured to add the identification value corresponding to each region as the first syntax element to the encoded data corresponding to each region.
optionally, the sixth determining unit includes:
a first determining subunit, configured to determine, when the first resolution corresponding to each region is the same as the second resolution corresponding to the reference region, that the identification value corresponding to each region is a first identification value;
a second determining subunit, configured to determine, when the first resolution corresponding to each region is different from the second resolution corresponding to the reference region, that the identification value corresponding to each region is a second identification value.
Optionally, the sixth determining unit includes:
A third determining subunit, configured to determine a fourth identification value corresponding to the first resolution corresponding to each region in a plurality of identification values, and a third identification value corresponding to the second resolution corresponding to the reference region in the plurality of identification values, where different resolutions in the plurality of resolutions correspond to different identification values in the plurality of identification values;
and the fourth determining subunit is configured to perform an operation on the third identification value and the fourth identification value, and determine an identification value corresponding to each region from an operation result.
optionally, the apparatus further comprises:
a second adding module, configured to add a second syntax element to the encoded data corresponding to each region after the encoded data corresponding to each region is obtained by encoding each region in the plurality of regions with a corresponding resolution in the plurality of resolutions, where the second syntax element is used to indicate a position of the reference video frame.
The application environment of the embodiment of the present invention may refer to the application environment in the above embodiments, but is not described herein again. The embodiment of the invention provides an optional specific application example of the connection method for implementing the real-time communication.
As an alternative embodiment, the above-mentioned video coding and decoding method can be applied, but not limited to, in the scenario of coding and decoding the video as shown in fig. 11. In this scenario, for the tth frame to be encoded in the video, the area in the tth frame is divided into different Tile areas, as shown in fig. 11, Tile1 area, Tile2 area, Tile3 area, and Tile4 area. The division manner in fig. 11 is only an example, and the number and the shape of the regions obtained by dividing one frame are not limited in the embodiment of the present invention, for example, one frame may be further divided into an ROI (Region of Interest) and a non-ROI.
then, the Cost (Rate division Cost) is respectively calculated by adopting different resolutions in different Tile areas, and the resolution corresponding to the minimum Cost is used as the resolution used on the Tile areas. For example, for the Tile1 region, the corresponding costs are calculated by using the resolution 1, the resolution 2 and the resolution 3 in the predetermined resolution set, wherein the cost corresponding to the resolution 2 is the minimum, and then the block in the Tile1 region is encoded by using the resolution 2.
In an alternative embodiment, in order to make the decoding end know the resolution used by different blocks in a video frame when encoding, the corresponding flag bits may be used to indicate the corresponding resolution during the encoding process, for example, for a block encoded in a frame that has been encoded using high resolution, the corresponding flag bits are set to 0, and for a block encoded using low resolution, the corresponding flag bits are set to 1. Of course, this setting is only an example, and other setting manners of the flag bits may be adopted, for example, the corresponding flag bit is set to 1 for a block encoded with high resolution, and the corresponding flag bit is set to 0 for a block encoded with low resolution.
as another alternative embodiment, in order to make the decoding end know the resolution used by different blocks in a video frame during encoding, the corresponding flag bit may be used during encoding to indicate the corresponding resolution, for example, if the resolution used by the current block during encoding is consistent with the resolution used by the previous block during encoding, the flag bit corresponding to the current block is set to 0; and if the resolution used by the current block in the encoding process is different from that used by the previous block, setting the flag bit corresponding to the current block to be 1. Of course, this setting manner is only an example, and other flag bit setting manners may also be adopted, for example, if the resolution used by the current block in encoding is consistent with that used by the previous block, the flag bit corresponding to the current block is set to 1; and if the resolution used by the current block in the encoding process is different from that used by the previous block, setting the flag bit corresponding to the current block to be 0.
For the flag bit setting manner in the different embodiments, after entropy encoding, the obtained bits to be transmitted are different.
as shown in fig. 12, in the video encoding process of the present invention, different blocks in a frame of a video are adaptively encoded with corresponding resolutions, so that the corresponding peak snr is relatively large and distortion is relatively small both in the case of a small bandwidth ratio (e.g., smaller than the bandwidth threshold Th shown in fig. 12) and in the case of a large bandwidth ratio (e.g., larger than the bandwidth threshold Th shown in fig. 12).
in addition, since different blocks in one frame of the video are adaptively encoded with corresponding resolutions, there is no need to select the corresponding resolutions according to the intersections (e.g., the intersections in fig. 1) corresponding to different types of videos or different frames of the same video or different blocks in the same frame when encoding the frames of the video, which reduces the encoding complexity.
according to still another aspect of the embodiments of the present invention, there is also provided an electronic apparatus for implementing the above, as shown in fig. 13, the electronic apparatus including: one or more processors 1302 (only one of which is shown in the figure) in which a computer program is stored, a memory 1304 in which a processor is arranged to carry out the steps of any of the above-described method embodiments by means of the computer program, the sensor 1306, the encoder 1308 and the transmission means 1310.
Optionally, in this embodiment, the electronic apparatus may be located in at least one network device of a plurality of network devices of a computer network.
optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
S1, acquiring a video frame to be decoded, wherein the video frame to be decoded is divided into a plurality of areas;
S2, obtaining a first syntax element carried in the to-be-decoded data corresponding to each of the multiple regions, where the first syntax element is used to indicate a relationship between a first resolution and a second resolution, the first resolution is a resolution used for encoding the each region, the second resolution is a resolution used for encoding a reference region corresponding to the each region in a reference video frame of the to-be-decoded video frame, and the multiple resolutions used for encoding the multiple regions include at least two different resolutions;
s3, determining a first resolution corresponding to each of the regions according to the first syntax element and the second resolution;
S4, decoding each of the plurality of regions with the first resolution corresponding to each of the plurality of regions.
Alternatively, it can be understood by those skilled in the art that the structure shown in fig. 13 is only an illustration, and the electronic device may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, and a Mobile Internet Device (MID), a PAD, and the like. Fig. 13 is a diagram illustrating a structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in FIG. 13, or have a different configuration than shown in FIG. 13.
The memory 1304 may be used to store software programs and modules, such as program instructions/modules corresponding to the video decoding method and apparatus in the embodiments of the present invention, and the processor 1302 executes various functional applications and data processing by running the software programs and modules stored in the memory 1304, that is, implementing the control method of the target component described above. The memory 1304 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 1304 can further include memory remotely located from the processor 1302, which can be connected to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmitting device 1310 is used for receiving or transmitting data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmission device 1310 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices to communicate with the internet or a local area Network. In one example, the transmission device 1310 is a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
wherein the memory 1304 is used for storing, inter alia, application programs.
Embodiments of the present invention also provide a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, acquiring a video frame to be decoded, wherein the video frame to be decoded is divided into a plurality of areas;
S2, obtaining a first syntax element carried in the to-be-decoded data corresponding to each of the multiple regions, where the first syntax element is used to indicate a relationship between a first resolution and a second resolution, the first resolution is a resolution used for encoding the each region, the second resolution is a resolution used for encoding a reference region corresponding to the each region in a reference video frame of the to-be-decoded video frame, and the multiple resolutions used for encoding the multiple regions include at least two different resolutions;
s3, determining a first resolution corresponding to each of the regions according to the first syntax element and the second resolution.
S4, decoding each of the plurality of regions with the first resolution corresponding to each of the plurality of regions.
optionally, the storage medium is further configured to store a computer program for executing the steps included in the method in the foregoing embodiment, which is not described in detail in this embodiment.
Alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the methods of the foregoing embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
the above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
the integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing one or more computer devices (which may be personal computers, servers, network devices, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention.
in the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
in the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
in addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
the foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (15)

1. A video decoding method, comprising:
acquiring a video frame to be decoded, wherein the video frame to be decoded is divided into a plurality of areas;
Acquiring a first syntax element corresponding to each of the plurality of regions, wherein the first syntax element is used for indicating a relationship between a first resolution and a second resolution, the first resolution is a resolution used for encoding each region, the second resolution is a resolution used for encoding a reference region corresponding to each region in a reference video frame of the video frame to be decoded, and the plurality of resolutions used for encoding the plurality of regions include at least two different resolutions;
determining a first resolution corresponding to each region according to the first syntax element and the second resolution;
and decoding each region in the plurality of regions with the first resolution corresponding to each region.
2. the method of claim 1, wherein determining the first resolution for each region according to the first syntax element and the second resolution comprises:
Determining the relationship between the second resolution and the first resolution corresponding to each region according to the relationship between the identification value corresponding to each region and the identification value corresponding to the reference region;
And determining the first resolution corresponding to each region according to the relationship between the second resolution and the first resolution corresponding to each region and the second resolution.
3. the method of claim 1, wherein determining the first resolution for each region according to the first syntax element and the second resolution comprises:
determining the second resolution as a first resolution corresponding to each region in the case that the identification value corresponding to each region is determined to be a first identification value, wherein the first identification value is used for indicating that the first resolution corresponding to each region is the same as the second resolution;
determining a resolution different from the second resolution as the first resolution corresponding to each region in the case that the identification value corresponding to each region is determined to be the second identification value, wherein the second identification value is used for indicating that the first resolution corresponding to each region is different from the second resolution.
4. the method of claim 1, wherein determining the first resolution for each region according to the first syntax element and the second resolution comprises:
acquiring an identification value corresponding to each region and a third identification value corresponding to the second resolution;
calculating the identification value corresponding to each region and the third identification value, and determining a calculation result as a fourth identification value corresponding to the resolution of each region;
and acquiring the resolution corresponding to the fourth identification value in a plurality of resolutions as the first resolution corresponding to each region.
5. The method of claim 1, wherein after obtaining the video frame to be decoded, the method further comprises:
Obtaining a second syntax element corresponding to each region, wherein the second syntax element is used for indicating the position of the reference video frame;
determining the reference video frame indicated by the second syntax element.
6. The method of claim 1, wherein the dividing of the video frame to be decoded into the plurality of regions comprises one of:
the plurality of regions are a plurality of video blocks obtained by dividing the video frame to be decoded based on a predetermined video coding and decoding standard;
The plurality of regions are obtained by dividing the video frame to be decoded in response to the obtained input region dividing instruction;
The plurality of regions are a plurality of Tile regions.
7. a video encoding method, comprising:
Acquiring a video frame to be encoded, wherein the video frame to be encoded is divided into a plurality of areas;
encoding each of the plurality of regions by using a corresponding resolution of a plurality of resolutions to obtain encoded data corresponding to each region, wherein the plurality of resolutions include at least two different resolutions;
Adding a first syntax element to the coded data corresponding to each region according to the relationship between the first resolution corresponding to each region and the second resolution of the reference region corresponding to each region in the reference video frame of the video frame to be coded, wherein the first syntax element is used for indicating the resolution adopted for coding each region.
8. The method of claim 7, wherein adding a first syntax element to the encoded data corresponding to each region according to a relationship between the first resolution corresponding to each region and the second resolution of the reference region corresponding to each region in the reference video frame of the video frame to be encoded comprises:
determining an identification value corresponding to each region according to a relation between the first resolution corresponding to each region and the second resolution of the reference region;
And adding the identification value corresponding to each region as the first syntax element into the coded data corresponding to each region.
9. the method of claim 8, wherein determining the identification value corresponding to each region according to the relationship between the first resolution corresponding to each region and the second resolution of the reference region comprises:
Under the condition that the first resolution corresponding to each region is the same as the second resolution corresponding to the reference region, determining the identification value corresponding to each region as a first identification value;
and under the condition that the first resolution corresponding to each region is different from the second resolution corresponding to the reference region, determining the identification value corresponding to each region as a second identification value.
10. The method according to claim 8, wherein determining the identification value corresponding to each region according to the relationship between the resolution corresponding to each region and the resolution of the reference region comprises:
determining a fourth identification value corresponding to the first resolution corresponding to each region in a plurality of identification values, and a third identification value corresponding to the second resolution corresponding to the reference region in the plurality of identification values, wherein different resolutions in the plurality of resolutions correspond to different identification values in the plurality of identification values;
And calculating the third identification value and the fourth identification value, and determining the calculation result as the identification value corresponding to each region.
11. The method of claim 6, wherein after encoding each of the plurality of regions at a corresponding one of a plurality of resolutions to obtain encoded data corresponding to each of the regions, the method further comprises:
Adding a second syntax element to the coded data corresponding to each region, wherein the second syntax element is used for indicating the position of the reference video frame.
12. A video decoding apparatus, comprising:
the device comprises a first acquisition module, a second acquisition module and a first decoding module, wherein the first acquisition module is used for acquiring a video frame to be decoded, and the video frame to be decoded is divided into a plurality of areas;
a second obtaining module, configured to obtain a first syntax element carried in data to be decoded corresponding to each of the multiple regions, where the first syntax element is used to indicate a relationship between a first resolution and a second resolution, the first resolution is a resolution used for encoding each region, the second resolution is a resolution used for encoding a reference region corresponding to each region in a reference video frame of the video frame to be decoded, and a plurality of resolutions used for encoding the multiple regions include at least two different resolutions;
a first determining module, configured to determine, according to the first syntax element and the second resolution, a first resolution corresponding to each of the regions;
A decoding module, configured to decode each of the multiple regions with a first resolution corresponding to the each region.
13. a video encoding apparatus, comprising:
the fourth acquisition module is used for acquiring a video frame to be encoded, wherein the video frame to be encoded is divided into a plurality of areas;
The encoding module is configured to encode each of the plurality of regions by using a corresponding resolution of a plurality of resolutions to obtain encoded data corresponding to each of the plurality of regions, where the plurality of resolutions include at least two different resolutions;
A first adding module, configured to add a first syntax element to the encoded data corresponding to each region according to a relationship between the first resolution corresponding to each region and a second resolution of a reference region corresponding to each region in a reference video frame of the video frame to be encoded, where the first syntax element is used to indicate a resolution at which each region is encoded.
14. a storage medium, in which a computer program is stored, wherein the computer program is arranged to perform the method of any of claims 1 to 10 when executed.
15. an electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method of any of claims 1 to 10 by means of the computer program.
CN201910927099.2A 2019-09-27 2019-09-27 Video decoding method and device, video encoding method and device Active CN110545431B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910927099.2A CN110545431B (en) 2019-09-27 2019-09-27 Video decoding method and device, video encoding method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910927099.2A CN110545431B (en) 2019-09-27 2019-09-27 Video decoding method and device, video encoding method and device

Publications (2)

Publication Number Publication Date
CN110545431A true CN110545431A (en) 2019-12-06
CN110545431B CN110545431B (en) 2023-10-24

Family

ID=68714972

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910927099.2A Active CN110545431B (en) 2019-09-27 2019-09-27 Video decoding method and device, video encoding method and device

Country Status (1)

Country Link
CN (1) CN110545431B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111741298A (en) * 2020-08-26 2020-10-02 腾讯科技(深圳)有限公司 Video coding method and device, electronic equipment and readable storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1518415A1 (en) * 2002-06-18 2005-03-30 Koninklijke Philips Electronics N.V. Video encoding method and corresponding encoding and decoding devices
WO2008133910A2 (en) * 2007-04-25 2008-11-06 Thomson Licensing Inter-view prediction with downsampled reference pictures
WO2015005750A1 (en) * 2013-07-12 2015-01-15 삼성전자 주식회사 Video encoding method and apparatus therefor using modification vector inducement, video decoding method and apparatus therefor
CN105052152A (en) * 2013-04-01 2015-11-11 高通股份有限公司 Inter-layer reference picture restriction for high level syntax-only scalable video coding
CN107155107A (en) * 2017-03-21 2017-09-12 腾讯科技(深圳)有限公司 Method for video coding and device, video encoding/decoding method and device
CN108848376A (en) * 2018-06-20 2018-11-20 腾讯科技(深圳)有限公司 Video coding, coding/decoding method, device and computer equipment
KR20190033403A (en) * 2017-09-21 2019-03-29 에스케이텔레콤 주식회사 Video Encoding and Decoding using Resolution Enhancement Scheme

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1518415A1 (en) * 2002-06-18 2005-03-30 Koninklijke Philips Electronics N.V. Video encoding method and corresponding encoding and decoding devices
WO2008133910A2 (en) * 2007-04-25 2008-11-06 Thomson Licensing Inter-view prediction with downsampled reference pictures
CN105052152A (en) * 2013-04-01 2015-11-11 高通股份有限公司 Inter-layer reference picture restriction for high level syntax-only scalable video coding
WO2015005750A1 (en) * 2013-07-12 2015-01-15 삼성전자 주식회사 Video encoding method and apparatus therefor using modification vector inducement, video decoding method and apparatus therefor
CN107155107A (en) * 2017-03-21 2017-09-12 腾讯科技(深圳)有限公司 Method for video coding and device, video encoding/decoding method and device
CN108495130A (en) * 2017-03-21 2018-09-04 腾讯科技(深圳)有限公司 Video coding, coding/decoding method and device, terminal, server and storage medium
WO2018171447A1 (en) * 2017-03-21 2018-09-27 腾讯科技(深圳)有限公司 Video encoding method, video decoding method, computer device and storage medium
KR20190033403A (en) * 2017-09-21 2019-03-29 에스케이텔레콤 주식회사 Video Encoding and Decoding using Resolution Enhancement Scheme
CN108848376A (en) * 2018-06-20 2018-11-20 腾讯科技(深圳)有限公司 Video coding, coding/decoding method, device and computer equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
J. SAMUELSSON等: "AHG 8: Adaptive Resolution Change (ARC) with downsampling", 《JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11 15TH MEETING: GOTHENBURG, SE, 3–12 JULY 2019, JVET-O0240-V1》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111741298A (en) * 2020-08-26 2020-10-02 腾讯科技(深圳)有限公司 Video coding method and device, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN110545431B (en) 2023-10-24

Similar Documents

Publication Publication Date Title
CN110636294B (en) Video decoding method and device, and video encoding method and device
CN110650357B (en) Video decoding method and device
CN104702976A (en) Video playing method and equipment
CN110519607B (en) Video decoding method and device, and video encoding method and device
CN108063946B (en) Image encoding method and apparatus, storage medium, and electronic apparatus
WO2021057697A1 (en) Video encoding and decoding methods and apparatuses, storage medium, and electronic device
CN116567228A (en) Encoding method, real-time communication method, apparatus, device and storage medium
CN112351278B (en) Video encoding method and device and video decoding method and device
CN110545431B (en) Video decoding method and device, video encoding method and device
CN110582022B (en) Video encoding and decoding method and device and storage medium
CN110677692B (en) Video decoding method and device and video encoding method and device
CN110662071B (en) Video decoding method and device, storage medium and electronic device
CN110572677B (en) Video encoding and decoding method and device, storage medium and electronic device
CN111918067A (en) Data processing method and device and computer readable storage medium
CN110572672A (en) Video encoding and decoding method and device, storage medium and electronic device
CN110677653A (en) Video encoding and decoding method and device and storage medium
CN110636293B (en) Video encoding and decoding methods and devices, storage medium and electronic device
CN110677721B (en) Video encoding and decoding method and device and storage medium
CN110677676A (en) Video encoding method and apparatus, video decoding method and apparatus, and storage medium
US12034944B2 (en) Video encoding method and apparatus, video decoding method and apparatus, electronic device and readable storage medium
CN110572653B (en) Video encoding and decoding methods and devices, storage medium and electronic device
CN110536134B (en) Video encoding method, video decoding method, video encoding apparatus, video decoding apparatus, storage medium, and electronic apparatus
CN110572654B (en) Video encoding and decoding methods and devices, storage medium and electronic device
CN110572674B (en) Video encoding and decoding method and device, storage medium and electronic device
CN110636295B (en) Video encoding and decoding method and device, storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant