CN117412064A - Video decoding method, device, electronic equipment and computer storage medium - Google Patents

Video decoding method, device, electronic equipment and computer storage medium Download PDF

Info

Publication number
CN117412064A
CN117412064A CN202311016906.8A CN202311016906A CN117412064A CN 117412064 A CN117412064 A CN 117412064A CN 202311016906 A CN202311016906 A CN 202311016906A CN 117412064 A CN117412064 A CN 117412064A
Authority
CN
China
Prior art keywords
video
decoded
size representation
value
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311016906.8A
Other languages
Chinese (zh)
Inventor
李志强
蓝得标
周武君
康伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen TCL Digital Technology Co Ltd
Original Assignee
Shenzhen TCL Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen TCL Digital Technology Co Ltd filed Critical Shenzhen TCL Digital Technology Co Ltd
Priority to CN202311016906.8A priority Critical patent/CN117412064A/en
Publication of CN117412064A publication Critical patent/CN117412064A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals

Abstract

The embodiment of the application discloses a video decoding method, a video decoding device, electronic equipment and a computer storage medium; in the embodiment of the application, a video to be decoded is obtained, and a plurality of size representation values corresponding to a MaxTbLog2SizeY standard are obtained; screening out the size representation values meeting the preset size conditions from the plurality of size representation values to obtain target size representation values; taking the target size representation value as a predicted value of MaxTbLog2SizeY of the video to be decoded; and decoding the video to be decoded according to the predicted value of the MaxTbLog2SizeY. The embodiment of the application can improve the flexibility of video decoding.

Description

Video decoding method, device, electronic equipment and computer storage medium
Technical Field
The present invention relates to the field of video processing technologies, and in particular, to a video decoding method, a video decoding device, an electronic device, and a computer storage medium.
Background
As video can record and transmit various contents, the application field of video is increasing. When transmitting video, in order to reduce the bandwidth used for video transmission, the video is encoded by an encoding algorithm and then transmitted.
Among other things, video may be encoded by high efficiency video coding (High Efficiency Video Coding, HEVC). When the video is encoded through the efficient video encoding, an encoded video stream can be obtained, then, when the video is decoded, the value corresponding to the MaxTbLog2SizeY in the encoded video stream is determined, and then, the transformed video stream is decoded through the value corresponding to the MaxTbLog2SizeY. However, the flexibility is low because the decoding is performed after the corresponding value of MaxTbLog2SizeY is obtained from the encoded video stream.
Disclosure of Invention
The embodiment of the application provides a video decoding method, a video decoding device, electronic equipment and a computer storage medium, which can solve the technical problem that the flexibility is low after a value corresponding to MaxTbLog2SizeY is acquired from an encoded video stream.
The embodiment of the application provides a video decoding method, which comprises the following steps:
acquiring a video to be decoded and acquiring a plurality of size representation values corresponding to a MaxTbLog2SizeY standard;
screening out the size representation values meeting the preset size conditions from the plurality of size representation values to obtain target size representation values;
taking the target size representation value as a predicted value of the MaxTbLog2SizeY of the video to be decoded;
and decoding the video to be decoded according to the predicted value of the MaxTbLog2SizeY.
Accordingly, an embodiment of the present application provides a video decoding apparatus, including:
the acquisition module is used for acquiring videos to be decoded and acquiring a plurality of size representation values corresponding to the MaxTbLog2SizeY standard;
the screening module is used for screening the size representation values meeting the preset size conditions from the plurality of size representation values to obtain target size representation values;
the prediction module is used for taking the target size representation value as a predicted value of the MaxTbLog2SizeY of the video to be decoded;
and the decoding module is used for decoding the video to be decoded according to the predicted value of the MaxTbLog2SizeY.
In addition, the embodiment of the application also provides electronic equipment, which comprises a processor and a memory, wherein the memory stores a computer program, and the processor is used for running the computer program in the memory to realize the video decoding method provided by the embodiment of the application.
In addition, the embodiment of the application further provides a computer readable storage medium, and the computer readable storage medium stores a computer program, and the computer program is suitable for being loaded by a processor to execute any video decoding method provided by the embodiment of the application.
In addition, the embodiment of the application further provides a computer program product, which comprises a computer program, and the computer program realizes any video decoding method provided by the embodiment of the application when being executed by a processor.
In the embodiment of the application, a video to be decoded is obtained, and a plurality of size representation values corresponding to a MaxTbLog2SizeY standard are obtained; screening out a size representation value meeting a preset size condition from a plurality of size representation values to obtain a target size representation value; taking the target size representation value as a predicted value of MaxTbLog2SizeY of the video to be decoded; according to the predicted value of MaxTbLog2SizeY, the video to be decoded is decoded, the true value corresponding to the MaxTbLog2SizeY is not required to be obtained from the video to be decoded, the video to be decoded can also be decoded, the method for decoding the video to be decoded is increased, and the flexibility of decoding the video to be decoded is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a video decoding method according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a transform block provided by an embodiment of the present application;
fig. 3 is a flowchart of another video decoding method according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a video decoding apparatus according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
The embodiment of the application provides a video decoding method, a video decoding device, electronic equipment and a computer storage medium. The video decoding device may be integrated in an electronic apparatus, which may be a server or a terminal.
The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, network acceleration services (Content Delivery Network, CDN), basic cloud computing services such as big data and an artificial intelligent platform.
The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, which is not limited herein.
In addition, "plurality" in the embodiments of the present application means two or more. "first" and "second" and the like in the embodiments of the present application are used for distinguishing descriptions and are not to be construed as implying relative importance.
The following will describe in detail. The following description of the embodiments is not intended to limit the preferred embodiments.
In the present embodiment, description will be made from the viewpoint of a video decoding apparatus, and for convenience of explanation of the video decoding method of the present application, a detailed explanation will be made below with the video decoding apparatus integrated in a terminal, that is, with the terminal as an execution subject.
Referring to fig. 1, fig. 1 is a flowchart of a video decoding method according to an embodiment of the present application. The video decoding method may include:
s101, acquiring videos to be decoded and acquiring a plurality of size representation values corresponding to a MaxTbLog2SizeY standard.
The video to be decoded may be a video obtained by encoding the original video by using a transform block method, and the type of the transform block method may be selected according to practical situations, for example, the transform block method may be a high-efficiency video encoding and decoding method (High Efficiency Video Coding, HEVC) and a multifunctional video encoding and decoding method (Versatile Video Coding, VVC), which are not limited herein.
For example, when an efficient video encoding and decoding method is adopted, the process of encoding the original video may be:
each video frame in an original video is segmented through a video Coding layer (Video Coding Layer, VCL) to obtain Coding tree Blocks (Coding Tree Blocks, CTB) corresponding to each video frame, coding Blocks (CB) are obtained by dividing the Coding tree Blocks, prediction Units (PB) and Transform Units (TB) are obtained by dividing the Coding Blocks, then video frames are coded according to the Prediction Blocks and the Transform Blocks to obtain candidate coded videos, and finally the candidate coded videos are packaged into different NAL Units (each VPS, SPS, PPS, SEI, I frame or P frame can be called as an NALU unit) through a network extraction layer (Network Abstraction Layer, NAL) to obtain the video to be decoded.
In the process of dividing the video frame to obtain the transformation block, the video frame can be divided to obtain the transformation block according to a plurality of size representation values corresponding to the MaxTbLog2SizeY standard. For example, when the original video is encoded by using the high-efficiency video encoding and decoding method, the multiple size representation values corresponding to the MaxTbLog2SizeY standard may be 2, 3, 4, and 5,2 corresponds to 4*4 pixels, 3 corresponds to 8×8 pixels, 4 corresponds to 16×16 pixels, and 5 corresponds to 32×32 pixels, and then the original video may be divided by using at least one of 4*4 pixels, 8×8 pixels, 16×16 pixels, and 32×32 pixels.
Alternatively, the plurality of size characterization values corresponding to the MaxTbLog2SizeY standard may include 2, 3, 4, and 5. The size of the transform block corresponding to 5 is 32×32 pixels, the size of the transform block corresponding to 4 is 16×16 pixels, the size of the transform block corresponding to 3 is 8×8 pixels, and the size of the transform block corresponding to 2 is 4*4 pixels.
S102, screening out the size representation values meeting the preset size conditions from the plurality of size representation values to obtain a target size representation value.
The preset size condition may be set according to an actual situation, for example, a largest size representation value of the plurality of size representation values may be used as a size representation value that satisfies the preset size condition, or a smallest size representation value of the plurality of size representation values may be used as a size representation value that satisfies the preset size condition, or one size representation value of the plurality of size representation values may be randomly used as a size representation value that satisfies the preset size condition, which is not limited herein.
When the largest size representation value in the plurality of size representation values is used as the size representation value meeting the preset size condition, the process of screening the size representation value meeting the preset size condition from the plurality of size representation values to obtain the target size representation value can be as follows:
screening out the largest size representation value from the plurality of size representation values;
and taking the maximum size representation value as a size representation value meeting the preset size condition to obtain a target size representation value.
For example, when the original video is encoded by using the efficient video encoding and decoding method, the largest size representation value of the plurality of size representation values is 5, and then the 5 is taken as the target size representation value.
Because the actual value of MaxTbLog2SizeY in the video is usually 5, in the embodiment of the present application, the largest size representation value selected from the multiple size representation values is used as the size representation value meeting the preset size condition, so as to obtain the target size representation value, so that decoding can be successfully performed when the video to be decoded is decoded according to the largest size representation value.
When the size representation value is represented in the form of a pixel, one size representation value may include a height pixel size and a width pixel size, and in this case, in the process of screening out the maximum size representation value, the maximum height pixel size and/or the maximum width pixel size may be used as the maximum size representation value.
When the size representation value in the plurality of size representation values is randomly used as the size representation value meeting the preset size condition, the process of screening the size representation value meeting the preset size condition from the plurality of size representation values to obtain the target size representation value can be as follows:
screening out the size representation values from the plurality of size representation values according to a random rule to obtain candidate size representation values;
and taking the candidate size representation value as a size representation value meeting the preset size condition to obtain a target size representation value.
And S103, taking the target size representation value as a predicted value corresponding to MaxTbLog2SizeY of the video to be decoded.
After obtaining the target size representation value, the terminal takes the target size representation value as a predicted value corresponding to the MaxTbLog2SizeY of the video to be decoded, namely, the target size representation value is assigned to the MaxTbLog2SizeY.
S104, decoding the video to be decoded according to the predicted value of the MaxTbLog2SizeY.
After obtaining the predicted value of MaxTbLog2SizeY of the video to be decoded, the terminal can decode the video to be decoded according to the predicted value of MaxTbLog2SizeY of the video to be decoded.
For example, as shown in fig. 2, when the actual value of MaxTbLog2SizeY of the video to be decoded is 5, that is, the size of the corresponding change block of the video to be decoded is 32×32, the size representation values are 2, 3, 4 and 5,2 corresponds to 4*4 pixels, 3 corresponds to 8×8 pixels, 4 corresponds to 16×16 pixels, and 5 corresponds to 32×32 pixels, the maximum size representation value of the plurality of size representation values is used as the target size representation value, and when the predicted value of MaxTbLog2SizeY is the same as the actual value of MaxTbLog2SizeY, the predicted value of MaxTbLog2SizeY is adopted to decode the video to be decoded, so that decoding is successful.
Optionally, the video to be decoded includes a plurality of video key frames to be decoded (key frames are also referred to as I frames), and when the video to be decoded is decoded according to the prediction value of MaxTbLog2SizeY, the plurality of video key frames to be decoded may be decoded according to the prediction value of MaxTbLog2SizeY, that is, the prediction value of MaxTbLog2SizeY of each video key frame to be decoded is the same. Or, the target video key frame to be decoded may be screened out from the video to be decoded, then the target video key frame to be decoded is decoded according to the prediction value of MaxTbLog2SizeY, at this time, the prediction values of MaxTbLog2SizeY of the video frames to be decoded of each frame may be the same or may be different, and the process of screening out the size characterization values meeting the preset size condition from the multiple size characterization values to obtain the target size characterization value may be:
screening target video key frames to be decoded from the video to be decoded;
screening size characterization values meeting preset size conditions from the plurality of size characterization values to obtain target size characterization values aiming at target video key frames to be decoded;
the target size representation value is used as a predicted value corresponding to MaxTbLog2SizeY of the video to be decoded, and the video to be decoded is decoded according to the predicted value of MaxTbLog2SizeY, which comprises the following steps:
taking the target size representation value as a numerical value of MaxTbLog2SizeY of a target video key frame to be decoded;
and decoding the target video key frame to be decoded according to the predicted value of the MaxTbLog2SizeY.
At this time, if the target video key frame to be decoded is successfully decoded, taking the next video key frame of the target video key frame to be decoded in the video to be decoded as the target video key frame to be decoded, and returning to execute the step of screening the size representation value meeting the preset size condition from the multiple size representation values to obtain the target size representation value for the target video key frame to be decoded until all the video frames to be decoded in the video to be decoded are successfully decoded.
In some embodiments, the process of screening the size representation values satisfying the preset size condition from the plurality of size representation values to obtain the target size representation value for the target video key frame to be decoded may be:
judging whether a target video key frame to be decoded has a previous video key frame in the video to be decoded or not;
if the target video key frame to be decoded does not exist in the video to be decoded, the size representation value meeting the preset size condition is screened out from the multiple size representation values, and the target size representation value aiming at the target video key frame to be decoded is obtained.
Wherein the last video key frame is the key frame that precedes and is closest to the target video key frame to be decoded.
For example, the key frames in the video to be decoded are the 1 st frame of video to be decoded, the 10 th frame of video to be decoded and the 20 th frame of video to be decoded, if the target video key frame to be decoded is the 1 st frame of video to be decoded in the video to be decoded, the target video key frame to be decoded does not have the previous video key frame in the video to be decoded, if the target video key frame to be decoded is the 10 th frame of video to be decoded, the target video key frame to be decoded has the previous video key frame in the video to be decoded, and the previous video key frame is the 1 st frame of video to be decoded (at this time, the 1 st frame of video to be decoded has been successfully decoded).
In the embodiment of the application, when the target video key frame to be decoded does not have the previous video key frame in the video to be decoded, the size representation value meeting the preset size condition is screened from a plurality of size representation values, and the target size representation value is obtained.
At this time, after determining whether the target video key frame to be decoded has the previous video key frame in the video to be decoded, the method may further include:
if the target video key frame to be decoded has the previous video key frame in the video to be decoded, acquiring a predicted value of MaxTbLog2SizeY of the previous video key frame;
and taking the predicted value of MaxTbLog2SizeY of the previous video key frame as the predicted value of MaxTbLog2SizeY of the target video key frame to be decoded.
For example, if the target video key frame to be decoded is the 10 th video frame to be decoded in the video to be decoded, the target video key frame to be decoded has the previous video key frame in the video to be decoded, the previous video key frame is the 1 st video frame to be decoded, the prediction value of MaxTbLog2SizeY of the 1 st video frame to be decoded successfully decoded is 5, and then the 5 is taken as the target size representation value for the target video key frame to be decoded.
Because the actual value of MaxTbLog2SizeY of the target video key frame to be decoded may be the same as the actual value of MaxTbLog2SizeY of the previous video key frame, in this embodiment, when the target video key frame to be decoded does not exist in the previous video key frame to be decoded, the size characterization value meeting the preset size condition is selected from the multiple size characterization values to obtain the target size characterization value of the target video key frame to be decoded, when the target video key frame to be decoded exists in the previous video key frame to be decoded, the predicted value of MaxTbLog2SizeY of the previous video key frame is obtained, and the predicted value of MaxTbLog2SizeY of the previous video key frame is used as the predicted value of MaxTbLog2SizeY of the target video key frame to be decoded, so that the target video key frame to be decoded can be successfully decoded when the predicted value of MaxTbLog2SizeY of the target video key frame to be decoded is subsequently decoded according to the target video key frame to be decoded.
Optionally, since the prediction value of MaxTbLog2SizeY may be incorrect, resulting in decoding failure, in some embodiments, decoding the video to be decoded according to the prediction value of MaxTbLog2SizeY includes:
decoding the video to be decoded according to the predicted value of MaxTbLog2 SizeY;
if decoding fails, returning to execute the step of screening the size representation value meeting the preset size condition from the plurality of size representation values to obtain the target size representation value until the video to be decoded is successfully decoded.
For example, the size characterizing values included in the plurality of size characterizing values are 2, 3, 4 and 5, the largest size characterizing value is taken as a target size characterizing value, the true size of the largest transform block is 16×16 pixels, when decoding is performed for the first time, 5 is taken as a target size characterizing value, that is, the predicted value of MaxTbLog2SizeY is 5, according to 5, decoding is performed on the video to be decoded, when decoding fails, 4 is taken as a target size characterizing value, that is, the predicted value of MaxTbLog2SizeY is 4, according to 4, decoding is performed on the video to be decoded, when decoding succeeds, and decoding is stopped.
If decoding fails, the step of returning to execute the step of screening the size representation value meeting the preset size condition from the plurality of size representation values to obtain the target size representation value until the video to be decoded is successfully decoded may be:
if decoding fails, taking the first size representation value as a plurality of size representation values, wherein the first size representation value is a size representation value except for target size representation;
and returning to execute the step of screening the size representation values meeting the preset size condition from the plurality of size representation values to obtain the target size representation value, so as to screen the target size representation value from the plurality of size representation values according to the rule from large to small until the video to be decoded is successfully decoded.
And when decoding fails, taking the first size representation value as a plurality of size representation values, wherein the first size representation value is a size representation value except for the target size representation, and returning to execute the step of screening the size representation value meeting the preset size condition from the plurality of size representation values to obtain the target size representation value until the video to be decoded is successfully decoded, so that the screening of the target size representation value from the plurality of size representation values according to a rule from large to small is realized.
For example, the size characterization values included in the plurality of size characterization values are 2, 3, 4 and 5, the largest size characterization value is taken as the target size characterization value, the true value of MaxTbLog2SizeY is 4, and at the time of first decoding, 5 is taken as the target size characterization value, that is, 5 is taken as the predicted value of MaxTbLog2SizeY at this time, and decoding is performed according to 5, if decoding fails, the first size characterization value is 2, 3 and 4, and 4 is taken as the target size characterization value, that is, the predicted value of MaxTbLog2SizeY is 4 at this time, and then decoding is performed according to 4, and decoding is successful, and decoding is stopped.
Optionally, when the target video key frame to be decoded is decoded according to the predicted value of MaxTbLog2SizeY, decoding the video to be decoded according to the predicted value of MaxTbLog2SizeY, and if decoding fails, returning to execute the step of screening the size representation value meeting the preset size condition from the multiple size representation values to obtain the target size representation value until the video to be decoded is successfully decoded, where the process of successfully decoding the video to be decoded may be:
decoding the target video key frame to be decoded according to the predicted value of the MaxTbLog2 SizeY;
if decoding fails, returning to execute the step of screening the size representation value meeting the preset size condition from the plurality of size representation values to obtain the target size representation value of the target video key frame to be decoded until the video to be decoded is successfully decoded.
At this time, if the target video key frame to be decoded is successfully decoded, taking the next video key frame of the target video key frame to be decoded in the video to be decoded as the target video key frame to be decoded, and returning to execute the step of screening the size representation value meeting the preset size condition from the multiple size representation values to obtain the target size representation value aiming at the target video key frame to be decoded.
If decoding fails, the step of returning to execute the step of screening the size representation value meeting the preset size condition from the plurality of size representation values to obtain the target size representation value of the target video key frame to be decoded until the video to be decoded is successfully decoded may be:
if decoding fails, taking the first size representation value as a plurality of size representation values, wherein the first size representation value is a size representation value except for target size representation;
and returning to execute the step of screening the size representation values meeting the preset size condition from the plurality of size representation values to obtain the target size representation values of the target video key frames to be decoded, so as to screen the target size representation values from the plurality of size representation values according to the rule from large to small until the target video key frames to be decoded are successfully decoded.
In other embodiments, to improve the user experience, after decoding the target video key frame to be decoded according to the prediction value of MaxTbLog2SizeY, the method may further include:
and if decoding of the target video key frame to be decoded fails according to the predicted value of the MaxTbLog2SizeY, displaying by adopting preset display content.
When decoding fails, the content corresponding to the target video key frame to be decoded cannot be displayed, so that the preset display content can be adopted for display, and the user experience is improved.
The preset display content may be selected according to actual situations, for example, the preset display content may be a black screen, a frame corresponding to a video frame of a previous frame of a target video key frame to be decoded, or a content of a pause frame, which is not limited herein.
The last video key frame of the target video key frame to be decoded refers to the video frame preceding and adjacent to the target video key frame to be decoded. For example, if the target video key frame to be decoded is the 10 th frame video frame to be decoded in the video to be decoded, the previous video frame may be the 9 th frame video frame in the video to be decoded.
As can be seen from the above, in the embodiment of the present application, a video to be decoded is obtained, and a plurality of size characterization values corresponding to the MaxTbLog2SizeY standard are obtained; screening out a size representation value meeting a preset size condition from a plurality of size representation values to obtain a target size representation value; taking the target size representation value as a predicted value of MaxTbLog2SizeY of the video to be decoded; according to the predicted value of MaxTbLog2SizeY, the video to be decoded is decoded, the true value corresponding to the MaxTbLog2SizeY is not required to be obtained from the video to be decoded, the video to be decoded can also be decoded, the method for decoding the video to be decoded is increased, and the flexibility of decoding the video to be decoded is improved.
The video decoding method provided in the embodiment of the present application is further described below with reference to fig. 3. In this embodiment, the size representation values are 2, 3, 4 and 5,2 corresponds to 4*4 pixels, 3 corresponds to 8×8 pixels, 4 corresponds to 16×16 pixels, 5 corresponds to 32×32 pixels, and the largest size representation value of the plurality of size representation values is taken as the target size representation value.
And acquiring the video to be decoded, and carrying out decoding initialization. The MaxTbLog2SizeY is set to 5 and assigned to the decoder (decoder) to decode the video to be decoded. It is determined whether the decoder returns a decoder error. If the decoder returns a decoder error indicating decoding failure, maxTbLog2SizeY is set to 4 and assigned to the decoder (decoder) to decode the video to be decoded. It is determined whether the decoder returns a decoder error. If the decoder returns a decoder error indicating decoding failure, maxTbLog2SizeY is set to 3 and assigned to the decoder (decoder) to decode the video to be decoded. It is determined whether the decoder returns a decoder error. If the decoder returns a decoder error indicating decoding failure, maxTbLog2SizeY is set to 2 and assigned to the decoder (decoder) to decode the video to be decoded. It is determined whether the decoder returns a decoder error. If the decoder does not return the decoder error, indicating that the decoding is successful, if the decoder returns the decoder error, indicating that the decoding is failed, the decoding process is exited.
In order to facilitate better implementation of the video decoding method provided by the embodiment of the application, the embodiment of the application also provides a device based on the video decoding method. Where the meaning of the terms is the same as in the video decoding method described above, specific implementation details may be referred to in the description of the method embodiments.
For example, as shown in fig. 4, the video decoding apparatus may include:
the obtaining module 401 is configured to obtain a video to be decoded and obtain a plurality of size representation values corresponding to the MaxTbLog2SizeY standard.
And a screening module 402, configured to screen the size representation values satisfying the preset size condition from the plurality of size representation values, so as to obtain a target size representation value.
The prediction module 403 is configured to take the target size representation value as a predicted value of MaxTbLog2SizeY of the video to be decoded.
The decoding module 404 is configured to decode the video to be decoded according to the prediction value of MaxTbLog2SizeY.
Optionally, the filtering module 402 is specifically configured to perform:
screening out the largest size representation value from the plurality of size representation values;
and taking the maximum size representation value as a size representation value meeting the preset size condition to obtain a target size representation value.
Optionally, the decoding module 404 is specifically configured to perform:
decoding the video to be decoded according to the predicted value of the MaxTbLog2 SizeY;
if decoding fails, a first size representation value is taken as the plurality of size representation values, the first size representation value being a size representation value other than the target size representation,
and returning to the step of screening the largest size representation value from the plurality of size representation values to screen the target size representation value from the plurality of size representation values according to the rule from large to small until the video to be decoded is successfully decoded.
Optionally, the filtering module 402 is specifically configured to perform:
screening out the size representation values from the plurality of size representation values according to a random rule to obtain candidate size representation values;
and taking the candidate size representation value as a size representation value meeting the preset size condition to obtain a target size representation value.
Optionally, the filtering module 402 is specifically configured to perform:
screening target video key frames to be decoded from the video to be decoded;
judging whether a target video key frame to be decoded has a previous video key frame in the video to be decoded or not;
if the target video key frame to be decoded does not exist in the video to be decoded, the size representation value meeting the preset size condition is screened out from the multiple size representation values, and the target size representation value aiming at the target video key frame to be decoded is obtained.
Accordingly, the prediction module 403 is specifically configured to perform:
and taking the target size representation value as a predicted value of MaxTbLog2SizeY of the target video key frame to be decoded.
Accordingly, the decoding module 404 is specifically configured to perform:
and decoding the target video key frame to be decoded according to the predicted value of the MaxTbLog2SizeY.
Optionally, the screening module 402 is further configured to perform:
if the target video key frame to be decoded has the previous video key frame in the video to be decoded, acquiring a predicted value of MaxTbLog2SizeY of the previous video key frame;
and taking the predicted value of MaxTbLog2SizeY of the previous video key frame as the predicted value of MaxTbLog2SizeY of the target video key frame to be decoded.
Optionally, the video decoding apparatus further includes:
a display module for performing:
if decoding of the target video key frame to be decoded fails according to the prediction value of MaxTbLog2SizeY, displaying a picture corresponding to the video frame of the last frame of the target video key frame to be decoded.
In the specific implementation, each module may be implemented as an independent entity, or may be combined arbitrarily, and implemented as the same entity or a plurality of entities, and the specific implementation and the corresponding beneficial effects of each module may be referred to the foregoing method embodiments, which are not described herein again.
The embodiment of the application also provides an electronic device, which may be a server or a terminal, as shown in fig. 5, and shows a schematic structural diagram of the electronic device according to the embodiment of the application, specifically:
the electronic device may include one or more processing cores 'processors 501, one or more computer-readable storage media's memory 502, a power supply 503, and an input unit 504, among other components. It will be appreciated by those skilled in the art that the electronic device structure shown in fig. 5 is not limiting of the electronic device and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components. Wherein:
the processor 501 is a control center of the electronic device, connects various parts of the entire electronic device using various interfaces and lines, performs various functions of the electronic device and processes data by running or executing computer programs and/or modules stored in the memory 502, and invoking data stored in the memory 502. Optionally, processor 501 may include one or more processing cores; preferably, the processor 501 may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 501.
The memory 502 may be used to store computer programs and modules, and the processor 501 performs various functional applications and data processing by executing the computer programs and modules stored in the memory 502. The memory 502 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, a computer program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data created according to the use of the electronic device, etc. In addition, memory 502 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 502 may also include a memory controller to provide access to the memory 502 by the processor 501.
The electronic device further comprises a power supply 503 for powering the various components, preferably the power supply 503 is logically connected to the processor 501 via a power management system, whereby the functions of managing charging, discharging, and power consumption are performed by the power management system. The power supply 503 may also include one or more of any of a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
The electronic device may further comprise an input unit 504, which input unit 504 may be used for receiving input digital or character information and for generating keyboard, mouse, joystick, optical or trackball signal inputs in connection with user settings and function control.
Although not shown, the electronic device may further include a display unit or the like, which is not described herein. In particular, in this embodiment, the processor 501 in the electronic device loads executable files corresponding to the processes of one or more computer programs into the memory 502 according to the following instructions, and the processor 501 executes the computer programs stored in the memory 502, so as to implement various functions, for example:
acquiring a video to be decoded and acquiring a plurality of size representation values corresponding to a MaxTbLog2SizeY standard;
screening out a size representation value meeting a preset size condition from a plurality of size representation values to obtain a target size representation value;
taking the target size representation value as a predicted value of MaxTbLog2SizeY of the video to be decoded;
and decoding the video to be decoded according to the predicted value of the MaxTbLog2SizeY.
The specific embodiments and the corresponding beneficial effects of the above operations can be referred to the above detailed description of the video decoding method, and are not described herein.
It will be appreciated by those of ordinary skill in the art that all or part of the steps of the various methods of the above embodiments may be performed by a computer program, or by computer program control related hardware, which may be stored in a computer readable storage medium and loaded and executed by a processor.
To this end, embodiments of the present application provide a computer readable storage medium having stored therein a computer program that is capable of being loaded by a processor to perform the steps of any of the video decoding methods provided by embodiments of the present application. For example, the computer program may perform the steps of:
acquiring a video to be decoded and acquiring a plurality of size representation values corresponding to a MaxTbLog2SizeY standard;
screening out a size representation value meeting a preset size condition from a plurality of size representation values to obtain a target size representation value;
taking the target size representation value as a predicted value of MaxTbLog2SizeY of the video to be decoded;
and decoding the video to be decoded according to the predicted value of the MaxTbLog2SizeY.
The specific embodiments and the corresponding beneficial effects of each of the above operations can be found in the foregoing embodiments, and are not described herein again.
Wherein the computer-readable storage medium may comprise: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
Since the computer program stored in the computer readable storage medium may execute the steps in any video decoding method provided in the embodiments of the present application, the beneficial effects that any video decoding method provided in the embodiments of the present application may be achieved are detailed in the previous embodiments, and will not be described herein.
Among other things, according to one aspect of the present application, a computer program product or computer program is provided that includes computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the video decoding method described above.
The foregoing has described in detail a video decoding method, apparatus, electronic device and computer storage medium provided by the embodiments of the present application, and specific examples have been applied to illustrate the principles and embodiments of the present application, where the foregoing description of the embodiments is only for aiding in understanding the method and core idea of the present application; meanwhile, those skilled in the art will have variations in the specific embodiments and application scope in light of the ideas of the present application, and the present description should not be construed as limiting the present application in view of the above.

Claims (10)

1. A video decoding method, comprising:
acquiring a video to be decoded and acquiring a plurality of size representation values corresponding to a MaxTbLog2SizeY standard;
screening out the size representation values meeting the preset size conditions from the plurality of size representation values to obtain target size representation values;
taking the target size representation value as a predicted value of MaxTbLog2SizeY of the video to be decoded;
and decoding the video to be decoded according to the predicted value of the MaxTbLog2SizeY.
2. The video decoding method according to claim 1, wherein the step of screening the size representation values satisfying a preset size condition from the plurality of size representation values to obtain a target size representation value includes:
screening out the largest size representation value from the plurality of size representation values;
and taking the maximum size representation value as a size representation value meeting the preset size condition to obtain a target size representation value.
3. The video decoding method according to claim 2, wherein the decoding the video to be decoded according to the prediction value of MaxTbLog2SizeY includes:
decoding the video to be decoded according to the predicted value of the MaxTbLog2 SizeY;
if decoding fails, a first size representation value is taken as the plurality of size representation values, the first size representation value being a size representation value other than the target size representation,
and returning to the step of screening the largest size representation value from the plurality of size representation values to screen the target size representation value from the plurality of size representation values according to the rule from large to small until the video to be decoded is successfully decoded.
4. The video decoding method according to claim 1, wherein the step of screening the size representation values satisfying a preset size condition from the plurality of size representation values to obtain a target size representation value includes:
screening out the size representation values from the plurality of size representation values according to a random rule to obtain candidate size representation values;
and taking the candidate size representation value as a size representation value meeting the preset size condition to obtain a target size representation value.
5. The video decoding method according to claim 1, wherein the step of screening the size representation values satisfying a preset size condition from the plurality of size representation values to obtain a target size representation value includes:
screening out target video key frames to be decoded from the video to be decoded;
judging whether the target video key frame to be decoded has a previous video key frame in the video to be decoded;
if the target video key frame to be decoded does not have the previous video key frame in the video to be decoded, screening out the size representation values meeting the preset size conditions from the plurality of size representation values to obtain a target size representation value aiming at the target video key frame to be decoded;
the step of decoding the video to be decoded according to the predicted value of the MaxTbLog2SizeY, wherein the target size representation value is used as the predicted value of the MaxTbLog2SizeY of the video to be decoded, and the step of decoding the video to be decoded comprises the following steps:
taking the target size representation value as a predicted value of MaxTbLog2SizeY of the target video key frame to be decoded;
and decoding the target video key frame to be decoded according to the predicted value of the MaxTbLog2SizeY.
6. The video decoding method of claim 5, further comprising, after said determining whether the target video key frame to be decoded has a previous video key frame in the video to be decoded:
if the target video key frame to be decoded has a previous video key frame in the video to be decoded, acquiring a predicted value of MaxTbLog2SizeY of the previous video key frame;
and taking the predicted value of the MaxTbLog2SizeY of the previous video key frame as the predicted value of the MaxTbLog2SizeY of the target video key frame to be decoded.
7. The video decoding method of claim 5, wherein the method further comprises:
and if decoding of the target video key frame to be decoded fails according to the predicted value of the MaxTbLog2SizeY, displaying a picture corresponding to a video frame of the last frame of the target video key frame to be decoded.
8. A video decoding apparatus, comprising:
the acquisition module is used for acquiring videos to be decoded and acquiring a plurality of size representation values corresponding to the MaxTbLog2SizeY standard;
the screening module is used for screening the size representation values meeting the preset size conditions from the plurality of size representation values to obtain target size representation values;
the prediction module is used for taking the target size representation value as a prediction value of the MaxTbLog2SizeY of the video to be decoded;
and the decoding module is used for decoding the video to be decoded according to the predicted value of the MaxTbLog2SizeY.
9. An electronic device comprising a processor and a memory, the memory storing a computer program, the processor being configured to execute the computer program in the memory to perform the video decoding method of any one of claims 1 to 7.
10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program adapted to be loaded by a processor for performing the video decoding method of any of claims 1 to 7.
CN202311016906.8A 2023-08-11 2023-08-11 Video decoding method, device, electronic equipment and computer storage medium Pending CN117412064A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311016906.8A CN117412064A (en) 2023-08-11 2023-08-11 Video decoding method, device, electronic equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311016906.8A CN117412064A (en) 2023-08-11 2023-08-11 Video decoding method, device, electronic equipment and computer storage medium

Publications (1)

Publication Number Publication Date
CN117412064A true CN117412064A (en) 2024-01-16

Family

ID=89487762

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311016906.8A Pending CN117412064A (en) 2023-08-11 2023-08-11 Video decoding method, device, electronic equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN117412064A (en)

Similar Documents

Publication Publication Date Title
CN111681167B (en) Image quality adjusting method and device, storage medium and electronic equipment
US20220038724A1 (en) Video stream decoding method and apparatus, terminal device, and storage medium
CN112533059B (en) Image rendering method and device, electronic equipment and storage medium
CN111182303A (en) Encoding method and device for shared screen, computer readable medium and electronic equipment
CN108650460B (en) Server, panoramic video storage and transmission method and computer storage medium
CN111093094A (en) Video transcoding method, device and system, electronic equipment and readable storage medium
CN112672149B (en) Video processing method and device, storage medium and server
US8620096B2 (en) Virtualization server for presentation virtualization and image data encoding method
CN111263243A (en) Video coding method and device, computer readable medium and electronic equipment
CN111669595A (en) Screen content coding method, device, equipment and medium
CN113327303B (en) Image processing method, image processing device, computer equipment and storage medium
CN110891195B (en) Method, device and equipment for generating screen image and storage medium
US20230018087A1 (en) Data coding method and apparatus, and computer-readable storage medium
CN117412064A (en) Video decoding method, device, electronic equipment and computer storage medium
CN115941972A (en) Image transmission method, device, equipment and storage medium
CN114217758A (en) Image display method, image display device, electronic equipment and computer readable storage medium
US11323730B2 (en) Temporally-overlapped video encoding, video decoding and video rendering techniques therefor
CN113996056A (en) Data sending and receiving method of cloud game and related equipment
CN113613024A (en) Video preprocessing method and device
CN105847822A (en) Video decoding method and device
CN110969672A (en) Image compression method and device
EP3989566A1 (en) Motion information list construction method in video encoding and decoding, device, and apparatus
CN117176979B (en) Method, device, equipment and storage medium for extracting content frames of multi-source heterogeneous video
CN115243053B (en) Point cloud encoding and decoding method and related equipment
CN117412013A (en) Image quality adjustment method, image quality adjustment device, electronic device, and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination